# Fictional Army - Filtering and Sorting

### Introduction:

This exercise was inspired by this [page](http://chrisalbon.com/python/)

Special thanks to: https://github.com/chrisalbon for sharing the dataset and materials.

### Step 1. Import the necessary libraries

In [1]:
import pandas as pd
import numpy as np

### Step 2. This is the data given as a dictionary

In [2]:
# Create an example dataframe about a fictional army
raw_data = {'regiment': ['Nighthawks', 'Nighthawks', 'Nighthawks', 'Nighthawks', 'Dragoons', 'Dragoons', 'Dragoons', 'Dragoons', 'Scouts', 'Scouts', 'Scouts', 'Scouts'],
            'company': ['1st', '1st', '2nd', '2nd', '1st', '1st', '2nd', '2nd','1st', '1st', '2nd', '2nd'],
            'deaths': [523, 52, 25, 616, 43, 234, 523, 62, 62, 73, 37, 35],
            'battles': [5, 42, 2, 2, 4, 7, 8, 3, 4, 7, 8, 9],
            'size': [1045, 957, 1099, 1400, 1592, 1006, 987, 849, 973, 1005, 1099, 1523],
            'veterans': [1, 5, 62, 26, 73, 37, 949, 48, 48, 435, 63, 345],
            'readiness': [1, 2, 3, 3, 2, 1, 2, 3, 2, 1, 2, 3],
            'armored': [1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 1, 1],
            'deserters': [4, 24, 31, 2, 3, 4, 24, 31, 2, 3, 2, 3],
            'origin': ['Arizona', 'California', 'Texas', 'Florida', 'Maine', 'Iowa', 'Alaska', 'Washington', 'Oregon', 'Wyoming', 'Louisana', 'Georgia']}

### Step 3. Create a dataframe and assign it to a variable called army. 

#### Don't forget to include the columns names

In [5]:
army = pd.DataFrame(raw_data)

In [6]:
army.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 12 entries, 0 to 11
Data columns (total 10 columns):
armored      12 non-null int64
battles      12 non-null int64
company      12 non-null object
deaths       12 non-null int64
deserters    12 non-null int64
origin       12 non-null object
readiness    12 non-null int64
regiment     12 non-null object
size         12 non-null int64
veterans     12 non-null int64
dtypes: int64(7), object(3)
memory usage: 1.0+ KB


### Step 4. Set the 'origin' colum as the index of the dataframe

In [9]:
army.set_index(['origin'], inplace=True)

### Step 5. Print only the column veterans

In [10]:
army['veterans']

origin
Arizona         1
California      5
Texas          62
Florida        26
Maine          73
Iowa           37
Alaska        949
Washington     48
Oregon         48
Wyoming       435
Louisana       63
Georgia       345
Name: veterans, dtype: int64

### Step 6. Print the columns 'veterans' and 'deaths'

In [14]:
army[['veterans', 'deaths']]

Unnamed: 0_level_0,veterans,deaths
origin,Unnamed: 1_level_1,Unnamed: 2_level_1
Arizona,1,523
California,5,52
Texas,62,25
Florida,26,616
Maine,73,43
Iowa,37,234
Alaska,949,523
Washington,48,62
Oregon,48,62
Wyoming,435,73


### Step 7. Print the name of all the columns.

In [15]:
army.columns

Index(['armored', 'battles', 'company', 'deaths', 'deserters', 'readiness',
       'regiment', 'size', 'veterans'],
      dtype='object')

### Step 8. Select the 'deaths', 'size' and 'deserters' columns from Maine and Alaska

In [17]:
army.loc[['Maine', 'Alaska']][['deaths', 'size', 'deserters']]

Unnamed: 0_level_0,deaths,size,deserters
origin,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Maine,43,1592,3
Alaska,523,987,24


### Step 9. Select the rows 3 to 7 and the columns 3 to 6

In [18]:
army.iloc[2:7, 2:6]

Unnamed: 0_level_0,company,deaths,deserters,readiness
origin,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Texas,2nd,25,31,3
Florida,2nd,616,2,3
Maine,1st,43,3,2
Iowa,1st,234,4,1
Alaska,2nd,523,24,2


### Step 10. Select every row after the fourth row

In [20]:
army.iloc[4:]

Unnamed: 0_level_0,armored,battles,company,deaths,deserters,readiness,regiment,size,veterans
origin,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
Maine,0,4,1st,43,3,2,Dragoons,1592,73
Iowa,1,7,1st,234,4,1,Dragoons,1006,37
Alaska,0,8,2nd,523,24,2,Dragoons,987,949
Washington,1,3,2nd,62,31,3,Dragoons,849,48
Oregon,0,4,1st,62,2,2,Scouts,973,48
Wyoming,0,7,1st,73,3,1,Scouts,1005,435
Louisana,1,8,2nd,37,2,2,Scouts,1099,63
Georgia,1,9,2nd,35,3,3,Scouts,1523,345


### Step 11. Select every row up to the 4th row

In [24]:
army.iloc[:5]

Unnamed: 0_level_0,armored,battles,company,deaths,deserters,readiness,regiment,size,veterans
origin,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
Arizona,1,5,1st,523,4,1,Nighthawks,1045,1
California,0,42,1st,52,24,2,Nighthawks,957,5
Texas,1,2,2nd,25,31,3,Nighthawks,1099,62
Florida,1,2,2nd,616,2,3,Nighthawks,1400,26
Maine,0,4,1st,43,3,2,Dragoons,1592,73


### Step 12. Select the 3rd column up to the 7th column

In [25]:
army.iloc[:,2:7]

Unnamed: 0_level_0,company,deaths,deserters,readiness,regiment
origin,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Arizona,1st,523,4,1,Nighthawks
California,1st,52,24,2,Nighthawks
Texas,2nd,25,31,3,Nighthawks
Florida,2nd,616,2,3,Nighthawks
Maine,1st,43,3,2,Dragoons
Iowa,1st,234,4,1,Dragoons
Alaska,2nd,523,24,2,Dragoons
Washington,2nd,62,31,3,Dragoons
Oregon,1st,62,2,2,Scouts
Wyoming,1st,73,3,1,Scouts


### Step 13. Select rows where df.deaths is greater than 50

In [26]:
army[army['deaths'] > 50]

Unnamed: 0_level_0,armored,battles,company,deaths,deserters,readiness,regiment,size,veterans
origin,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
Arizona,1,5,1st,523,4,1,Nighthawks,1045,1
California,0,42,1st,52,24,2,Nighthawks,957,5
Florida,1,2,2nd,616,2,3,Nighthawks,1400,26
Iowa,1,7,1st,234,4,1,Dragoons,1006,37
Alaska,0,8,2nd,523,24,2,Dragoons,987,949
Washington,1,3,2nd,62,31,3,Dragoons,849,48
Oregon,0,4,1st,62,2,2,Scouts,973,48
Wyoming,0,7,1st,73,3,1,Scouts,1005,435


### Step 14. Select rows where df.deaths is greater than 500 or less than 50

In [29]:
army[(army['deaths'] > 500) | (army['deaths'] < 50)]

Unnamed: 0_level_0,armored,battles,company,deaths,deserters,readiness,regiment,size,veterans
origin,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
Arizona,1,5,1st,523,4,1,Nighthawks,1045,1
Texas,1,2,2nd,25,31,3,Nighthawks,1099,62
Florida,1,2,2nd,616,2,3,Nighthawks,1400,26
Maine,0,4,1st,43,3,2,Dragoons,1592,73
Alaska,0,8,2nd,523,24,2,Dragoons,987,949
Louisana,1,8,2nd,37,2,2,Scouts,1099,63
Georgia,1,9,2nd,35,3,3,Scouts,1523,345


### Step 15. Select all the regiments not named "Dragoons"

In [31]:
army[army['regiment'] != 'Dragoons']['regiment'].unique()

array(['Nighthawks', 'Scouts'], dtype=object)

### Step 16. Select the rows called Texas and Arizona

In [34]:
army.loc[['Texas', 'Arizona']]

Unnamed: 0_level_0,armored,battles,company,deaths,deserters,readiness,regiment,size,veterans
origin,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
Texas,1,2,2nd,25,31,3,Nighthawks,1099,62
Arizona,1,5,1st,523,4,1,Nighthawks,1045,1


### Step 17. Select the third cell in the row named Arizona

In [35]:
army.loc['Arizona'][2]

'1st'

### Step 18. Select the third cell down in the column named deaths

In [39]:
army.loc[:,'deaths'][2]

25