# Fictional Army - Filtering and Sorting

### Introduction:

This exercise was inspired by this [page](http://chrisalbon.com/python/)

Special thanks to: https://github.com/chrisalbon for sharing the dataset and materials.

### Step 1. Import the necessary libraries

In [1]:
import pandas as pd

### Step 2. This is the data given as a dictionary

In [2]:
# Create an example dataframe about a fictional army
raw_data = {'regiment': ['Nighthawks', 'Nighthawks', 'Nighthawks', 'Nighthawks', 'Dragoons', 'Dragoons', 'Dragoons', 'Dragoons', 'Scouts', 'Scouts', 'Scouts', 'Scouts'],
            'company': ['1st', '1st', '2nd', '2nd', '1st', '1st', '2nd', '2nd','1st', '1st', '2nd', '2nd'],
            'deaths': [523, 52, 25, 616, 43, 234, 523, 62, 62, 73, 37, 35],
            'battles': [5, 42, 2, 2, 4, 7, 8, 3, 4, 7, 8, 9],
            'size': [1045, 957, 1099, 1400, 1592, 1006, 987, 849, 973, 1005, 1099, 1523],
            'veterans': [1, 5, 62, 26, 73, 37, 949, 48, 48, 435, 63, 345],
            'readiness': [1, 2, 3, 3, 2, 1, 2, 3, 2, 1, 2, 3],
            'armored': [1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 1, 1],
            'deserters': [4, 24, 31, 2, 3, 4, 24, 31, 2, 3, 2, 3],
            'origin': ['Arizona', 'California', 'Texas', 'Florida', 'Maine', 'Iowa', 'Alaska', 'Washington', 'Oregon', 'Wyoming', 'Louisana', 'Georgia']}

### Step 3. Create a dataframe and assign it to a variable called army. 

#### Don't forget to include the columns names

In [46]:
fict_war = pd.DataFrame(raw_data, index=raw_data['origin'], columns=raw_data.keys())

fict_war.index.name = 'Origin'
fict_war.columns

fict_war.head(5)

# Got ahead of myself. Switch to doing the setting of the index after the creation of the frame. 
# It duplicated the column

fict_war = pd.DataFrame(raw_data, columns=raw_data.keys())

fict_war[:2]

Unnamed: 0,regiment,company,deaths,battles,size,veterans,readiness,armored,deserters,origin
0,Nighthawks,1st,523,5,1045,1,1,1,4,Arizona
1,Nighthawks,1st,52,42,957,5,2,0,24,California


### Step 4. Set the 'origin' colum as the index of the dataframe

In [15]:
fict_war.set_index('origin', drop=True, inplace=True)

fict_war[:2]

Unnamed: 0_level_0,regiment,company,deaths,battles,size,veterans,readiness,armored,deserters
origin,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
Arizona,Nighthawks,1st,523,5,1045,1,1,1,4
California,Nighthawks,1st,52,42,957,5,2,0,24


### Step 5. Print only the column veterans

In [16]:
fict_war['veterans']

origin
Arizona         1
California      5
Texas          62
Florida        26
Maine          73
Iowa           37
Alaska        949
Washington     48
Oregon         48
Wyoming       435
Louisana       63
Georgia       345
Name: veterans, dtype: int64

### Step 6. Print the columns 'veterans' and 'deaths'

In [18]:
fict_war[['veterans', 'deaths']]

Unnamed: 0_level_0,veterans,deaths
origin,Unnamed: 1_level_1,Unnamed: 2_level_1
Arizona,1,523
California,5,52
Texas,62,25
Florida,26,616
Maine,73,43
Iowa,37,234
Alaska,949,523
Washington,48,62
Oregon,48,62
Wyoming,435,73


### Step 7. Print the name of all the columns.

In [21]:
fict_war.columns

Index(['regiment', 'company', 'deaths', 'battles', 'size', 'veterans',
       'readiness', 'armored', 'deserters'],
      dtype='object')

### Step 8. Select the 'deaths', 'size' and 'deserters' columns from Maine and Alaska

In [60]:
fict_war[['deaths', 'size', 'deserters']][fict_war.index.isin(['Maine', 'Alaska'])]

Unnamed: 0,deaths,size,deserters


### Step 9. Select the rows 3 to 7 and the columns 3 to 6

In [62]:
fict_war[2:6][fict_war.iloc[3:6]]

# Much simpler

fict_war.iloc[3:7, 3:6]

Unnamed: 0,battles,size,veterans
3,2,1400,26
4,4,1592,73
5,7,1006,37
6,8,987,949


### Step 10. Select every row after the fourth row

In [64]:
fict_war[3:]

Unnamed: 0,regiment,company,deaths,battles,size,veterans,readiness,armored,deserters,origin
3,Nighthawks,2nd,616,2,1400,26,3,1,2,Florida
4,Dragoons,1st,43,4,1592,73,2,0,3,Maine
5,Dragoons,1st,234,7,1006,37,1,1,4,Iowa
6,Dragoons,2nd,523,8,987,949,2,0,24,Alaska
7,Dragoons,2nd,62,3,849,48,3,1,31,Washington
8,Scouts,1st,62,4,973,48,2,0,2,Oregon
9,Scouts,1st,73,7,1005,435,1,0,3,Wyoming
10,Scouts,2nd,37,8,1099,63,2,1,2,Louisana
11,Scouts,2nd,35,9,1523,345,3,1,3,Georgia


### Step 11. Select every row up to the 4th row

In [65]:
fict_war[:3]

Unnamed: 0,regiment,company,deaths,battles,size,veterans,readiness,armored,deserters,origin
0,Nighthawks,1st,523,5,1045,1,1,1,4,Arizona
1,Nighthawks,1st,52,42,957,5,2,0,24,California
2,Nighthawks,2nd,25,2,1099,62,3,1,31,Texas


### Step 12. Select the 3rd column up to the 7th column

In [37]:
fict_war.iloc[: , 3:7]

Unnamed: 0_level_0,battles,size,veterans,readiness
origin,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Arizona,5,1045,1,1
California,42,957,5,2
Texas,2,1099,62,3
Florida,2,1400,26,3
Maine,4,1592,73,2
Iowa,7,1006,37,1
Alaska,8,987,949,2
Washington,3,849,48,3
Oregon,4,973,48,2
Wyoming,7,1005,435,1


### Step 13. Select rows where df.deaths is greater than 50

In [70]:
fict_war[fict_war['deaths'] > 50]

Unnamed: 0,regiment,company,deaths,battles,size,veterans,readiness,armored,deserters,origin
0,Nighthawks,1st,523,5,1045,1,1,1,4,Arizona
1,Nighthawks,1st,52,42,957,5,2,0,24,California
3,Nighthawks,2nd,616,2,1400,26,3,1,2,Florida
5,Dragoons,1st,234,7,1006,37,1,1,4,Iowa
6,Dragoons,2nd,523,8,987,949,2,0,24,Alaska
7,Dragoons,2nd,62,3,849,48,3,1,31,Washington
8,Scouts,1st,62,4,973,48,2,0,2,Oregon
9,Scouts,1st,73,7,1005,435,1,0,3,Wyoming


### Step 14. Select rows where df.deaths is greater than 500 or less than 50

In [74]:
fict_war[(fict_war['deaths'] > 500) | (fict_war['deaths'] < 50)]

# or?

SyntaxError: invalid syntax (<ipython-input-74-c35bc6d5fee4>, line 5)

### Step 15. Select all the regiments not named "Dragoons"

In [44]:
fict_war[fict_war.regiment != 'Dragoons']

Unnamed: 0_level_0,regiment,company,deaths,battles,size,veterans,readiness,armored,deserters
origin,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
Arizona,Nighthawks,1st,523,5,1045,1,1,1,4
California,Nighthawks,1st,52,42,957,5,2,0,24
Texas,Nighthawks,2nd,25,2,1099,62,3,1,31
Florida,Nighthawks,2nd,616,2,1400,26,3,1,2
Oregon,Scouts,1st,62,4,973,48,2,0,2
Wyoming,Scouts,1st,73,7,1005,435,1,0,3
Louisana,Scouts,2nd,37,8,1099,63,2,1,2
Georgia,Scouts,2nd,35,9,1523,345,3,1,3


### Step 16. Select the rows called Texas and Arizona

In [77]:
fict_war[fict_war.index.isin(['Arizona', 'Texas'])]

fict_war.ix[['Arizona', 'Texas']]

Unnamed: 0,regiment,company,deaths,battles,size,veterans,readiness,armored,deserters,origin
Arizona,,,,,,,,,,
Texas,,,,,,,,,,


### Step 17. Select the third cell in the row named Arizona

In [78]:
fict_war.ix['Arizona', 'deaths']

KeyError: 'Arizona'

### Step 18. Select the third cell down in the column named deaths

In [79]:
fict_war.ix[2, 'deaths']

25