### Topics:
- Set index/reset index
- using loc and iloc

In [1]:
import pandas as pd

### Import the jamesbond.csv

In [2]:
bond = pd.read_csv('../datasets/jamesbond.csv')

bond.head()

Unnamed: 0,Film,Year,Actor,Director,Box Office,Budget,Bond Actor Salary
0,Dr. No,1962,Sean Connery,Terence Young,448.8,7.0,0.6
1,From Russia with Love,1963,Sean Connery,Terence Young,543.8,12.6,1.6
2,Goldfinger,1964,Sean Connery,Guy Hamilton,820.4,18.6,3.2
3,Thunderball,1965,Sean Connery,Terence Young,848.1,41.9,4.7
4,Casino Royale,1967,David Niven,Ken Hughes,315.0,85.0,


### Setting and restting the index
- You can set the index on import by using the index_col =  parameter.
- The index can contain duplicates.

In [3]:
bond = pd.read_csv('../datasets/jamesbond.csv', index_col='Film')

bond.head(3)

Unnamed: 0_level_0,Year,Actor,Director,Box Office,Budget,Bond Actor Salary
Film,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Dr. No,1962,Sean Connery,Terence Young,448.8,7.0,0.6
From Russia with Love,1963,Sean Connery,Terence Young,543.8,12.6,1.6
Goldfinger,1964,Sean Connery,Guy Hamilton,820.4,18.6,3.2


#### You can also explicitly set the index using a method - set_index( )
- Rerun cell 2.

In [7]:
bond.set_index(keys='Film', inplace=True)

bond.head(1)

Unnamed: 0_level_0,Year,Actor,Director,Box Office,Budget,Bond Actor Salary
Film,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Dr. No,1962,Sean Connery,Terence Young,448.8,7.0,0.6


### Reset the index using reset_index( )
- To make permanent, use inplace=True

In [12]:
bond.reset_index(inplace=True)

bond.head(3)

Unnamed: 0,Film,Year,Actor,Director,Box Office,Budget,Bond Actor Salary
0,Dr. No,1962,Sean Connery,Terence Young,448.8,7.0,0.6
1,From Russia with Love,1963,Sean Connery,Terence Young,543.8,12.6,1.6
2,Goldfinger,1964,Sean Connery,Guy Hamilton,820.4,18.6,3.2


### Changing the index once it is set using inplace=True

In [13]:
bond.set_index("Film", inplace=True)

bond.head(3)

Unnamed: 0_level_0,Year,Actor,Director,Box Office,Budget,Bond Actor Salary
Film,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Dr. No,1962,Sean Connery,Terence Young,448.8,7.0,0.6
From Russia with Love,1963,Sean Connery,Terence Young,543.8,12.6,1.6
Goldfinger,1964,Sean Connery,Guy Hamilton,820.4,18.6,3.2


In [14]:
# If I were to set the index as Year now, it would set it as the new index byt it would drop Film. If you dont want this:
bond.reset_index(inplace=True)

bond.set_index('Year', inplace=True)

bond.head(3)

Unnamed: 0_level_0,Film,Actor,Director,Box Office,Budget,Bond Actor Salary
Year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1962,Dr. No,Sean Connery,Terence Young,448.8,7.0,0.6
1963,From Russia with Love,Sean Connery,Terence Young,543.8,12.6,1.6
1964,Goldfinger,Sean Connery,Guy Hamilton,820.4,18.6,3.2


### Retreive rows by index label with .loc

In [16]:
# set index first
bond = pd.read_csv('../datasets/jamesbond.csv', index_col="Film")
bond.head(1)

Unnamed: 0_level_0,Year,Actor,Director,Box Office,Budget,Bond Actor Salary
Film,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Dr. No,1962,Sean Connery,Terence Young,448.8,7.0,0.6


### Sort the index first for optimization

In [18]:
bond.sort_index(inplace=True)
bond.head()

Unnamed: 0_level_0,Year,Actor,Director,Box Office,Budget,Bond Actor Salary
Film,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
A View to a Kill,1985,Roger Moore,John Glen,275.2,54.5,9.1
Casino Royale,2006,Daniel Craig,Martin Campbell,581.5,145.3,3.3
Casino Royale,1967,David Niven,Ken Hughes,315.0,85.0,
Diamonds Are Forever,1971,Sean Connery,Guy Hamilton,442.5,34.7,5.8
Die Another Day,2002,Pierce Brosnan,Lee Tamahori,465.4,154.2,17.9


In [19]:
# usng the .loc method to get data by label names- this retruns a series with the rows info displayed
bond.loc['Goldfinger']

Year                         1964
Actor                Sean Connery
Director             Guy Hamilton
Box Office                  820.4
Budget                       18.6
Bond Actor Salary             3.2
Name: Goldfinger, dtype: object

In [20]:
# what about when there are duplicate values? if there are dups, you get a data frame returned.
bond.loc['Casino Royale']

Unnamed: 0_level_0,Year,Actor,Director,Box Office,Budget,Bond Actor Salary
Film,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Casino Royale,2006,Daniel Craig,Martin Campbell,581.5,145.3,3.3
Casino Royale,1967,David Niven,Ken Hughes,315.0,85.0,


In [23]:
# we can use a slice as well to extract all movies from one index label to another.
bond.loc['Diamonds Are Forever':'From Russia with Love']

Unnamed: 0_level_0,Year,Actor,Director,Box Office,Budget,Bond Actor Salary
Film,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Diamonds Are Forever,1971,Sean Connery,Guy Hamilton,442.5,34.7,5.8
Die Another Day,2002,Pierce Brosnan,Lee Tamahori,465.4,154.2,17.9
Dr. No,1962,Sean Connery,Terence Young,448.8,7.0,0.6
For Your Eyes Only,1981,Roger Moore,John Glen,449.4,60.2,
From Russia with Love,1963,Sean Connery,Terence Young,543.8,12.6,1.6


### Extract more than one row label - use a list!

In [25]:
# order matter, change the order of labels to whatever you want the order to be
bond.loc[['Die Another Day', 'Octopussy']]

Unnamed: 0_level_0,Year,Actor,Director,Box Office,Budget,Bond Actor Salary
Film,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Die Another Day,2002,Pierce Brosnan,Lee Tamahori,465.4,154.2,17.9
Octopussy,1983,Roger Moore,John Glen,373.8,53.9,7.8


### Retreive rows by index position with .iloc

In [26]:
# keep standard numeric default index
bond = pd.read_csv('../datasets/jamesbond.csv')
bond.head(1)

Unnamed: 0,Film,Year,Actor,Director,Box Office,Budget,Bond Actor Salary
0,Dr. No,1962,Sean Connery,Terence Young,448.8,7.0,0.6


In [27]:
# grab a rwo by index
bond.iloc[0]

Film                        Dr. No
Year                          1962
Actor                 Sean Connery
Director             Terence Young
Box Office                   448.8
Budget                         7.0
Bond Actor Salary              0.6
Name: 0, dtype: object

In [28]:
# grab the 15th movie (this is 14)
bond.iloc[14]

Film                   Octopussy
Year                        1983
Actor                Roger Moore
Director               John Glen
Box Office                 373.8
Budget                      53.9
Bond Actor Salary            7.8
Name: 14, dtype: object

### Get more than one row

In [29]:
bond.iloc[[15,20]]

Unnamed: 0,Film,Year,Actor,Director,Box Office,Budget,Bond Actor Salary
15,A View to a Kill,1985,Roger Moore,John Glen,275.2,54.5,9.1
20,The World Is Not Enough,1999,Pierce Brosnan,Michael Apted,439.5,158.3,13.5


### Using a slice
- When using numeric index, the last rwo is not inclusive just like a normal list slice.
- Behvior is exactly the same as a list slice.
- When a label is used, iloc syntax can still be used.

In [30]:
# this returns rows 10-19, to include 20, use 21 as the last index position given its not all inclusive like labels.
bond.iloc[10:20]

Unnamed: 0,Film,Year,Actor,Director,Box Office,Budget,Bond Actor Salary
10,The Spy Who Loved Me,1977,Roger Moore,Lewis Gilbert,533.0,45.1,
11,Moonraker,1979,Roger Moore,Lewis Gilbert,535.0,91.5,
12,For Your Eyes Only,1981,Roger Moore,John Glen,449.4,60.2,
13,Never Say Never Again,1983,Sean Connery,Irvin Kershner,380.0,86.0,
14,Octopussy,1983,Roger Moore,John Glen,373.8,53.9,7.8
15,A View to a Kill,1985,Roger Moore,John Glen,275.2,54.5,9.1
16,The Living Daylights,1987,Timothy Dalton,John Glen,313.5,68.8,5.2
17,Licence to Kill,1989,Timothy Dalton,John Glen,250.9,56.7,7.9
18,GoldenEye,1995,Pierce Brosnan,Martin Campbell,518.5,76.9,5.1
19,Tomorrow Never Dies,1997,Pierce Brosnan,Roger Spottiswoode,463.2,133.9,10.0
