## Different choices for indexing 

### .loc is primarily label based, but may also be used with a boolean array. Allowed inputs are:
  - ```A single label, e.g. 5 or 'a' (Note that 5 is interpreted as a label of the index)```
  - ```A list or array of labels ['a', 'b', 'c'].```
  - ```A slice object with labels 'a':'f' (Slicing with labels and Endpoints are inclusive)```
  - ```A boolean array (any NA values will be treated as False)```
  - ```A callable function with one argument and that returns valid output for indexing (one of the above)```
  - ```Note: .loc will raise KeyError when the items are not found```


In [12]:
import os
import pandas as pd
os.chdir("/home/mediaworker/anaconda2/envs/pandas_playground/")

df = pd.read_csv("datasets/jamesbond.csv", index_col="Film")

# sort based on index "Film"
df.sort_index(inplace=True)

In [19]:
df.head(10)

Unnamed: 0_level_0,Year,Actor,Director,Box Office,Budget,Bond Actor Salary
Film,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
A View to a Kill,1985,Roger Moore,John Glen,275.2,54.5,9.1
Casino Royale,2006,Daniel Craig,Martin Campbell,581.5,145.3,3.3
Casino Royale,1967,David Niven,Ken Hughes,315.0,85.0,
Diamonds Are Forever,1971,Sean Connery,Guy Hamilton,442.5,34.7,5.8
Die Another Day,2002,Pierce Brosnan,Lee Tamahori,465.4,154.2,17.9
Dr. No,1962,Sean Connery,Terence Young,448.8,7.0,0.6
For Your Eyes Only,1981,Roger Moore,John Glen,449.4,60.2,
From Russia with Love,1963,Sean Connery,Terence Young,543.8,12.6,1.6
GoldenEye,1995,Pierce Brosnan,Martin Campbell,518.5,76.9,5.1
Goldfinger,1964,Sean Connery,Guy Hamilton,820.4,18.6,3.2


In [6]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 26 entries, 0 to 25
Data columns (total 7 columns):
Film                 26 non-null object
Year                 26 non-null int64
Actor                26 non-null object
Director             26 non-null object
Box Office           26 non-null float64
Budget               26 non-null float64
Bond Actor Salary    18 non-null float64
dtypes: float64(3), int64(1), object(3)
memory usage: 1.5+ KB


In [18]:
# extract information about film "Moonraker"
df.loc["Moonraker"]

# Note: Above Moonraker is the label of index column

Year                          1979
Actor                  Roger Moore
Director             Lewis Gilbert
Box Office                     535
Budget                        91.5
Bond Actor Salary              NaN
Name: Moonraker, dtype: object

In [21]:
# extract all values for index labels from "Casino Royale" to "From Russia with Love"
df.loc["Casino Royale":"From Russia with Love"]

# Note: "Casino Royale" is start index and "From Russia with Love" is stop index. Both are include in final result
# loc slicing is with inclusive of start ans stop index

Unnamed: 0_level_0,Year,Actor,Director,Box Office,Budget,Bond Actor Salary
Film,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Casino Royale,2006,Daniel Craig,Martin Campbell,581.5,145.3,3.3
Casino Royale,1967,David Niven,Ken Hughes,315.0,85.0,
Diamonds Are Forever,1971,Sean Connery,Guy Hamilton,442.5,34.7,5.8
Die Another Day,2002,Pierce Brosnan,Lee Tamahori,465.4,154.2,17.9
Dr. No,1962,Sean Connery,Terence Young,448.8,7.0,0.6
For Your Eyes Only,1981,Roger Moore,John Glen,449.4,60.2,
From Russia with Love,1963,Sean Connery,Terence Young,543.8,12.6,1.6


In [23]:
# extract all values for index labels from "Skyfall" to end of the dataframe
df.loc["Skyfall":]

Unnamed: 0_level_0,Year,Actor,Director,Box Office,Budget,Bond Actor Salary
Film,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Skyfall,2012,Daniel Craig,Sam Mendes,943.5,170.2,14.5
Spectre,2015,Daniel Craig,Sam Mendes,726.7,206.3,
The Living Daylights,1987,Timothy Dalton,John Glen,313.5,68.8,5.2
The Man with the Golden Gun,1974,Roger Moore,Guy Hamilton,334.0,27.7,
The Spy Who Loved Me,1977,Roger Moore,Lewis Gilbert,533.0,45.1,
The World Is Not Enough,1999,Pierce Brosnan,Michael Apted,439.5,158.3,13.5
Thunderball,1965,Sean Connery,Terence Young,848.1,41.9,4.7
Tomorrow Never Dies,1997,Pierce Brosnan,Roger Spottiswoode,463.2,133.9,10.0
You Only Live Twice,1967,Sean Connery,Lewis Gilbert,514.2,59.9,4.4


In [24]:
# extract all values for index labels from "Diamonds Are Forever" to start of the dataframe
df.loc[:"Diamonds Are Forever"]

Unnamed: 0_level_0,Year,Actor,Director,Box Office,Budget,Bond Actor Salary
Film,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
A View to a Kill,1985,Roger Moore,John Glen,275.2,54.5,9.1
Casino Royale,2006,Daniel Craig,Martin Campbell,581.5,145.3,3.3
Casino Royale,1967,David Niven,Ken Hughes,315.0,85.0,
Diamonds Are Forever,1971,Sean Connery,Guy Hamilton,442.5,34.7,5.8


In [25]:
# extract specific list of index labels
df.loc[["Diamonds Are Forever", "Skyfall"]]

# Note: The order of resulting rows will depend on the order given in above filter list

Unnamed: 0_level_0,Year,Actor,Director,Box Office,Budget,Bond Actor Salary
Film,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Diamonds Are Forever,1971,Sean Connery,Guy Hamilton,442.5,34.7,5.8
Skyfall,2012,Daniel Craig,Sam Mendes,943.5,170.2,14.5


In [32]:
# .loc throws KeyError if any one index column is not present in the dataframe
# df.loc[["Diamonds Are Forever", "Skyfall2"]]
# =================================================================================

### .iloc is primarily integer position based (from 0 to length-1 of the axis), but may also be used with a boolean array. Allowed inputs are:

 - An integer e.g. 5.
 - A list or array of integers [4, 3, 0].
 - A slice object with ints 1:7.
 - A boolean array (any NA values will be treated as False).

In [35]:
df = pd.read_csv("datasets/jamesbond.csv")
df.head()

Unnamed: 0,Film,Year,Actor,Director,Box Office,Budget,Bond Actor Salary
0,Dr. No,1962,Sean Connery,Terence Young,448.8,7.0,0.6
1,From Russia with Love,1963,Sean Connery,Terence Young,543.8,12.6,1.6
2,Goldfinger,1964,Sean Connery,Guy Hamilton,820.4,18.6,3.2
3,Thunderball,1965,Sean Connery,Terence Young,848.1,41.9,4.7
4,Casino Royale,1967,David Niven,Ken Hughes,315.0,85.0,


In [37]:
# extract 1st item from dataframe using index value
df.iloc[0]

Film                        Dr. No
Year                          1962
Actor                 Sean Connery
Director             Terence Young
Box Office                   448.8
Budget                           7
Bond Actor Salary              0.6
Name: 0, dtype: object

In [39]:
# extract 1st and 12th item from dataframe using index value
df.iloc[[0, 12]]

Unnamed: 0,Film,Year,Actor,Director,Box Office,Budget,Bond Actor Salary
0,Dr. No,1962,Sean Connery,Terence Young,448.8,7.0,0.6
12,For Your Eyes Only,1981,Roger Moore,John Glen,449.4,60.2,


In [41]:
# extract more then one item from dataframe using index value
df.iloc[[0, 12, 14, 16 , 17]]

Unnamed: 0,Film,Year,Actor,Director,Box Office,Budget,Bond Actor Salary
0,Dr. No,1962,Sean Connery,Terence Young,448.8,7.0,0.6
12,For Your Eyes Only,1981,Roger Moore,John Glen,449.4,60.2,
14,Octopussy,1983,Roger Moore,John Glen,373.8,53.9,7.8
16,The Living Daylights,1987,Timothy Dalton,John Glen,313.5,68.8,5.2
17,Licence to Kill,1989,Timothy Dalton,John Glen,250.9,56.7,7.9


In [47]:
# slicing only start is inclusive
df.iloc[:5]

Unnamed: 0,Film,Year,Actor,Director,Box Office,Budget,Bond Actor Salary
0,Dr. No,1962,Sean Connery,Terence Young,448.8,7.0,0.6
1,From Russia with Love,1963,Sean Connery,Terence Young,543.8,12.6,1.6
2,Goldfinger,1964,Sean Connery,Guy Hamilton,820.4,18.6,3.2
3,Thunderball,1965,Sean Connery,Terence Young,848.1,41.9,4.7
4,Casino Royale,1967,David Niven,Ken Hughes,315.0,85.0,


In [48]:
df.iloc[2:5]

Unnamed: 0,Film,Year,Actor,Director,Box Office,Budget,Bond Actor Salary
2,Goldfinger,1964,Sean Connery,Guy Hamilton,820.4,18.6,3.2
3,Thunderball,1965,Sean Connery,Terence Young,848.1,41.9,4.7
4,Casino Royale,1967,David Niven,Ken Hughes,315.0,85.0,


In [49]:
df.iloc[20:]

Unnamed: 0,Film,Year,Actor,Director,Box Office,Budget,Bond Actor Salary
20,The World Is Not Enough,1999,Pierce Brosnan,Michael Apted,439.5,158.3,13.5
21,Die Another Day,2002,Pierce Brosnan,Lee Tamahori,465.4,154.2,17.9
22,Casino Royale,2006,Daniel Craig,Martin Campbell,581.5,145.3,3.3
23,Quantum of Solace,2008,Daniel Craig,Marc Forster,514.2,181.4,8.1
24,Skyfall,2012,Daniel Craig,Sam Mendes,943.5,170.2,14.5
25,Spectre,2015,Daniel Craig,Sam Mendes,726.7,206.3,
