# Advanced indexing

In [1]:
%matplotlib inline

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
try:
    import seaborn
except ImportError:
    pass


This dataset is borrowed from the [PyCon tutorial of Brandon Rhodes](https://github.com/brandon-rhodes/pycon-pandas-tutorial/) (so all credit to him!). You can download these data from here: [`titles.csv`](https://drive.google.com/file/d/0B3G70MlBnCgKa0U4WFdWdGdVOFU/view?usp=sharing) and [`cast.csv`](https://drive.google.com/file/d/0B3G70MlBnCgKRzRmTWdQTUdjNnM/view?usp=sharing) and put them in the `/data` folder.

In [2]:
cast = pd.read_csv('data/cast.csv')
cast.head()

Unnamed: 0,title,year,name,type,character,n
0,Suuri illusioni,1985,Homo $,actor,Guests,22.0
1,Gangsta Rap: The Glockumentary,2007,Too $hort,actor,Himself,
2,Menace II Society,1993,Too $hort,actor,Lew-Loc,27.0
3,Porndogs: The Adventures of Sadie,2009,Too $hort,actor,Bosco,3.0
4,Stop Pepper Palmer,2014,Too $hort,actor,Himself,


In [3]:
titles = pd.read_csv('data/titles.csv')
titles.head()

Unnamed: 0,title,year
0,The Rising Son,1990
1,Ashes of Kukulcan,2016
2,The Thousand Plane Raid,1969
3,Crucea de piatra,1993
4,The 86,2015


## Setting columns as the index

Why is it useful to have an index?

- Giving meaningful labels to your data -> easier to remember which data are where
- Unleash some powerful methods, eg with a DatetimeIndex for time series
- Easier and faster selection of data

It is this last one we are going to explore here!

Setting the `title` column as the index:

In [8]:
pd.options.display.max_rows = 10

In [9]:
cast

Unnamed: 0,title,year,name,type,character,n
0,Suuri illusioni,1985,Homo $,actor,Guests,22
1,Gangsta Rap: The Glockumentary,2007,Too $hort,actor,Himself,
2,Menace II Society,1993,Too $hort,actor,Lew-Loc,27
3,Porndogs: The Adventures of Sadie,2009,Too $hort,actor,Bosco,3
4,Stop Pepper Palmer,2014,Too $hort,actor,Himself,
...,...,...,...,...,...,...
3333690,Stuttur Frakki,1993,Sveinbj?rg ??rhallsd?ttir,actress,Flugfreyja,24
3333691,Foxtrot,1988,Lilja ??risd?ttir,actress,D?ra,24
3333692,Niceland (Population. 1.000.002),2004,Sigr??ur J?na ??risd?ttir,actress,Woman in Bus,26
3333693,U.S.S.S.S...,2003,Krist?n Andrea ??r?ard?ttir,actress,Afgr.dama ? bens?nst??,17


In [10]:
c = cast.set_index('title')

In [11]:
c.head()

Unnamed: 0_level_0,year,name,type,character,n
title,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Suuri illusioni,1985,Homo $,actor,Guests,22.0
Gangsta Rap: The Glockumentary,2007,Too $hort,actor,Himself,
Menace II Society,1993,Too $hort,actor,Lew-Loc,27.0
Porndogs: The Adventures of Sadie,2009,Too $hort,actor,Bosco,3.0
Stop Pepper Palmer,2014,Too $hort,actor,Himself,


Instead of doing:

In [12]:
%%time
cast[cast['title'] == 'Hamlet']

CPU times: user 480 ms, sys: 8 ms, total: 488 ms
Wall time: 489 ms


Unnamed: 0,title,year,name,type,character,n
1456,Hamlet,1996,Riz Abbasi,actor,Attendant to Claudius,1
6710,Hamlet,1921,Fritz Achterberg,actor,"Fortinbras,",9
10274,Hamlet,2009,Hayden Adams,actor,Laertes,7
10275,Hamlet,2009,Hayden Adams,actor,Player,7
12743,Hamlet,1913,Eric Adeney,actor,Reynaldo,14
...,...,...,...,...,...,...
3212647,Hamlet,1964,Carol Teitel,actress,Lady,
3236803,Hamlet,1969,Jennifer Tudor,actress,Court lady,23
3257437,Hamlet,2000,Diane Venora,actress,Gertrude,3
3284728,Hamlet,1996,Perdita Weeks,actress,Second Player,44


we can now do:

In [15]:
%%time
c.loc['Stop Pepper Palmer']

CPU times: user 176 ms, sys: 8 ms, total: 184 ms
Wall time: 183 ms


Unnamed: 0_level_0,year,name,type,character,n
title,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Stop Pepper Palmer,2014,Too $hort,actor,Himself,
Stop Pepper Palmer,2014,Spencer Belnap,actor,Lawyer,
Stop Pepper Palmer,2014,Carlton Bluford,actor,Jerome Johnson,
Stop Pepper Palmer,2014,Nasheda Caudle,actor,Back up Dancer,
Stop Pepper Palmer,2014,Andrew Diaz,actor,Pepper Palmer,
...,...,...,...,...,...
Stop Pepper Palmer,2014,Rachel Patten-Moskios,actress,Club Patron,
Stop Pepper Palmer,2014,Amy Savannah,actress,Too Short's Girlfriend,
Stop Pepper Palmer,2014,Eve Speer,actress,Trustee,
Stop Pepper Palmer,2014,Catherine Standiford,actress,Club Patron,


But you can also have multiple columns as the index, leading to a **multi-index or hierarchical index**:

In [16]:
c = cast.set_index(['title', 'year'])

In [17]:
c.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,name,type,character,n
title,year,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Suuri illusioni,1985,Homo $,actor,Guests,22.0
Gangsta Rap: The Glockumentary,2007,Too $hort,actor,Himself,
Menace II Society,1993,Too $hort,actor,Lew-Loc,27.0
Porndogs: The Adventures of Sadie,2009,Too $hort,actor,Bosco,3.0
Stop Pepper Palmer,2014,Too $hort,actor,Himself,


In [22]:
%%time
c.loc[('Hamlet', 2000),:]

CPU times: user 32 ms, sys: 12 ms, total: 44 ms
Wall time: 44.8 ms




Unnamed: 0_level_0,Unnamed: 1_level_0,name,type,character,n
title,year,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Hamlet,2000,Casey Affleck,actor,Fortinbras,15
Hamlet,2000,Paul Bartel,actor,Osric,14
Hamlet,2000,Paul Ferriter,actor,Special Guest Appearance,23
Hamlet,2000,Larry Fessenden,actor,Kissing Man,24
Hamlet,2000,Karl Geary,actor,Horatio,8
Hamlet,...,...,...,...,...
Hamlet,2000,Anne (II) Nixon,actress,Special Guest Appearance,34
Hamlet,2000,India Reed Kotis,actress,Special Guest Appearance,29
Hamlet,2000,Kelly Sebastian,actress,Secretary,39
Hamlet,2000,Julia Stiles,actress,Ophelia,7


In [19]:
c2 = c.sort_index()

In [23]:
%%time
c2.loc[('Hamlet', 2000),:]

CPU times: user 8 ms, sys: 0 ns, total: 8 ms
Wall time: 9.64 ms


Unnamed: 0_level_0,Unnamed: 1_level_0,name,type,character,n
title,year,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Hamlet,2000,Casey Affleck,actor,Fortinbras,15
Hamlet,2000,Paul Bartel,actor,Osric,14
Hamlet,2000,Paul Ferriter,actor,Special Guest Appearance,23
Hamlet,2000,Larry Fessenden,actor,Kissing Man,24
Hamlet,2000,Karl Geary,actor,Horatio,8
Hamlet,...,...,...,...,...
Hamlet,2000,Anne (II) Nixon,actress,Special Guest Appearance,34
Hamlet,2000,India Reed Kotis,actress,Special Guest Appearance,29
Hamlet,2000,Kelly Sebastian,actress,Secretary,39
Hamlet,2000,Julia Stiles,actress,Ophelia,7


In [26]:
c2.reset_index(['title', 'year'])

Unnamed: 0,title,year,name,type,character,n
0,#1 Serial Killer,2013,Michael Alton,actor,Detective Roberts,17
1,#1 Serial Killer,2013,Aaron Aoki,actor,Plastic Bag Victim,21
2,#1 Serial Killer,2013,Zachary (X) Brown,actor,Africian American Teen,18
3,#1 Serial Killer,2013,Yvis Cannavale,actor,Homeless Man,25
4,#1 Serial Killer,2013,Patrick Chien,actor,Cleaver Victim,22
...,...,...,...,...,...,...
3333690,xXx: State of the Union,2005,Deborah S. Smith,actress,Business Woman,
3333691,xXx: State of the Union,2005,Gina St. John,actress,Field Reporter,28
3333692,xXx: State of the Union,2005,Paola (III) Torres,actress,DC Executive,
3333693,xXx: State of the Union,2005,Samantha Tyler,actress,Corvette Girl,38
