<a href='http://www.scienceacademy.ca'> <img style="float: left;height:70px" src="Logo_SA.png"></a>

## Hierarchical Indexing
Hierarchical indexing is an important feature of pandas. It makes it possible to have multiple (two or more) index levels on an axis. Somewhat abstractly, it provides a way to work with higher dimensional data in a lower dimensional form. <br>

Let’s start with a simple example for **Series**:

In [1]:
import numpy as np
import pandas as pd
# Create a Series with a list of lists (or arrays) as the index:
index = [['a','a','a','b','b','b','c','c','d','d'], # level 1 index
         [1,2,3,1,2,3,1,2,1,2]]                     # level 2 index
ser = pd.Series(np.random.randn(10),index = index)
ser

a  1    1.930055
   2   -2.085373
   3    0.153083
b  1    0.409857
   2   -1.084681
   3   -0.303109
c  1   -0.028327
   2    0.988131
d  1   -0.483826
   2   -0.719108
dtype: float64

With a hierarchically indexed object, so-called partial indexing is possible, which enables the concise selection of the subsets of the data.

In [2]:
# Data retrieval  
ser['a']

1    1.930055
2   -2.085373
3    0.153083
dtype: float64

In [3]:
# single value
ser['a'][2]

-2.0853727905916641

** Example with DataFrame:**<br>
With a DataFrame, either axis can have a hierarchical index.<br>

In [4]:
df = pd.DataFrame(np.arange(12).reshape((4, 3)),
                  index=[['a', 'a', 'b', 'b'], [1, 2, 1, 2]], 
                  columns=['AB', 'ON', 'BC'])

In [5]:
df

Unnamed: 0,Unnamed: 1,AB,ON,BC
a,1,0,1,2
a,2,3,4,5
b,1,6,7,8
b,2,9,10,11


How to index the above dataframe!<br>
* on the columns axis, just use normal bracket notation `df[]`. 
* on row axis, we use `df.loc[]` 

Calling one level of the index returns the sub-dataframe.

In [6]:
df['AB']

a  1    0
   2    3
b  1    6
   2    9
Name: AB, dtype: int64

In [7]:
df.loc['a']

Unnamed: 0,AB,ON,BC
1,0,1,2
2,3,4,5


We want to **grab a single value**, idea is to **go from outside to inside**, e.g. we want to grab "11"

In [8]:
#df.loc['b']
#df.loc['b'].loc[2]
df.loc['b'].loc[2]['BC']

11

The hierarchical levels can have names (as strings or any Python objects). If so, these will show up in the console output:

In [9]:
df.index.names

FrozenList([None, None])

Let's give names to the index "L_1, L_2"

In [10]:
df.index.names = ['L_1', 'L_2']

In [11]:
df

Unnamed: 0_level_0,Unnamed: 1_level_0,AB,ON,BC
L_1,L_2,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
a,1,0,1,2
a,2,3,4,5
b,1,6,7,8
b,2,9,10,11


## Good to know!
### <code>xs()</code>
Let me introduce a very useful and built-in method "`xs()`" to grab data from multilevel index. <br>
`xs()` has ability to go inside a multilevel index. <br>

In [12]:
# Returns a cross-section (row(s) or column(s)) from the Series/DataFrame.
df.xs('a')

Unnamed: 0_level_0,AB,ON,BC
L_2,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1,0,1,2
2,3,4,5


If we want to grab all the data in df where index L_2 is "1", its tricky for `loc` method, `xs` will do the magic here!<br>
For Example:<br>
tell `xs()` what you want, 1 here, and indicate the level, L_2 in this case.z

In [13]:
df.xs(1, level='L_2')

Unnamed: 0_level_0,AB,ON,BC
L_1,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
a,0,1,2
b,6,7,8


# Excellent! 
Lets do a quick revision and move on to our next topic!<br> 
I want to congratulate here, you are making a great progress, keep it up!