## Hierarchical Indexing
### Working with pandas
*Curtis Miller*

In this notebook I explore the creation and use of hierarchical indices.

Let's see different ways to create a hierarchical index.

In [1]:
import pandas as pd
from pandas import Series, DataFrame
import numpy as np

In [2]:
# Directly with MultiIndex
midx = pd.MultiIndex([['a', 'b'], ['alpha', 'beta'], [1, 2]],
                     [[0, 0, 0, 0, 1, 1, 1, 1],
                      [0, 0, 1, 1, 0, 0, 1, 1],
                      [0, 1, 0, 1, 0, 1, 0, 1]])
Series(np.arange(8), index=midx)

a  alpha  1    0
          2    1
   beta   1    2
          2    3
b  alpha  1    4
          2    5
   beta   1    6
          2    7
dtype: int32

In [4]:
# In the Series creation
srs = Series(np.arange(8),
             index=[['a', 'a', 'a', 'a', 'b', 'b', 'b', 'b'],
                    ['alpha', 'alpha', 'beta', 'beta',
                     'alpha', 'alpha', 'beta', 'beta'],
                    [1, 2, 1, 2, 1, 2, 1, 2]])
srs

a  alpha  1    0
          2    1
   beta   1    2
          2    3
b  alpha  1    4
          2    5
   beta   1    6
          2    7
dtype: int32

Let's first see what slicing a Series with a multi-index looks like.

In [5]:
srs.loc['b']

alpha  1    4
       2    5
beta   1    6
       2    7
dtype: int32

In [6]:
srs.loc['b', 'alpha']    # The following won't work for DataFrames

1    4
2    5
dtype: int32

In [7]:
srs.loc['b', 'alpha', 1]

4

In [8]:
srs.loc['a', :, 1]

a  alpha  1    0
   beta   1    2
dtype: int32

Now we look at managing a hierarchical index attached to a `DataFrame`.

In [9]:
df = DataFrame(np.random.randn(8, 3), index=midx,
               columns=['AAA', 'BBB', 'CCC'])
df.loc['b']

Unnamed: 0,Unnamed: 1,AAA,BBB,CCC
alpha,1,-0.310013,-1.16488,0.479013
alpha,2,-0.458267,-0.483928,-0.001625
beta,1,-0.714172,-0.837795,-0.807873
beta,2,-0.331394,0.646078,0.133713


In [10]:
df.loc[('b', 'alpha')]    # Must use a tuple here

Unnamed: 0,AAA,BBB,CCC
1,-0.310013,-1.16488,0.479013
2,-0.458267,-0.483928,-0.001625


In [11]:
df.loc[('b', 'alpha', 1)]

AAA   -0.310013
BBB   -1.164880
CCC    0.479013
Name: (b, alpha, 1), dtype: float64

In [12]:
df.loc[('b', slice(None), 1), :]    # Don't treat : as optional

Unnamed: 0,Unnamed: 1,Unnamed: 2,AAA,BBB,CCC
b,alpha,1,-0.310013,-1.16488,0.479013
b,beta,1,-0.714172,-0.837795,-0.807873


In [13]:
df.loc[(slice(None, 'b'), slice(None), 1), ['AAA', 'BBB']]    # :'b'

Unnamed: 0,Unnamed: 1,Unnamed: 2,AAA,BBB
a,alpha,1,-0.533517,-0.064212
a,beta,1,0.645208,-0.729256
b,alpha,1,-0.310013,-1.16488
b,beta,1,-0.714172,-0.837795


In [None]:
df.loc[(slice(None), slice(None), 1), 'CCC']