## `Hierarchical indexing`

`Hierarchical indexing`, also known as MultiIndexing, is a powerful feature in Pandas that allows you to work with data organized in multiple levels of indexing.\
This is useful for handling higher-dimensional data within the familiar two-dimensional DataFrame and one-dimensional Series structures.

Why Use Hierarchical Indexing? 
* Store complex, multi-dimensional data in a more compact form.
* Perform operations like slicing, grouping, and aggregating in a more intuitive way.
* Transform data easily for analysis or presentation

### Creating a MultiIndex

In [1]:
import pandas as pd
import numpy as np

divisions = ['Class A', 'Class A', 'Class B', 'Class B']
student_names = ['Anu', 'Kiran', 'Karthik', 'Aksa']


index = pd.MultiIndex.from_arrays([divisions,student_names], names=['Division', 'Student'])
data = {
    'Mathematics': [85, 90, 78, 88],
    'Science': [92, 84, 79, 91]
}

df = pd.DataFrame(data, index=index)
print("DataFrame with MultiIndex:")
df


DataFrame with MultiIndex:


Unnamed: 0_level_0,Unnamed: 1_level_0,Mathematics,Science
Division,Student,Unnamed: 2_level_1,Unnamed: 3_level_1
Class A,Anu,85,92
Class A,Kiran,90,84
Class B,Karthik,78,79
Class B,Aksa,88,91


In [3]:
index = [('California', 2000), ('California', 2010),
         ('New York', 2000), ('New York', 2010),
         ('Texas', 2000), ('Texas', 2010)]
populations = [33871648, 37253956,
               18976457, 19378102,
               20851820, 25145561]
let = pd.Series(populations, index=index)
let

(California, 2000)    33871648
(California, 2010)    37253956
(New York, 2000)      18976457
(New York, 2010)      19378102
(Texas, 2000)         20851820
(Texas, 2010)         25145561
dtype: int64

In [5]:
index = pd.MultiIndex.from_arrays(index)
index

MultiIndex([('California', 'California', 'New York', 'New York', 'Texas', ...),
            (        2000,         2010,       2000,       2010,    2000, ...)],
           )

In [7]:
let = let.reindex(index)
let 

California  California  New York  New York  Texas  Texas   NaN
2000        2010        2000      2010      2000   2010    NaN
dtype: float64

### MultiIndex as extra dimension

In [10]:
let_df = let.unstack()
let_df

Unnamed: 0,Unnamed: 1,Unnamed: 2,Unnamed: 3,Unnamed: 4,2010,Texas
2000,2010,2000,2010,2000,,
California,California,New York,New York,Texas,,


In [12]:
let_df.stack()

Series([], dtype: float64)

In [16]:
# let_df = pd.DataFrame({'total': let,
#                        'under18': [9267089, 9284094,
#                                    4687374, 4318033,
#                                    5906301, 6879014]})
# let_df