#  Data Science Learning Journey  
*Curiosity to Capability — One Notebook at a Time*

---
Compiled and authored by **Partho Sarothi Das**   
	Dhaka, Bangladesh  
	Bachelor's & Master's in Statistics  
	Investment Banking Professional → Aspiring Data Scientist 
    
---

# MultiIndex

In Pandas, a MultiIndex (also called a hierarchical index) is an advanced indexing feature that allows us to work with multiple levels of indexes on your Series or DataFrame. It's particularly useful when dealing with higher-dimensional data in a 2D structure.

### What is a MultiIndex?

A MultiIndex is an index object containing multiple levels (or labels) of indexing on axes (rows or columns). Think of it as a way to group rows or columns under multiple keys.

# Creating a MultiIndex

### 1. From tuples:

In [6]:
import pandas as pd

index = pd.MultiIndex.from_tuples([('A', 1), ('A', 2), ('B', 1), ('B', 2)], names=['letter', 'number'])
df = pd.DataFrame({'value': [10, 20, 30, 40]}, index=index)
df

Unnamed: 0_level_0,Unnamed: 1_level_0,value
letter,number,Unnamed: 2_level_1
A,1,10
A,2,20
B,1,30
B,2,40


### 2. From a product of iterables:

In [8]:
arrays = [['A', 'A', 'B', 'B'], [1, 2, 1, 2]]
index = pd.MultiIndex.from_arrays(arrays, names=('letter', 'number'))
df = pd.DataFrame({'value': [10, 20, 30, 40]}, index=index)
df

Unnamed: 0_level_0,Unnamed: 1_level_0,value
letter,number,Unnamed: 2_level_1
A,1,10
A,2,20
B,1,30
B,2,40


### 3. From a DataFrame:

In [10]:
city = pd.DataFrame({
    'city': ['Dinajpur', 'Dinajpur', 'Sylhet', 'Sylhet'],
    'year': [2000, 2001, 2000, 2001],
    'pop': [1.5, 1.7, 2.4, 2.9]
})
city = city.set_index(['city', 'year'])  # Create MultiIndex from columns
city

Unnamed: 0_level_0,Unnamed: 1_level_0,pop
city,year,Unnamed: 2_level_1
Dinajpur,2000,1.5
Dinajpur,2001,1.7
Sylhet,2000,2.4
Sylhet,2001,2.9


# Accessing Data in a MultiIndex DataFrame

### .loc[ ] with tuple:

In [13]:
df.loc[('A', 1)]

value    10
Name: (A, 1), dtype: int64

### Slicing using ---> pd.IndexSlice

In [15]:
idx = pd.IndexSlice
df.loc[idx['A':'B', 1], :]


Unnamed: 0_level_0,Unnamed: 1_level_0,value
letter,number,Unnamed: 2_level_1
A,1,10
B,1,30


# MultiIndex Columns

In [17]:
columns = pd.MultiIndex.from_product([['Math', 'Science'], ['Score', 'Grade']])
df = pd.DataFrame([[90, 'A', 85, 'B+'], [88, 'B', 89, 'A']], columns=columns)
df

Unnamed: 0_level_0,Math,Math,Science,Science
Unnamed: 0_level_1,Score,Grade,Score,Grade
0,90,A,85,B+
1,88,B,89,A


# Practical Uses of MultiIndex

- Time-series with hierarchical structure (e.g., region, year, quarter)

- Panel data (e.g., firm-level data over time)

- Nested grouping in groupby()

# Tips for Working with MultiIndex

- Always name your index levels to avoid confusion.

- Use sort_index() before slicing if you get warnings.

- IndexSlice is powerful for multi-level filtering.