## How do you handle multi-index DataFrames (MultiIndex)?

## What are multi-index DataFrames?

Multi-index DataFrames, also known as hierarchical index DataFrames, are a data structure in Pandas that allows you to have multiple levels of row and/or column indices.

They are designed to handle structured data with hierarchical relationships effectively.

## Importing pandas library

In [1]:
import pandas as pd

## Reading my file

Here I have done multi-indexing by setting  'date' and 'market' features as index

In [4]:
df = pd.read_csv('tn_tomato_price.csv', index_col=['date', 'market'])
df.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,admin1,admin2,price,usdprice
date,market,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2012-04-15,Chennai,Tamil Nadu,Chennai,18.78,0.36
2012-04-15,Dindigul,Tamil Nadu,Dindigul,21.61,0.42
2012-04-15,Thiruchirapalli,Tamil Nadu,Tiruchchirappalli,21.17,0.41
2012-07-15,Chennai,Tamil Nadu,Chennai,25.77,0.47
2012-07-15,Dindigul,Tamil Nadu,Dindigul,24.18,0.44


## Accessing Data in MultiIndex DataFrames

## Selecting rows with specific values in the MultiIndex

In [7]:
df_rows = df.loc[('2012-07-15', 'Chennai')]
df_rows

admin1      Tamil Nadu
admin2         Chennai
price            25.77
usdprice          0.47
Name: (2012-07-15, Chennai), dtype: object

## Selecting columns with specific values in the MultiIndex

In [8]:
df_col = df.loc[:, ('price', 'usdprice')]
df_col

Unnamed: 0_level_0,Unnamed: 1_level_0,price,usdprice
date,market,Unnamed: 2_level_1,Unnamed: 3_level_1
2012-04-15,Chennai,18.78,0.36
2012-04-15,Dindigul,21.61,0.42
2012-04-15,Thiruchirapalli,21.17,0.41
2012-07-15,Chennai,25.77,0.47
2012-07-15,Dindigul,24.18,0.44
...,...,...,...
2021-07-15,Dindigul,19.71,0.26
2021-07-15,Ramanathapuram,23.92,0.32
2021-07-15,Thiruchirapalli,19.74,0.26
2021-07-15,Tirunelveli,20.30,0.27


## Reshaping MultiIndex DataFrames:

In [9]:
# Unstacking to create a wide-format DataFrame
wide_df = df.unstack()
wide_df

Unnamed: 0_level_0,admin1,admin1,admin1,admin1,admin1,admin1,admin1,admin1,admin1,admin2,...,price,usdprice,usdprice,usdprice,usdprice,usdprice,usdprice,usdprice,usdprice,usdprice
market,Chennai,Coimbatore,Cuddalore,Dharmapuri,Dindigul,Ramanathapuram,Thiruchirapalli,Tirunelveli,Vellore,Chennai,...,Vellore,Chennai,Coimbatore,Cuddalore,Dharmapuri,Dindigul,Ramanathapuram,Thiruchirapalli,Tirunelveli,Vellore
date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2
2012-04-15,Tamil Nadu,,,,Tamil Nadu,,Tamil Nadu,,,Chennai,...,,0.36,,,,0.42,,0.41,,
2012-07-15,Tamil Nadu,,,,Tamil Nadu,,Tamil Nadu,,,Chennai,...,,0.47,,,,0.44,,0.45,,
2012-08-15,Tamil Nadu,,,,Tamil Nadu,,Tamil Nadu,,,Chennai,...,,0.29,,,,0.22,,0.21,,
2012-09-15,Tamil Nadu,,,,Tamil Nadu,,Tamil Nadu,,,Chennai,...,,0.28,,,,0.22,,0.23,,
2012-10-15,Tamil Nadu,,,,Tamil Nadu,,Tamil Nadu,,,Chennai,...,,0.24,,,,0.19,,0.15,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2021-03-15,Tamil Nadu,Tamil Nadu,Tamil Nadu,Tamil Nadu,Tamil Nadu,Tamil Nadu,Tamil Nadu,Tamil Nadu,Tamil Nadu,Chennai,...,15.08,0.30,0.23,0.21,0.15,0.24,0.33,0.21,0.21,0.21
2021-04-15,Tamil Nadu,Tamil Nadu,Tamil Nadu,Tamil Nadu,Tamil Nadu,Tamil Nadu,Tamil Nadu,Tamil Nadu,Tamil Nadu,Chennai,...,10.00,0.25,0.14,0.17,0.12,0.17,0.23,0.16,0.15,0.13
2021-05-15,Tamil Nadu,Tamil Nadu,Tamil Nadu,Tamil Nadu,Tamil Nadu,,Tamil Nadu,Tamil Nadu,Tamil Nadu,Chennai,...,9.69,0.19,0.17,0.18,0.13,0.15,,0.17,0.19,0.13
2021-06-15,Tamil Nadu,Tamil Nadu,Tamil Nadu,Tamil Nadu,Tamil Nadu,Tamil Nadu,Tamil Nadu,Tamil Nadu,Tamil Nadu,Chennai,...,12.37,0.16,0.18,0.25,0.22,0.22,0.27,0.22,0.24,0.17


In [11]:
long_df = df.stack()
long_df

date        market               
2012-04-15  Chennai      admin1      Tamil Nadu
                         admin2         Chennai
                         price            18.78
                         usdprice          0.36
            Dindigul     admin1      Tamil Nadu
                                        ...    
2021-07-15  Tirunelveli  usdprice          0.27
            Vellore      admin1      Tamil Nadu
                         admin2         Vellore
                         price            17.74
                         usdprice          0.24
Length: 1448, dtype: object

In [10]:
# Resetting the index to regular columns
df_reset = df.reset_index()
df_reset

Unnamed: 0,date,market,admin1,admin2,price,usdprice
0,2012-04-15,Chennai,Tamil Nadu,Chennai,18.78,0.36
1,2012-04-15,Dindigul,Tamil Nadu,Dindigul,21.61,0.42
2,2012-04-15,Thiruchirapalli,Tamil Nadu,Tiruchchirappalli,21.17,0.41
3,2012-07-15,Chennai,Tamil Nadu,Chennai,25.77,0.47
4,2012-07-15,Dindigul,Tamil Nadu,Dindigul,24.18,0.44
...,...,...,...,...,...,...
357,2021-07-15,Dindigul,Tamil Nadu,Dindigul,19.71,0.26
358,2021-07-15,Ramanathapuram,Tamil Nadu,Ramanathapuram,23.92,0.32
359,2021-07-15,Thiruchirapalli,Tamil Nadu,Tiruchchirappalli,19.74,0.26
360,2021-07-15,Tirunelveli,Tamil Nadu,Tirunelveli Kattabo,20.30,0.27


These are few things we can do after multi-indexing. Lot more we can do and I will discuss those things in future