---
title: "Working with MultiIndex DataFrames"
author: "Mohammed Adil Siraju"
date: "2025-09-23"
categories: [pandas, dataframe, multi-index]
description: "Guide to creating, inspecting, and working with MultiIndex DataFrames in pandas, including integration with NumPy."
---
# Working with MultiIndex DataFrames

This notebook covers MultiIndex DataFrames in pandas: creation, inspection, and combining with NumPy arrays. MultiIndex allows hierarchical indexing for complex data structures.

## Introduction to MultiIndex

MultiIndex in pandas allows you to have multiple levels of indexing on rows or columns. It's useful for hierarchical data like time series with multiple categories.

In [9]:
import pandas as pd

## Creating a MultiIndex DataFrame

Use `pd.MultiIndex.from_arrays()` to create a MultiIndex from arrays. Here, we create a DataFrame with a two-level row index.

In [10]:
arrays = [['A','A','B','B',], [1,2,1,2]]
index = pd.MultiIndex.from_arrays(arrays, names=('First', 'Second'))
df = pd.DataFrame({'Data': [10,20,30,40]}, index=index)

df

Unnamed: 0_level_0,Unnamed: 1_level_0,Data
First,Second,Unnamed: 2_level_1
A,1,10
A,2,20
B,1,30
B,2,40


## Inspecting the MultiIndex

Access the index with `df.index`. It shows the hierarchical structure.

In [11]:
df.index

MultiIndex([('A', 1),
            ('A', 2),
            ('B', 1),
            ('B', 2)],
           names=['First', 'Second'])

## Combining pandas with NumPy

Pandas integrates seamlessly with NumPy. You can create DataFrames from NumPy arrays and use NumPy functions on DataFrame data.

## Creating DataFrames from NumPy Arrays

Use `pd.DataFrame()` with a NumPy array to create a DataFrame. Specify column names for clarity.

In [12]:
import numpy as np

In [13]:
array_data = np.array([[1,2,3], [4,5,6], [7,8,9]])
df_np = pd.DataFrame(array_data, columns=['Sales A', 'Sales B', 'Sales C'])

df_np

Unnamed: 0,Sales A,Sales B,Sales C
0,1,2,3
1,4,5,6
2,7,8,9


## Best Practices

- Use meaningful names for MultiIndex levels (e.g., 'Category', 'Subcategory').
- When selecting data, use `.loc[]` with tuples for MultiIndex access.
- Reset index with `df.reset_index()` if you need to flatten the hierarchy.

## Summary

This notebook demonstrated creating and inspecting MultiIndex DataFrames and integrating pandas with NumPy arrays. MultiIndex is powerful for complex data but can be tricky—practice with real datasets!