---
title: "Multi-Index DataFrame FINISH LATER"
description: "A MultiIndex object is a way to create a hierarchical structure of columns and/or rows. You can think of a MultiIndex as an list of unique tuples.

Use multi-indexing to create a hierarchy of dimensions for your data. If all columns are tuples of the same size, then they are understood as a multi-index. The same goes for row index labels."
tags: Pandas
URL: https://github.com/ageron/handson-ml
Licence: Apache License 2.0
Creator: 
Meta: ""

---

 <div>
    	<img src="./coco.png" style="float: left;height: 55px">
    	<div style="height: 150px;text-align: center; padding-top:5px">
        <h1>
      	Multi-Index DataFrame FINISH LATER
        </h1>
        <p>A MultiIndex object is a way to create a hierarchical structure of columns and/or rows. You can think of a MultiIndex as an list of unique tuples.</p>
    	</div>
		</div> 

 <div style="height:40px">
		<div style="width:100%; text-align:center; border-bottom: 1px solid #000; line-height:0.1em; margin:40px 0 20px;">
    	<span style="background:#fff; padding:0 10px; font-size:25px; font-family: 'Open Sans', sans-serif;">
        Key Code
    	</span>
		</div>
		</div>
			

In [None]:
import pandas as pd

https://pandas.pydata.org/pandas-docs/stable/user_guide/advanced.html

 <div style="height:40px">
		<div style="width:100%; text-align:center; border-bottom: 1px solid #000; line-height:0.1em; margin:40px 0 20px;">
    	<span style="background:#fff; padding:0 10px; font-size:25px; font-family: 'Open Sans', sans-serif;">
        Example
    	</span>
		</div>
		</div>
			

 <div style="height:40px">
		<div style="width:100%; text-align:center; border-bottom: 1px solid #000; line-height:0.1em; margin:40px 0 20px;">
    	<span style="background:#fff; padding:0 10px; font-size:25px; font-family: 'Open Sans', sans-serif;">
        Example
    	</span>
		</div>
		</div>
			

If all columns are tuples of the same size, then they are understood as a multi-index. The same goes for row index labels. For example:

In [59]:
df = pd.DataFrame(
  {
    ("public", "birthyear"):
        {("Paris","alice"):1985, ("Paris","bob"): 1984, ("London","charles"): 1992},
    ("public", "hobby"):
        {("Paris","alice"):"Biking", ("Paris","bob"): "Dancing"},
    ("private", "weight"):
        {("Paris","alice"):68, ("Paris","bob"): 83, ("London","charles"): 112},
    ("private", "children"):
        {("Paris", "alice"):np.nan, ("Paris","bob"): 3, ("London","charles"): 0}
  }
)
df

Unnamed: 0_level_0,Unnamed: 1_level_0,private,private,public,public
Unnamed: 0_level_1,Unnamed: 1_level_1,children,weight,birthyear,hobby
London,charles,0.0,112,1992,
Paris,alice,,68,1985,Biking
Paris,bob,3.0,83,1984,Dancing


You can now get a `DataFrame` containing all the `"public"` columns very simply:

In [60]:
df["public"]

Unnamed: 0,Unnamed: 1,birthyear,hobby
London,charles,1992,
Paris,alice,1985,Biking
Paris,bob,1984,Dancing


In [61]:
df["public", "hobby"]  # Same result as d5["public"]["hobby"]

London  charles        NaN
Paris   alice       Biking
        bob        Dancing
Name: (public, hobby), dtype: object

## Dropping a level
Let's look at `d5` again:

In [62]:
d5

Unnamed: 0_level_0,Unnamed: 1_level_0,private,private,public,public
Unnamed: 0_level_1,Unnamed: 1_level_1,children,weight,birthyear,hobby
London,charles,0.0,112,1992,
Paris,alice,,68,1985,Biking
Paris,bob,3.0,83,1984,Dancing


There are two levels of columns, and two levels of indices. We can drop a column level by calling `droplevel()` (the same goes for indices):

In [63]:
d5.columns = d5.columns.droplevel(level = 0)
d5

Unnamed: 0,Unnamed: 1,children,weight,birthyear,hobby
London,charles,0.0,112,1992,
Paris,alice,,68,1985,Biking
Paris,bob,3.0,83,1984,Dancing


## Transposing
You can swap columns and indices using the `T` attribute:

In [64]:
d6 = d5.T
d6

Unnamed: 0_level_0,London,Paris,Paris
Unnamed: 0_level_1,charles,alice,bob
children,0.0,,3
weight,112.0,68,83
birthyear,1992.0,1985,1984
hobby,,Biking,Dancing


## Stacking and unstacking levels
Calling the `stack()` method will push the lowest column level after the lowest index:

In [65]:
d7 = d6.stack()
d7

Unnamed: 0,Unnamed: 1,London,Paris
children,bob,,3
children,charles,0.0,
weight,alice,,68
weight,bob,,83
weight,charles,112.0,
birthyear,alice,,1985
birthyear,bob,,1984
birthyear,charles,1992.0,
hobby,alice,,Biking
hobby,bob,,Dancing


Note that many `NaN` values appeared. This makes sense because many new combinations did not exist before (eg. there was no `bob` in `London`).

Calling `unstack()` will do the reverse, once again creating many `NaN` values.

In [66]:
d8 = d7.unstack()
d8

Unnamed: 0_level_0,London,London,London,Paris,Paris,Paris
Unnamed: 0_level_1,alice,bob,charles,alice,bob,charles
children,,,0.0,,3,
weight,,,112.0,68,83,
birthyear,,,1992.0,1985,1984,
hobby,,,,Biking,Dancing,


If we call `unstack` again, we end up with a `Series` object:

In [67]:
d9 = d8.unstack()
d9

London  alice    children        None
                 weight           NaN
                 birthyear        NaN
                 hobby            NaN
        bob      children         NaN
                 weight           NaN
                 birthyear        NaN
                 hobby            NaN
        charles  children           0
                 weight           112
                 birthyear       1992
                 hobby           None
Paris   alice    children        None
                 weight            68
                 birthyear       1985
                 hobby         Biking
        bob      children           3
                 weight            83
                 birthyear       1984
                 hobby        Dancing
        charles  children         NaN
                 weight           NaN
                 birthyear        NaN
                 hobby           None
dtype: object

The `stack()` and `unstack()` methods let you select the `level` to stack/unstack. You can even stack/unstack multiple levels at once:

In [68]:
d10 = d9.unstack(level = (0,1))
d10

Unnamed: 0_level_0,London,London,London,Paris,Paris,Paris
Unnamed: 0_level_1,alice,bob,charles,alice,bob,charles
children,,,0.0,,3,
weight,,,112.0,68,83,
birthyear,,,1992.0,1985,1984,
hobby,,,,Biking,Dancing,


## Most methods return modified copies
As you may have noticed, the `stack()` and `unstack()` methods do not modify the object they apply to. Instead, they work on a copy and return that copy. This is true of most methods in pandas.