<div class="alert alert-block alert-success">
    <h1 align="center">Pandas Trick 28</h1>
    <h3 align="center">Stack & Unstack in Pandas</h3>
    <h4 align="center"><a href="http://www.iran-machinelearning.ir">Soheil Tehranipour</a></h5>
</div>

<img src = "https://www.datasciencemadesimple.com/wp-content/uploads/2017/11/stack-in-pandas-python.png?ezimgfmt=ng%3Awebp%2Fngcb1%2Frs%3Adevice%2Frscb1-1">

In [64]:
import numpy as np
import pandas as pd

<img src = "https://pandas.pydata.org/docs/_images/reshaping_stack.png">

In [65]:
df_single_level_cols = pd.DataFrame([[0, 2], [3, 4]],
                                    index=['deer', 'monkey'],
                                    columns=['weight', 'height'])

Stacking a dataframe with a single level column axis returns a Series:

In [66]:
df_single_level_cols

Unnamed: 0,weight,height
deer,0,2
monkey,3,4


<img src="https://www.w3resource.com/w3r_images/pandas-dataframe-stack.png" alt="Pandas: DataFrame - Stack Single level column." title="Title text" height="" width="" />

In [67]:
df_single_level_cols.stack()

deer    weight    0
        height    2
monkey  weight    3
        height    4
dtype: int64

<img src="https://www.w3resource.com/w3r_images/pandas-dataframe-stack-1.png" alt="Pandas: DataFrame - Stacking a dataframe with a single level column axis returns a Series." title="Title text" height="" width="" />

*Multi level columns: simple case:*

In [68]:
multicol1 = pd.MultiIndex.from_tuples([('weight', 'kg'),
                                       ('weight', 'pounds')])
df_multi_level_cols1 = pd.DataFrame([[3, 4], [4, 5]],
                                    index=['deer', 'monkey'],
                                    columns=multicol1)

Stacking a dataframe with a multi-level column axis:

In [69]:
df_multi_level_cols1

Unnamed: 0_level_0,weight,weight
Unnamed: 0_level_1,kg,pounds
deer,3,4
monkey,4,5


<img src="https://www.w3resource.com/w3r_images/pandas-dataframe-stack-2.png" alt="Pandas: DataFrame - Stacking a dataframe with a multi-level column axis." title="Title text" height="" width="" />

In [70]:
df_multi_level_cols1.stack()

  df_multi_level_cols1.stack()


Unnamed: 0,Unnamed: 1,weight
deer,kg,3
deer,pounds,4
monkey,kg,4
monkey,pounds,5


<img src="https://www.w3resource.com/w3r_images/pandas-dataframe-stack-3.png" alt="Pandas: DataFrame - Stacking a dataframe with a multi-level column axis." title="Title text" height="" width="" />

**Missing values**

In [71]:
multicol2 = pd.MultiIndex.from_tuples([('weight', 'kg'),
                                       ('height', 'm')])
df_multi_level_cols2 = pd.DataFrame([[2.0, 3.0], [4.0, 5.0]],
                                    index=['deer', 'monkey'],
                                    columns=multicol2)

It is common to have missing values when stacking a dataframe with multi-level columns,<br>
as the stacked dataframe typically has more values than the original dataframe. Missing values<br>
are filled with NaNs:

In [72]:
df_multi_level_cols2

Unnamed: 0_level_0,weight,height
Unnamed: 0_level_1,kg,m
deer,2.0,3.0
monkey,4.0,5.0


<img src="https://www.w3resource.com/w3r_images/pandas-dataframe-stack-4.png" alt="Pandas: DataFrame - Missing values stacking a dataframe with multi-level column." title="Title text" height="" width="" />

In [73]:
df_multi_level_cols2.stack()

  df_multi_level_cols2.stack()


Unnamed: 0,Unnamed: 1,weight,height
deer,kg,2.0,
deer,m,,3.0
monkey,kg,4.0,
monkey,m,,5.0


<img src="https://www.w3resource.com/w3r_images/pandas-dataframe-stack-5.png" alt="Pandas: DataFrame - Missing values are filled with NaNs." title="Title text" height="" width="" />

## Prescribing the level(s) to be stacked:
The first parameter controls which level or levels are stacked:

In [74]:
df_multi_level_cols2.stack(0)

  df_multi_level_cols2.stack(0)


Unnamed: 0,Unnamed: 1,kg,m
deer,height,,3.0
deer,weight,2.0,
monkey,height,,5.0
monkey,weight,4.0,


<img src="https://www.w3resource.com/w3r_images/pandas-dataframe-stack-6.png" alt="Pandas: DataFrame - The first parameter controls which level or levels are stacked." title="Title text" height="" width="" />

In [75]:
df_multi_level_cols2.stack([0, 1])

  df_multi_level_cols2.stack([0, 1])


deer    height  m     3.0
        weight  kg    2.0
monkey  height  m     5.0
        weight  kg    4.0
dtype: float64

<img src="https://www.w3resource.com/w3r_images/pandas-dataframe-stack-7.png" alt="Pandas: DataFrame - The first parameter controls which level or levels are stacked." title="Title text" height="" width="" />

## Dropping missing values

In [76]:
df_multi_level_cols3 = pd.DataFrame([[None, 2.0], [3.0, 4.0]],
                                    index=['deer', 'monkey'],
                                    columns=multicol2)

Note that rows where all values are missing are dropped by default but this behaviour<br>
can be controlled via the dropna keyword parameter:

In [77]:
df_multi_level_cols3

Unnamed: 0_level_0,weight,height
Unnamed: 0_level_1,kg,m
deer,,2.0
monkey,3.0,4.0


<img src="https://www.w3resource.com/w3r_images/pandas-dataframe-stack-8.png" alt="Pandas: DataFrame - rows where all valkues are missing are dropped by default ." title="Title text" height="" width="" />

In [78]:
df_multi_level_cols3.stack(dropna=False)

  df_multi_level_cols3.stack(dropna=False)


Unnamed: 0,Unnamed: 1,weight,height
deer,kg,,
deer,m,,2.0
monkey,kg,3.0,
monkey,m,,4.0


<img src="https://www.w3resource.com/w3r_images/pandas-dataframe-stack-9.png" alt="Pandas: DataFrame - dropna=false value." title="Title text" height="" width="" />

In [79]:
df_multi_level_cols3.stack(dropna=True)

  df_multi_level_cols3.stack(dropna=True)


Unnamed: 0,Unnamed: 1,weight,height
deer,m,,2.0
monkey,kg,3.0,
monkey,m,,4.0


<img src="https://www.w3resource.com/w3r_images/pandas-dataframe-stack-10.png" alt="Pandas: DataFrame - dropna=true value." title="Title text" height="" width="" />

## Unstack the dataframe

<img src="https://pandas.pydata.org/docs/_images/reshaping_unstack.png">

In [80]:
index = pd.MultiIndex.from_tuples([('one', 'x'), ('one', 'y'),
                                   ('two', 'x'), ('two', 'y')])
s = pd.Series(np.arange(2.0, 6.0), index=index)
s

one  x    2.0
     y    3.0
two  x    4.0
     y    5.0
dtype: float64

<img src="https://www.w3resource.com/w3r_images/pandas-dataframe-unstack.svg" alt="Pandas: Dataframe - unstack." title="Title text" />

In [81]:
s.unstack(level=-1)

Unnamed: 0,x,y
one,2.0,3.0
two,4.0,5.0


<img src="https://www.w3resource.com/w3r_images/pandas-dataframe-unstack-1.svg" alt="Pandas: Dataframe - unstack level=-1." title="Title text" />

In [82]:
s.unstack(level=0)

Unnamed: 0,one,two
x,2.0,4.0
y,3.0,5.0


<img src="https://www.w3resource.com/w3r_images/pandas-dataframe-unstack-2.svg" alt="Pandas: Dataframe - unstack level=0." title="Title text" />

In [83]:
df = s.unstack(level=0)
df.unstack()

one  x    2.0
     y    3.0
two  x    4.0
     y    5.0
dtype: float64

<img src="https://webna.ir/wp-content/uploads/2018/08/%D9%85%DA%A9%D8%AA%D8%A8-%D8%AE%D9%88%D9%86%D9%87.png" width=50% />