<div class="alert alert-block alert-success">
    <h1 align="center">Pandas Trick 28</h1>
    <h3 align="center">Stack & Unstack in Pandas</h3>
    <h4 align="center"><a href="https://github.com/AliBinary">Ali Ghanbari</a></h5>
</div>

![image.png](attachment:image.png)

In [None]:
import numpy as np
import pandas as pd

![image.png](attachment:image.png)

In [None]:
df_single_level_cols = pd.DataFrame([[0, 2], [3, 4]],
                                    index=['deer', 'monkey'],
                                    columns=['weight', 'height'])

Stacking a dataframe with a single level column axis returns a Series:

In [None]:
df_single_level_cols

![image.png](attachment:image.png)

In [None]:
df_single_level_cols.stack()

![image.png](attachment:image.png)

*Multi level columns: simple case:*

In [None]:
multicol1 = pd.MultiIndex.from_tuples([('weight', 'kg'),
                                       ('weight', 'pounds')])
df_multi_level_cols1 = pd.DataFrame([[3, 4], [4, 5]],
                                    index=['deer', 'monkey'],
                                    columns=multicol1)

Stacking a dataframe with a multi-level column axis:

In [None]:
df_multi_level_cols1

![image.png](attachment:image.png)

In [None]:
df_multi_level_cols1.stack()

![image.png](attachment:image.png)

**Missing values**

In [None]:
multicol2 = pd.MultiIndex.from_tuples([('weight', 'kg'),
                                       ('height', 'm')])
df_multi_level_cols2 = pd.DataFrame([[2.0, 3.0], [4.0, 5.0]],
                                    index=['deer', 'monkey'],
                                    columns=multicol2)

It is common to have missing values when stacking a dataframe with multi-level columns,<br>
as the stacked dataframe typically has more values than the original dataframe. Missing values<br>
are filled with NaNs:

In [None]:
df_multi_level_cols2

![image.png](attachment:image.png)

In [None]:
df_multi_level_cols2.stack()

![image.png](attachment:image.png)

## Prescribing the level(s) to be stacked:
The first parameter controls which level or levels are stacked:

In [None]:
df_multi_level_cols2

In [None]:
df_multi_level_cols2.stack(0)

![image.png](attachment:image.png)

In [None]:
df_multi_level_cols2

In [None]:
df_multi_level_cols2.stack([0, 1])

![image.png](attachment:image.png)

## Dropping missing values

In [None]:
df_multi_level_cols3 = pd.DataFrame([[None, 2.0], [3.0, 4.0]],
                                    index=['deer', 'monkey'],
                                    columns=multicol2)

Note that rows where all values are missing are dropped by default but this behaviour<br>
can be controlled via the dropna keyword parameter:

In [None]:
df_multi_level_cols3

![image.png](attachment:image.png)

In [None]:
df_multi_level_cols3.stack(dropna=False)

![image.png](attachment:image.png)

In [None]:
df_multi_level_cols3.stack(dropna=True)

![image.png](attachment:image.png)

## Unstack the dataframe

![image.png](attachment:image.png)

In [None]:
index = pd.MultiIndex.from_tuples([('one', 'x'), ('one', 'y'),
                                   ('two', 'x'), ('two', 'y')])
s = pd.Series(np.arange(2.0, 6.0), index=index)
s

![image.png](attachment:image.png)

In [None]:
s.unstack(level=-1)

![image.png](attachment:image.png)

In [None]:
s.unstack(level=0)

![image.png](attachment:image.png)

In [None]:
print(s)
df = s.unstack(level=0)
print(df)
df.unstack()