# Reshaping in Pandas with stack() and unstack()

This is a Notebook for the medium article [Creating a dual-axis Combo Chart in Python](https://bindichen.medium.com/creating-a-dual-axis-combo-chart-in-python-52624b187834)

Please check out article for instructions

**License**: [BSD 2-Clause](https://opensource.org/licenses/BSD-2-Clause)

* https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.stack.html
* https://medium.com/swlh/reshaping-in-pandas-with-stack-and-unstack-functions-bb169f64467d
* https://towardsdatascience.com/wide-to-long-data-how-and-when-to-use-pandas-melt-stack-and-wide-to-long-7c1e0f462a98

#### Version of packages used in this Notebook

In [2]:
import numpy as np
import pandas as pd

# Make sure your package version >= them
print('numpy: ', np.__version__)
print('pandas: ', pd.__version__)

numpy:  1.18.1
pandas:  1.1.4


# Tutorial

In [23]:
import pandas as pd
import numpy as np

## 1. Single level

In [5]:
df_single_level_cols = pd.DataFrame([['Mostly cloudy', 10], ['Sunny', 12]],
                                    index=['London', 'Oxford'],
                                    columns=['Weather', 'Wind'])

df_single_level_cols

Unnamed: 0,Weather,Wind
London,Mostly cloudy,10
Oxford,Sunny,12


In [6]:
df_single_level_cols.stack()

London  Weather    Mostly cloudy
        Wind                  10
Oxford  Weather            Sunny
        Wind                  12
dtype: object

## 2. Multi level columns: simple case

In [10]:
multicol1 = pd.MultiIndex.from_tuples([('Wind', 'mph'),
                                       ('Wind', 'm/s')])

In [12]:
df_multi_level_cols1 = pd.DataFrame([[13, 5.5], [19, 8.5]],
                                    index=['London', 'Oxford'],
                                    columns=multicol1)

df_multi_level_cols1

Unnamed: 0_level_0,Wind,Wind
Unnamed: 0_level_1,mph,m/s
London,13,5.5
Oxford,19,8.5


In [13]:
df_multi_level_cols1.stack()

Unnamed: 0,Unnamed: 1,Wind
London,m/s,5.5
London,mph,13.0
Oxford,m/s,8.5
Oxford,mph,19.0


## 3. Missing values

In [21]:
multicol2 = pd.MultiIndex.from_tuples([('Wind', 'mph'),
                                       ('Temperature', '°C')])

In [24]:
df_multi_level_cols2 = pd.DataFrame([[13, 8], [19, 6]],
                                    index=['London', 'Oxford'],
                                    columns=multicol2)

In [25]:
df_multi_level_cols2

Unnamed: 0_level_0,Wind,Temperature
Unnamed: 0_level_1,mph,°C
London,13,8
Oxford,19,6


In [26]:
df_multi_level_cols2.stack()

Unnamed: 0,Unnamed: 1,Temperature,Wind
London,mph,,13.0
London,°C,8.0,
Oxford,mph,,19.0
Oxford,°C,6.0,


## Prescribing the level(s) to be stacked

In [27]:
df_multi_level_cols2.stack(0)

Unnamed: 0,Unnamed: 1,mph,°C
London,Temperature,,8.0
London,Wind,13.0,
Oxford,Temperature,,6.0
Oxford,Wind,19.0,


In [28]:
df_multi_level_cols2.stack([0, 1])

London  Temperature  °C      8.0
        Wind         mph    13.0
Oxford  Temperature  °C      6.0
        Wind         mph    19.0
dtype: float64

## Dropping missing values

In [15]:
df_multi_level_cols3 = pd.DataFrame([[None, 1.0], [2.0, 3.0]],
                                    index=['cat', 'dog'],
                                    columns=multicol2)

In [16]:
df_multi_level_cols3

Unnamed: 0_level_0,weight,height
Unnamed: 0_level_1,kg,m
cat,,1.0
dog,2.0,3.0


In [17]:
df_multi_level_cols3.stack(dropna=False)

Unnamed: 0,Unnamed: 1,height,weight
cat,kg,,
cat,m,1.0,
dog,kg,,2.0
dog,m,3.0,


In [19]:
df_multi_level_cols3.stack(dropna=True)

Unnamed: 0,Unnamed: 1,height,weight
cat,m,1.0,
dog,kg,,2.0
dog,m,3.0,


## unstack

In [20]:
index = pd.MultiIndex.from_tuples([('one', 'a'), ('one', 'b'),
                                   ('two', 'a'), ('two', 'b')])

In [25]:
s = pd.Series(np.arange(1.0, 5.0), index=index)
s

one  a    1.0
     b    2.0
two  a    3.0
     b    4.0
dtype: float64

In [26]:
s.unstack(level=-1)

Unnamed: 0,a,b
one,1.0,2.0
two,3.0,4.0


In [27]:
s.unstack(level=0)

Unnamed: 0,one,two
a,1.0,3.0
b,2.0,4.0
