In [6]:
import pandas as pd
import numpy as np

### Pandas Melt
Pandas Melt will turn our dataframe from wide (many columns) to tall (many rows)

Let's run through 2 examples:
1. Standard Pandas Melt
2. Standard Pandas Melt with custom column names

First, let's create a wide DataFrame. I'll use different restaurant purchases per day. It's common to see data presented with data per day in columns. However, it's tough to do analysis on.

In [33]:
df = pd.DataFrame.from_dict({"Name": ['Liho Liho', 'Tompkins', 'The Square', 'Chambers'],
                             "8/4/2020": np.random.randint(10,200, size=(1,4))[0],
                             "8/5/2020": np.random.randint(12,200, size=(1,4))[0],
                             "8/6/2020": np.random.randint(12,200, size=(1,4))[0],
                             "8/7/2020": np.random.randint(12,200, size=(1,4))[0]}, orient='columns')
df

Unnamed: 0,Name,8/4/2020,8/5/2020,8/6/2020,8/7/2020
0,Liho Liho,123,136,129,75
1,Tompkins,179,164,112,184
2,The Square,42,54,58,54
3,Chambers,73,25,66,188


### 1. Standard Pandas Melt
In this example, I want to take all of the date columns to the right of "Name" and turn them into rows.

This means that I'll have 4 Liho Liho rows, one for each date that is a column.

I'll specify id_vars="Name" tell pandas thats what I want to 'unpivot' around. Another way - By specifying "Name" in id_vars, I'm telling pandas to take all of the other columns (besides "Name"), and combine them to one column.

Notice how we now have 16 rows -- 4 rows (because there were 4 columns) for every 1 row we had before

In [26]:
df.melt(id_vars='Name')

Unnamed: 0,Name,variable,value
0,Liho Liho,8/4/2020,31
1,Tompkins,8/4/2020,212
2,The Square,8/4/2020,135
3,Chambers,8/4/2020,54
4,Liho Liho,8/5/2020,83
5,Tompkins,8/5/2020,201
6,The Square,8/5/2020,33
7,Chambers,8/5/2020,50
8,Liho Liho,8/6/2020,192
9,Tompkins,8/6/2020,190


If you ever only want to do a subset of your columns, then you'll need to specify them directly through *value_vars*. Notice how the other columns I did not specify were dropped. Make sure this is what you intend to do.

In [27]:
df.melt(id_vars='Name', value_vars=['8/4/2020', '8/5/2020'])

Unnamed: 0,Name,variable,value
0,Liho Liho,8/4/2020,31
1,Tompkins,8/4/2020,212
2,The Square,8/4/2020,135
3,Chambers,8/4/2020,54
4,Liho Liho,8/5/2020,83
5,Tompkins,8/5/2020,201
6,The Square,8/5/2020,33
7,Chambers,8/5/2020,50


**Warning**: A confusing example and I'm not sure when you would use this.

If you didn't want your data dropped like above, then you could specify the columns you wanted to keep in your *id_vars*. This will unpivot everything else.

In [28]:
df.melt(id_vars=['Name','8/4/2020', '8/5/2020'])

Unnamed: 0,Name,8/4/2020,8/5/2020,variable,value
0,Liho Liho,31,83,8/6/2020,192
1,Tompkins,212,201,8/6/2020,190
2,The Square,135,33,8/6/2020,23
3,Chambers,54,50,8/6/2020,141
4,Liho Liho,31,83,8/7/2020,164
5,Tompkins,212,201,8/7/2020,205
6,The Square,135,33,8/7/2020,35
7,Chambers,54,50,8/7/2020,170


### 2. Standard Pandas Melt with custom column names
Very similar to above, but in this case, we will use custom names for our *new* columns.

You can specify your new long column (what's been unpivoted) through *var_name*, and your value column through *value_name*

In [31]:
df.melt(id_vars='Name', var_name='Date', value_name='Transaction_Amount')

Unnamed: 0,Name,Date,Transaction_Amount
0,Liho Liho,8/4/2020,31
1,Tompkins,8/4/2020,212
2,The Square,8/4/2020,135
3,Chambers,8/4/2020,54
4,Liho Liho,8/5/2020,83
5,Tompkins,8/5/2020,201
6,The Square,8/5/2020,33
7,Chambers,8/5/2020,50
8,Liho Liho,8/6/2020,192
9,Tompkins,8/6/2020,190
