# Pandas Pivot & Pivot Tables

## Pandas `pivot` Methond

In [1]:
import pandas as pd
import numpy as np

DataFrame containing the future winners of the MyHEAT Cars, Cash and Cribs Lotto:

In [2]:
df = pd.DataFrame({'prize': ['Car', 'Cash', 'Crib', 'Car', 'Cash', 'Crib'],
                   'month': ['JAN', 'FEB', 'MAR', 'JAN', 'FEB', 'MAR'],
                   'year': ['2023', '2024', '2025', '2026', '2027', '2028'],
                   'winner': ['Dave', 'Aateka', 'Zach', 'Salar', 'Dave', 'Dave']})
df

Unnamed: 0,prize,month,year,winner
0,Car,JAN,2023,Dave
1,Cash,FEB,2024,Aateka
2,Crib,MAR,2025,Zach
3,Car,JAN,2026,Salar
4,Cash,FEB,2027,Dave
5,Crib,MAR,2028,Dave


Let's see what happens when we index the table by `year` and set the column headers by `winner`:

In [3]:
df.pivot(index='year', columns='winner')

Unnamed: 0_level_0,prize,prize,prize,prize,month,month,month,month
winner,Aateka,Dave,Salar,Zach,Aateka,Dave,Salar,Zach
year,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2
2023,,Car,,,,JAN,,
2024,Cash,,,,FEB,,,
2025,,,,Crib,,,,MAR
2026,,,Car,,,,JAN,
2027,,Cash,,,,FEB,,
2028,,Crib,,,,MAR,,


The above is not really valuable helpful.... How about we just use the `prize` as our values:

In [4]:
df.pivot(index='year', columns='winner', values='prize')

winner,Aateka,Dave,Salar,Zach
year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2023,,Car,,
2024,Cash,,,
2025,,,,Crib
2026,,,Car,
2027,,Cash,,
2028,,Crib,,


What if we want to know who won what prize in what year?

In [5]:
df.pivot(index='winner', columns='prize', values='year')

prize,Car,Cash,Crib
winner,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Aateka,,2024.0,
Dave,2023.0,2027.0,2028.0
Salar,2026.0,,
Zach,,,2025.0


However, the above pivot tables are **all** fundamentally flawed because `Dave` is not a Unique ID at MyHEAT :P 

## Pandas `pivot_table`

Used for pivoting tables where numerical values need to be aggregated

In [6]:
df = pd.DataFrame({"A": ["foo", "foo", "foo", "foo", "foo",
                         "bar", "bar", "bar", "bar"],
                   "B": ["one", "one", "one", "two", "two",
                         "one", "one", "two", "two"],
                   "C": ["small", "large", "large", "small",
                         "small", "large", "small", "small",
                         "large"],
                   "D": [1, 2, 2, 3, 3, 4, 5, 6, 7],
                   "E": [2, 4, 5, 5, 6, 6, 8, 9, 9]})
df

Unnamed: 0,A,B,C,D,E
0,foo,one,small,1,2
1,foo,one,large,2,4
2,foo,one,large,2,5
3,foo,two,small,3,5
4,foo,two,small,3,6
5,bar,one,large,4,6
6,bar,one,small,5,8
7,bar,two,small,6,9
8,bar,two,large,7,9


In [7]:
pd.pivot_table(df, values='D', index=['A', 'B'],  columns='C') 

Unnamed: 0_level_0,C,large,small
A,B,Unnamed: 2_level_1,Unnamed: 3_level_1
bar,one,4.0,5.0
bar,two,7.0,6.0
foo,one,2.0,1.0
foo,two,,3.0


In [8]:
pd.pivot_table(df, values='D', index=['A', 'B'],  columns='C', aggfunc=np.sum) 

Unnamed: 0_level_0,C,large,small
A,B,Unnamed: 2_level_1,Unnamed: 3_level_1
bar,one,4.0,5.0
bar,two,7.0,6.0
foo,one,4.0,1.0
foo,two,,6.0


In [9]:
pd.pivot_table(df, values='D', index=['A', 'B'],  columns='C', aggfunc=np.sum, fill_value=0) 

Unnamed: 0_level_0,C,large,small
A,B,Unnamed: 2_level_1,Unnamed: 3_level_1
bar,one,4,5
bar,two,7,6
foo,one,4,1
foo,two,0,6


In [10]:
pd.pivot_table(df, values=['D', 'E'], index=['A', 'C'], aggfunc={'D': np.mean, 'E': np.mean}) 

Unnamed: 0_level_0,Unnamed: 1_level_0,D,E
A,C,Unnamed: 2_level_1,Unnamed: 3_level_1
bar,large,5.5,7.5
bar,small,5.5,8.5
foo,large,2.0,4.5
foo,small,2.333333,4.333333


In [11]:
pd.pivot_table(df, values=['D', 'E'], index=['A', 'C'], aggfunc={'D': np.mean, 'E': [min, max, np.mean]}) 

Unnamed: 0_level_0,Unnamed: 1_level_0,D,E,E,E
Unnamed: 0_level_1,Unnamed: 1_level_1,mean,max,mean,min
A,C,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2
bar,large,5.5,9,7.5,6
bar,small,5.5,9,8.5,8
foo,large,2.0,5,4.5,4
foo,small,2.333333,6,4.333333,2
