<img src="http://imgur.com/1ZcRyrc.png" style="float: left; margin: 20px; height: 55px">

## A Simple `.melt()` and `.pivot_table()` Example

_Authors: Kiefer Katovich (SF)_

---


In [2]:
import pandas as pd
import numpy as np

#### Create some fake data about beer quality and ratings.

In [3]:
beers = {
    'name':['coors','bud','natural light','keystone ice',
            'sierra nevada', 'sam adams', 'new belgium',
            'odouls',
            'pbr','stella','chimay','magnolia','21amendment'],
    'class':['crap','crap','crap','crap',
             'mid','mid','mid',
             'notabeer',
             'pretentious','pretentious','pretentious','pretentious','pretentious']
}
beers['price'] = np.concatenate([np.array([1.5, 1.7, 1.2, 1.2]),
                                np.array([2., 1.9, 2.1]),
                                np.array([3.]),
                                np.array([0.5,3.5, 10., 15.,2.])])
beers['rating'] = np.random.normal(5, 2, size=13)

for k, v in beers.items():
    print k, len(v)
beers = pd.DataFrame(beers)
beers

rating 13
price 13
name 13
class 13


Unnamed: 0,class,name,price,rating
0,crap,coors,1.5,7.769475
1,crap,bud,1.7,3.732404
2,crap,natural light,1.2,2.641919
3,crap,keystone ice,1.2,3.482523
4,mid,sierra nevada,2.0,5.097718
5,mid,sam adams,1.9,7.094218
6,mid,new belgium,2.1,7.385905
7,notabeer,odouls,3.0,5.99259
8,pretentious,pbr,0.5,6.866919
9,pretentious,stella,3.5,8.434656


#### Melt the beer data into long format using the name of the beer as the ID.

In [4]:
beers_long = pd.melt(beers, id_vars=['name'])
beers_long.sort_values('name', axis=0)

Unnamed: 0,name,variable,value
38,21amendment,rating,5.41369
25,21amendment,price,2
12,21amendment,class,pretentious
1,bud,class,crap
27,bud,rating,3.7324
14,bud,price,1.7
36,chimay,rating,3.02179
23,chimay,price,10
10,chimay,class,pretentious
26,coors,rating,7.76948


#### Use `.pivot_table()` on the original beer DataFrame to summarize the mean and standard deviation for rating and price by class.

In [5]:
beer_class_summary = pd.pivot_table(beers, index='class', values=['rating','price'],
                                   aggfunc=[np.mean, np.std])

In [6]:
beer_class_summary

Unnamed: 0_level_0,mean,mean,std,std
Unnamed: 0_level_1,price,rating,price,rating
class,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2
crap,1.4,4.40658,0.244949,2.289941
mid,2.0,6.525947,0.1,1.245451
notabeer,3.0,5.99259,,
pretentious,6.2,4.97465,6.109419,2.92799


#### Go from long format back to wide format using `.pivot_table()` and a custom aggregate function.

In [15]:
beers_long

Unnamed: 0,name,variable,value
0,coors,class,crap
1,bud,class,crap
2,natural light,class,crap
3,keystone ice,class,crap
4,sierra nevada,class,mid
5,sam adams,class,mid
6,new belgium,class,mid
7,odouls,class,notabeer
8,pbr,class,pretentious
9,stella,class,pretentious


In [20]:
def first_item(series):
    item = series.iloc[0]
    print(series)
    print('\b')
    return item


new_beers_wide = pd.pivot_table(beers_long, index='name', values='value',
                                columns=['variable'], aggfunc=first_item)
new_beers_wide.reset_index()

12    pretentious
Name: value, dtype: object

25    2
Name: value, dtype: object

38    5.41369
Name: value, dtype: object

1    crap
Name: value, dtype: object

14    1.7
Name: value, dtype: object

27    3.7324
Name: value, dtype: object

10    pretentious
Name: value, dtype: object

23    10
Name: value, dtype: object

36    3.02179
Name: value, dtype: object

0    crap
Name: value, dtype: object

13    1.5
Name: value, dtype: object

26    7.76948
Name: value, dtype: object

3    crap
Name: value, dtype: object

16    1.2
Name: value, dtype: object

29    3.48252
Name: value, dtype: object

11    pretentious
Name: value, dtype: object

24    15
Name: value, dtype: object

37    1.13619
Name: value, dtype: object

2    crap
Name: value, dtype: object

15    1.2
Name: value, dtype: object

28    2.64192
Name: value, dtype: object

6    mid
Name: value, dtype: object

19    2.1
Name: value, dtype: object

32    7.38591
Name: value, dtype: object

7    notabeer


variable,name,class,price,rating
0,21amendment,pretentious,2.0,5.41369
1,bud,crap,1.7,3.7324
2,chimay,pretentious,10.0,3.02179
3,coors,crap,1.5,7.76948
4,keystone ice,crap,1.2,3.48252
5,magnolia,pretentious,15.0,1.13619
6,natural light,crap,1.2,2.64192
7,new belgium,mid,2.1,7.38591
8,odouls,notabeer,3.0,5.99259
9,pbr,pretentious,0.5,6.86692
