In [1]:
import numpy as np
import pandas as pd

### Pivot Tables

Pivot Tables can be produced using groupby operations and hierarchical indexing, although these operations can be complicated.  To simplify our work, DataFrames have a *pivot_table* method. 

The pivot_table method has many optional and one required argument.  The required argument, index indicates where the pivot will take place.  The aggfunc argument indicates how individual group computations will be performed.   

##### Example 1

In [8]:
pets = pd.DataFrame({'gender':np.random.choice(np.array(['M','F']), size = (10,)),
                    'type':np.random.choice(np.array(['dog','cat']), size = (10,)),
                    'weight':10*np.random.normal(size = (10,))+25,
                    'num_feet':np.random.choice(range(1,5), size = (10,), p=[0.05, 0.05, 0.3, 0.6])})
pets

Unnamed: 0,gender,type,weight,num_feet
0,F,cat,20.236837,4
1,M,dog,27.085906,3
2,F,cat,41.589539,4
3,F,dog,34.345957,4
4,M,dog,28.293106,3
5,M,cat,42.184208,4
6,F,cat,37.143714,4
7,F,dog,22.093624,4
8,F,cat,34.792727,4
9,M,cat,19.776972,2


We pivot on gender.

In [9]:
pets.pivot_table(index = ['gender'], aggfunc = 'mean')

Unnamed: 0_level_0,num_feet,weight
gender,Unnamed: 1_level_1,Unnamed: 2_level_1
F,4,31.7004
M,3,29.335048


Now, we will pivot on both gender and type.

In [10]:
pets.pivot_table(index = ['gender', 'type'], aggfunc = 'mean')

Unnamed: 0_level_0,Unnamed: 1_level_0,num_feet,weight
gender,type,Unnamed: 2_level_1,Unnamed: 3_level_1
F,cat,4,33.440704
F,dog,4,28.21979
M,cat,3,30.98059
M,dog,3,27.689506


Now, we change the aggfunc argument.

In [11]:
pets.pivot_table(index = ['gender', 'type'], aggfunc = 'sum')

Unnamed: 0_level_0,Unnamed: 1_level_0,num_feet,weight
gender,type,Unnamed: 2_level_1,Unnamed: 3_level_1
F,cat,16,133.762818
F,dog,8,56.439581
M,cat,6,61.96118
M,dog,6,55.379012


Note that we could have accomplished this with a groupby.

In [12]:
list(pets[['num_feet', 'weight']].groupby([pets['gender'], pets['type']]))

[(('F', 'cat'),
     num_feet     weight
  0         4  20.236837
  2         4  41.589539
  6         4  37.143714
  8         4  34.792727),
 (('F', 'dog'),
     num_feet     weight
  3         4  34.345957
  7         4  22.093624),
 (('M', 'cat'),
     num_feet     weight
  5         4  42.184208
  9         2  19.776972),
 (('M', 'dog'),
     num_feet     weight
  1         3  27.085906
  4         3  28.293106)]

In [13]:
pets[['num_feet', 'weight']].groupby([pets['gender'], pets['type']]).sum()

Unnamed: 0_level_0,Unnamed: 1_level_0,num_feet,weight
gender,type,Unnamed: 2_level_1,Unnamed: 3_level_1
F,cat,16,133.762818
F,dog,8,56.439581
M,cat,6,61.96118
M,dog,6,55.379012


$\Box$