# 13. Pivot Tables - Grouping but with a Different Shape
Pivot tables are basically the same thing as a groupby, but will return the results in a slightly different shape, often times more readable.

For instance, if we are to find the average salary by race and gender, we could do the following groupby.

In [1]:
import pandas as pd
emp = pd.read_csv('data/employee.csv')
emp.head()

Unnamed: 0,title,dept,salary,race,gender,experience
0,POLICE OFFICER,Houston Police Department-HPD,45279.0,White,Male,1
1,ENGINEER/OPERATOR,Houston Fire Department (HFD),63166.0,White,Male,34
2,SENIOR POLICE OFFICER,Houston Police Department-HPD,66614.0,Black,Male,32
3,ENGINEER,Public Works & Engineering-PWE,71680.0,Asian,Male,4
4,CARPENTER,Houston Airport System (HAS),42390.0,White,Male,3


In [2]:
emp.groupby(['race', 'gender']).agg({'salary': 'mean'})

Unnamed: 0_level_0,Unnamed: 1_level_0,salary
race,gender,Unnamed: 2_level_1
Asian,Female,58304.222222
Asian,Male,60622.956522
Black,Female,48133.381643
Black,Male,51853.0
Hispanic,Female,44216.96
Hispanic,Male,55493.064057
Native American,Female,58844.333333
Native American,Male,68850.5
White,Female,66415.527778
White,Male,63439.195745


# A Groupby produces long data - hard to compare
It's hard to compare female vs male average salaries given the output above. A pivot table will produce the exact same information but have one of the groups as the columns.

In [4]:
emp.pivot_table(index='race', columns='gender', values='salary').astype('int')
#turning numbers into integers

gender,Female,Male
race,Unnamed: 1_level_1,Unnamed: 2_level_1
Asian,58304,60622
Black,48133,51853
Hispanic,44216,55493
Native American,58844,68850
White,66415,63439


## By default a pivot table takes the mean - change the aggregation with `aggfunc`

In [5]:
emp.pivot_table(index='race', columns='gender', values='salary', aggfunc='median')

gender,Female,Male
race,Unnamed: 1_level_1,Unnamed: 2_level_1
Asian,51514.5,55461.0
Black,40581.0,49150.0
Hispanic,42837.5,55437.0
Native American,58855.0,69351.0
White,62783.0,62540.0


# Use a pivot table when you have two grouping columns
Choose one column as the **`index`** and the other as the **`columns`**. The aggregating column is set as the **`values`**, and the aggregating function as **`aggfunc`**.

# Ask and then Answer your own questions
Use any of the datasets we have looked at so far.