In [1]:
import pandas as pd

### Pandas Crosstab
Pandas crosstab is extremely similar to pandas pivot table. In fact, cross tab uses pivot table in its [source code.](https://github.com/pandas-dev/pandas/blob/d9fff2792bf16178d4e450fe7384244e50635733/pandas/core/reshape/pivot.py#L616)

You use crosstab when you want to transform 3 or more columns into a summarization table. It's mostly used when your data does *not* start as a DataFrame. But rather lists of items. 

Examples we'll run through:
1. Simple crosstab exercises
2. Simple crosstab exercises with sum aggregate function
3. Exploring crosstab parameters

But first, let's start with a couple of lists of restaurants in San Francisco:

In [2]:
res_names = ['FC', 'LL', 'FC', '5C', 'TS', 'FC', '5C']
purchase_type = ['Food', 'Food', 'Food', 'Drink', 'Food', 'Drink', 'Drink']
price = [12, 25, 32, 10, 15, 22, 18]

print ('Restaurant Names: {}'.format(res_names))
print ('Purchase Type: {}'.format(purchase_type))
print ('Price: {}'.format(price))

Restaurant Names: ['FC', 'LL', 'FC', '5C', 'TS', 'FC', '5C']
Purchase Type: ['Food', 'Food', 'Food', 'Drink', 'Food', 'Drink', 'Drink']
Price: [12, 25, 32, 10, 15, 22, 18]


### 1. Simple crosstab exercises
When you create a crosstab table, you'll need to specify what you want on the rows, how to split the columns, and what you'd like to include in the values.

Notice that I need to pass a list of lists to index and columns. To do this I'll wrap my res_names in a list.

In [3]:
pd.crosstab(index=[res_names], columns=[purchase_type])

col_0,Drink,Food
row_0,Unnamed: 1_level_1,Unnamed: 2_level_1
5C,2,0
FC,1,2
LL,0,1
TS,0,1


### 2. Simple crosstab exercises with sum aggregate function
By default (in the example above) crosstab will count the frequencies in which an intersection happens. Notice how '5C' and 'Drink' intersection happens twice, so it's listed as '2' in the values.

But what if we wanted to summarize the price by summing them up? You can do that by passing values and aggfunc

In [4]:
pd.crosstab(index=[res_names], columns=[purchase_type], values=price, aggfunc=sum)

col_0,Drink,Food
row_0,Unnamed: 1_level_1,Unnamed: 2_level_1
5C,28.0,
FC,22.0,44.0
LL,,25.0
TS,,15.0


### 3. Exploring crosstab parameters
Crosstab comes with many other parameters you can use. Check out the [documentation](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.crosstab.html) for reference.

In [23]:
pd.crosstab(index=[res_names],
            columns=[purchase_type],
            values=price,
            aggfunc=lambda x: x.sum()**2, # Setting a custom agg function
            rownames=["Restaurants"], # Giving a title to my rows
            colnames=['Food Types'], # Giving a title to my columns
            margins=True, # Adding margins (Subtotals on the ends)
            margins_name="Totals") # Give my subtotals a title

Food Types,Drink,Food,Totals
Restaurants,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
5C,784.0,,784
FC,484.0,1936.0,4356
LL,,625.0,625
TS,,225.0,225
Totals,2500.0,7056.0,17956
