# Iowa whiskey sales

Let's look at whiskey sales in Iowa. This is a subset of the data from the [Iowa Liquor Sales dataset](https://data.iowa.gov/Sales-Distribution/Iowa-Liquor-Sales/m3tr-qhgy).

In [1]:
import icanexplain as ice

sales = ice.datasets.load_iowa_whiskey_sales()
sales.head().style.format()

Unnamed: 0,date,category,vendor,sales_amount,price_per_bottle,bottles_sold,bottle_volume_ml,year
0,2012-06-04,CANADIAN WHISKIES,"CONSTELLATION WINE COMPANY, INC.",94.02,15.67,6,1750,2012
1,2016-01-05,STRAIGHT BOURBON WHISKIES,CAMPARI(SKYY),18.76,9.38,2,375,2016
2,2016-05-25,CANADIAN WHISKIES,DIAGEO AMERICAS,11.03,11.03,1,300,2016
3,2016-01-20,CANADIAN WHISKIES,PHILLIPS BEVERAGE COMPANY,33.84,11.28,3,750,2016
4,2012-03-19,CANADIAN WHISKIES,"CONSTELLATION WINE COMPANY, INC.",94.02,15.67,6,1750,2012


The `sales_amount` column represents the bill a customer payed for a given transaction. We can sum it and group by year to see how the total sales amount evolves over time.

In [2]:
import locale

locale.setlocale(locale.LC_MONETARY, 'en_US.UTF-8')
def fmt_currency(x):
    return locale.currency(x, grouping=True)

(
    sales.groupby('year')['sales_amount']
    .sum()
    .to_frame()
    .assign(diff=lambda x: x.diff())
    .style.format(lambda x: fmt_currency(x) if x > 0 else '')
)

Unnamed: 0_level_0,sales_amount,diff
year,Unnamed: 1_level_1,Unnamed: 2_level_1
2012,"$1,842,098.86",
2016,"$2,298,505.88","$456,407.02"
2020,"$3,378,164.43","$1,079,658.55"


Ok, but why? Well, we can use icanexplain to break down the evolution into two effects:

1. The inner effect: how much the average transaction value changed.
2. The mix effect: how much the number of transations changed.

In [3]:
import icanexplain as ice

explainer = ice.SumExplainer(
    fact='sales_amount',
    period='year',
    group='category'
)
explanation = explainer(sales)
(
    explanation.style
    .format(lambda x: fmt_currency(x) if x > 0 else '$0')
    .set_properties(**{'text-align': 'right'})
)

Unnamed: 0_level_0,Unnamed: 1_level_0,inner,mix
year,category,Unnamed: 2_level_1,Unnamed: 3_level_1
2016,BLENDED WHISKIES,"$17,854.43","$7,356.77"
2016,CANADIAN WHISKIES,$0,"$225,902.66"
2016,CORN WHISKIES,$0,"$4,113.90"
2016,IRISH WHISKIES,"$22,144.48","$75,122.83"
2016,SCOTCH WHISKIES,"$19,591.97",$0
2016,SINGLE BARREL BOURBON WHISKIES,"$1,852.03","$6,375.43"
2016,STRAIGHT BOURBON WHISKIES,"$107,144.93","$97,934.50"
2016,STRAIGHT RYE WHISKIES,$0,$0
2020,BLENDED WHISKIES,"$83,342.60","$59,768.58"
2020,CANADIAN WHISKIES,"$224,022.62","$149,363.35"


For instance, we see that the average transation amount for blended whiskies contributed to an $17,854 increase in sales from 2012 to 2016. This is the inner effect. The mix effect for blended whiskies, on the other hand, contributed to a $7,356 increase in sales.

Here's another example: the mix effect of Canadian whiskies is $225,902. This value, the mix effect, represents the increase due to the number of extra sales for Canadian whiskies. The inner effect, on the other hand, is $0. This means that the average transaction value for Canadian whiskies did not change between 2012 and 2016, and therefore didn't contribute to the increase in sales.

A visual way to look interpret the above table is to use a waterfall chart. The idea is that the contributions sum to the difference between two periods. In this case, the difference in sales from 2012 to 2016 is $456,407. The waterfall chart shows how the inner and mix effects contributed to this difference.

In [4]:
explainer.plot(sales)