GitHub - numericlee/BeerForecast

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
inputs		inputs
studies		studies
.Rhistory		.Rhistory
ABI_analysis.py		ABI_analysis.py
ABI_data_collation.py		ABI_data_collation.py
README.rmd		README.rmd

Repository files navigation

## Project Overview ##

This project was completed as part of a 48 hour hackathon sponsored by ABInBev.

The company presented disguised data on the monthly volume performance of various SKUs in 58 Markets over a 60 month period.
(The markets are described in the data as "Agencies.") The company also presents industry, demographic, and weather data. Hackathon participants were asked to forecast sales volume for the following month.

Separately, participants were asked to recommend two SKUs the company should launch in two new target markets.

We post the code used in our analysis but not our recommendations

### Forecast Methodology ###

#### Background/Data Exploration: ####

Unit Prices (measured in Sales per HL) have been rising (roughly 6%/year in USD terms) while organic volume has declined. Overall volume has remained steadier due to SKU proliferation and a few new Markets. We excluded from the analysis a few spurious data points (e.g. unit price low or negative)

#### Data Collation ####

One python module contains code which synthesizes data from various sources, excludes some observations, and identifies other datapoints which appear to be spurious.

### Model ###

Predictor = VolPred2 + Second Order Effects

*where*

VolPred2 = SAV (Market, SKU) x Monthly Seasonality

#### Seasonality ####

We derived monthly seasonality factors from organic (“same store”) volume in a subset of Market-SKU (“MS”) pairs where the company was active for the entire 60 month period. Note that industry seasonality differs from the industry at large. We assume the same seasonality applies at every Market and SKU.

The monthly factors utilized here capture most of the Holiday Effect

#### Seasonally Annualized Volume ####

For ~295 MS pairs with 40+ months of data, the model extrapolates volume for the 61st month (January 2018)using a linear regression of SAV 1against time trend, MS-specific unit price, Promotions, and MaxTemp. Estimate Jan.18 weather(Max_Temp) based on the preceding 5 Januaries for same Market. Estimate unit price with a separate linear regression of prices of the MS pair against time and seasonality (the monthly factors derived above did double duty.) Promotion spend is
projected at 1.151x 2017 monthly average

For the remaining 55 MS pairs with short history, use 83% of 2017 SAV. Many of these are being wound down.

#### Second Order Effects ####

While the monthly seasonality factors determined above were intended to capture the impact of holidays, a comparison of the predictions generated by the above two factors against actual volume indicated the model was not fully capturing the effect of Christmas and New Year’s. (The other holidays did not have statistically significant
impact.)

A second regression based on Volume – 1.0007 * Vol Pred2 found some of the residual could be explained by a combination of the two Holidays, unit price, Seasonally Annualized Volume (SAV)
and Max_temp. Industry volume may also explain some of the residual. If we had access to a good monthly industry forecast, we would want to retool our model.

#### Potential Refinements ####

- Our extrapolation of a Market and SKU currently reflects only the 60 monthly observations. The extrapolation model could give weight to the development of other comparable Markets and other SKUs in the same Market.
A related point: If an Market stops carrying (or de-emphasizes) another SKU, the remaining SKUs could benefit. (and vice versa)

- The impact of promotion almost certainly extends beyond the current period.

- Demand could be pulled forward or shifted back due to price changes or inventory effects. The model does not incorporate these catchup effects.

- Price history should be considered. Perhaps once consumers get acclimated to higher price (past sticker shock) demand recovers

- If two Markets cover the same territory, the success of one could impinge on the other

### Product Recommendations for New Markets: Methodology ###

- We analyzed the 12 SKUs with highest Revenue Per Capita in their respective Markets

- We regress SKU-specific Seasonally Adjusted Revenue/Population against Weather, Household Income, and Time Trend. Revenue was derived from Unit Price x Volume. Table 1 shows Monthly Revenue per Capita (USD Seas Adjusted) by SKU

- We sought to maximize revenue three years after product launch

#### Potential Refinements ####

- The analysis assumes demand for each product is independent of the others. Only a handful of Markets don’t carry all five of the company's largest SKUs. This means we can’t be sure how much revenue we would lose if we carried one of those products without the others, particularly if consumers see these SKUs as close substitutes. In the handful of Markets with different product lineups, there is no evidence there is a better 2 product combination. SKUs 18 and 22 have not demonstrated the same level of success as the flagship products.
- Some Markets have been carrying the company's products for less than five years and may not have achieved their full potential