# Where should a drinks company run promotions?

## 📖 Background
Your company owns a chain of stores across Russia that sell a variety of alcoholic drinks. The company recently ran a wine promotion in Saint Petersburg that was very successful. Due to the cost to the business, it isn’t possible to run the promotion in all regions. The marketing team would like to target 10 other regions that have similar buying habits to Saint Petersburg where they would expect the promotion to be similarly successful.

### The data
The marketing team has sourced you with historical sales volumes per capita for several different drinks types.

- "year" - year (1998-2016)
- "region" - name of a federal subject of Russia. It could be oblast, republic, krai, autonomous okrug, federal city and a single autonomous oblast
- "wine" - sale of wine in litres by year per capita
- "beer" - sale of beer in litres by year per capita
- "vodka" - sale of vodka in litres by year per capita
- "champagne" - sale of champagne in litres by year per capita
- "brandy" - sale of brandy in litres by year per capita

In [1]:
import numpy as np
import pandas as pd

import seaborn as sns

pd.set_option("display.max_rows", 100)

## Reading dataset

In [2]:
alcohol_sales_df = pd.read_csv("../data/russian_alcohol_consumption.csv")

display(alcohol_sales_df.head())

Unnamed: 0,year,region,wine,beer,vodka,champagne,brandy
0,1998,Republic of Adygea,1.9,8.8,3.4,0.3,0.1
1,1998,Altai Krai,3.3,19.2,11.3,1.1,0.1
2,1998,Amur Oblast,2.1,21.2,17.3,0.7,0.4
3,1998,Arkhangelsk Oblast,4.3,10.6,11.7,0.4,0.3
4,1998,Astrakhan Oblast,2.9,18.0,9.5,0.8,0.2


### a. Dataset's shape

In [3]:
alcohol_sales_df.shape

(1615, 7)

### b. Calculating `total_consumption`

In [4]:
alcohol_sales_df["total_consumption"] = alcohol_sales_df["wine"] + alcohol_sales_df["beer"] + \
alcohol_sales_df["vodka"] + alcohol_sales_df["champagne"] + alcohol_sales_df["brandy"]

display(alcohol_sales_df.head())

Unnamed: 0,year,region,wine,beer,vodka,champagne,brandy,total_consumption
0,1998,Republic of Adygea,1.9,8.8,3.4,0.3,0.1,14.5
1,1998,Altai Krai,3.3,19.2,11.3,1.1,0.1,35.0
2,1998,Amur Oblast,2.1,21.2,17.3,0.7,0.4,41.7
3,1998,Arkhangelsk Oblast,4.3,10.6,11.7,0.4,0.3,27.3
4,1998,Astrakhan Oblast,2.9,18.0,9.5,0.8,0.2,31.4


## Analyzing sales of Saint Petersburg

As given in the problem statement that the promotion ran by the company in **Saint Petersburg** was successful.

**Will try to answer the following question:**

### What factor led to the promotion in Saint Petersburg successful?

In [5]:
saint_petersburg_df = alcohol_sales_df[alcohol_sales_df["region"] == "Saint Petersburg"]

display(saint_petersburg_df.head())

Unnamed: 0,year,region,wine,beer,vodka,champagne,brandy,total_consumption
59,1998,Saint Petersburg,2.7,27.9,12.3,1.2,0.5,44.6
144,1999,Saint Petersburg,2.6,57.4,13.0,1.7,0.6,75.3
229,2000,Saint Petersburg,4.4,68.2,14.7,2.0,0.9,90.2
314,2001,Saint Petersburg,6.2,101.0,15.5,2.4,0.8,125.9
399,2002,Saint Petersburg,6.3,104.6,17.2,2.6,0.9,131.6
