## Price Elasticity of Demand (cross-price elasticity located after price elasticity)

In the following analysis, we select Dan Murphy products as our main price elasticity analysis. For future reference, the goal then is to have this model so it can be implemented in every kind of category throughout the business.

**Hypothesis Proposed**
   
- From the Dan Murphy Craft Beer sales sample data in 2021, Is impression demand sensitive to its own product price changes? If yes, by how much is impression demand sensitive to price change?

**Machine Learning Model**
    
- Linear Regression

**Price Elasticity Formula**

- The price elasticity in demand is defined as the percentage change in quantity demanded divided by the percentage change in price (2003, OECD). In this model, price-elasticity is the calculation of how sensitive impression demand is to price change

   **Quantity percentage change / Price percentage change * Price Mean / Quantity Mean** (2019,John Doe)
   
## Content

### Price Elasticity

- **3.1.1 Sample Selection**
- **3.1.2 Sample Imputation**
- **3.1.3 Linear Regression Model**
- **3.1.4 Price Elasticity Null Hypothesis**
- **3.1.5 Price Elasticity Results**

### Cross-Price Elasticity Matrix

- **3.2.1 Cross - Price Elasticity Definition**
- **3.2.2 Cross - Price Elasticity Matrix Function (Multi Linear Regression)**
- **3.2.3 Cross - Price Elasticity 12 MacBook (Mid 2017, Silver) Case**
- **3.2.4  Cross - Price Elasticity 12 MacBook (Mid 2017, Silver) Conclusion**


In [235]:
%matplotlib inline

from __future__ import print_function
from statsmodels.compat import lzip
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

import statsmodels.api as sm
from statsmodels.formula.api import ols
import csv


In [236]:
beer = pd.read_csv('beer.csv')

In [237]:
beer.head(10)

Unnamed: 0,brand,product,size,q1,q1_sales,q2,q2_sales,q3,q3_sales,q4,q4_sales
0,Stone & Wood,Pacific Ale Bottles 330mL,6,$24.49,21819,$24.49,19134,$24.49,21826,$24.49,15137
1,James Squire,One Fifty Lashes Pale Ale Bottles 345mL,6,$22.99,19836,$22.99,17690,$22.99,21701,$22.99,19772
2,Little Creatures,Pale Ale Bottles 330mL,6,$21.95,21978,$21.95,17007,$21.95,16613,$21.95,16784
3,Balter,XPA Cans 375mL,4,$18.99,21079,$18.99,15369,$18.99,15261,$18.99,20366
4,Mountain Goat,Very Enjoyable Beer Cans 375mL,6,$18.99,18231,$18.99,16390,$18.99,16251,$18.99,21239
5,Furphy,Refreshing Ale Bottles 375mL,6,$20.99,18980,$20.99,16750,$20.99,21439,$20.99,17814
6,Burleigh,Big Head No Carb Beer 330mL,6,$21.95,18773,$21.95,21194,$21.95,16165,$21.95,17109
7,Young Henrys,Newtowner Pale Ale Cans 375mL,6,$21.45,16300,$21.45,18018,$21.45,15534,$21.45,15106
8,Gage Roads,Single Fin Summer Ale Bottles 330mL,6,$19.95,20299,$19.95,19876,$19.95,19806,$19.95,18910
9,4 Pines,Pale Ale Bottles 330mL,6,$19.95,17544,$19.95,21325,$19.95,18370,$19.95,19873


In [238]:
beer.isnull().sum()

brand       0
product     0
size        0
q1          0
q1_sales    0
q2          0
q2_sales    0
q3          0
q3_sales    0
q4          0
q4_sales    0
dtype: int64

In [239]:
# Change headers to slugs

In [240]:
## Regular Expression for Small Packs and Case/Cartons

In [241]:
import re

In [242]:
# [re.findall(r"\d+\.\d+", str(val))
#     for val in beer.small_pack_price_og]    
# that should extract all the price for that column

beer['q1_price'] = beer['q1'].str.extract(r"(\d+\.\d+)", expand=True)
beer['q1_price']

0     24.49
1     22.99
2     21.95
3     18.99
4     18.99
5     20.99
6     21.95
7     21.45
8     19.95
9     19.95
10    18.99
11    14.95
12    18.99
13    19.99
14    20.99
15    18.99
16    21.49
17    18.99
18    18.99
19    23.39
20    26.49
21    23.99
22    21.49
23    24.99
24    24.99
25    23.99
26    19.99
27    21.49
28    15.99
Name: q1_price, dtype: object

In [243]:
# [re.findall(r"\d+\.\d+", str(val))
#     for val in beer.case_carton_price_og]    
# that should extract all the price for that column

beer['q2_price'] = beer['q2'].str.extract(r"(\d+\.\d+)", expand=True)
beer['q2_price']

0     24.49
1     22.99
2     21.95
3     18.99
4     18.99
5     20.99
6     21.95
7     21.45
8     19.95
9     19.95
10    18.99
11    14.95
12    18.99
13    19.99
14    20.99
15    18.99
16    21.49
17    18.99
18    18.99
19    23.39
20    26.49
21    23.99
22    21.49
23    24.99
24    24.99
25    23.99
26    19.99
27    21.49
28    15.99
Name: q2_price, dtype: object

In [244]:
# [re.findall(r"\((\d+)\)", str(val))
#     for val in beer.small_pack_price_og]    
# that should extract all the number of small packs item amounts

beer['q3_price'] = beer['q3'].str.extract(r"(\d+\.\d+)", expand=True)
beer['q3_price']

0     24.49
1     22.99
2     21.95
3     18.99
4     18.99
5     20.99
6     21.95
7     21.45
8     19.95
9     19.95
10    18.99
11    14.95
12    18.99
13    19.99
14    20.99
15    18.99
16    21.49
17    18.99
18    18.99
19    23.39
20    26.49
21    23.99
22    21.49
23    24.99
24    24.99
25    23.99
26    19.99
27    21.49
28    15.99
Name: q3_price, dtype: object

In [245]:

# [re.findall(r"\((\d+)\)", str(val))
#     for val in beer.case_carton_price_og]    
# that should extract all the number of small packs item amounts

beer['q4_price'] = beer['q4'].str.extract(r"(\d+\.\d+)", expand=True)
beer['q4_price']

0     24.49
1     22.99
2     21.95
3     18.99
4     18.99
5     20.99
6     21.95
7     21.45
8     19.95
9     19.95
10    18.99
11    14.95
12    18.99
13    19.99
14    20.99
15    18.99
16    21.49
17    18.99
18    18.99
19    23.39
20    26.49
21    23.99
22    21.49
23    24.99
24    24.99
25    23.99
26    19.99
27    21.49
28    15.99
Name: q4_price, dtype: object

In [246]:
import warnings
warnings.filterwarnings('ignore')

In [247]:
beer['title'] = beer['brand'] + ' ' + beer['product']

In [248]:
beer['date'] = '31-03-2021'

In [249]:
beer['quantity'] = beer['q1_sales'] + beer['q2_sales'] + beer['q3_sales'] + beer['q4_sales']
beer['price'] = beer['q1_price']

In [250]:
small_set = beer.iloc[:3].drop(['q1', 'q1_price', 'q1_sales', 'q2', 'q2_price', 'q2_sales', 'q3', 'q3_price', 'q3_sales', 'q4', 'q4_price', 'q4_sales', 'size', 'brand', 'product'], axis = 1)

In [255]:
small_set

Unnamed: 0,title,date,quantity,price
0,Stone & Wood Pacific Ale Bottles 330mL,31-03-2021,77916,24.49
1,James Squire One Fifty Lashes Pale Ale Bottles...,31-03-2021,78999,22.99
2,Little Creatures Pale Ale Bottles 330mL,31-03-2021,72382,21.95


In [256]:
#Format and build a dataframe with x_values for each product within the category
x_pivot = small_set.pivot( index='date', columns='title' ,values='price' )
x_values = pd.DataFrame(x_pivot.to_records())
x_values.dropna()
print(x_values)

         date James Squire One Fifty Lashes Pale Ale Bottles 345mL  \
0  31-03-2021                                              22.99     

  Little Creatures Pale Ale Bottles 330mL  \
0                                   21.95   

  Stone & Wood Pacific Ale Bottles 330mL  
0                                  24.49  


In [257]:
#Format and build a dataframe with y_values for each product within the category
y_pivot = small_set.pivot( index='date', columns='title' ,values='quantity' )
y_values = pd.DataFrame(y_pivot.to_records())
print(y_values)

         date  James Squire One Fifty Lashes Pale Ale Bottles 345mL  \
0  31-03-2021                                              78999      

   Little Creatures Pale Ale Bottles 330mL  \
0                                    72382   

   Stone & Wood Pacific Ale Bottles 330mL  
0                                   77916  


In [261]:
points = []
results_values = {
    "title": [],
    "price_elasticity": [],
    "price_mean": [],
    "quantity_mean": [],
    "intercept": [],
    "t_score":[],
    "slope": [],
    "coefficient_pvalue" : [],
}

#Append x_values with y_values per same product name
for column in x_values.columns[1:]:
    column_points = []
    for i in range(len(x_values[column])):
        if not np.isnan(x_values[column][i]) and not np.isnan(y_values[column][i]):
            column_points.append((x_values[column][i], y_values[column][i]))
    df = pd.DataFrame(list(column_points), columns= ['x_value', 'y_value'])


    #Linear Regression Model
    import statsmodels.api as sm
    x_value = df['x_value']
    y_value = df['y_value']
    X = sm.add_constant(x_value)
    model = sm.OLS(y_value, X)
    result = model.fit()
    
    #(Null Hypothesis test) Coefficient with a p value less than 0.05
    if result.f_pvalue < 0.05:
        
        rsquared = result.rsquared
        coefficient_pvalue = result.f_pvalue
        intercept, slope = result.params
        mean_price = np.mean(x_value)
        mean_quantity = np.mean(y_value)
        tintercept, t_score = result.tvalues
     
        #Price elasticity Formula
        price_elasticity = (slope)*(mean_price/mean_quantity)    
            
        #Append results into dictionary for dataframe
        results_values["ProductID"].append(column)
        results_values["price_elasticity"].append(price_elasticity)
        results_values["price_mean"].append(mean_price)
        results_values["quantity_mean"].append(mean_quantity)
        results_values["intercept"].append(intercept)
        results_values['t_score'].append(t_score)
        results_values["slope"].append(slope)
        results_values["coefficient_pvalue"].append(coefficient_pvalue)
        
final_df = pd.DataFrame.from_dict(results_values)
df_elasticity = final_df[['ProductID','price_elasticity','t_score','coefficient_pvalue','slope','price_mean','quantity_mean','intercept']]
df_elasticity

TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

### References:
- (Amazon 2021) Algorithmic Marketing by Ilya Katsov
- Ileana Cabada - Medium post (https://towardsdatascience.com/identifying-your-price-competitors-with-cross-price-elasticities-a-practical-approach-26c19f12b1ee)
- (Doe, 2019) Cost and Economics in Pricing Strategy  (John Doe, University of Virginia)
- (OECD, 2003) OECD (https://stats.oecd.org/glossary/detail.asp?ID=3206)