# Customer Analytics in FMGC Industry (Part 2)
#### by Sooyeon Won

### Keywords 

- Marketing Mix 
- STP framework
- Purchase Analytics 
- Predictive Analysis
- Price Elasticity
- Modeling Brand Choice


### Contents 

<ul>
<li><a href="#Introduction">1. Introduction</a></li> 
<li><a href="#Preparation">2. Data Preparation</a></li>
<li><a href="#Exploration">3. Data Exploration</a></li>
<li><a href="#Analysis">4. Data Analysis</a></li>
&emsp;4.1. Customer Analytics<br>
&emsp;4.2. Purchase Analytics <br>
&emsp;&emsp;&emsp;i. Descriptive Analyses by Segment <br>
&emsp;&emsp;&emsp;&emsp;&emsp; i-1. The Proportion of each Segment <br>
&emsp;&emsp;&emsp;&emsp;&emsp; i-2. Purchase Occasions and Purchase Incidences <br>
&emsp;&emsp;&emsp;&emsp;&emsp; i-3. Brand Chocie <br>
&emsp;&emsp;&emsp;&emsp;&emsp; i-4. Revenue Comparison between segments <br>
&emsp;&emsp;&emsp;ii. Predictive Analyses <br>
&emsp;&emsp;&emsp;&emsp;&emsp; ii-1. Modeling Purchase Incidence<br>
&emsp;&emsp;&emsp;&emsp;&emsp; ii-2. Modeling Brand Choice <br>
&emsp;&emsp;&emsp;&emsp;&emsp; ii-3. Modeling Purchase Quantity <br>
<li><a href="#Conclusion">5. Conclusion</a></li>
</ul>


## 4. Data Analysis
### 4.2. Purchase Analytics
- Data Preparation from Part 1
- ii. Predictive Analyses <br>
&emsp;&emsp;&emsp; ii-1. Modeling Purchase Incidence<br>
&emsp;&emsp;&emsp; **ii-2. Modeling Brand Choice** <br>
&emsp;&emsp;&emsp; ii-3. Modeling Purchase Quantity <br>

### Data Preparation from Part 1

In [None]:
# Import the relevant libraries 
import numpy as np
import pandas as pd

import matplotlib.pyplot as plt 
import matplotlib.axes as axs
%matplotlib inline
plt.rc("font", size=14)
import seaborn as sns
sns.set()
sns.set(style="whitegrid", color_codes=True)

from sklearn.preprocessing import StandardScaler 
from sklearn.decomposition import PCA 
from sklearn.cluster import KMeans 

# Import pickle in order to be able to load our pickled objects.
import pickle

# Import the Logistic Regression module from sk learn for the purchase probability model.
from sklearn.linear_model import LogisticRegression 

In [None]:
# Load data
purchase_df = pd.read_csv('purchase data.csv', index_col = 0)
price_elasticities = pd.read_csv('price_elasticities.csv', index_col = 0)

# Import Scaler, PCA, K-Means
scaler = pickle.load(open('scaler.pickle', 'rb'))
pca = pickle.load(open('pca.pickle', 'rb'))
kmeans_pca = pickle.load(open('kmeans_pca.pickle', 'rb'))

# Standardisation
features = purchase_df[['Sex', 'Marital status', 'Age', 'Education', 'Income', 'Occupation', 'Settlement size']]
purchase_segm_std = scaler.transform(features)
# Apply PCA
purchase_segm_pca = pca.transform(purchase_segm_std)
# Segment data
purchase_segm_kmeans_pca = kmeans_pca.predict(purchase_segm_pca)

# Create a copy of the data frame
purchase_predictors = purchase_df.copy()

# Add segment labels
purchase_predictors['Segment'] = purchase_segm_kmeans_pca
segment_dummies = pd.get_dummies(purchase_segm_kmeans_pca, prefix = 'Segment', prefix_sep = '_')
purchase_predictors = purchase_predictors.reset_index()
purchase_predictors = purchase_predictors.merge(segment_dummies, how='inner',
                                                left_on = purchase_predictors.index,
                                                right_on = segment_dummies.index)

In [None]:
df_pa = purchase_predictors.set_index('ID').iloc[:,1:].copy()
df_pa.head() # Data is ready to conduct predictive analysis 

### ii. Predictive Analysis 
In this section, I mainly conduct predictive analytics. Using machine learning models I estimated price elasticities. The situation, in this part of analysis, is that customers are in the store and have decided to purchase a product we are interested in: which brand they'll choose. Furthermore, I analysed how the choice of customer would be changed as competitors’ price is changing.

### ii-2. Modeling Brand Choice <br>
<li><a href="#BrandChoice">ii-2-1. Brand Choice</a></li>
<li><a href="#OPEB5">ii-2-2. Own Price Elasticity Brand 5</a></li>
<li><a href="#CPEB54">ii-2-3. Cross Price Elasticity Brand 5, Cross Brand 4</a></li> 
<li><a href="#OaCPEbySeg">ii-2-4. Own and Cross-Price Elasticity by Segment</a></li> 

> The chocolate brand has values from 0 to 5 indicating which of the 5 brands are chosen. The entry is indicated as 0, if no purchase was made during the particular shopping trip. To investigate the brand choice, I focused on these entries where brand is from 1 to 5. By doing s, we can be sure that a purchase was made and a certain brand was preferred. <br><br>
The final goal of the brand choice model is to determine what the probability to choose a certain brand is. This would help marketers analyse their customers‘ behaviour. Based on the analysis, the marketers can increase sales and certainly the goal would help to increase customer satisfaction. I take advantage of a "Logistic Regression Algorithm". This time I apply it to a multi class scenario, because now there are 5 classes (brands). This approach is known as a **"Multi-Nominal-Logistic-Regression / Classifier"**.


<a id='BrandChoice'></a>
### ii-2-1. Brand Choice

In [None]:
brand_choice = df_pa.query('Incidence ==1')
brand_choice.head()

In [None]:
# The target variable of the model is "brand".
output = brand_choice['Brand']
# Predict based on the prices for the five brands.
features = ['Price_1', 'Price_2', 'Price_3', 'Price_4', 'Price_5']
inputs = brand_choice[features]
# Brand Choice Model fit.
model_brand_choice = LogisticRegression(solver = 'sag', multi_class = 'multinomial')
model_brand_choice.fit(inputs, output)

> Like the previous analysis, I predict the brand  based on price, but now we're interested in the prices of each individual
brand and the interactions between them.

In [None]:
# Turn off scientific notation in pandas 
pd.set_option('display.float_format', lambda x: '%.2f' % x)
# Make some transformations on the coefficients data frame to increase readability.
# Tanspose the data frame, to keep with the conventional representation of results.
# Add labels for the columns and the index, which represent the coefficients of the brands and prices, respectively. 
coefficients = ['Coef_Brand_1', 'Coef_Brand_2', 'Coef_Brand_3', 'Coef_Brand_4', 'Coef_Brand_5']
prices = ['Price_1', 'Price_2', 'Price_3', 'Price_4', 'Price_5']
bc_coef = pd.DataFrame(np.transpose(model_brand_choice.coef_), columns = coefficients, index = prices)

bc_coef

> The Dataframe "bc_coef" presents the beta values. 
>- "Coef_Brand_1": The Brand 1 the coefficient for the own brand with respect to price is negative (-3,92), while it's positive for all other prices except Brand 5. We know that the higher the price of the own product, the lower the probability for it to be purchased.
So it makes sense for the own brand price coefficient to be negative.
>- On the other hand, the more the price of a competitor increases, the higher the probability of customers switching to our own brand would be. Hence there is a positive relationship between our own brands (Brand 5) purchase probability and a competitive brand increasing their price. 
>- It can be easily realized that the choice probability for own brand and the choice probabilities for all the other brands are interrelated and a marketing mix tool of own brand reflects not only the choice probability for that brand but the choice probabilities for all other brands as well. These effects are known as own **"Brand Effects"** and **"Cross Brand Effects"**. Throughout this analysis, I examined both in more detail.


<a id='OPEB5'></a>
### ii-2-2. Own Price Elasticity Brand 5

In this section I chose the Brand 5 and regarded as the own product which should be analysed. Then I expand the findings into developing a strategy to target customers. 

Firstly, I examine the effects of price changes to own brand price. Then I explore the effects of competitors' product price changing on the brand 5 product. This information contains in the Brand 5 own price elasticity and the cross price elasticities respectively. 

For this analysis, I chose Brand 5 chocolate product, the most expensive brand. Like before, in order to predict the brand elasticities, I generated a dataframe which contains the same features to fit the brand choice model. Basically I predict the purchase probability of brand 5 for different price points to calculate the predicted probabilities, Then apply the predict proper method on the brand choice model.

In [None]:
price_range = np.arange(0.5, 3.5, 0.01)

In [None]:
# Calculate price elasticity of brand choice.
# Create a data frame with price columns, which the model will use to predict the brand choice probabilities.
own_brand_5 = pd.DataFrame( {'Price_1':  brand_choice['Price_1'].mean(),
                             'Price_2':  brand_choice['Price_2'].mean(), 
                             'Price_3':  brand_choice['Price_3'].mean(),
                             'Price_4':  brand_choice['Price_4'].mean(),
                             'Price_5':  price_range} , index = np.arange(price_range.size))
own_brand_5.head()

In [None]:
# Brand Choice Model prediction.
pred_brand_5 = model_brand_choice.predict_proba(own_brand_5)

# The model returns the probabilities of choosing each of the 5 brands. 
# Since, we are interested in the probability for the fifth brand we need to obtain the last column located on position 4,
# as we're starting to count from 0.
pred_own_brand_5 = pred_brand_5[: ][:, 4]

# We're interested in choosing brand 5. 
# Therefore, the beta coefficient we require is that of the brand 5 coefficient and price 5.
beta5 = bc_coef.iloc[4, 4]
np.round(beta5, 4)

In [None]:
own_price_elasticity_brand_5 = beta5 * price_range * (1 - pred_own_brand_5)
price_elasticities.loc['Brand5_own_Avg'] = own_price_elasticity_brand_5
price_elasticities

In [None]:
# Plot elasticities of purchase probability for brand 5.
plt.figure(figsize = (9, 6))
plt.plot(price_range, own_price_elasticity_brand_5, color = 'grey')
plt.xlabel('Price 5')
plt.ylabel('Elasticity')
plt.title('Own Price Elasticity of Purchase Probability for Brand 5')

<a id='CPEB54'></a>
### ii-2-3. Cross Price Elasticity Brand 5, Cross Brand 4

Now, what would happen to the purchase probability of brand 5, if a competitor changed their pricing? Among other brands, I compared the Brand 5 with Brand 4. The reason is the most expensive chocolate: Brand 5 is highly likely one of the highest qualities (certainly not always, though). Based on this assumption, it seems that the brand 4 seems the closest to the brand 5. Therefore, it would make the most sense to compare these two. I determine the cross price elasticities of "Brand 5" with respect to "Brand 4".

In [None]:
brand5_cross_brand4 = pd.DataFrame({'B5CB4_Price_1':  brand_choice['Price_1'].mean(),
                                     'B5CB4_Price_2':  brand_choice['Price_2'].mean(), 
                                     'B5CB4_Price_3':  brand_choice['Price_3'].mean(),
                                     'B5CB4_Price_4':  price_range,
                                     'B5CB4_Price_5':  brand_choice['Price_5'].mean()}, index = np.arange(price_range.size))

brand5_cross_brand4.head()

In [None]:
# Compute the probabilities by using the previous model brand choice predict probabilities
pred_brand5_cross_brand4 = model_brand_choice.predict_proba(brand5_cross_brand4)

In [None]:
# The probability of choosing the competitor brand 
# Select the purchase probability for brand 4, contained in the 4th column with index 3. 
pred_brand_4 = pred_brand5_cross_brand4[:][:, 3]

In [None]:
# We're interested in choosing brand 5. 
# Therefore, the beta coefficient we require is that of the brand 5 coefficient and price 5.
np.round(beta5, 4)

[Price Elasticity - Reference](https://365datascience.com/price-elasticity/)
$$ E = -\beta(Own Product Price)*price(Cross Brand)*Pr(Cross Brand) $$ 

Price Elasticity of Probability for Brand Choice is equal to negative the price coefficient of the own brand multiplied by the price of the cross brand,further multiplied by the probability for choosing the cross brand.

In [None]:
brand5_cross_brand4_price_elasticity = -beta5 * price_range * pred_brand_4

In [None]:
# Update price elasticities data frame to include the cross price elasticities for brand 5 with respect to brand 4.
pd.options.display.max_columns = None
price_elasticities.loc['Brand5_Cross_Brand_4_Avg'] = brand5_cross_brand4_price_elasticity
price_elasticities

> From the last row in the table, we examine the cross price elasticity of purchase probability for brand 5 with respect to brand 4. We can see that the values are all are positive. This implies that as the price of the competitor brand increases, so does the probability for purchasing the own brand (Brand 5). Even though the elasticity starts to decrease from the price point 1.48, it is still positive, meaning that the increase in purchase probability for the own brand happens more slowly.

In [None]:
plt.figure(figsize = (9, 6))
plt.plot(price_range, brand5_cross_brand4_price_elasticity, color = 'grey', label ="Cross Price Elasticity")
plt.plot(price_range, np.abs(own_price_elasticity_brand_5), color = 'grey',  
         linestyle = '--',label ="Own Product Price Elasticity in Absolute Values (Brand 5)" )
plt.xlabel('Price Range')
plt.ylabel('Elasticity')
plt.title('Cross Price Elasticity of Brand 5 wrt Brand 4')
plt.axvline(x=1.65, color='blue', linewidth=1, label = "Strong Substitute cut-off line")
plt.axvspan(0.5, 1.65, facecolor='b', alpha=0.2, label = "Strong Substitute Price Zone for own Brand")
# show the legend
plt.legend( bbox_to_anchor = (1, 0.5), loc = 6)
plt.show()

>- The positive values of the elasticities across the price range indicates that if competitor brand 4 increases prices, the purchase probability for Brand 5  would increase. In other words, our competitor raises prices and they start buying our own product more, and the elasticities show us exactly how much more. 
>-If the cross price elasticity is greater than 0, the two products are considered "Substitutes". If, however we were looking at brand 5 cross some type of different products (as an example of products having nothing in common), the cross price elasticity would not be necessarily positive.
>-In the above example all cross price elasticities are positive, since all brands are substitutes for one another.
Furthermore, if the cross price elasticity at some price point is greater in absolute terms than our own price elasticity, the alternative brand is considered a **"Strong Substitute"**. In this sense, the Brand 4 could be strong substitute for brand 5, depending on the price point. In the graph, brand 4 is a strong substitute for brand 5 for all prices up to 1,65 (Blue-coloured Zone)
However, note that these prices are out of the natural price domain of brand 4. Therefore, if brand 4 had a substantially lower price it would be an extraordinarily strong competitor a brand 5. It is important to mark that the observed price range of brand 4 lies between 1,76(min. price) and 2,26 (max. price). Thus, when it comes to the **"average"** customer, Brand 4 is an weak substitute for brand 5. 
>- Also, the elasticity is gradually decreasing. This signals that with an increase in price the purchase probability changes more slowly. Therefore, our purchase probability still increases with the increase in price of brand 4 but at a slower rate.


<a id='OaCPEbySeg'></a>
### ii-2-4. Own and Cross-Price Elasticity by Segment

Until now, I analyse the Own Product Price Elasticity and Cross Price Elasticity regarding Brand 5, for the **"average"** customers. However, that targeting the average customer can be quite cumbersome, sometimes impossible. No brand can make everyone satisfied, but a brand can make a certain customer "Segment" more satisfied. For these reasons, I conducted the analysis in this section about **"Own and Cross Price Elasticities by Customer Segments"**.<br><br>

<li><a href="#Segment3">Well-Off (Segment 3)</a></li>
<li><a href="#Segment0">Standard (Segment 0)</a></li>
<li><a href="#Segment1">Career-Focused (Segment 1)</a></li>
<li><a href="#Segment2">Fewer-Opportunities (Segment 2)</a></li>


<a id='Segment3'></a>
### Well-Off (Segment 3) 

> From the descriptive analysis, I found out that "Well-Off" segment has a strong preference for brand 4. Therefore, it should be interesting to observe their behaviour with respect to the price changes in brand 4 firstly. To analyse the purchase probability for choosing brand 5 by segments, I filtered the data to contain only purchase incidences of the 3rd segment: Well-Off.

In [None]:
brand_choice_WO = df_pa.query('Incidence ==1 & Segment == 3')

# Brand Choice Model estimation.
output_WO = brand_choice_WO['Brand']
brand_choice_WO = pd.get_dummies(brand_choice_WO, columns=['Brand'], prefix = 'Brand', prefix_sep = '_')
inputs_WO = brand_choice_WO[features]

model_brand_choice_WO = LogisticRegression(solver = 'sag', multi_class = 'multinomial', max_iter = 300)
model_brand_choice_WO.fit(inputs_WO, output_WO)

# Coefficients table for well-off.
bc_coef = pd.DataFrame(np.transpose(model_brand_choice_WO.coef_), columns = coefficients, index = prices)
bc_coef

### (Well-Off) Own-Brand Price Elasticity: Brand 5

In [None]:
# Calculate own-brand price elasticity for brand 5 and the Well-off segment.
own_brand_5_WO = pd.DataFrame({'Price_1_WO_OBPE':  brand_choice_WO['Price_1'].mean(),
                               'Price_2_WO_OBPE':  brand_choice_WO['Price_2'].mean(), 
                               'Price_3_WO_OBPE':  brand_choice_WO['Price_3'].mean(),
                               'Price_4_WO_OBPE':  brand_choice_WO['Price_4'].mean(),
                               'Price_5_WO_OBPE':  price_range} , index = np.arange(price_range.size))

own_brand_5_WO.head()

In [None]:
pred_own_brand_5_WO = model_brand_choice_WO.predict_proba(own_brand_5_WO)
pr_own_brand_5_WO = pred_own_brand_5_WO[: ][: , 4]
own_price_elasticity_brand_5_WO =  beta5 * price_range * (1 - pr_own_brand_5_WO)
price_elasticities.loc['Brand5_own_PE_WellOff'] = own_price_elasticity_brand_5_WO

### (Well-Off) Cross-Brand Price Elasticity: Brand 5 vs. 4

In [None]:
# Calculate cross-brand price elasticity for brand 5 with respect to brand 4 for the Well-off segment.
brand5_cross_brand4_WO = pd.DataFrame({'B5CB4_Price_1':  brand_choice_WO['Price_1'].mean(),
                                       'B5CB4_Price_2':  brand_choice_WO['Price_2'].mean(), 
                                       'B5CB4_Price_3':  brand_choice_WO['Price_3'].mean(),
                                       'B5CB4_Price_4':  price_range,
                                       'B5CB4_Price_5':  brand_choice_WO['Price_5'].mean()},
                                        index = np.arange(price_range.size))

pred_brand5_cross_brand4_WO = model_brand_choice_WO.predict_proba(brand5_cross_brand4_WO)
pr_cross_brand_5_WO = pred_brand5_cross_brand4_WO[: ][: , 3]

# Update master data frame to include the newly obtained cross-brand price elasticities.
brand5_cross_brand4_price_elasticity_WO = -beta5 * price_range * pr_cross_brand_5_WO
price_elasticities.loc['Brand5_Cross_Brand_4_WellOff'] = brand5_cross_brand4_price_elasticity_WO

### (Well-Off) Visualizations

In [None]:
# Plot the own brand and cross-brand price elasticities for brand 5 cross brand 4 side by side.
fig, axs = plt.subplots(1, 2, figsize = (14, 4))
axs[0].plot(price_range, own_price_elasticity_brand_5_WO, color = 'r')
axs[0].set_title('Brand 5 Segment Well-Off')
axs[0].set_xlabel('Price 5')
axs[0].axvspan(brand_choice.Price_5.min(), brand_choice.Price_5.max(), facecolor='r', alpha=0.2, label = "Natural Price Domain of Brand 5")

axs[1].plot(price_range, brand5_cross_brand4_price_elasticity_WO, color = 'r')
axs[1].set_title('Cross Price Elasticity of Brand 5 wrt Brand 4 Segment Well-Off')
axs[1].set_xlabel('Price 4')
axs[1].axvspan(brand_choice.Price_4.min(), brand_choice.Price_4.max(), facecolor='r', alpha=0.4, label = "Natural Price Domain of Brand 4")

for ax in axs.flat:
    ax.set(ylabel = 'Elasticity')
    ax.legend(bbox_to_anchor = (0, 1.1), loc = 3)

In [None]:
# The natural price range of Brand 4
brand_choice.Price_4.min(), brand_choice.Price_4.max()

In [None]:
# The natural price range of Brand 5
brand_choice.Price_5.min(), brand_choice.Price_5.max()

#### Well-Off
> Notice that while the X axes both show the same price range, the left plot refers to changes in the price of "Brand 5", the right plot refers to changes in the price of "Brand 4". Also, the red-coloured zone in both plots indicates the natural price range of Brand  5, 4, respectively, It is worth focusing on this point of the graph. 
>- **"Own Price Elasticity"** indicates that well-off customers is elastic to the own brand (Brand 5). This can be verified from the descriptive analysis table that indeed over 60 percent of the well-off segment purchased brand 4 at about 20 percent by brand 5.
>- **"Cross Price Elasticities"**: The values of elasticity are positive indicating that for the well off's brand 4 is substitute for brand 5. 

> For example, assume that Brand 5 costs 2,40 dollars. Then the own price elasticity is -1,97. Moreover, when Brand 4 cost 2,00, the cross price elasticity is about 1,53. <br>
>- If the competitor Brand 4 decrease its prices by 1%, then the cross price elasticity is 1,53, thus the purchase probability of our brand (Brand 5) will fall by 1,53%.
>-  To strike back we can also set lower our own price by 1%, we must look at the "own" price elasticity of our brand (the left plot), since it is minus to a 1 percent decrease in our price would be reflected in a 1,97% increase in purchase profitability. The net effect of the two price decreases is 1,97% - 1,53% = 0,44 %. Therefore, we have reacted to the competitors price range and have actually gained some market share.
>-  With the price elasticities, we can also react to the  competitor to keep the purchase probability constant. We establish that if brand 4 decreases their price by 1 %, the purchase probability for brand 5 would decrease by 1,53 %. Let's say, X is the decrease in price we require to reach a 1,53% increase in purchase probability. Then we can make an equation: X*2,00 = 1,53%, to match that by our own price decrease.  X is equal to 0,76%. If brand 4 decreases their price by 1 percent, we can decrease ours by 0,75% and
theoretically we will not lose a single customer from the well-off segment.


<a id='Segment0'></a>
### Standard (Segment 0) 

In [None]:
brand_choice_Strd= df_pa.query('Incidence ==1 & Segment == 0')

# Brand Choice Model estimation.
output_Strd = brand_choice_Strd['Brand']
brand_choice_Strd = pd.get_dummies(brand_choice_Strd, columns=['Brand'], prefix = 'Brand', prefix_sep = '_')
inputs_Strd = brand_choice_Strd[features]

model_brand_choice_Strd = LogisticRegression(solver = 'sag', multi_class = 'multinomial', max_iter = 300)
model_brand_choice_Strd.fit(inputs_Strd, output_Strd)

# Coefficients table for Standard. 
bc_coef = pd.DataFrame(np.transpose(model_brand_choice_Strd.coef_), columns = coefficients, index = prices)
bc_coef

### (Standard) Own-Brand Price Elasticity: Brand 5

In [None]:
# Calculate own-brand price elasticity for brand 5 and the Standard segment.
own_brand_5_Strd = pd.DataFrame({'Price1_Strd_OBPE':  brand_choice_Strd['Price_1'].mean(),
                                 'Price2_Strd_OBPE':  brand_choice_Strd['Price_2'].mean(), 
                                 'Price3_Strd_OBPE':  brand_choice_Strd['Price_3'].mean(),
                                 'Price4_Strd_OBPE':  brand_choice_Strd['Price_4'].mean(),
                                 'Price5_Strd_OBPE':  price_range} , index = np.arange(price_range.size))

own_brand_5_Strd.head()

In [None]:
pred_own_brand_5_Strd = model_brand_choice_Strd.predict_proba(own_brand_5_Strd)
pr_own_brand_5_Strd = pred_own_brand_5_Strd[: ][: , 4]

# Compute price elasticities and update master dataframe.
own_price_elasticity_brand_5_Strd =  beta5 * price_range * (1 - pr_own_brand_5_Strd) 
price_elasticities.loc['Brand5_own_PE_Standard'] = own_price_elasticity_brand_5_Strd

### (Standard) Cross-Brand Price Elasticity: Brand 5 vs. 4

In [None]:
# Calculate cross-brand price elasticity for brand 5 with respect to brand 4 for the Well-off segment.
brand5_cross_brand4_Strd = pd.DataFrame({'B5CB4_Price_1':  brand_choice_Strd['Price_1'].mean(),
                                       'B5CB4_Price_2':  brand_choice_Strd['Price_2'].mean(), 
                                       'B5CB4_Price_3':  brand_choice_Strd['Price_3'].mean(),
                                       'B5CB4_Price_4':  price_range,
                                       'B5CB4_Price_5':  brand_choice_Strd['Price_5'].mean()},
                                        index = np.arange(price_range.size))

pred_brand5_cross_brand4_Strd = model_brand_choice_Strd.predict_proba(brand5_cross_brand4_Strd)
pr_cross_brand_5_Strd = pred_brand5_cross_brand4_Strd[: ][: , 3]

# Update master data frame to include the newly obtained cross-brand price elasticities.
brand5_cross_brand4_price_elasticity_Strd = -beta5 * price_range * pr_cross_brand_5_Strd
price_elasticities.loc['Brand5_Cross_Brand_4_Standard'] = brand5_cross_brand4_price_elasticity_Strd

### (Standard) Visualizations

In [None]:
# Plot the own brand and cross-brand price elasticities for brand 5 cross brand 4 side by side.
fig, axs = plt.subplots(1, 2, figsize = (14, 4))
axs[0].plot(price_range, own_price_elasticity_brand_5_Strd, color = 'g')
axs[0].set_title('Brand 5 Segment Standard')
axs[0].set_xlabel('Price 5')
axs[0].axvspan(brand_choice.Price_5.min(), brand_choice.Price_5.max(), facecolor='g', alpha=0.2, label = "Natural Price Domain of Brand 5")


axs[1].plot(price_range, brand5_cross_brand4_price_elasticity_Strd, color = 'g')
axs[1].set_title('Cross Price Elasticity of Brand 5 wrt Brand 4 Segment Standard')
axs[1].set_xlabel('Price 4')
axs[1].axvspan(brand_choice.Price_4.min(), brand_choice.Price_4.max(), facecolor='g', alpha=0.4, label = "Natural Price Domain of Brand 4")

for ax in axs.flat:
    ax.set(ylabel = 'Elasticity')
    ax.legend(bbox_to_anchor = (0, 1.1), loc = 3)

#### Standard Segment

Like before, note that the X axes refer to the price of brand 5 and price of brand 4 respectively. The y-axis reflects the own price elasticity (on the left plot) or the cross price elasticity (on the right plot). Standard customer is more elastic when compared to the average customers. The difference becomes even more pronounced when we compared the standards to other segments.

The greed-coloured area indicate the natural price domain of Brand 5 (left plot) and Brand 4 (right plot) The elasticity of the standard segment is between -1,42 and -1,27. Therefore its purchase probability for the own brand is elastic for the entire observed price range of the brand.

If the company is planning to win some of the standard segment market, the appropriate marketing strategy would be to lower prices in this price range to increase the purchase probability for this segment. However, considering that this segment isn't quite homogenous and a marketing strategy based on only this segment might be risky.


<a id='Segment1'></a>
### Career-Focused (Segment 1) 

In [None]:
brand_choice_Cf= df_pa.query('Incidence ==1 & Segment == 1')

# Brand Choice Model estimation.
output_Cf = brand_choice_Cf['Brand']
brand_choice_Cf = pd.get_dummies(brand_choice_Cf, columns=['Brand'], prefix = 'Brand', prefix_sep = '_')
inputs_Cf = brand_choice_Cf[features]

model_brand_choice_Cf = LogisticRegression(solver = 'sag', multi_class = 'multinomial', max_iter = 300)
model_brand_choice_Cf.fit(inputs_Cf, output_Cf)

# Coefficients table for Standard. 
bc_coef = pd.DataFrame(np.transpose(model_brand_choice_Cf.coef_), columns = coefficients, index = prices)
bc_coef

### (Career-Focused) Own-Brand Price Elasticity: Brand 5

In [None]:
# Calculating own-brand price elasticity for brand 5 and the Standard segment.
own_brand_5_Cf = pd.DataFrame({'Price1_Cf_OBPE':  brand_choice_Cf['Price_1'].mean(),
                               'Price2_Cf_OBPE':  brand_choice_Cf['Price_2'].mean(), 
                               'Price3_Cf_OBPE':  brand_choice_Cf['Price_3'].mean(),
                               'Price4_Cf_OBPE':  brand_choice_Cf['Price_4'].mean(),
                               'Price5_Cf_OBPE':  price_range} , index = np.arange(price_range.size))

own_brand_5_Cf.head()

In [None]:
pred_own_brand_5_Cf = model_brand_choice_Cf.predict_proba(own_brand_5_Cf)
pr_own_brand_5_Cf = pred_own_brand_5_Cf[: ][: , 4]

# Compute price elasticities and update master dataframe.
own_price_elasticity_brand_5_Cf =  beta5 * price_range * (1 - pr_own_brand_5_Cf) 
price_elasticities.loc['Brand5_own_PE_Career_Focused'] = own_price_elasticity_brand_5_Cf

### (Career-Focused) Cross-Brand Price Elasticity: Brand 5 vs. 4

In [None]:
# Calculate cross-brand price elasticity for brand 5 with respect to brand 4 for the Well-off segment.
brand5_cross_brand4_Cf = pd.DataFrame({'B5CB4_Price_1':  brand_choice_Cf['Price_1'].mean(),
                                       'B5CB4_Price_2':  brand_choice_Cf['Price_2'].mean(), 
                                       'B5CB4_Price_3':  brand_choice_Cf['Price_3'].mean(),
                                       'B5CB4_Price_4':  price_range,
                                       'B5CB4_Price_5':  brand_choice_Cf['Price_5'].mean()},
                                        index = np.arange(price_range.size))

pred_brand5_cross_brand4_Cf = model_brand_choice_Cf.predict_proba(brand5_cross_brand4_Cf)
pr_cross_brand_5_Cf = pred_brand5_cross_brand4_Cf[: ][: , 3]

# Update master data frame to include the newly obtained cross-brand price elasticities.
brand5_cross_brand4_price_elasticity_Cf = -beta5 * price_range * pr_cross_brand_5_Cf
price_elasticities.loc['Brand5_Cross_Brand_4_Career_Focused'] = brand5_cross_brand4_price_elasticity_Cf

In [None]:
# Plot the own brand and cross-brand price elasticities for brand 5 cross brand 4 side by side.
fig, axs = plt.subplots(1, 2, figsize = (14, 4))
axs[0].plot(price_range, own_price_elasticity_brand_5_Cf, color = 'orange')
axs[0].set_title('Brand 5 Segment Career-Focused')
axs[0].set_xlabel('Price 5')
axs[0].axvspan(brand_choice.Price_5.min(), brand_choice.Price_5.max(), facecolor='orange', alpha=0.2, label = "Natural Price Domain of Brand 5")


axs[1].plot(price_range, brand5_cross_brand4_price_elasticity_Strd, color = 'orange')
axs[1].set_title('Cross Price Elasticity of Brand 5 wrt Brand 4 Segment Career-Focused')
axs[1].set_xlabel('Price 4')
axs[1].axvspan(brand_choice.Price_4.min(), brand_choice.Price_4.max(), facecolor='orange', alpha=0.4, label = "Natural Price Domain of Brand 4")


for ax in axs.flat:
    ax.set(ylabel = 'Elasticity')
    ax.legend(bbox_to_anchor = (0, 1.1), loc = 3)

#### Career Focused 

Career Focused customers are the least elastic among the rest. They seem to be inelastic throughout the whole price range.
In other words, it means that this segment is not really affected by the increase in the price of the own brand.

In addition, the values of cross price elasticity in this segment is also exceptionally low. This type of customers is unlikely to switch to the competitor brand. They are loyal to the brand and the company could increase prices of its own brand without fear of losing too much market share.


<a id='Segment2'></a>
### Fewer-Opportunities (Segment 2) 

In [None]:
brand_choice_FO= df_pa.query('Incidence ==1 & Segment == 2')

# Brand Choice Model estimation.
output_FO = brand_choice_FO['Brand']
brand_choice_FO = pd.get_dummies(brand_choice_FO, columns=['Brand'], prefix = 'Brand', prefix_sep = '_')
inputs_FO = brand_choice_FO[features]

model_brand_choice_FO = LogisticRegression(solver = 'sag', multi_class = 'multinomial', max_iter = 300)
model_brand_choice_FO.fit(inputs_FO, output_FO)

# Coefficients table for Standard. 
bc_coef_FO = pd.DataFrame(np.transpose(model_brand_choice_FO.coef_), columns = coefficients, index = prices)
bc_coef_FO

### (Fewer-Opportunities) Own-Brand Price Elasticity: Brand 5

In [None]:
# Calculating own-brand price elasticity for brand 5 and the Standard segment.
own_brand_5_FO = pd.DataFrame({'Price1_FO_OBPE':  brand_choice_FO['Price_1'].mean(),
                               'Price2_FO_OBPE':  brand_choice_FO['Price_2'].mean(), 
                               'Price3_FO_OBPE':  brand_choice_FO['Price_3'].mean(),
                               'Price4_FO_OBPE':  brand_choice_FO['Price_4'].mean(),
                               'Price5_FO_OBPE':  price_range} , index = np.arange(price_range.size))

own_brand_5_FO.head()

In [None]:
pred_own_brand_5_FO = model_brand_choice_FO.predict_proba(own_brand_5_FO)
pr_own_brand_5_FO = pred_own_brand_5_FO[: ][: , 4]

# Compute price elasticities and update master dataframe.
own_price_elasticity_brand_5_FO =  beta5 * price_range * (1 - pr_own_brand_5_FO) 
price_elasticities.loc['Brand5_own_PE_Fewer_Opportunities'] = own_price_elasticity_brand_5_FO

### (Fewer-Opportunities) Cross-Brand Price Elasticity: Brand 5 vs. 4

In [None]:
# Calculate cross-brand price elasticity for brand 5 with respect to brand 4 for the Fewer_Opportunities. 
brand5_cross_brand4_FO = pd.DataFrame({'B5CB4_Price_1':  brand_choice_FO['Price_1'].mean(),
                                       'B5CB4_Price_2':  brand_choice_FO['Price_2'].mean(), 
                                       'B5CB4_Price_3':  brand_choice_FO['Price_3'].mean(),
                                       'B5CB4_Price_4':  price_range,
                                       'B5CB4_Price_5':  brand_choice_FO['Price_5'].mean()},
                                        index = np.arange(price_range.size))

pred_brand5_cross_brand4_FO = model_brand_choice_FO.predict_proba(brand5_cross_brand4_FO)
pr_cross_brand_5_FO = pred_brand5_cross_brand4_FO[: ][: , 3]

# Update master data frame to include the newly obtained cross-brand price elasticities.
brand5_cross_brand4_price_elasticity_FO = -beta5 * price_range * pr_cross_brand_5_FO
price_elasticities.loc['Brand5_Cross_Brand_4_Fewer_Opportunities'] = brand5_cross_brand4_price_elasticity_FO
pd.options.display.max_columns = None
price_elasticities.to_csv('price_elasticities_2.csv')
price_elasticities

In [None]:
# Plot the own brand and cross-brand price elasticities for brand 5 cross brand 4 side by side.
fig, axs = plt.subplots(1, 2, figsize = (14, 4))
axs[0].plot(price_range, own_price_elasticity_brand_5_FO, color = 'b')
axs[0].set_title('Brand 5 Segment Fewer-Opportunities')
axs[0].set_xlabel('Price 5')
axs[0].axvspan(brand_choice.Price_5.min(), brand_choice.Price_5.max(), facecolor='b', alpha=0.2, label = "Natural Price Domain of Brand 5")


axs[1].plot(price_range, brand5_cross_brand4_price_elasticity_Strd, color = 'b')
axs[1].set_title('Cross Price Elasticity of Brand 5 wrt Brand 4 Segment Fewer-Opportunities')
axs[1].set_xlabel('Price 4')
axs[1].axvspan(brand_choice.Price_4.min(), brand_choice.Price_4.max(), facecolor='b', alpha=0.4, label = "Natural Price Domain of Brand 4")


for ax in axs.flat:
    ax.set(ylabel = 'Elasticity')
    ax.legend(bbox_to_anchor = (0, 1.1), loc = 3)

#### Fewer Opportunities segment

The own price elasticity (left plot) has a more pronounced shape. This segment seems to be inelastic at lower price points and then customers become rapidly elastic at higher prices. Considering the natural price range (blue-coloured area) of the brand_5, this segment customers are rather elastic. As for Career-Focused segment, Fewer Opportunities seem loyal to brand 5, comparing to brand 4, according to the Cross price elasticity. 

As shown in descriptive analysis (Part 2-1), Fewer-Opportunities segment almost never buys brand 5 nor brand 4. The correlations are 0.066 and 0.098, respectively. This means we don't have enough observations to obtain an accurate model and that could be the reason why both curves look so out of character. Therefore, to target this segment in particular brand 5, more purchase data especially from this segment of people are necessary. However, if a product may be too pricey for a segment, we may never obtain more data about their behaviour. These people are simply not the target group for Brand 5.


### Visualizations of All Segments

In [None]:
minor_ticks = np.arange(-4, 2, 1)
minor_ticks

In [None]:
#plot the own and cross brand price elasticities for the average customer and each of the four segments.
minor_ticks = np.arange(-4, 2, 1)
custom_ylim = (-4.5, 2.0)
fig1, (ax1, ax2, ax3, ax4, ax5) = plt.subplots(5, 2, figsize = (15, 12), sharex = True )

# Average
ax1[0].plot(price_range, own_price_elasticity_brand_5, 'tab:grey')
ax1[0].set_title('Brand 5 Average Customer')
ax1[0].set_ylabel('Elasticity')
ax1[0].set_ylim([-4.5, 2.0])
ax1[0].axvspan(brand_choice.Price_5.min(), brand_choice.Price_5.max(), facecolor='grey', alpha=0.2)
ax1[0].set_yticks(minor_ticks, minor=True)
ax1[0].grid(  which = 'minor', axis = 'y' )

ax1[1].plot(price_range, brand5_cross_brand4_price_elasticity, 'tab:grey')
ax1[1].set_title('Cross Brand 4 Average Customer')
ax1[1].set_ylim([-4.5, 2.0])
ax1[1].axvspan(brand_choice.Price_5.min(), brand_choice.Price_5.max(), facecolor='grey', alpha=0.2)
ax1[1].set_yticks(minor_ticks, minor=True)
ax1[1].grid(  which = 'minor', axis = 'y' )

# Well-off
ax2[0].plot(price_range, own_price_elasticity_brand_5_WO, 'tab:red')
ax2[0].set_title('Brand 5 Segment Well-off')
ax2[0].set_ylabel('Elasticity')
ax2[0].set_ylim([-4.5, 2.0])
ax2[0].axvspan(brand_choice.Price_5.min(), brand_choice.Price_5.max(), facecolor='r', alpha=0.2)
ax2[0].set_yticks(minor_ticks, minor=True)
ax2[0].grid(  which = 'minor', axis = 'y' )

ax2[1].plot(price_range, brand5_cross_brand4_price_elasticity_WO, 'tab:red')
ax2[1].set_title('Cross Brand 4 Segment Well-off')
ax2[1].set_ylim([-4.5, 2.0])
ax2[1].axvspan(brand_choice.Price_5.min(), brand_choice.Price_5.max(), facecolor='r', alpha=0.2)
ax2[1].set_yticks(minor_ticks, minor=True)
ax2[1].grid(  which = 'minor', axis = 'y' )


# Standard Segment
ax3[0].plot(price_range, own_price_elasticity_brand_5_Strd, 'tab:green')
ax3[0].set_title('Brand 5 Segment Standard')
ax3[0].set_ylabel('Elasticity')
ax3[0].set_ylim([-4.5, 2.0])
ax3[0].axvspan(brand_choice.Price_5.min(), brand_choice.Price_5.max(), facecolor='g', alpha=0.2)
ax3[0].set_yticks(minor_ticks, minor=True)
ax3[0].grid(  which = 'minor', axis = 'y' )
ax3[1].plot(price_range, brand5_cross_brand4_price_elasticity_Strd, 'tab:green')
ax3[1].set_title('Cross Brand 4 Segment Standard')
ax3[1].set_ylim([-4.5, 2.0])
ax3[1].axvspan(brand_choice.Price_5.min(), brand_choice.Price_5.max(), facecolor='g', alpha=0.2)
ax3[1].set_yticks(minor_ticks, minor=True)
ax3[1].grid(  which = 'minor', axis = 'y' )

# Career-Focused
ax4[0].plot(price_range, own_price_elasticity_brand_5_Cf, 'tab:orange')
ax4[0].set_title('Brand 5 Segment Career-Focused')
ax4[0].set_ylabel('Elasticity')
ax4[0].set_ylim([-4.5, 2.0])
ax4[0].axvspan(brand_choice.Price_5.min(), brand_choice.Price_5.max(), facecolor='orange', alpha=0.2)
ax4[0].set_yticks(minor_ticks, minor=True)
ax4[0].grid(  which = 'minor', axis = 'y' )
ax4[1].plot(price_range, brand5_cross_brand4_price_elasticity_Cf, 'tab:orange')
ax4[1].set_title('Cross Brand 4 Segment Career-Focused')
ax4[1].set_ylim([-4.5, 2.0])
ax4[1].axvspan(brand_choice.Price_5.min(), brand_choice.Price_5.max(), facecolor='orange', alpha=0.2)
ax4[1].set_yticks(minor_ticks, minor=True)
ax4[1].grid(  which = 'minor', axis = 'y' )

# Fewer-Opportunities
ax5[0].plot(price_range, own_price_elasticity_brand_5_FO, 'tab:blue')
ax5[0].set_title('Brand 5 Segment Fewer-Opportunities')
ax5[0].set_xlabel('Price 5')
ax5[0].set_ylabel('Elasticity')
ax5[0].set_ylim([-4.5, 2.0])
ax5[0].axvspan(brand_choice.Price_5.min(), brand_choice.Price_5.max(), facecolor='b', alpha=0.2)
ax5[0].set_yticks(minor_ticks, minor=True)
ax5[0].grid(  which = 'minor', axis = 'y' )

ax5[1].plot(price_range, brand5_cross_brand4_price_elasticity_FO, 'tab:blue')
ax5[1].set_title('Cross Brand 4 Segment Fewer-Opportunities')
ax5[1].set_xlabel('Price 4')
ax5[1].set_ylim([-4.5, 2.0])
ax5[1].axvspan(brand_choice.Price_5.min(), brand_choice.Price_5.max(), facecolor='b', alpha=0.2)
ax5[1].set_yticks(minor_ticks, minor=True)
ax5[1].grid(  which = 'minor', axis = 'y' )

To get more insights, all own price elasticities and cross price elasticities are presented side by side. The two plots in the first row represents the average customer, while the following lines represent each segment.  Also, the plots are presented with shared y and x axis. It is better to compare the plots with the findings of descriptive analysis (Part 2-1). 

It seems that Career-Focused and Well-Off segments require the most attention as both groups are actually the people that purchase brand 5. Well-off segment is much more elastic than the Career-Focused. In other words, increasing the prices would barely affect the Career-Focused segment but would seriously damage Well-Off segment sales. 

If brand 4 were to decrease their price, then it would affect the Well-Off segment, but not the career focused one. A tiny decrease in our pricing would compensate such a competitor move. If prices of chocolates were to drop, we would have space to decrease our price offering while gaining solid market share from the Well-Off segment and practically retaining the Career-Focused customer base.
