# Predict Sales for a Catalog Launch (Part 2)

### By Sooyeon Won

> To make a recommendation for the business decision, I calculate the expected profits by sending the new catalogs to new customers. 

In [1]:
# Import the relevant libraries 
import pandas as pd 
import numpy as np 
import statsmodels.api as sm 
import matplotlib.pyplot as plt 
from sklearn.linear_model import LinearRegression 
import seaborn as sns 
sns.set()

import pickle

In [2]:
test_data = pd.read_excel('p1-mailinglist.xlsx')

In [3]:
reg= pickle.load(open('reg_p1.pickle','rb'))

In [4]:
test_data.head()

Unnamed: 0,Name,Customer_Segment,Customer_ID,Address,City,State,ZIP,Store_Number,Avg_Num_Products_Purchased,#_Years_as_Customer,Score_No,Score_Yes
0,A Giametti,Loyalty Club Only,2213,5326 S Lisbon Way,Centennial,CO,80015,105,3,0.2,0.694964,0.305036
1,Abby Pierson,Loyalty Club and Credit Card,2785,4344 W Roanoke Pl,Denver,CO,80236,101,6,0.6,0.527275,0.472725
2,Adele Hallman,Loyalty Club Only,2931,5219 S Delaware St,Englewood,CO,80110,101,7,0.9,0.421118,0.578882
3,Alejandra Baird,Loyalty Club Only,2231,2301 Lawrence St,Denver,CO,80205,103,2,0.6,0.694862,0.305138
4,Alice Dewitt,Loyalty Club Only,2530,5549 S Hannibal Way,Centennial,CO,80015,104,4,0.5,0.612294,0.387706


In [5]:
data = test_data.drop(['Name','Customer_ID','Address','State','City','ZIP','Store_Number', '#_Years_as_Customer'], axis = 1) 
data.head() # Dropping unnecessary columns 

Unnamed: 0,Customer_Segment,Avg_Num_Products_Purchased,Score_No,Score_Yes
0,Loyalty Club Only,3,0.694964,0.305036
1,Loyalty Club and Credit Card,6,0.527275,0.472725
2,Loyalty Club Only,7,0.421118,0.578882
3,Loyalty Club Only,2,0.694862,0.305138
4,Loyalty Club Only,4,0.612294,0.387706


In [6]:
data=pd.get_dummies(data, drop_first = True) # Get dummy encoding for the column: Customer_Segment
data.head() # Recheck

Unnamed: 0,Avg_Num_Products_Purchased,Score_No,Score_Yes,Customer_Segment_Loyalty Club Only,Customer_Segment_Loyalty Club and Credit Card,Customer_Segment_Store Mailing List
0,3,0.694964,0.305036,1,0,0
1,6,0.527275,0.472725,0,1,0
2,7,0.421118,0.578882,1,0,0
3,2,0.694862,0.305138,1,0,0
4,4,0.612294,0.387706,1,0,0


In [7]:
data.head()

Unnamed: 0,Avg_Num_Products_Purchased,Score_No,Score_Yes,Customer_Segment_Loyalty Club Only,Customer_Segment_Loyalty Club and Credit Card,Customer_Segment_Store Mailing List
0,3,0.694964,0.305036,1,0,0
1,6,0.527275,0.472725,0,1,0
2,7,0.421118,0.578882,1,0,0
3,2,0.694862,0.305138,1,0,0
4,4,0.612294,0.387706,1,0,0


In [8]:
col = ['Avg_Num_Products_Purchased', 'Customer_Segment_Loyalty Club Only',
       'Customer_Segment_Loyalty Club and Credit Card',
       'Customer_Segment_Store Mailing List',
       'Score_Yes']

In [9]:
data_rearranged = data[['Avg_Num_Products_Purchased','Customer_Segment_Loyalty Club Only', 
                        'Customer_Segment_Loyalty Club and Credit Card', 'Customer_Segment_Store Mailing List']]
data_rearranged.head()

Unnamed: 0,Avg_Num_Products_Purchased,Customer_Segment_Loyalty Club Only,Customer_Segment_Loyalty Club and Credit Card,Customer_Segment_Store Mailing List
0,3,1,0,0
1,6,0,1,0
2,7,1,0,0
3,2,1,0,0
4,4,1,0,0


In [10]:
data['Predicted_Revenue'] = reg.predict(data_rearranged) # Predict the data

In [11]:
data['Expected_Revenue'] = data['Predicted_Revenue']*data['Score_Yes'] # Calculate the expected sales values of each customer

In [12]:
data.head()

Unnamed: 0,Avg_Num_Products_Purchased,Score_No,Score_Yes,Customer_Segment_Loyalty Club Only,Customer_Segment_Loyalty Club and Credit Card,Customer_Segment_Store Mailing List,Predicted_Revenue,Expected_Revenue
0,3,0.694964,0.305036,1,0,0,355.036364,108.298804
1,6,0.527275,0.472725,0,1,0,987.159466,466.654501
2,7,0.421118,0.578882,1,0,0,622.941184,360.609345
3,2,0.694862,0.305138,1,0,0,288.060159,87.898046
4,4,0.612294,0.387706,1,0,0,422.012569,163.616744


In [14]:
total_exp_rev=data['Expected_Revenue'].sum()
total_exp_profit = total_exp_rev*0.5 - (6.5*250)

print('Total Expected Revenue: USD', total_exp_rev.round(2) )
print('Total Expected Profit: USD', total_exp_profit.round(2) )

Total Expected Revenue: USD 47224.87
Total Expected Profit: USD 21987.44


> **Findings:** When assuming this year’s catalog is sent to these 250 customers, the expected profit from the new catalog is USD 21,987.43

## Conclusion 

> The company should send this year’s catalog to these new customers. According to the predictive analysis, the expected profit is USD 21,987.43, assuming the catalog is sent to these 250 new customers. The expected profit contribution is higher than USD10,000 which
is the cut-off suggested by the management. The expected profit is calculated according to the following steps.<br>
Using the above-mentioned regression equation, the average sales amounts of the 250 new customers are predicted. Then the predicted sales amounts are multiplied with the probability that each customer will buy the catalog. By doing so, I obtained the expected
revenue from each new customer. The aggregation value of each expected revenues is USD 47,224.87. After multiplying it by
the average gross margin (50%), I subtracted the all costs of catalog for these 250 customers. After all I ended up with the expected profit amount: USD 21,987.43. <br> <br>
USD 47,224.87 (The sum of expected revenue) * 50% (The average gross margin) - USD 6.5* 250(The total costs of 250 catalogs) <br>= 
   USD 21,987.43 (The expected profit contribution)
