# Introduction 
You just accepted a job at CausalMarketing, a firm that offers marketing consulting services. 
For your first assignment, you've been tasked with helping a large online clothing store, Nosara, with their targeted advertising campaign.


Nosara is promoting a new handbag model priced at ***$100***, which is not yet available for sale. 

They randomly selected 3 million past customers and sent them emails offering the opportunity to pre-order the handbag. 

These customers were divided into two groups: 
***one received only the pre-order invitation (control group)***, and 
***the other received a 20% discount if they pre-ordered (treatment group)***. 
Each email allows the purchase of ***a single handbag***.

Nosara has ***10,979,592*** past customers who haven't been emailed yet. 
They want you to help them design a data-driven solution to 
decide which of these customers should ***receive a discount email*** and which should ***receive a regular email*** to ***maximize*** sales after discounts.

Nosara has shared two datasets:

one of customers who have already been emailed (nosara_labeled) and another of customers not yet emailed (nosara_unlabeled). The fields are:

Features (all numeric): f0, f1, f2, f3, f4, f5, f6, f7, f8, f9, f10, f11

treatment: Indicates group (1 = treated, 0 = control). 

preorder: Indicates if the customer pre-ordered (1=Yes, 0=No).

Note: Feature names were anonymized, and their values randomly projected to keep their predictive power without risking the privacy of Nosara’s customer base.

## Analysis

#### Business understanding 

Need to help the company to decide who need to receive the promotion emails to maximum the profits 
so the Benefit > cost 

#### Cost and Benefit martix 

|    | pre-order (1) | do not pre-order (0) |
| -------- | ------- | ------ |
| regular email   | BC = 100| 0 |
| 20% discount email| BC - DC = 80 | 0 |

Bag Cost(BC) : 100

Discount Cost(DC) : 20

#### Decision need to make 
10,979,592

#### what possbility we have 

|    | pre-order (1) | do not pre-order (0) |
| -------- | ------- | ------ |
| regular email (control)  | P(1\|RE) | P(0\|RE) |
| 20% discount email (treatment) | P(1\|DE) | P(0\|DE) |

P(1|RE) = from control group

P(0|RE) = from control group

P(1|DE) = from treatment group

P(0|DE) = from treatment group

#### Understand the business 

Nothing . just know it selling the bag witgh pre-order and want to get the maximum value

#### Understand the data 

2 datasets and group 

##### Group:

***one received only the pre-order invitation (control group)***  

***the other received a 20% discount if they pre-ordered (treatment group)***. 

##### Data: 

***receive a discount email***  (nosara_labeled)

***receive a regular email***   (nosara_unlabeled)

#### Modeling
Nth to do modeling is provided 



In [1]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn import preprocessing
from sklearn.linear_model import LogisticRegression

# Load Data
df_labeled = pd.read_csv("nosara_labeled.csv")
df_unlabeled = pd.read_csv("nosara_unlabeled.csv")

# Pre-process data
feature_names = ['f0','f1','f2','f3','f4','f5','f6','f7','f8','f9','f10','f11']
X = df_labeled[feature_names]
y = df_labeled.preorder
t = df_labeled.treatment
scaler = preprocessing.StandardScaler().fit(X)
X = scaler.transform(X)
X_unlabeled = scaler.transform(df_unlabeled[feature_names])

# Split data
X_tr, X_te, y_tr, y_te, t_tr, t_te = train_test_split(X, y, t, test_size=0.2, random_state=1337)

# Data is large, so be patient with the training. 
# Model for treated
model_treat = LogisticRegression().fit(X_tr[t_tr==1],y_tr[t_tr==1])
# Model for control
model_control = LogisticRegression().fit(X_tr[t_tr==0],y_tr[t_tr==0])
print("Done!")

Done!


In [2]:
# output the prediction in treatment
treat_predict=model_treat.predict(X_te[t_te==1])


## calculate the prob. on 

|    | pre-order (1) | do not pre-order (0) |
| -------- | ------- | ------ |
| 20% discount email (treatment) | P(1\|DE) | P(0\|DE) |

In [None]:
from sklearn.metrics import confusion_matrix
cm_treat = confusion_matrix(treat_predict, y_te[t_te==1])
print("test case size on E(1|RE) = {}".format(treat_predict.shape[0]))
print("confusion matrix is :")
print("[[TN, FP],")
print("[FN, TP]]")
print(cm_treat)
print("so the P(1|DE) = (E(1|FP,DE) + E(1|TP,DE))/E(1|DE)")
P_1_DE = (cm_treat[0,1]+cm_treat[1,1])/treat_predict.shape[0]
print(f"P(1|DE)={P_1_DE}")
print("so the P(0|DE) = (E(TN|0,DE) + E(FN|0,DE))/E(1|DE)")
P_0_DE = (cm_treat[0,0]+cm_treat[1,0])/treat_predict.shape[0]
print(f"P(0|DE)={P_0_DE}")


test case size on E(1|RE) = 509834
confusion matrix is :
[[TN, FP],
[FN, TP]]
[[480075  16702]
 [  5273   7784]]
so the P(1|DE) = (E(1|FP,DE) + E(1|TP,DE))/E(1|DE)
P(1|DE)=0.04802739715279875
so the P(0|RE) = (E(TN|0|RE) + E(FN|0|RE))/E(1|RE)
P(0|DE)=0.9519726028472012


## calculate the prob. on 

|    | pre-order (1) | do not pre-order (0) |
| -------- | ------- | ------ |
| regular email (control)  | P(1\|RE) | P(0\|RE) |

In [59]:
# output the prediction in treatment
control_predict=model_control.predict(X_te[t_te==0])
cm_control = confusion_matrix(control_predict, y_te[t_te==0])
print("test case size on E(1|RE) = {}".format(control_predict.shape[0]))
print("confusion matrix is :")
print("[[TN, FP],")
print("[FN, TP]]")
print(cm_control)
print("so the P(1|RE) = (E(1|FP,RE) + E(1|TP,RE))/E(1|RE)")
P_1_RE = (cm_control[0,1]+cm_control[1,1])/control_predict.shape[0]
print(f"P(1|RE)={P_1_RE}")
print("so the P(0|RE) = (E(TN|0|RE) + E(FN|0|RE))/E(1|RE)")
P_0_RE = (cm_control[0,0]+cm_control[1,0])/control_predict.shape[0]
print(f"P(0|DE)={P_0_RE}")


test case size on E(1|RE) = 90166
confusion matrix is :
[[TN, FP],
[FN, TP]]
[[86060  2549]
 [  680   877]]
so the P(1|RE) = (E(1|FP,RE) + E(1|TP,RE))/E(1|RE)
P(1|RE)=0.03799658407825566
so the P(0|RE) = (E(TN|0|RE) + E(FN|0|RE))/E(1|RE)
P(0|DE)=0.9620034159217443


In [63]:
# Effect 
effect = P_1_DE * 80 - P_1_RE * 100
print(f"Effect={effect}")
# # prob pn P(1)
# print(f"E(1) = E(1|RE)+E(1|DE)")
# E_1_DE = cm_treat[0,1]+cm_treat[1,1]
# E_1 = E_1_DE 
# print(f"# of events sending discount email={E_1}")
# print(f"# of events sending email={X_te.shape[0]}")
# P_1 = E_1/X_te.shape[0]
# print(f"P(send discount email)={P_1}")
print(f"Since we just calculate what if sending the discount email to everyone")
print(f"Assume the probility is 1 ")
# Effect of Decision Rule 
effect_decision_rule = 10979592*1*effect
print(f"Effect Decision_Rule={effect_decision_rule}")

Effect=0.04253336439833433
Since we just calculate what if sending the discount email to everyone
Assume the probility is 1 
Effect Decision_Rule=466998.9874810364


Suppose the predictive models provided by 卡洛斯 can properly estimate the probability of pre-ordering for each individual. What decision rule (expressed as an inequality) should we use to decide which email to send? If the inequality is met, we send the discount email; otherwise, we send the regular email. Use Pt to represent the probability of pre-order with the discount email and Pc to represent the probability of pre-order with the regular email.

pt*80-pc*100>0

Apply your decision rule from the previous question to the test set using the predictions from the models provided by 卡洛斯. What percentage of individuals is targeted by your rule? Enter your answer as a rounded integer. For example, if the percentage is 32.23%, enter 32 (not 0.3223).


Apply your decision rule from the previous question to the test set using the predictions from the models provided by 卡洛斯. What percentage of individuals is targeted by your rule? Enter your answer as a rounded integer. For example, if the percentage is 32.23%, enter 32 (not 0.3223).

In [87]:
# To calculate the individual probility
treat_predict_prob = model_treat.predict_proba(X_te)

control_predict_prob = model_control.predict_proba(X_te)


In [88]:
# calculate the action 
# Calculate the values for column 'value'
values = (treat_predict_prob[:, 1] * 80) - (control_predict_prob[:, 1] * 100)

# Calculate the actions for column 'action'
actions = [1 if value > 0 else 0 for value in values]

# Create a DataFrame
df_action = pd.DataFrame({'value': values, 'predict': actions,'real': y_te})

df_passing_threhold = df_action[df_action["predict"]==1]
df_fail_threhold = df_action[df_action["predict"]==0]

In [89]:
P_on_sending_email =  df_passing_threhold.shape[0] / X_te.shape[0]

print(f"the probility on sneding discount email={P_on_sending_email}")

the probility on sneding discount email=0.4190316666666667


What is the predicted average revenue effect after discounts for individuals targeted by your decision rule according to the models provided by 卡洛斯?

## calculate the prob. on 

|    | pre-order (1) | do not pre-order (0) |
| -------- | ------- | ------ |
| apply rule  | P(1\|AD) | P(0\|AD) |
| do not apply rule  | P(1\|DD) | P(0\|DD) |

In [95]:
df_action_apply_rule = df_action[(df_action["predict"]==1)]
df_do_not_apply_rule = df_action[(df_action["predict"]==0)]

print(f"size of apply rule : {df_action_apply_rule.shape[0]}")
print(f"size of do not apply rule : {df_do_not_apply_rule.shape[0]}")

p_action_apply_rule = df_action_apply_rule.shape[0] / df_action.shape[0]
p_do_not_apply_rule = df_do_not_apply_rule.shape[0] / df_action.shape[0]

avg_effect = p_action_apply_rule * 80 - p_do_not_apply_rule*100
print(f"avg_effect: {avg_effect}")
# print(f"Effect of Decision Rule: {10979592*P_on_sending_email*avg_effect}")

size of apply rule : 251419
size of do not apply rule : 348581
avg_effect: -24.574299999999994


Based on the treatment and preorder columns, what was the actual average revenue effect after discounts for those individuals targeted by your decision rule?

In [94]:
df_action_apply_rule_positive = df_action[(df_action["predict"]==1) & (df_action["real"]==1) ]
df_action_apply_rule_negative = df_action[(df_action["predict"]==1) & (df_action["real"]==0) ]
df_do_not_apply_rule_positive = df_action[(df_action["predict"]==0) & (df_action["real"]==1) ]
df_do_not_apply_rule_negative = df_action[(df_action["predict"]==0) & (df_action["real"]==0) ]

print(f"size of apply rule with pre order: {df_action_apply_rule_positive.shape[0]}")
print(f"size of apply rule without pre order: {df_action_apply_rule_negative.shape[0]}")
print(f"size of do not apply rule with pre order: {df_do_not_apply_rule_positive.shape[0]}")
print(f"size of do not apply rule without pre order: {df_do_not_apply_rule_negative.shape[0]}")

p_apply_rule_positive = df_action_apply_rule_positive.shape[0] / df_action.shape[0]
p_apply_rule_negative = df_action_apply_rule_negative.shape[0] / df_action.shape[0]
p_do_not_apply_rule_positive = df_do_not_apply_rule_positive.shape[0] / df_action.shape[0]

avg_effect = p_apply_rule_positive * 80 - p_do_not_apply_rule_positive*100
print(f"avg_effect: {avg_effect}")
# print(f"Effect of Decision Rule: {10979592*P_on_sending_email*avg_effect}")

size of apply rule with pre order: 9145
size of apply rule without pre order: 242274
size of do not apply rule with pre order: 18767
size of do not apply rule without pre order: 329814
avg_effect: -1.9084999999999999


Which of the two values you provided earlier is more reliable for assessing the model's performance in decision-making? Justify your answer.

For the remaining 10,979,592, give your best estimate of the total revenue after applying discounts if Nosara were to use your decision rule to determine which customers receive discount emails versus regular emails. Enter your answer as a rounded integer (e.g., if the revenue is $1,034.32, enter 1034).

In [96]:
print(f"Effect of Decision Rule: {10979592*P_on_sending_email*avg_effect}")

Effect of Decision Rule: -113061359.20687641


Do you believe your current data-driven solution adds business value? Justify your answer.