## Final model
We have two Marketing Channels:
```
    Email (cost: $5/email, with 15% success rate)
    Personal Phone Call (cost: $100/call, with 50% success rate)
```

* The Final Model is based on the average return.

* For each FREE tier customer get the probability (P) that their recent activity with the system matches the PREMIUM customers 
* Predict the MRR (M) for the FREE tier customer
* Perform the following calculation:
* A1 = P .  M .  (0.15)  -  (5) (0.85) = (0.15) (P) (M) - 4.25
* A2 = P. M. (0.5) - (100) (0.5) =  0.5 P. M - 50
* If both are negative, do NOT reach the customer
* If A1 >= A2, use email
* If A1 < A2, use Phone


In [1]:
import pandas as pd



In [6]:
# Read the premium activity predictions
premium_activity_predictions = pd.read_csv('./activity_predictions.csv')
premium_activity_predictions = premium_activity_predictions[['id',
                                                             'prediction',
                                                             'premium_activity_probability']]
# Read the mrr predictions
mrr_predictions = pd.read_csv('./mrr_predictions.csv')
mrr_predictions = mrr_predictions[['id', 'MRR']]

In [7]:
premium_activity_predictions

Unnamed: 0,id,prediction,premium_activity_probability
0,202,FREE,0.0
1,203,FREE,0.0
2,204,FREE,0.0
3,205,FREE,0.0
4,206,FREE,0.0
...,...,...,...
3365,5195,FREE,0.0
3366,5196,PREMIUM,0.9
3367,5197,FREE,0.0
3368,5199,FREE,0.0


In [8]:
mrr_predictions

Unnamed: 0,id,MRR
0,202,358.460809
1,203,359.038732
2,204,358.268169
3,205,359.520334
4,206,363.662110
...,...,...
3365,5195,360.290897
3366,5196,357.786567
3367,5197,357.882887
3368,5199,357.786567


In [10]:
combined_df = premium_activity_predictions.merge(mrr_predictions, how="inner", left_on="id", right_on="id")

In [11]:
combined_df.head()

Unnamed: 0,id,prediction,premium_activity_probability,MRR
0,202,FREE,0.0,358.460809
1,203,FREE,0.0,359.038732
2,204,FREE,0.0,358.268169
3,205,FREE,0.0,359.520334
4,206,FREE,0.0,363.66211


In [12]:
combined_df['email_score'] = combined_df['premium_activity_probability'] * combined_df["MRR"] * 0.15 - 4.25
combined_df['phone_score'] = combined_df['premium_activity_probability'] * combined_df["MRR"] * 0.5 - 50

In [19]:
combined_df["Marketing Channel"] = None
combined_df.loc[(combined_df['email_score'] >= combined_df['phone_score']),"Marketing Channel"] = "email"
combined_df.loc[(combined_df['email_score'] < combined_df['phone_score']),"Marketing Channel"] = "phone"
combined_df.loc[(combined_df['email_score'] < 0) & (combined_df['phone_score'] < 0),"Marketing Channel"] = "Ignore"

In [20]:
combined_df["Marketing Channel"].value_counts()

Ignore    3223
phone      117
email       30
Name: Marketing Channel, dtype: int64

In [22]:
combined_df.to_csv("./final_predictions.csv", index=False)