# Scenario: 
- Your company is about to launch a promotional campaign and you want to segment your target for this promotion. 
- Note: It's detrimental to give everyone this promotion (you're going to cannibalize yourself if you do this!!!)
- We can segment our targets based on this approach like below: 
    - **Treatment Responders**: Customers that will purchase only if they receive an offer 
    - **Treatment Non Responders**: Customers that won't purchase in any case 
    - **Control Responders**: Customers that will purchase without an offer
    - **Control Non- Responders**: Customers that will not purchase if they don't receive an offer. 

## Our Target: 
- Based on these four customer types, we need to target treatment responders and control non-responders.
- We must avoid treatment non responders and control responders (we will not benfit from targeting them). We'll just waste our money and time. 

## Two Simple Steps
1. Predict the proabilities of being in each group for all customers (aka multi-classification model)
2. Calculate the uplift score 

<img src="./visuals/upliftscore.png">

**Higher score means higher uplift**

In [1]:
# lets load important from previous notebooks 
#function to order clusters
def order_cluster(cluster_field_name, target_field_name,df,ascending):
    new_cluster_field_name = 'new_' + cluster_field_name
    df_new = df.groupby(cluster_field_name)[target_field_name].mean().reset_index()
    df_new = df_new.sort_values(by=target_field_name,ascending=ascending).reset_index(drop=True)
    df_new['index'] = df_new.index
    df_final = pd.merge(df,df_new[[cluster_field_name,'index']], on=cluster_field_name)
    df_final = df_final.drop([cluster_field_name],axis=1)
    df_final = df_final.rename(columns={"index":cluster_field_name})
    return df_final


#function for calculating the uplift
def calc_uplift(df):
    avg_order_value = 25
    
    #calculate conversions for each offer type
    base_conv = df[df.offer == 'No Offer']['conversion'].mean()
    disc_conv = df[df.offer == 'Discount']['conversion'].mean()
    bogo_conv = df[df.offer == 'Buy One Get One']['conversion'].mean()
    
    #calculate conversion uplift for discount and bogo
    disc_conv_uplift = disc_conv - base_conv
    bogo_conv_uplift = bogo_conv - base_conv
    
    #calculate order uplift
    disc_order_uplift = disc_conv_uplift * len(df[df.offer == 'Discount']['conversion'])
    bogo_order_uplift = bogo_conv_uplift * len(df[df.offer == 'Buy One Get One']['conversion'])
    
    #calculate revenue uplift
    disc_rev_uplift = disc_order_uplift * avg_order_value
    bogo_rev_uplift = bogo_order_uplift * avg_order_value
    
    
    print('Discount Conversion Uplift: {0}%'.format(np.round(disc_conv_uplift*100,2)))
    print('Discount Order Uplift: {0}'.format(np.round(disc_order_uplift,2)))
    print('Discount Revenue Uplift: ${0}\n'.format(np.round(disc_rev_uplift,2)))
    
    if len(df[df.offer == 'Buy One Get One']['conversion']) > 0:
          
        print('-------------- \n')
        print('BOGO Conversion Uplift: {0}%'.format(np.round(bogo_conv_uplift*100,2)))
        print('BOGO Order Uplift: {0}'.format(np.round(bogo_order_uplift,2)))
        print('BOGO Revenue Uplift: ${0}'.format(np.round(bogo_rev_uplift,2)))    

## Importing our data
- Our assumption for orders averaging to 25 dollars remains true

In [5]:
import pandas as pd
import numpy as np

In [3]:
df_data = pd.read_csv('./datasets/data.csv')
df_data.head()

Unnamed: 0,recency,history,used_discount,used_bogo,zip_code,is_new,channel,offer,conversion
0,10,142.44,1,0,Surburban,0,Phone,Buy One Get One,0
1,6,329.08,1,1,Rural,1,Web,No Offer,0
2,7,180.65,0,1,Surburban,1,Web,Buy One Get One,0
3,9,675.83,1,0,Rural,1,Web,Discount,0
4,2,45.34,1,0,Urban,0,Web,Buy One Get One,0


In [6]:
#let' calculate uplift 
calc_uplift(df_data)

Discount Conversion Uplift: 7.66%
Discount Order Uplift: 1631.89
Discount Revenue Uplift: $40797.35

-------------- 

BOGO Conversion Uplift: 4.52%
BOGO Order Uplift: 967.4
BOGO Revenue Uplift: $24185.01


## Feature Engineering
- We need to create four classes: TR, TN, CR, CN
- We know that the customers who received discount and BOGO offrs are treatment and the rest is control
- We will create a campaign group column to make this information visible

In [11]:
#lets first assume all customers are in the treatment group 
df_data['campaign_group'] = 'treatment'

#for rows where the offer column is no offer, change the campaign group to control 
df_data.loc[df_data['offer'] == 'No Offer', 'campaign_group'] = 'control'

### Let's look back at our terminology 
 - **Treatment Responders**: Customers that will purchase only if they receive an offer 
 - **Treatment Non Responders**: Customers that won't purchase in any case 
 - **Control Responders**: Customers that will purchase without an offer
 - **Control Non- Responders**: Customers that will not purchase if they don't receive an offer. 

In [13]:
#now lets create our labels 

#customers that didnt purchase when they didnt get the offer 
df_data['target_class'] = 0 #CN

#custoemrs that didnt receive an offer but purchased an item regardless
df_data.loc[(df_data['campaign_group'] == 'control') & (df_data['conversion'] > 0),'target_class'] = 1 #CR

#customers who received the promotional offer but didn't purchase 
df_data.loc[(df_data['campaign_group'] == 'treatment') & (df_data['conversion'] == 0),'target_class'] = 2 #TN

#customers who received an offer and purchassed an item 
df_data.loc[(df_data['campaign_group'] == 'treatment') & (df_data['conversion'] > 0),'target_class'] = 3 #TR

In [14]:
df_data['target_class'].value_counts()

2    35562
0    19044
3     7132
1     2262
Name: target_class, dtype: int64

### Our Target Labels 
The mapping of the classes are below:
    - 0 -> Control Non-Responders
    - 1 -> Control Responders
    - 2 -> Treatment Non-Responders
    - 3 -> Treatment Responders

## More Feature Engineering
- Clusters for history

In [15]:
from sklearn.cluster import KMeans

In [16]:
#instantiating and creating clusters for history
kmeans = KMeans(n_clusters=5)
kmeans.fit(df_data[['history']])
df_data['history_cluster'] = kmeans.predict(df_data[['history']])

#order the clusters 
df_data = order_cluster('history_cluster', 'history', df_data, True)

In [17]:
df_data.head()

Unnamed: 0,recency,history,used_discount,used_bogo,zip_code,is_new,channel,offer,conversion,campaign_group,target_class,history_cluster
0,10,142.44,1,0,Surburban,0,Phone,Buy One Get One,0,treatment,2,0
1,2,45.34,1,0,Urban,0,Web,Buy One Get One,0,treatment,2,0
2,6,134.83,0,1,Surburban,0,Phone,Buy One Get One,1,treatment,3,0
3,9,46.42,0,1,Urban,0,Phone,Buy One Get One,0,treatment,2,0
4,10,32.84,0,1,Urban,1,Web,Buy One Get One,0,treatment,2,0


In [18]:
#now lets copy the df and drop features that helped us create target class
df_model = df_data.drop(['offer', 'conversion', 'campaign_group'], axis = 1)

In [20]:
#use new df to change categorical vriables to dummy variables 
df_model = pd.get_dummies(df_model)
df_model.head()

Unnamed: 0,recency,history,used_discount,used_bogo,is_new,target_class,history_cluster,zip_code_Rural,zip_code_Surburban,zip_code_Urban,channel_Multichannel,channel_Phone,channel_Web
0,10,142.44,1,0,0,2,0,0,1,0,0,1,0
1,2,45.34,1,0,0,2,0,0,0,1,0,0,1
2,6,134.83,0,1,0,3,0,0,1,0,0,1,0
3,9,46.42,0,1,0,2,0,0,0,1,0,1,0
4,10,32.84,0,1,1,2,0,0,0,1,0,0,1


In [21]:
#sanity
df_model.isnull().sum().sum()

0

## Model Prep/Model Time!!!

In [22]:
#create X and y
X = df_model.drop(['target_class'], axis = 1)
y = df_model[['target_class']]

In [23]:
from sklearn.model_selection import train_test_split

In [24]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 56)

In [25]:
from xgboost import XGBClassifier

In [26]:
#fit the model and predict the proabilities 
xgb_model = XGBClassifier().fit(X_train, y_train)

  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


In [27]:
#predicting probabilities 
class_probs = xgb_model.predict_proba(X_test)

In [28]:
#lets see how this looks
#chekcing the first probability 
class_probs[0]

array([0.33018395, 0.01280369, 0.597153  , 0.05985933], dtype=float32)

**Let's look at our target class**
The mapping of the classes are below:
    - 0 -> Control Non-Responders
    - 1 -> Control Responders
    - 2 -> Treatment Non-Responders
    - 3 -> Treatment Responders

## Calculating Uplift Score for Customer 1
- CN: 32%
- CR: 2%
- TN: 58.9%
- TR: 6.9%

**Thus**, our uplift score is 
$$0.32+ 0.069 - 0.02 - 0.589 = -0.22$$

## Let's Apply this to all of our customers 

In [30]:
df_model.head()

Unnamed: 0,recency,history,used_discount,used_bogo,is_new,target_class,history_cluster,zip_code_Rural,zip_code_Surburban,zip_code_Urban,channel_Multichannel,channel_Phone,channel_Web
0,10,142.44,1,0,0,2,0,0,1,0,0,1,0
1,2,45.34,1,0,0,2,0,0,0,1,0,0,1
2,6,134.83,0,1,0,3,0,0,1,0,0,1,0
3,9,46.42,0,1,0,2,0,0,0,1,0,1,0
4,10,32.84,0,1,1,2,0,0,0,1,0,0,1


In [32]:
#predicting for ALL customers 
overall_proba = xgb_model.predict_proba(df_model.drop(['target_class'], axis = 1))

In [33]:
overall_proba

array([[0.31995273, 0.02575007, 0.57639676, 0.07790043],
       [0.28865084, 0.05036366, 0.5243716 , 0.13661386],
       [0.29879466, 0.03615357, 0.5490475 , 0.11600427],
       ...,
       [0.23988946, 0.03793477, 0.5920978 , 0.13007794],
       [0.28525636, 0.08949278, 0.40827736, 0.21697351],
       [0.27095157, 0.05023812, 0.4534411 , 0.22536916]], dtype=float32)

In [34]:
overall_proba.shape

(64000, 4)

In [36]:
#assign probabilities to 4 different columns
df_model['proba_CN'] = overall_proba[:,0] 
df_model['proba_CR'] = overall_proba[:,1] 
df_model['proba_TN'] = overall_proba[:,2] 
df_model['proba_TR'] = overall_proba[:,3]

In [37]:
df_model.head()

Unnamed: 0,recency,history,used_discount,used_bogo,is_new,target_class,history_cluster,zip_code_Rural,zip_code_Surburban,zip_code_Urban,channel_Multichannel,channel_Phone,channel_Web,proba_CN,proba_CR,proba_TN,proba_TR
0,10,142.44,1,0,0,2,0,0,1,0,0,1,0,0.319953,0.02575,0.576397,0.0779
1,2,45.34,1,0,0,2,0,0,0,1,0,0,1,0.288651,0.050364,0.524372,0.136614
2,6,134.83,0,1,0,3,0,0,1,0,0,1,0,0.298795,0.036154,0.549048,0.116004
3,9,46.42,0,1,0,2,0,0,0,1,0,1,0,0.300693,0.024607,0.564924,0.109776
4,10,32.84,0,1,1,2,0,0,0,1,0,0,1,0.308458,0.016537,0.597214,0.077791


In [38]:
#calculate uplift score for all custoemrs 
df_model['uplift_score'] = df_model.eval('proba_CN + proba_TR - proba_TN - proba_CR')


In [39]:
#assign it back to main dataframe
df_data['uplift_score'] = df_model['uplift_score']

In [40]:
df_data.head()

Unnamed: 0,recency,history,used_discount,used_bogo,zip_code,is_new,channel,offer,conversion,campaign_group,target_class,history_cluster,uplift_score
0,10,142.44,1,0,Surburban,0,Phone,Buy One Get One,0,treatment,2,0,-0.204294
1,2,45.34,1,0,Urban,0,Web,Buy One Get One,0,treatment,2,0,-0.149471
2,6,134.83,0,1,Surburban,0,Phone,Buy One Get One,1,treatment,3,0,-0.170402
3,9,46.42,0,1,Urban,0,Phone,Buy One Get One,0,treatment,2,0,-0.179062
4,10,32.84,0,1,Urban,1,Web,Buy One Get One,0,treatment,2,0,-0.227502


## Checkpoint
- Is the model really working?
- Let's see if we can use this model in real life.

## Model Evaluation 
1. High Uplift Score: Customers have uplift score > 3rd quantile
2. Low Uplift Score: Customers have uplift score < 2nd quantile

We are going to compare:
- Conversion uplift
- Revenue uplift per target customer to see if our model can make our actions more efficient.

In [41]:
calc_uplift(df_data)

Discount Conversion Uplift: 7.66%
Discount Order Uplift: 1631.89
Discount Revenue Uplift: $40797.35

-------------- 

BOGO Conversion Uplift: 4.52%
BOGO Order Uplift: 967.4
BOGO Revenue Uplift: $24185.01


In [45]:
df_data['offer'].value_counts()

Buy One Get One    21387
Discount           21307
No Offer           21306
Name: offer, dtype: int64

## Benchmarks 
1. Discount Campaign
    - Total Discount Customer Count: 21,307
    - Discount Conversion Uplift: 7.66%
    - Discount Order Uplift: 1631.89
    - Revenue Uplift Per Targeted Customer: 1.91
2. BOGO Campaign 
    - Total BOGO Customer Count: 21,387
    - BOGO Conversion Uplift: 4.52%
    - BOGO Order Uplift: 967.4
    - Revenue Uplift Per Targeted Customer: 1.13

## Calculating Uplift for Discount Campaign of High Uplift Scores 

In [48]:
df_data_lift = df_data.copy()
uplift_q_75 = df_data_lift['uplift_score'].quantile(0.75) 
df_data_lift = df_data_lift[(df_data_lift['offer'] != 'Buy One Get One') & (df_data_lift['uplift_score'] > uplift_q_75)].reset_index(drop=True)

In [49]:
df_data_lift['offer'].value_counts()

No Offer    5599
Discount    5269
Name: offer, dtype: int64

In [50]:
#calculate the uplift
calc_uplift(df_data_lift)

Discount Conversion Uplift: 12.55%
Discount Order Uplift: 661.51
Discount Revenue Uplift: $16537.67



In [52]:
len(df_data_lift[df_data_lift['offer'] == 'Discount'])

5269

In [53]:
16537/5269

3.13854621370279

## Discount Campaign of High Uplift Scores (greater than 3rd quantile)
- Discount Conversion Uplift: 12.55%
- Discount Order Uplift: 661.51
- Discount Revenue Uplift: 16537.67
- Revenue Uplift Per Targeted Customer: 3.13

### Performance
- Revenue uplift per targeted customer increased by 63%
    - 1.91 to 3.13
- Approximately 24% of the discount group is contributing 40% of discount revenue
    - 5269/21,307 = .24
    - 16537/40797 = .40

## Calculating Uplift for Discount Campaign of Low Uplift Scores (less than 2nd quantile)

In [54]:
df_data_lift = df_data.copy()
uplift_q_5 = df_data_lift.uplift_score.quantile(0.5)
df_data_lift = df_data_lift[(df_data_lift.offer != 'Buy One Get One') & (df_data_lift.uplift_score < uplift_q_5)].reset_index(drop=True)

#calculate the uplift
calc_uplift(df_data_lift)

Discount Conversion Uplift: 5.45%
Discount Order Uplift: 588.78
Discount Revenue Uplift: $14719.42



In [55]:
len(df_data_lift[df_data_lift['offer'] == 'Discount'])

10812

In [56]:
14719.42/10812

1.3613965963743988

## Discount Campaign of Low Uplift Scores (less than 2nd quantile)
- Discount Conversion Uplift: 5.45%
- Discount Order Uplift: 588.78
- Discount Revenue Uplift: 14,719.42
- Revenue Uplift Per Targeted Customer: 1.36

### Performance
- Revenue uplift per targeted customer decreased
    - 1.91 to 1.36
- Approximately 51% of the discount group is contributing 36% of discount revenue
    - 10812/21,307 = .51
    - 14,719/40797 = .36

In [57]:
10812/21307

0.5074388698549772

In [59]:
14719/40797

0.3607863323283575