# Coupon redemption classification model
**We know from the dataset that 70% of the customers never use the coupons they receive and this would lead to a waste of money and time for the company.
My objective here is to create a classification model to predict if a customer will redeem its coupons or not for the last 5 campaigns of the year. Beyond knowing which customers will redeem their coupons, it can be more interesting for a company to identify which customers won't redeem them in order to either decide on different marketing and communication actions for reaching them or to not send them coupon at all and save money.**

### Table of Contents

* [1. Dataset Creation](#section_1)
    * [A. Demographic data](#section_1_1)
    * [B. Campaign data](#section_1_2)
    * [C. Coupon redemption data](#section_1_3)
    * [D. Transaction data](#section_1_4)  
    ___
* [2. Data Exploration](#section_2)
    * [A. Shape of our data](#section_2_1)
    * [B. Data types and data completeness](#section_2_2)
    * [C. Statistical summary](#section_2_3)
    * [D. Class distribution](#section_2_4)
    * [E. Variable encoding and split](#section_2_5)
    * [F. Skew of univariate distribution](#section_2_6)
    * [G. Correlation between attributes](#section_2_7)
    ---
* [3. Data Preprocessing](#section_3)
    * [A. Split into train test](#section_3_1)
    * [B. Data transformation](#section_3_2)
    * [C. Feature selection](#section_3_3)
   ___
* [4. Model Creation](#section_4)
    * [A. Baseline model](#section_4_1)
    * [B. Model hyperparameter tuning](#section_4_2)
    ___
* [5. Model Results](#section_5)
    * [A. Accuracy and Precision scores](#section_5_1)
    * [B. Learning curve](#section_5_2)
    * [C. Confusion matrix](#section_5_3)
    ---

# 1. Dataset Creation <a class="anchor" id="section_1"></a>

In [None]:
import warnings
import numpy as np
import pandas as pd
import sklearn as sk
import matplotlib
import seaborn as sns
import matplotlib.pyplot as plt
import collections
from collections import Counter
from sklearn.decomposition import PCA
%matplotlib inline
warnings.filterwarnings('ignore')

import xgboost as xgb
from scipy import stats
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import StratifiedKFold
from sklearn.model_selection import cross_val_score
from sklearn.metrics import precision_score
from sklearn import preprocessing
from sklearn.preprocessing import PowerTransformer
from sklearn.preprocessing import LabelEncoder
from xgboost import XGBClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.metrics import precision_score
from sklearn.metrics import recall_score
from sklearn.metrics import classification_report

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))
data_folder = "/kaggle/input/dunnhumby-the-complete-journey/"

### Demographics data <a class="anchor" id="section_1_1"></a>
This table contains several demographics information concerning __802 regular customers__. let's load the data and start an exploratory data analysis 

In [None]:
df = dict()
df["hh_demographic"] = pd.read_csv(data_folder + "hh_demographic.csv")
demographic=df["hh_demographic"]
demographic["MARITAL_STATUS_CODE"].replace(['A', 'B', 'U'],['Married','Unknown','Single'],inplace=True)

demographic

### Campaign data <a class="anchor" id="section_1_2"></a>
The tables campaign_table and campaign_desc contains all the campaign information. As we want to predict the customers that won't redeem their coupons for the __next 5 planned campaigns__, we will create our model based on all the previous campaigns. 
We can see that the last five campaigns are campaigns number __21, 22, 23, 24 and 25__.

In [None]:
df["campaign_desc"] = pd.read_csv(data_folder+"campaign_desc.csv")
campaign_desc=df["campaign_desc"]
#Sort campaign by start date
campaign_desc=campaign_desc.sort_values(by=['START_DAY','CAMPAIGN'],ascending=True)
campaign_desc

In [None]:
#We exclude the last five campaigns filtering on campaigns starting before days 615. we don't consider campaign 20
campaign_desc = campaign_desc[campaign_desc['START_DAY']<615]

The table campaign_table tells us which customers received a specific campaign.

In [None]:
df["campaign_table"] = pd.read_csv(data_folder+"campaign_table.csv")
campaign_table=df["campaign_table"]
campaign_table.head(10)

From this point we can merge the two campaign tables

In [None]:
#We call campaign the new dataframe merging the dataset
campaign = pd.merge(campaign_desc[['CAMPAIGN','START_DAY']],campaign_table[['household_key','CAMPAIGN']],on="CAMPAIGN",how="left")
#Count number of campaign per household
campaign['#campaign']=campaign.groupby(by='household_key')['CAMPAIGN'].transform('count')
#Delete useless column
campaign=campaign.drop(columns=['CAMPAIGN','START_DAY'])
#Delete duplicates
campaign.drop_duplicates(subset=['household_key', '#campaign'], keep="first", inplace=True)
campaign

### Coupon redemption data <a class="anchor" id="section_1_3"></a>
This table contains all the coupons that have been redeemed by each customer. We will use it to count how many coupons a customer redeemed for each campaign.<br>
We will follow the below steps :
- Loading the dataset
- Retrieving coupons redeemed before Day 615
- Count the number of campaign for which at least one coupon has been redeemed
- Merge our campaign table, coupon redemption table and demographic table
- Define if a customer is sensitive to coupon or not

In [None]:
#Read the coupon_redempt table
df["coupon_redempt"] = pd.read_csv(data_folder+"coupon_redempt.csv")
coupon_redempt=df["coupon_redempt"]
#Keep only coupon redeemed before DAY 615
coupon_redempt=coupon_redempt[coupon_redempt['DAY']<615]
#Drop useless columns
coupon_redempt=coupon_redempt.drop(columns=['DAY','COUPON_UPC'])
#Keep only one occurence of coupon redeemed by campaign
coupon_redempt.drop_duplicates(subset=['household_key', 'CAMPAIGN'], keep="first", inplace=True)
#Count number of campaign the customer redeemed at least one coupon
redemption_per_household=coupon_redempt.groupby(['household_key'], as_index=False)['CAMPAIGN'].agg({'redeemed': pd.Series.nunique})
redemption_per_household

In [None]:
#Merging of campaign and coupon redemption tables
temp = pd.merge(campaign, redemption_per_household, on=['household_key'],how="left")
#Creation of our output variable
temp["Sensitivity"]= np.where(temp["redeemed"]>0, 'Sensible', 'Not sensible')
#Creation of our aggregated dataset. We use the inner join to keep only customers for which we have the demographics data and thoose who were part of at least one campaign
dataset= pd.merge(demographic, temp[['household_key','Sensitivity']], on=['household_key'],how="inner")

### Transactions data <a class="anchor" id="section_1_4"></a>
This table contains all the transactions made by customers during 2 years. it contains 12 columns and 2 595 732 rows.
We will use the transactional dataset to create some features for our model :
- Total sales between day 1 and day 615
- Total number of visits in the shops
- Median basket spend by customer
- Average product price purchased by customer

To clean the dataset, we will follow the below steps :
- Load the data
- Keep sales before day 615
- Exclude transaction with sales value and quantity inferior or equal to 0

In [None]:
# load the dataset
df["transaction_data"] = pd.read_csv(data_folder+"transaction_data.csv")
transaction=df["transaction_data"]
#Keep transaction before day 615
transaction=transaction[transaction['DAY']<615]
#Exclude transactions related to returns
transaction=transaction[transaction['SALES_VALUE']>0]
transaction=transaction[transaction['QUANTITY']>0]
transaction.head(20)

__Feature creation and table merging__

In [None]:
#Calculate total sales per customer
total_sales=transaction.groupby(by='household_key', as_index=False)['SALES_VALUE'].sum().rename(columns={'SALES_VALUE': 'Total_sales'})
#Calculate total number of visits per customer
total_visits=transaction.groupby(['household_key'], as_index=False)['BASKET_ID'].agg({'total_visits': pd.Series.nunique})
#Calculate median basket amount per customer
temp_basket=transaction.groupby(['household_key','BASKET_ID'], as_index=False)['SALES_VALUE'].sum()
temp_median_basket=temp_basket.groupby(['household_key'], as_index=False)['SALES_VALUE'].median().rename(columns={'SALES_VALUE': 'median_basket'})
#Calculate average product price bought per customer
temp_product=transaction.groupby(['household_key'], as_index=False)['SALES_VALUE'].mean().rename(columns={'SALES_VALUE': 'avg_price'})
dataset=dataset.merge(total_sales,on='household_key').merge(total_visits,on='household_key').merge(temp_median_basket,on='household_key').merge(temp_product,on='household_key')
dataset=dataset.drop(columns=['household_key'])
dataset

# 2. Data exploration <a class="anchor" id="section_2"></a>
Now our dataset is ready, let's discover it with descriptive statistics

### A. Shape of our data <a class="anchor" id="section_2_1"></a>
Let's use the shape() method to know the dimensions of our dataset :

In [None]:
print(dataset.shape)

We have __12 variables__ and __751 observations__ in our dataset

### B. Data types and data completeness <a class="anchor" id="section_2_2"></a>
Let's use the __*info()*__ function to get more information about our dataframe as variables types and data completness :

In [None]:
print(dataset.info())

We have __8 categorical variables__ and __4 numerical values__.  
We don't have any missing value among our 751 observations.

### C. Statistical Summary <a class="anchor" id="section_2_3"></a>
Let's use the __*describe()*__ function to have a basic description of our dataset. It will enable us to have various summary statistics :

In [None]:
pd.options.display.float_format = "{:.2f}".format
dataset.describe()

In [None]:
categorical_vars = ['AGE_DESC','MARITAL_STATUS_CODE','INCOME_DESC','HOMEOWNER_DESC','HH_COMP_DESC','KID_CATEGORY_DESC']
num_plots = len(categorical_vars)
total_cols = 2
total_rows = num_plots//total_cols
fig, axs = plt.subplots(nrows=total_rows, ncols=total_cols,
                        figsize=(7*total_cols, 7*total_rows), constrained_layout=True)
for i, var in enumerate(categorical_vars):
    row = i//total_cols
    pos = i % total_cols    
    plot = sns.countplot(x=var, data=dataset, ax=axs[row][pos],hue='Sensitivity',palette="Set1")

In [None]:
categorical_vars = ['AGE_DESC','MARITAL_STATUS_CODE','INCOME_DESC','HOMEOWNER_DESC','HH_COMP_DESC','KID_CATEGORY_DESC']
num_plots = len(categorical_vars)
total_cols = 2
total_rows = num_plots//total_cols
fig, axs = plt.subplots(nrows=total_rows, ncols=total_cols,
                        figsize=(7*total_cols, 7*total_rows), constrained_layout=True)
for i, var in enumerate(categorical_vars):
    row = i//total_cols
    pos = i % total_cols
    plot = sns.barplot(x=var, y='median_basket',data=dataset, ax=axs[row][pos],palette="Set1",estimator=np.median)

We can see the median basket is the __highest__ for :
- Married households
- Households with 3 kids and more
- Households with an income superior than 250k+

### D. Class distribution <a class="anchor" id="section_2_4"></a>
On classification problems, analyzing the class distribution is always an important step as highly imbalanced data are common and need special treatment. Let's check if data are imbalanced :

In [None]:
target = dataset['Sensitivity']
counter = Counter(target)
for k,v in counter.items():
    per = v / len(target) * 100
    print('Class=%s, Count=%d, Percentage=%.2f%%' % (k, v, per))

__62%__ of our customers are not sensitive to coupons. Hence our dataset is sligthly imbalanced with more customers not sensible to coupons.<br> We will keep that in mind when choosing our cross validation technique.

### E. Variable encoding and split <a class="anchor" id="section_2_5"></a>
These two intermediate steps are needed in order to :
1. Separate the columns of our dataset into input patterns (X) and output patterns (Y)
2. Transform our eight categorical variables in numeric variables

In [None]:
#1. Split data into X and Y
X=dataset.drop(columns=['Sensitivity'])
Y=dataset['Sensitivity']

#2.A. Encode string class values as integers
label_encoder = preprocessing.LabelEncoder()
label_encoder = label_encoder.fit(dataset['Sensitivity'])
label_encoded_y = label_encoder.transform(dataset['Sensitivity'])

#2.A. Encode Income values as integers
X['INCOME_DESC'].replace(['Under 15K', '15-24K', '25-34K', '35-49K', '50-74K', '75-99K', '100-124K', '125-149K', '150-174K', '175-199K', '200-249K', '250K+'],[0,1,2,3,4,5,6,7,8,9,10,11],inplace=True)

#2.A. Encode Income values as integers
X['AGE_DESC'].replace(['19-24', '25-34', '35-44', '45-54', '55-64', '65+'],[0,1,2,3,4,5],inplace=True)

#2.A. Label encoding the other categorical data
labelencoder_X_1 = LabelEncoder()
X['MARITAL_STATUS_CODE'] = labelencoder_X_1.fit_transform(X['MARITAL_STATUS_CODE'])
labelencoder_X_2 = LabelEncoder()
X['HOMEOWNER_DESC'] = labelencoder_X_2.fit_transform(X['HOMEOWNER_DESC'])
labelencoder_X_3 = LabelEncoder()
X['HH_COMP_DESC'] = labelencoder_X_3.fit_transform(X['HH_COMP_DESC'])
labelencoder_X_4 = LabelEncoder()
X['HOUSEHOLD_SIZE_DESC'] = labelencoder_X_4.fit_transform(X['HOUSEHOLD_SIZE_DESC'])
X["KID_CATEGORY_DESC"].replace(['None/Unknown','3+'],[0,3],inplace=True)
X['HOUSEHOLD_SIZE_DESC'] = X.HOUSEHOLD_SIZE_DESC.astype(float)
X['KID_CATEGORY_DESC'] = X.KID_CATEGORY_DESC.astype(float)

### F. Skew of univariate distribution <a class="anchor" id="section_2_6"></a>
This steps is one of the most important as many algorithm performance would depend on the normal distribution of the variables.
XGboost is a non-parametric model and non-parametric models are rarely affected by skewed features. However normalizing features will not have a negative effect on our models’ performance. Let's check if our data are skewed :

In [None]:
#Let's plot the Skewness by decreasing order
num_feats=X.dtypes[X.dtypes!='object'].index
skew_feats=X[num_feats].skew().sort_values(ascending=False)
skewness=pd.DataFrame({'Skew':skew_feats})
print(skewness)

Value close to 0 show __less skew__.  
Plotting the distribution is the fastest way to know if an attribute is Gaussian or skewed.

In [None]:
df = pd.DataFrame(data=X, columns=['AGE_DESC','MARITAL_STATUS_CODE','INCOME_DESC','HOMEOWNER_DESC','HH_COMP_DESC','KID_CATEGORY_DESC','Total_sales','total_visits','median_basket','avg_price'])
#Permet de tracer les courbes de distribution de toutes les variables
nd = pd.melt(df, value_vars =df )
n1 = sns.FacetGrid (nd, col='variable', col_wrap=5, sharex=False, sharey = False)
n1 = n1.map(sns.distplot, 'value')
n1

We can see we have variables with Multinomial distribution and variables with Gaussian-like distribution with a long right tail. This is confirmed with the positive values of the skewness

### G. Correlation between attributes <a class="anchor" id="section_2_7"></a>
Now we want to be sure we don't have variables with similar information. For that we will plot the correlation matrix.
Correlation refers to the relationship between two variables and how they move together.
Even if XGBoost manages very well correlated features, it's always a good step to check for multicollinearity

In [None]:
# Finding the relations between the variables
plt.figure(figsize=(20,10))
c= X.corr(method='spearman')
sns.heatmap(c,annot=True)
c

We can see that Households_SIZE_DESC,KID_CATEGORY_DESC and HH_COMP_DESC are correlated. Even if as said, XGboost handle correlated features, we will remove HOUSEHOLD_SIZE_DESC

In [None]:
#remove attribute
X=X.drop(columns=['HOUSEHOLD_SIZE_DESC'])

# 3. Data preprocessing <a class="anchor" id="section_3"></a>

### A. Split data into train test <a class="anchor" id="section_3_1"></a>

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X,label_encoded_y ,
test_size=0.3, random_state=7,shuffle=True)

### B. Data transform <a class="anchor" id="section_3_2"></a>
To give our data a more Gaussian distribution, we will use the power transform method

In [None]:
#instantiate 
pt = PowerTransformer(method='yeo-johnson', standardize=True) 

#Fit the data to the powertransformer
rescaler = pt.fit(X_train)

#Lets get the Lambdas that were found
print (rescaler.lambdas_)

calc_lambdas = rescaler.lambdas_

#Transform the data 
X_train_resc = rescaler.transform(X_train)
X_test_resc=rescaler.transform(X_test)

#Pass the transformed data into a new dataframe 
df_xt = pd.DataFrame(data=X_train_resc, columns=['AGE_DESC','MARITAL_STATUS_CODE','INCOME_DESC','HOMEOWNER_DESC','HH_COMP_DESC','KID_CATEGORY_DESC','Total_sales','total_visits','median_basket','avg_price'])

df_xt.describe()

Now our variables are scaled to a 0 mean and unit variance thanks to Standardization and they have a more Gaussian distribution thanks to Power Transform transformation. We can perform a PCA for feature selection  
(PCA doesn't necessarily assume the dataset to be Gaussian distributed)

### C. Feature selection <a class="anchor" id="section_3_3"></a>
We will use the PCA transformation. Our first step will be to choose the number of components

In [None]:
pca = PCA().fit(X_train_resc)

plt.rcParams["figure.figsize"] = (12,11)

fig, ax = plt.subplots()
xi = np.arange(1, 11, step=1)
y = np.cumsum(pca.explained_variance_ratio_)

plt.ylim(0.0,1.1)
plt.plot(xi, y, marker='o', linestyle='--', color='b')

plt.xlabel('Number of Components')
plt.xticks(np.arange(0, 11, step=1))
plt.ylabel('Cumulative variance (%)')
plt.title('The number of components needed to explain variance')

plt.axhline(y=0.90, color='r', linestyle='-')
plt.text(0.5, 0.90, '90% cut-off threshold', color = 'red', fontsize=16)

ax.grid(axis='x')
plt.show()

We will choose the number of components that __explains 90% of the variance__. From the graph we can see that we need __7 components__.

In [None]:
# on standardized data
pca_std = PCA(n_components=7).fit(X_train_resc)
X_train_PCA = pca_std.transform(X_train_resc)
X_test_PCA = pca_std.transform(X_test_resc)
pca_std.explained_variance_ratio_

# 4. Model creation <a class="anchor" id="section_4"></a>

### A. Model baseline <a class="anchor" id="section_4_1"></a>

In [None]:
from sklearn.model_selection import cross_val_score

#check the performance of the XGBoost model without tune parameters
# fit model on training data
from sklearn.model_selection import KFold
from sklearn.model_selection import cross_val_score
model = xgb.XGBClassifier()
kfold = StratifiedKFold(n_splits=5, random_state=7,shuffle=True)


Accuracy = cross_val_score(model, X_train_PCA, y_train, cv=kfold,scoring='accuracy')
Precision = cross_val_score(model, X_train_PCA, y_train, cv=kfold,scoring='precision')

print("Accuracy: %.1f%% (%.1f%%)" % (Accuracy.mean()*100, Accuracy.std()*100))
print("Precision: %.1f%% (%.1f%%)" % (Precision.mean()*100, Precision.std()*100))

To assess the performance of our model, we will concentrate on __two evaluation metrics :__  
- Accuracy
- Precision  

Precision is very interesting here as we would like to minimize as much as possible __Type 1 error__ also known as False positive. Indeed, we would like to avoid the case where a customer is predicted to be sensitive to coupons while in fact he is not. This would lead to money loss as we would print and send him/her coupons for nothing.  
Our model report an __accuracy of 66.7%__ and a __precision of 55.8%__. This means that when our model predict that a customer is sensitive to coupon, it's correct 55.8% of the time.
We will try to improve our model performance with some tuning.

### B. Model hyperparameter tuning <a class="anchor" id="section_4_2"></a>

In [None]:
# grid search to tune algorithm
model = xgb.XGBClassifier()
n_estimators = [50,100, 200, 300, 400,450, 500,600,1000]
learning_rate = [0.01,0.05,0.1,0.2]
max_depth= range(2,8)
gamma=[0, 0.25, 0.5, 0.7, 0.9, 1.0]

param_grid = dict(gamma=gamma,learning_rate=learning_rate, n_estimators=n_estimators,max_depth=max_depth)
eval_set=[(X_train_PCA, y_train), (X_test_PCA, y_test)]
kfold = StratifiedKFold(n_splits=5, random_state=7,shuffle=True)
grid_search = GridSearchCV(model, param_grid,scoring='precision', n_jobs=-1, cv=kfold)
grid_result = grid_search.fit(X_train_PCA, y_train,early_stopping_rounds= 20,eval_metric= ["logloss"],eval_set=eval_set,verbose=20)                       

# summarize result
print("Best: %.1f%% using %s" % (grid_result.best_score_*100, grid_result.best_params_))

__We now fit model on training data with the optimized parameters found__

In [None]:
# fit model on training data
model = xgb.XGBClassifier(learning_rate = 0.05,\
                          max_depth=2,\
                          n_estimators=200,\
                          gamma=0.7,\
                          objective = 'binary:logistic',\
                         )
fit_params={'early_stopping_rounds': 20, 
            'eval_metric': 'logloss',
            'verbose': False,
            'eval_set': [(X_train_PCA, y_train), (X_test_PCA, y_test)]}
                         
kfold =  StratifiedKFold(n_splits=5, random_state=7,shuffle=True)

Accuracy = cross_val_score(model, X_train_PCA, y_train, cv=kfold,scoring='accuracy',fit_params = fit_params)
Precision = cross_val_score(model, X_train_PCA, y_train, cv=kfold,scoring='precision',fit_params = fit_params)

print("Accuracy: %.1f%%" % (Accuracy.mean()*100))
print("Precision: %.1f%%" % (Precision.mean()*100))

After hyperparameter tuning, our accuracy __improved to 69%__ and our __precision to 62%__. Let's now discover the performance of our model on the test set.

# 5. Model results <a class="anchor" id="section_5"></a>

### A. Accuracy and Precision scores <a class="anchor" id="section_5_1"></a>

In [None]:
model = xgb.XGBClassifier(learning_rate = 0.05,\
                          max_depth=2,\
                          n_estimators=200,\
                          gamma=0.7,\
                          objective = 'binary:logistic',\
                          )

eval_set = [(X_train_PCA, y_train), (X_test_PCA, y_test)]
model.fit(X_train_PCA, y_train, early_stopping_rounds=20, eval_metric=["error","logloss"], eval_set=eval_set, verbose=False)
# make predictions for test data
predictions = model.predict(X_test_PCA)
# evaluate predictions
accuracy = accuracy_score(y_test, predictions)
precision = precision_score(y_test, predictions)

print("accuracy: %.2f%%" % (accuracy * 100.0))
print("Precision: %.2f%%" % (precision * 100.0))
print(classification_report(y_test, predictions,   labels=[1,0]))

Performance on test set reports an __accuracy of 67%__ and a __precision of 70%__

### B. Learning curves <a class="anchor" id="section_5_2"></a>

In [None]:
# retrieve performance metrics
results = model.evals_result()
epochs = len(results['validation_0']['error'])
x_axis = range(0, epochs)
# plot log loss
fig, ax = plt.subplots()
ax.plot(x_axis, results['validation_0']['logloss'], label='Train')
ax.plot(x_axis, results['validation_1']['logloss'], label='Test')
ax.legend()
plt.ylabel('Log Loss')
plt.title('XGBoost Log Loss')
plt.show()
# plot classification error
fig, ax = plt.subplots()
ax.plot(x_axis, results['validation_0']['error'], label='Train')
ax.plot(x_axis, results['validation_1']['error'], label='Test')
ax.legend()
plt.ylabel('Classification Error')
plt.title('XGBoost Classification Error')
plt.show()

We can see from the graph that __overfitting has been avoided__ thanks to early stopping. The best iteration has been found earlier at __round 105__ and this is confirmed graphically : we can see the test curve start slowly increasing again at this point 

### C. Confusion matrix <a class="anchor" id="section_5_3"></a>

In [None]:
# confusion marix for the test data
cm = sk.metrics.confusion_matrix(y_test, predictions,  labels=[1,0])

fig, ax= plt.subplots(figsize=(10,10))
sns.heatmap(cm, annot=True, fmt='g', ax = ax); 

# labels, title and ticks
ax.set_xlabel('Predicted labels');
ax.set_ylabel('True labels'); 
ax.set_title('Confusion Matrix'); 
ax.xaxis.set_ticklabels(['Sensible','Not sensible']); 
ax.yaxis.set_ticklabels(['Sensible','Not sensible']);

In conclusion we can use this model to predict if a customer is __sensitive__ to coupons __or not__. Compared to before, when we were sending coupons to customers, only 38% of them used to redeem their coupons meaning 62% of them were not interested. Now before each campaign we can use our prediction model to identify customers who won't use their coupons and either remove them from the mailing list or decide on different marketing actions to address them