---

<center><h1>ADMN5015 Artificial Intelligence in Marketing</h1>
<h2>Assignment 2: Classification using Tensorflow
<h3>Katrina Ong

---

**Summary**

This project aims to classify customers who are likely to churn or not likely to churn for a tour & travel company. The dataset was obtained through [Kaggle](https://www.kaggle.com/datasets/tejashvi14/tour-travels-customer-churn-prediction). 

To build a Tensorflow classification model, the following steps are implemented in this notebook: 
1. Import Packages
2. Source the Data
3. Explore the Data
4. Prepare the Data (Feature Engineering)
5. Build and Test Tensorflow Model
6. Evaluate Tensorflow Model
7. Predict on New Cases

More details can be found in each line of code below.

**Benefits of Customer Churn Prediction**

Being able to classify customers who will churn vs. customers who will not churn allows a company to take proactive approaches in retaining customers. By know which customers will likely churn, the company can target these customers, understand their behavior, and tailor their marketing strategy accordingly.

**Communication to Management**

Effective communication to management is key in adopting machine learning methods and analytics in an organization's marketing strategy. Benefits may be communicated in terms of the value of the business that could be saved if this model is adopted. For example, the value of business lost due to customers churning can be highlighted in comparison to the prediction accuracy of the model. 

The potential savings would amount to the following:

``` Potential Savings = (Value of Business Lost x Recall Rate) - Amount of Proactive Marketing Measures Undertaken ```

*Note: Recall Rate is used since it denotes the percentage of true positives (i.e., customers who will churn) detected*

**Results**

The Tensorflow model was able to achieve the following performance metrics:
- Accuracy: 0.8348 
- Recall: 0.8039 
- Area Under the Curve (AUC): 0.8694

**Sources:** [[1]](https://stackoverflow.com/questions/36288235/how-to-get-stable-results-with-tensorflow-setting-random-seed), [[2]](https://www.dlology.com/blog/how-to-choose-last-layer-activation-and-loss-function/), [[3]](https://www.analyticsvidhya.com/blog/2020/01/fundamentals-deep-learning-activation-functions-when-to-use-them/)

---

### 1) Import Packages

In [1]:
# Importing Standard packages
import numpy as np
from scipy import stats
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import missingno as msno
import random
import os

from datetime import datetime, date, timedelta
import time
from time import strptime

In [2]:
# Feature Engineering Packages
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import StandardScaler, RobustScaler, MinMaxScaler
#from feature_engine.encoding import OneHotEncoder, OrdinalEncoder
import category_encoders as ce
from sklearn.impute import KNNImputer

ModuleNotFoundError: No module named 'category_encoders'

In [None]:
# Imbalanced learning library
from imblearn.datasets import fetch_datasets
from imblearn.under_sampling import EditedNearestNeighbours, TomekLinks
from imblearn.over_sampling import (
    SMOTE,
    BorderlineSMOTE,
    SVMSMOTE,
)

from imblearn.combine import SMOTEENN, SMOTETomek

In [None]:
# Train Test Split
from sklearn.model_selection import train_test_split

In [None]:
# Tensorflow
import tensorflow as tf

In [None]:
#Classification Metrics
from sklearn import metrics

In [None]:
# Set Options for display
pd.options.display.max_rows = 1000
pd.options.display.max_columns = 100
pd.options.display.float_format = '{:.4f}'.format
sns.set_style("whitegrid")
sns.set_context("paper", font_scale = 2)

%matplotlib inline

import warnings
warnings.filterwarnings("ignore")
warnings.simplefilter(action='ignore', category=FutureWarning)

pd.options.mode.chained_assignment = None # default='warn'

In [None]:
#Set random state
SEED = 17

def set_seeds(seed=SEED):
    os.environ['PYTHONHASHSEED'] = str(seed)
    random.seed(seed)
    tf.random.set_seed(seed)
    np.random.seed(seed)

In [None]:
#Set random state for tensorflow
def set_global_determinism(seed=SEED):
    set_seeds(seed=seed)

    os.environ['TF_DETERMINISTIC_OPS'] = '1'
    os.environ['TF_CUDNN_DETERMINISTIC'] = '1'
    
    tf.config.threading.set_inter_op_parallelism_threads(1)
    tf.config.threading.set_intra_op_parallelism_threads(1)

# Call the above function with seed value
set_global_determinism(seed=SEED)

[[Source]](https://stackoverflow.com/questions/36288235/how-to-get-stable-results-with-tensorflow-setting-random-seed) 

---

### 2) Source the Data

In [None]:
# Source data from CSV file
df = pd.read_csv('Customertravel.csv')

In [None]:
# Preview first few records
df.head(3)

In [None]:
# Preview last few records
df.tail(3)

---

### 3) Explore the Data

In [None]:
# Checking the datatypes per column
df.info()

In [None]:
# Check summary statistics of numeric variables
df.describe().T

In [None]:
# Identify and check the value counts/classes of the target variable
# 1 - Customer Churn
# 0 - Customer does not churn

df["Target"].value_counts() 

In [None]:
# Plot count of target class
plt.rcParams['font.size'] = 15
ax = sns.countplot(data = df, x = "Target")
abs_values = df['Target'].value_counts(ascending=False).values
ax.bar_label(container=ax.containers[0], labels=abs_values); #imbalanced

In [None]:
#Create Correlation Heatmap
sns.heatmap(df.corr(),annot = True, annot_kws={"fontsize":18}, cmap='OrRd');

In [None]:
# Creating a list of categorical variables
cat_cols = ['FrequentFlyer','AnnualIncomeClass','AccountSyncedToSocialMedia','BookedHotelOrNot']

In [None]:
# Plotting the counts of values for each categorical variable 
fig,axes = plt.subplots(2,2,figsize=(30,20))
plt.rcParams['font.size'] = 25

for idx,cat_col in enumerate(cat_cols):
    row,col = idx//2,idx%2
    sns.countplot(data=df,x=cat_col,ax=axes[row,col])
    abs_values = df[cat_col].value_counts(ascending=False).values
    axes[row,col].bar_label(container=axes[row,col].containers[0], labels=abs_values)

---

### 3) Prepare the Data

#### 3a) One-Hot Encoding of Categorical Columns

In [None]:
# Define columns for one-hot encodring
onehot_cols = cat_cols.copy()

In [None]:
# Removing Annual Income Class since it is an ordinal column
onehot_cols.remove("AnnualIncomeClass")
onehot_cols

In [None]:
# Instantiate One-Hot encoding
onehot_encoder = ce.one_hot.OneHotEncoder(cols=onehot_cols,use_cat_names = True)

In [None]:
# Fit-Transform to Dataset
df = onehot_encoder.fit_transform(df)

In [None]:
df.head(3)

#### 3b) Ordinal Encoding for Ordered Categorical Columns

In [None]:
# Mapping of categories for ordinal encoding - lower number for lower rank
d_incomeClass = {'col': 'AnnualIncomeClass', 'mapping': {'Low Income': 0, 'Middle Income': 1, 'High Income': 2}}

In [None]:
# Instantiate Ordinal Encoding
ordinal_encoder = ce.ordinal.OrdinalEncoder(cols='AnnualIncomeClass',mapping = [d_incomeClass])

In [None]:
# Fit-Transform to Dataset
df = ordinal_encoder.fit_transform(df)

In [None]:
#Preview Data
df.head(3)

In [None]:
df.tail(3)

#### 3c) Remove Missing Values

In [None]:
# Remove missing values
df = df.drop(df[df['FrequentFlyer_No Record'] == 1].index)

In [None]:
# Check if records with missing values have been removed
df['FrequentFlyer_No Record'].value_counts()

#### 3d) Delete Unnecessary Columns

In [None]:
# Drop No Record encoded column
df = df.drop('FrequentFlyer_No Record',axis=1)

In [None]:
# Checking datatypes and names of columns
df.info()

In [None]:
# List columns to delete
cols_to_delete = ['FrequentFlyer_No','AccountSyncedToSocialMedia_No','BookedHotelOrNot_No']

In [None]:
# Drop columns 
df = df.drop(cols_to_delete,axis=1)

In [None]:
# Checking if information has been deleted
df.info()

#### 3e) Visualize Cleaned Data 

In [None]:
# Plot count of target class
plt.rcParams['font.size'] = 15
ax = sns.countplot(data = df, x = "Target")
abs_values = df['Target'].value_counts(ascending=False).values
ax.bar_label(container=ax.containers[0], labels=abs_values); #imbalanced

In [None]:
# Creating a new list of categorical variables
cat_cols = ['FrequentFlyer_Yes','AnnualIncomeClass','AccountSyncedToSocialMedia_Yes','BookedHotelOrNot_Yes']

In [None]:
# Plotting the counts of values for each categorical variable 
fig,axes = plt.subplots(2,2,figsize=(30,20))
plt.rcParams['font.size'] = 25

for idx,cat_col in enumerate(cat_cols):
    row,col = idx//2,idx%2
    sns.countplot(data=df,x=cat_col,ax=axes[row,col])
    abs_values = df[cat_col].value_counts(ascending=False).values
    axes[row,col].bar_label(container=axes[row,col].containers[0], labels=abs_values)

#### 3f) Train-Test Split

In [None]:
#Separate Features from Target
X = df.drop('Target',axis=1)
y = df['Target']

In [None]:
# Train-Test Split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=17)

#### 3g) Resampling Imbalanced Data

In [None]:
#Instantiate SMOTE
sm = SMOTE(
    sampling_strategy='auto',  
    random_state=21,  
    k_neighbors=5,
    n_jobs=-1
)

In [None]:
#Instantiate ENN
enn = EditedNearestNeighbours(
    sampling_strategy='auto',
    n_neighbors=3,
    kind_sel='all',
    n_jobs=-1)

In [None]:
#Combine Rebalancing Methods
method = SMOTEENN(
    sampling_strategy='auto',  
    random_state=21,  
    smote=sm,
    enn=enn,
    n_jobs=-1
)

In [None]:
#Apply resampling to training data only
X_train_rs, y_train_rs = method.fit_resample(X_train,y_train)

In [None]:
y_train_rs.value_counts() #resampling improved the balance between majority class and minority class

#### 3h) Scale the Dataset

In [None]:
# Instantiate Column Transformer to apply Standard Scaler to relevant columns and ignore/passthrough remaning date time columns
ct = ColumnTransformer([("scaler", StandardScaler(),['Age','ServicesOpted'])],
                        remainder = 'passthrough') 

In [None]:
# Fit and Transform on Training Data
X_train_sc = ct.fit_transform(X_train_rs)

In [None]:
# Transform on Test Data
X_test_sc = ct.transform(X_test)

In [None]:
# Checking the dimensions of the training and testing data
X_train_sc.shape, X_test_sc.shape, y_train_rs.shape, y_test.shape

#### 3i) Convert to Tensorflow Objects

In [None]:
# Convert training and testing features to Tensorflow Objects 
X_train_tf = tf.convert_to_tensor(X_train_sc)
X_test_tf = tf.convert_to_tensor(X_test_sc)

---

### 4) Build and Test Tensorflow Model

In [None]:
# Define the model

def get_basic_model():
  model = tf.keras.Sequential([
    #normalizer,
    tf.keras.layers.Dense(6, activation='relu'),
    #tf.keras.layers.Dense(2, activation='relu'),
    tf.keras.layers.Dense(4, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid') # Possible labels 0 or 1
  ])

  model.compile(optimizer='adam',
                #loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True), #if output is more than 2
                loss=tf.keras.losses.BinaryCrossentropy(from_logits=True), #if output is 2, uncomment this
                metrics=['Accuracy','Recall','AUC'])
  
  return model

In [None]:
# Defining callback to signal the model to stop learning
callback = tf.keras.callbacks.EarlyStopping(
    monitor="loss",
    min_delta=0.0001,
    patience=30,
    verbose=0,
    mode="min",
    baseline=None,
    restore_best_weights=True,
)

In [None]:
# Train model

BATCH_SIZE = 2**8 #has to be in power of 2

model = get_basic_model()

model.fit(X_train_tf, y_train_rs, epochs=4000, batch_size=BATCH_SIZE, callbacks = [callback]);

---

### 5) Evaluate Tensorflow Model

In [None]:
# Evaluate test data based on training data

score = model.evaluate(X_test_tf, y_test, verbose=1)

print(f'Test loss: {score[0]} / Test accuracy: {score[1]} / Test Recall: {score[2]} / Test AUC: {score[3]}')

In [None]:
# Save model
model.save('travelchurn_model')

---

### 6) Predict on New Cases

In [None]:
#Create a list of columns
ind_features = list(df.columns.values).remove('Target')

In [None]:
# Create a dataframe for of 3 new cases
df_predict = pd.DataFrame(columns = ind_features)

In [None]:
#Creating new case
sample1 = {'Age': 34,
           'FrequentFlyer_Yes': 1,
           'AnnualIncomeClass': 0,
           'ServicesOpted':5,
           'AccountSyncedToSocialMedia_Yes': 1,
           'BookedHotelOrNot_Yes':0} 

In [None]:
#Creating new case
sample2 = {'Age': 45,
           'FrequentFlyer_Yes': 1,
           'AnnualIncomeClass': 2,
           'ServicesOpted':5,
           'AccountSyncedToSocialMedia_Yes': 1,
           'BookedHotelOrNot_Yes':1} 

In [None]:
#Creating new case
sample3 = {'Age': 60,
           'FrequentFlyer_Yes': 1,
           'AnnualIncomeClass': 2,
           'ServicesOpted':2,
           'AccountSyncedToSocialMedia_Yes': 0,
           'BookedHotelOrNot_Yes':1} 

In [None]:
#Add new samples to dataframe
df_predict = df_predict.append(sample1, ignore_index=True)
df_predict = df_predict.append(sample2, ignore_index=True)
df_predict = df_predict.append(sample3, ignore_index=True)


In [None]:
#Preview Dataframe
df_predict.head()

In [None]:
# Get List of columns
col_list = list(df_predict.columns.values)

In [None]:
# Scale Data
df_predict = ct.fit_transform(df_predict)

In [None]:
# Convert data back into a dataframe
df_predict = pd.DataFrame(df_predict, columns = col_list)

In [None]:
# Convert data to Tensorflow tensor

predict_numeric_features = tf.convert_to_tensor(df_predict)

predict_numeric_features

In [None]:
# Predict labels

class_names = ['Does not Churn', 'Churn']

predictions = model(predict_numeric_features, training=False)

# Create new columns in dataframe
df_predict['label'] = None
df_predict['certainty'] = None

for i, logits in enumerate(predictions):
  class_idx = tf.argmax(logits).numpy()
  p = tf.nn.softmax(logits)[class_idx]
  name = class_names[class_idx]
  print(f"Example {i} prediction: {name} ({100*p}%)")

  # Save predictions to dataframe
  df_predict["label"].iloc[i] = name
  df_predict['certainty'].iloc[i] = format(p)