# UCI_Bank_Marketing Dataset - Classification problem - _Multi-layer Perceptron _


### Approach

 1. Load the data
  - Read the dataset into a dataframe
  - Separate the dataframe into *numerical* dataframe and *categorical* dataframe
  - Scale the numerical variables in the numerical dataframe using *MinMax Scaler*
  - Create dummy variables for each categorical variable using *pandas* library
  - Concatenate the scaled numerical dataframe and the categorical dataframe (modified one with dummy variables)
 2. Define the predictor variables and target variables as arrays (*X and y*). The ***y*** columns would be equal to number of target classes.
 3. Split X and y as train and test datasets (90:10 Train-test ratio)
 4. Neural Network Model
  - Define a custom MLP model
  - Define the neural network model parameters (no. of hidden layers, weights and biases for each layer etc.)
  - Define the cost function and the optimizer to minimise the cost
  - Define accuracy for the model
 5. Train the defined model on the training set. 
  - Define epochs.
  - For each epoch, consider a defined number of batches
  - For each batch, randomly choose X_train and y_train values
  - Calculate average cost and accuracy for each epoch
 6. Validate the model on the test set - calculate the accuracy
  

### Libraries

In [None]:
import tensorflow as tf
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler

In [3]:
# Check if the GPU is available - optional - only when using google colab
device_name = tf.test.gpu_device_name()
if device_name != '/device:GPU:0':
  raise SystemError('GPU device not found')
print('Found GPU at: {}'.format(device_name))

Found GPU at: /device:GPU:0


### Load the dataset

If using the Google Colab platform, run the cells below to upload the dataset from the local machine. This works on Google Chrome only.

In [6]:
# Comment these lines of code if running from local machine 
# Select the dataset csv file from the Choose Files button and rerun the cell
from google.colab import files
uploaded = files.upload()



Saving bank-additional-full.csv to bank-additional-full (1).csv


In [7]:
import io
# Read the dataset
# Uncomment the line below if running on a local machine. Chnage the path of dataset accordingly
#Banco = pd.read_csv('/home/sanjiv/Downloads/Python-Data-Science-and-Machine-Learning-Bootcamp/Python-Data-Science-and-Machine-Learning-Bootcamp/ML Tutorials/bank-additional-full.csv', sep=';')

# Comment the line below if running from a local machine
Banco = pd.read_csv(io.StringIO(uploaded['bank-additional-full.csv'].decode('utf-8')), sep=';')

Banco.head(2)

Unnamed: 0,age,job,marital,education,default,housing,loan,contact,month,day_of_week,...,campaign,pdays,previous,poutcome,emp.var.rate,cons.price.idx,cons.conf.idx,euribor3m,nr.employed,y
0,56,housemaid,married,basic.4y,no,no,no,telephone,may,mon,...,1,999,0,nonexistent,1.1,93.994,-36.4,4.857,5191.0,no
1,57,services,married,high.school,unknown,no,no,telephone,may,mon,...,1,999,0,nonexistent,1.1,93.994,-36.4,4.857,5191.0,no


In [None]:
# Replace the 'yes' and 'no' in the target variable to 'Subscriber' and 'non-subscriber'
# Doing so will help us during the application of get dummies on the target variable
Banco['y'].replace('yes','Subscriber', inplace=True)
Banco['y'].replace('no', 'Non-subscriber', inplace=True)

In [None]:
# Create two seperate dataframes - categorial and numeric
Banco_cat = Banco.select_dtypes(exclude=[np.number])
Banco_num = Banco.select_dtypes(include=[np.number])

### Scale the variables

In [None]:
# Scale the numerical variables
min_max_scaler = MinMaxScaler()

Banco_num = pd.DataFrame(min_max_scaler.fit_transform(Banco_num), columns=Banco_num.columns)

In [None]:
# Add the numeric dataframe to a new dataframe. Concatenate the categorical variables to the new dataframe after creating dummy variables
Banco_mod = Banco_num

### Covert categorical variables to dummy variables

In [None]:
# Get dummies for all categorical variables and concatenate the each resulting dataframe to the Banco_mod dataframe
for col in Banco_cat.columns:
        Banco_mod = pd.concat([Banco_mod,pd.get_dummies(Banco_cat[col])], axis=1)

In [13]:
# Check the concatenated dataframe
Banco_mod.head(2)

Unnamed: 0,age,duration,campaign,pdays,previous,emp.var.rate,cons.price.idx,cons.conf.idx,euribor3m,nr.employed,...,fri,mon,thu,tue,wed,failure,nonexistent,success,Non-subscriber,Subscriber
0,0.481481,0.05307,0.0,1.0,0.0,0.9375,0.698753,0.60251,0.957379,0.859735,...,0,1,0,0,0,0,1,0,1,0
1,0.493827,0.030297,0.0,1.0,0.0,0.9375,0.698753,0.60251,0.957379,0.859735,...,0,1,0,0,0,0,1,0,1,0


In [14]:
Banco_mod.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 41188 entries, 0 to 41187
Data columns (total 65 columns):
age                    41188 non-null float64
duration               41188 non-null float64
campaign               41188 non-null float64
pdays                  41188 non-null float64
previous               41188 non-null float64
emp.var.rate           41188 non-null float64
cons.price.idx         41188 non-null float64
cons.conf.idx          41188 non-null float64
euribor3m              41188 non-null float64
nr.employed            41188 non-null float64
admin.                 41188 non-null uint8
blue-collar            41188 non-null uint8
entrepreneur           41188 non-null uint8
housemaid              41188 non-null uint8
management             41188 non-null uint8
retired                41188 non-null uint8
self-employed          41188 non-null uint8
services               41188 non-null uint8
student                41188 non-null uint8
technician             41188 non-nu

### Train-test split

In [None]:
# Get X and y data - Drop the dummy target classes from X
# Get the values of X and y as numpy arrays instead of pandas series
X = Banco_mod.drop(['Subscriber','Non-subscriber'],axis=1).values
y = Banco_mod[['Subscriber', 'Non-subscriber']].values           

In [None]:
# Train-test split - Considering 90:10 ratio for train and test respectively
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size= 0.1,random_state = 101)

### Neural Network - Multi-layer perceptron Model

Define a four hidden layer neural network. Activation function used is *tanh* for all the hidden layers and the output layer.

In [None]:
# Define Multi-layer perceptron model
def mlp(_X, _weights, _biases, dropout_keep_prob):
    
    # Layer 1
    layer1 = tf.nn.dropout(tf.nn.tanh(tf.add(tf.matmul(_X, _weights['h1']), _biases['b1'])), dropout_keep_prob)
    
    # Layer 2
    layer2 = tf.nn.dropout(tf.nn.tanh(tf.add(tf.matmul(layer1, _weights['h2']), _biases['b2'])), dropout_keep_prob)
    
    # Layer 3
    layer3 = tf.nn.dropout(tf.nn.tanh(tf.add(tf.matmul(layer2, _weights['h3']), _biases['b3'])), dropout_keep_prob)
    
    # Layer 4
    layer4 = tf.nn.dropout(tf.nn.tanh(tf.add(tf.matmul(layer3, _weights['h4']), _biases['b4'])), dropout_keep_prob)
    
    # Output layer
    out = tf.nn.tanh(tf.add(tf.matmul(layer4, _weights['out']), _biases['out']))
    
    return out

In [None]:
# Network parameters
# randomly assign neurons in each layer
n_input = Banco_mod.shape[1] - 2   # Input size (There are only 63 input variables)
n_hidden_1 = 100                   # 1st layer
n_hidden_2 = 200                   # 2nd layer
n_hidden_3 = 200                   # 3rd layer
n_hidden_4 = 50                    # 4th layer
n_classes = 2                      # output m classes

# X and y - tf placeholders
X = tf.placeholder(tf.float32, [None, n_input])
y = tf.placeholder(tf.float32, [None, n_classes])
dropout_keep_prob = tf.placeholder(tf.float32)

In [None]:
# Define weights and biases for each layer as dictionaries (at random)
weights = {
    'h1': tf.Variable(tf.random_normal([n_input, n_hidden_1],stddev=0.1)),
    'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2],stddev=0.1)),
    'h3': tf.Variable(tf.random_normal([n_hidden_2, n_hidden_3],stddev=0.1)),
    'h4': tf.Variable(tf.random_normal([n_hidden_3, n_hidden_4],stddev=0.1)),
    'out': tf.Variable(tf.random_normal([n_hidden_4, n_classes],stddev=0.1)),                                   
}

biases = {
    'b1': tf.Variable(tf.random_normal([n_hidden_1])),
    'b2': tf.Variable(tf.random_normal([n_hidden_2])),
    'b3': tf.Variable(tf.random_normal([n_hidden_3])),
    'b4': tf.Variable(tf.random_normal([n_hidden_4])),
    'out': tf.Variable(tf.random_normal([n_classes]))
}

### __Model__

In [None]:
# Neural Net Parameters
LEARNING_RATE = 0.005                                 # Learning rate
TRAINING_EPOCHS = 2000                                # Number of epochs
BATCH_SIZE = 100                                      # Batch size

In [None]:
# Build model to predict the target classes
pred = mlp(X, weights, biases, dropout_keep_prob)

In [22]:
# Loss/cost function 
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred,labels=y)) 

# Optimizer to minimise the loss/cost - using Adam Optimizer
optimizer = tf.train.AdamOptimizer(learning_rate = LEARNING_RATE).minimize(cost)

Instructions for updating:

Future major versions of TensorFlow will allow gradients to flow
into the labels input on backprop by default.

See tf.nn.softmax_cross_entropy_with_logits_v2.



In [None]:
# Accuracy of the predicted value
correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

### Train the model

In [24]:
# Initialise variables
initialise = tf.initialize_all_variables()

# Launch session
sess = tf.Session()
sess.run(initialise)

Instructions for updating:
Use `tf.global_variables_initializer` instead.


In [25]:
# Training loop
print("Epoch\t\tCost\tAccuracy")
print("---------------------------------")
for epoch in range(TRAINING_EPOCHS):
    avg_cost = 0.
    total_batch = int(X_train.shape[0] / BATCH_SIZE)
    
    # Loop over all batches
    for i in range(total_batch):
        
        # Choose random records for each training batch
        choice = np.random.randint(X_train.shape[0], size = BATCH_SIZE)
        batch_xs = X_train[choice, :]
        batch_ys = y_train[choice, :]
        
        # Fit using batched data
        sess.run(optimizer, feed_dict={X: batch_xs, y: batch_ys, dropout_keep_prob: 0.9})
        
        # Calculate average cost 
        avg_cost += sess.run(cost, feed_dict={X: batch_xs, y: batch_ys, dropout_keep_prob:1.})/total_batch
    
    # Display progress after every 20 epochs
    
    if epoch % 20 == 0:
        
        training_acc = sess.run(accuracy, feed_dict={X: batch_xs, y: batch_ys, dropout_keep_prob:1.})
        print ("%04d/%04d\t%.4f\t%.3f" % (epoch, TRAINING_EPOCHS, avg_cost,training_acc))
        

Epoch		Cost	Accuracy
---------------------------------
0000/2000	0.3473	0.840
0020/2000	0.2714	0.920
0040/2000	0.2692	0.910
0060/2000	0.2660	0.920
0080/2000	0.2677	0.930
0100/2000	0.2644	0.920
0120/2000	0.2572	0.890
0140/2000	0.2615	0.940
0160/2000	0.2567	0.920
0180/2000	0.2596	0.920
0200/2000	0.2581	0.950
0220/2000	0.2569	0.910
0240/2000	0.2531	0.870
0260/2000	0.2566	0.890
0280/2000	0.2580	0.910
0300/2000	0.2606	0.930
0320/2000	0.2598	0.900
0340/2000	0.2632	0.870
0360/2000	0.2591	0.870
0380/2000	0.2609	0.940
0400/2000	0.2555	0.930
0420/2000	0.2643	0.890
0440/2000	0.2581	0.850
0460/2000	0.2599	0.950
0480/2000	0.2569	0.930
0500/2000	0.2553	0.910
0520/2000	0.2564	0.940
0540/2000	0.2565	0.880
0560/2000	0.2593	0.890
0580/2000	0.2577	0.950
0600/2000	0.2581	0.880
0620/2000	0.2541	0.910
0640/2000	0.2553	0.880
0660/2000	0.2576	0.910
0680/2000	0.2556	0.960
0700/2000	0.2603	0.900
0720/2000	0.2572	0.940
0740/2000	0.2596	0.900
0760/2000	0.2626	0.920
0780/2000	0.2587	0.930
0800/2000	0.2606	0.920
08

### Validation on Test data

In [26]:
# Testing

testing_acc = sess.run(accuracy, feed_dict={X: X_test, y: y_test, dropout_keep_prob:1.})
print ("Test accuracy: %.4f" % (testing_acc))

Test accuracy: 0.9031


In [None]:
# Close the session
sess.close()

### Summary

| # of Epochs | Learning Rate   |Accuracy (%)|
|------|------|------|
| 500  |0.001 |89.05 |
| 1000 |0.001 |89.68 |
| 2000 |0.005 |90.31 |

Accuracy is lesser than __*XGBoost*__ model, however, it can improve when training epochs are increased.