# Machine Learning
## Lab \#6: Binary Classifier using ANN
### Textbook is available @ [https://www.github.com/a-mhamdi/isetbz](https://www.github.com/a-mhamdi/isetbz)
---

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

Import `sklearn`.

In [2]:
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
from sklearn.preprocessing import StandardScaler
from sklearn.compose import ColumnTransformer
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix

Import `keras`.

In [3]:
from keras.models import Sequential
from keras.layers import Dense

Using TensorFlow backend.


Load the data using `pandas`.

In [4]:
df = pd.read_csv('./datasets/Churn_Modelling.csv')

In [5]:
df.head(3)

Unnamed: 0,RowNumber,CustomerId,Surname,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
0,1,15634602,Hargrave,619,France,Female,42,2,0.0,1,1,1,101348.88,1
1,2,15647311,Hill,608,Spain,Female,41,1,83807.86,1,0,1,112542.58,0
2,3,15619304,Onio,502,France,Female,42,8,159660.8,3,1,0,113931.57,1


In [6]:
X = df.iloc[:, 3:13].values
y = df.iloc[:, 13].values

label_encoder_X_country = LabelEncoder()
label_encoder_X_gender = LabelEncoder()

X[:, 1] = label_encoder_X_country.fit_transform(X[:, 1])
X[:, 2] = label_encoder_X_gender.fit_transform(X[:, 2])

one_hot_encoder = ColumnTransformer([("Geography", OneHotEncoder(), [1])], remainder = 'passthrough')

X = one_hot_encoder.fit_transform(X)
X = np.array(X, dtype=float)
X = X[:, 1:]

Scale the features.

In [7]:
sc = StandardScaler()
X = sc.fit_transform(X)

Split the dataset into training & testing sets.

In [8]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

Define the artificial neural network architecture.

In [9]:
clf_ann = Sequential()

Input layer & first hidden layer

In [10]:
num_features = X_train.shape[1]
clf_ann.add(Dense(6, input_shape = (num_features, ), activation = 'relu'))

Second hidden layer

In [11]:
clf_ann.add(Dense(6, activation = 'relu'))

Output layer

In [12]:
num_classes = 1
clf_ann.add(Dense(num_classes, activation = 'sigmoid'))

In [13]:
clf_ann.compile('Adam', loss = 'binary_crossentropy', metrics=['accuracy'])

An overall description of the neural network architecture.

In [14]:
clf_ann.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_1 (Dense)              (None, 6)                 72        
_________________________________________________________________
dense_2 (Dense)              (None, 6)                 42        
_________________________________________________________________
dense_3 (Dense)              (None, 1)                 7         
Total params: 121
Trainable params: 121
Non-trainable params: 0
_________________________________________________________________


Fit the classifier.

In [15]:
clf_ann.fit(x=X_train, y=y_train, batch_size=200, epochs=20, verbose=1)

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<keras.callbacks.callbacks.History at 0x7f46f71cefa0>

Evaluate the model.

In [16]:
scores = clf_ann.evaluate(x=X_test, y=y_test, batch_size=100, verbose=1)



Let's predict an output.

In [17]:
y_pred = clf_ann.predict(X_test)
y_pred = (y_pred > 0.5)

Define the confusion matrix.

In [18]:
cm = confusion_matrix(y_test, y_pred)
tp, fp, fn, tn = cm.ravel()

In [19]:
print('Accuracy is about {}%.' .format(100*(tp+tn)/sum((sum(cm)))))

Accuracy is about 83.2%.


In [20]:
print('\
The loss value is: {}.\n\n\
The accuracy percentage is: {}%. '.format(scores[0], 100*scores[1]))

The loss value is: 0.4155806913971901.

The accuracy percentage is: 83.20000171661377%. 
