# Machine Learning
## Lab \#6: Binary Classifier using ANN
### Textbook is available @ [https://www.github.com/a-mhamdi/mlpy](https://www.github.com/a-mhamdi/mlpy)
---

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

plt.rcParams['figure.dpi'] = 300

Import `sklearn`.

Import `keras`.

Load the data using `pandas`.

In [2]:
df = pd.read_csv('./datasets/Churn_Modelling.csv')

In [3]:
df.head(3)

Unnamed: 0,RowNumber,CustomerId,Surname,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
0,1,15634602,Hargrave,619,France,Female,42,2,0.0,1,1,1,101348.88,1
1,2,15647311,Hill,608,Spain,Female,41,1,83807.86,1,0,1,112542.58,0
2,3,15619304,Onio,502,France,Female,42,8,159660.8,3,1,0,113931.57,1


In [4]:
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import LabelEncoder, OneHotEncoder

In [5]:
X = df.iloc[:, 3:13].values
y = df.iloc[:, 13].values

label_encoder_X_country = LabelEncoder()
label_encoder_X_gender = LabelEncoder()

X[:, 1] = label_encoder_X_country.fit_transform(X[:, 1])
X[:, 2] = label_encoder_X_gender.fit_transform(X[:, 2])

one_hot_encoder = ColumnTransformer([('Geography', OneHotEncoder(), [1])], remainder='passthrough')

X = one_hot_encoder.fit_transform(X)
X = np.array(X, dtype=float)
X = X[:, 1:]

Scale the features.

In [6]:
from sklearn.preprocessing import StandardScaler

In [7]:
sc = StandardScaler()
X = sc.fit_transform(X)

Split the dataset into training & testing sets.

In [8]:
from sklearn.model_selection import train_test_split

In [9]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

Define the artificial neural network architecture.

In [10]:
from keras.models import Sequential

2023-11-15 22:54:34.278471: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-11-15 22:54:34.418807: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2023-11-15 22:54:34.418828: I tensorflow/compiler/xla/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2023-11-15 22:54:35.048602: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory
2023-

In [11]:
clf_ann = Sequential()

2023-11-15 22:54:35.568044: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2023-11-15 22:54:35.568069: W tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:265] failed call to cuInit: UNKNOWN ERROR (303)
2023-11-15 22:54:35.568093: I tensorflow/compiler/xla/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (e590): /proc/driver/nvidia/version does not exist
2023-11-15 22:54:35.568335: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


In [12]:
from keras.layers import Dense

Input layer & first hidden layer

In [13]:
num_features = X_train.shape[1]
clf_ann.add(Dense(6, input_shape=(num_features, ), activation='relu'))

Second hidden layer

In [14]:
clf_ann.add(Dense(6, activation='relu'))

Output layer

In [15]:
num_classes = 1
clf_ann.add(Dense(num_classes, activation='sigmoid'))

In [16]:
clf_ann.compile('Adam', loss='binary_crossentropy', metrics=['Accuracy', 'Precision', 'Recall'])

The list of metrics that can be used is accessible via the following command
```python
from keras import metrics
metrics. # Hit Tab. for autocompletion
```

An overall description of the neural network architecture.

In [17]:
clf_ann.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense (Dense)               (None, 6)                 72        
                                                                 
 dense_1 (Dense)             (None, 6)                 42        
                                                                 
 dense_2 (Dense)             (None, 1)                 7         
                                                                 
Total params: 121
Trainable params: 121
Non-trainable params: 0
_________________________________________________________________


Fit the classifier.

In [18]:
clf_ann.fit(x=X_train, y=y_train, batch_size=200, epochs=20, verbose=1)

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<keras.callbacks.History at 0x7f93782b6920>

Evaluate the model.

In [19]:
scores = clf_ann.evaluate(x=X_test, y=y_test, batch_size=100, verbose=1)



In [20]:
scores

[0.3765339255332947,
 0.8420000076293945,
 0.7149758338928223,
 0.36543211340904236]

In [21]:
print('The loss value is: {}.\n\
The accuracy, precision and recall percentages are resp. {}%, {}% and {}%'.format(scores[0], 100*scores[1], 100*scores[2], 100*scores[3]))

The loss value is: 0.3765339255332947.
The accuracy, precision and recall percentages are resp. 84.20000076293945%, 71.49758338928223% and 36.543211340904236%


Let's predict an output.

In [22]:
y_pred = clf_ann.predict(X_test)
y_pred = (y_pred > 0.5)



Get the confusion matrix.

In [23]:
from sklearn.metrics import confusion_matrix

In [24]:
cm = confusion_matrix(y_test, y_pred)
tp, fp, fn, tn = cm.ravel()

In [25]:
print('Accuracy is about {}%.' .format(100*(tp+tn)/sum((sum(cm)))))

Accuracy is about 84.2%.
