# Deep Learning via ANN

## Importing libraries

In [1]:
import numpy as np
import pandas as pd
import tensorflow as tf

In [2]:
tf.__version__

'2.12.0'

## Importing dataset

In [3]:
dataset = pd.read_csv('American Express User Exit Prediction.csv')
X = dataset.iloc[:, 0:-1].values
y = dataset.iloc[:, -1].values

In [None]:
print(X)

[[553 'Delhi' 'Female' ... 4 1 274150]
 [447 'Bengaluru' 'Male' ... 4 1 519360]
 [501 'Delhi' 'Female' ... 4 1 545501]
 ...
 [627 'Mumbai' 'Female' ... 4 0 494067]
 [600 'Bengaluru' 'Female' ... 2 1 109375]
 [553 'Delhi' 'Male' ... 4 1 180031]]


## Encoding categorical data

### Gender column : Label Encoding

In [4]:
from sklearn.preprocessing import LabelEncoder
label_encoder = LabelEncoder()
X[:, 2] = label_encoder.fit_transform(X[:, 2])

In [5]:
print(X)

[[553 'Delhi' 0 ... 4 1 274150]
 [447 'Bengaluru' 1 ... 4 1 519360]
 [501 'Delhi' 0 ... 4 1 545501]
 ...
 [627 'Mumbai' 0 ... 4 0 494067]
 [600 'Bengaluru' 0 ... 2 1 109375]
 [553 'Delhi' 1 ... 4 1 180031]]


### Geography column : One hot Encoding

In [6]:
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder
ct = ColumnTransformer(transformers = [('encoder', OneHotEncoder(), [1])], remainder='passthrough')
X = np.array(ct.fit_transform(X))

In [7]:
print(X)

[[0.0 1.0 0.0 ... 4 1 274150]
 [1.0 0.0 0.0 ... 4 1 519360]
 [0.0 1.0 0.0 ... 4 1 545501]
 ...
 [0.0 0.0 1.0 ... 4 0 494067]
 [1.0 0.0 0.0 ... 2 1 109375]
 [0.0 1.0 0.0 ... 4 1 180031]]


## Splitting dataset into Training & Test set

In [9]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 21)

## Feature Scaling

In [10]:
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

## ANN

### Initialization

In [None]:
ann = tf.keras.models.Sequential()

### Adding input layer and first hidden layer

In [None]:
ann.add(tf.keras.layers.Dense(units=6, activation='relu'))

### Adding second hidden layer

In [None]:
ann.add(tf.keras.layers.Dense(units=6, activation='relu'))

### Adding output layer

In [None]:
ann.add(tf.keras.layers.Dense(units=1, activation='sigmoid'))

## ANN Training

### Compiling ANN

In [None]:
ann.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])

### Training on training dataset

In [14]:
ann.fit(X_train, y_train, batch_size = 32, epochs = 120)

Epoch 1/120
Epoch 2/120
Epoch 3/120
Epoch 4/120
Epoch 5/120
Epoch 6/120
Epoch 7/120
Epoch 8/120
Epoch 9/120
Epoch 10/120
Epoch 11/120
Epoch 12/120
Epoch 13/120
Epoch 14/120
Epoch 15/120
Epoch 16/120
Epoch 17/120
Epoch 18/120
Epoch 19/120
Epoch 20/120
Epoch 21/120
Epoch 22/120
Epoch 23/120
Epoch 24/120
Epoch 25/120
Epoch 26/120
Epoch 27/120
Epoch 28/120
Epoch 29/120
Epoch 30/120
Epoch 31/120
Epoch 32/120
Epoch 33/120
Epoch 34/120
Epoch 35/120
Epoch 36/120
Epoch 37/120
Epoch 38/120
Epoch 39/120
Epoch 40/120
Epoch 41/120
Epoch 42/120
Epoch 43/120
Epoch 44/120
Epoch 45/120
Epoch 46/120
Epoch 47/120
Epoch 48/120
Epoch 49/120
Epoch 50/120
Epoch 51/120
Epoch 52/120
Epoch 53/120
Epoch 54/120
Epoch 55/120
Epoch 56/120
Epoch 57/120
Epoch 58/120
Epoch 59/120
Epoch 60/120
Epoch 61/120
Epoch 62/120
Epoch 63/120
Epoch 64/120
Epoch 65/120
Epoch 66/120
Epoch 67/120
Epoch 68/120
Epoch 69/120
Epoch 70/120
Epoch 71/120
Epoch 72/120
Epoch 73/120
Epoch 74/120
Epoch 75/120
Epoch 76/120
Epoch 77/120
Epoch 78

<keras.callbacks.History at 0x7d6aa8405120>


```
# ANN Code Explanation
```

1. `ann = tf.keras.models.Sequential()`: This line initializes a sequential model in Keras. A sequential model is a linear stack of layers, where you can add one layer at a time.

2. `ann.add(tf.keras.layers.Dense(units = 6, activation = 'relu'))`: This line adds the first dense (fully connected) layer to the neural network. Here's what the arguments mean:
   - `units = 6`: This specifies that the layer should have 6 neurons or units. The number of units in a layer is a hyperparameter and can be adjusted based on the problem at hand.
   - `activation = 'relu'`: The Rectified Linear Unit (ReLU) activation function is used in this layer. It's a common choice for hidden layers in neural networks. ReLU introduces non-linearity into the model.

3. `ann.add(tf.keras.layers.Dense(units = 6, activation = 'relu'))`: This line adds another hidden layer with 6 units and ReLU activation. Multiple hidden layers can help the network learn complex patterns in the data.

4. `ann.add(tf.keras.layers.Dense(units = 1, activation = 'sigmoid'))`: This line adds the output layer to the neural network. Here:
   - `units = 1`: This indicates a single neuron in the output layer, which suggests that this network might be used for a binary classification task (e.g., 0 or 1).
   - `activation = 'sigmoid'`: The sigmoid activation function is used in the output layer, which is common for binary classification. It squashes the output to a range between 0 and 1, representing the probability of the positive class.

5. `ann.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])`: This line compiles the neural network, specifying:
   - `optimizer = 'adam'`: Adam is a popular optimization algorithm used for training neural networks. It adapts the learning rate during training.
   - `loss = 'binary_crossentropy'`: This is the loss function used for binary classification tasks. It measures the error between the predicted values and the actual target values.
   - `metrics = ['accuracy']`: During training, it will track and display the accuracy as a metric.

6. `ann.fit(X_train, y_train, batch_size = 32, epochs = 120)`: This line trains the neural network:
   - `X_train` and `y_train` are assumed to be the training data and labels.
   - `batch_size = 32`: It specifies that the training data should be divided into batches of size 32 for each iteration of training. This is a common practice to improve training efficiency.
   - `epochs = 120`: It specifies the number of training iterations (epochs) to run. The model will go through the entire training dataset 120 times to learn the patterns in the data.

Overall, this code defines a simple feedforward neural network with two hidden layers and an output layer for binary classification. It uses the Adam optimizer, binary cross-entropy loss, and accuracy as the evaluation metric for training. The specific architecture and hyperparameters can be adjusted based on the problem

# Predictions

### Single Prediction

In [16]:
print(ann.predict(sc.transform([[0.0, 1.0, 0.0, 501, 0, 32, 2, 0.0, 4, 1, 545501]])) > 0.5)

[[False]]


### Predication on testset

In [18]:
y_pred = ann.predict(X_test)
y_pred = (y_pred > 0.5)
print(np.concatenate((y_pred.reshape(len(y_pred),1), y_test.reshape(len(y_test),1)),1))

[[0 0]
 [1 0]
 [1 1]
 ...
 [1 1]
 [0 0]
 [0 0]]


### Confusion Matrix

In [22]:
from sklearn.metrics import confusion_matrix, accuracy_score
cm = confusion_matrix(y_test, y_pred)
print(cm)
accuracy_score(y_test, y_pred)

[[1536   66]
 [ 225  159]]


0.8534743202416919