# Artifical Neural Networks: Algorithm

In the current example below, _Artifical Neural Networks (ANN)_ will be used to create a _Churn Model_. _Churn Modelling_ is the practice of analyzing current customer data to figure out when and why customers will leave a business in the future. 

## Installing Keras through Anaconda:

TensorFlow is an open-source _Deep Learning_ library developed by Google. Keras is an open-source _Neural Network_ library that serves as an abstraction layer for TenserFlow. Essentially, TenserFlow is the engine that still executes the model and uses Keras for more simplistic model creation.

__NOTE:__ Since *Tensorflow* includes Keras as module, only *Tensorflow* needs to be installed.

```bash
    # CPU Version
    conda install -c conda-forge tensorflow
    
    # GPU Version
    conda install -c anaconda tensorflow-gpu
```
<hr>

## ANN Ideology

### Rule of Thumb for Hidden Layer Neuron Count:
It is often stated as good practice that the hidden layers in an ANN to consist of a neuron count between the neuron counts for the input and output layers. As a result, a good starting point for the neuron count can be:

$$\LARGE C_H = \frac{C_I + C_O}{2}$$

__Where:__
* $C_H$: Hidden Layer Neuron Count 
* $C_I$: Input Layer Neuron Count
* $C_O$: Output Layer Neuron Count


### Activation Functions

The hidden layers of an ANN commonly use the _rectifier_ activation function, while the output layer uses the _sigmoid_ activation function.

__Sigmoid Function:__ The sigmoid function transforms all the values in the output layer into individual probabilities that represent the chance of each output occuring. All of these probability values add up to 1.

__Rectifier Function:__ Due to their nature of limiting the output, functions like the sigmoid function saturate: this means that large values may lose meaning when they are fitted into the range. Additionally, these functions are expensive to compute. Alternatively, a linear function isn't preferred because it might not allow the model to create complex correlations within the dataset. These are some of the main issues solved by the rectifer function. 

<hr>

## Keras Terminology

* __kernel_initializer__: Determines which statistical distribution or function to use when initializing weights.


* __activation__: The activation function to use in a neural layer.
    * __ReLU__: A unit employing the rectifier function is called a _Rectified Linear Unit_.


* __optimizer__: The function that determines how to tune an ANN's weights when performing backpropogation.
    * __adam__: Adam, which is derived from the name adaptive movement estimation, is one of the most popular optimizer functions to date.


* __loss:__ It is also known as the cost function.
    * __binary_crossentropy__: Can also be called _Sigmoid Cross-Entropy_. Cost is calculated by performing cross-entropy for each datapoint, with a sigmoid filter.


* __metrics:__ Used to evaluate model performance. Metrics specified won't be used in training the ANN.

<hr>

## Code

__Setting up the Dataset:__

In [3]:
import numpy as py
import matplotlib.pyplot as plt
import pandas as pd

dataset = pd.read_csv('Churn_Modelling.csv')
X = dataset.iloc[:, 3:13].values
y = dataset.iloc[:, 13].values

<hr>

__Dataset Preprocessing:__

In [2]:
from sklearn.preprocessing import OneHotEncoder, LabelEncoder
from sklearn.compose import ColumnTransformer
from sklearn.model_selection import train_test_split

# Transform gender descriptions to values 0 and 1.
X[:, 2] = LabelEncoder().fit_transform(X[:, 2])

# Perform dummy encoding on country descriptions.
ct = ColumnTransformer([('one_hot_encoder', OneHotEncoder(categories = 'auto'), [1])], remainder = 'passthrough')
X = ct.fit_transform(X)

# Prevent the dummy variable trap. 
X = X[:, 1:]

# Split the dataset into the training and test sets.
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)

# Feature Scaling
from sklearn.preprocessing import StandardScaler
sc_X = StandardScaler()
X_train = sc_X.fit_transform(X_train)
X_test = sc_X.transform(X_test)


X_train[0]

array([-0.5698444 ,  1.74309049,  0.16958176, -1.09168714, -0.46460796,
        0.00666099, -1.21571749,  0.8095029 ,  0.64259497, -1.03227043,
        1.10643166])

<hr>

__Creating the Artifical Neural Network Classifier:__

In [3]:
import tensorflow as tf
# The Sequential class consists of the ANN structure.
from tf.keras.models import Sequential
# The Dense class consists of an ANN layer.
from tf.keras.layers import Dense

# ANN Initialization
classifier = Sequential()

# Adds the first hidden layer to the ANN. Because the first hidden layer doesn't know what its input variables 
# are going to be, an input dimension parameter is used to specify the number of independent variables.
classifier.add(Dense(units = 6, kernel_initializer = 'uniform', activation = 'relu', input_dim = 11))

# Adds the second hidden layer that knows where its input values are going to come from to the ANN.
classifier.add(Dense(units = 6, kernel_initializer = 'uniform', activation = 'relu'))

# Adds the output layer consisting of one output value to the ANN.
classifier.add(Dense(units = 1, kernel_initializer = 'uniform', activation = 'sigmoid'))

# Compiles the ANN classifier.
classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])

Using TensorFlow backend.


<hr>

__Fitting the Classifier & Predicting Results:__

In [5]:
# Keras performs mini-batch gradient descent. 'epochs' determines the number of times the dataset should be looped through.
classifier.fit(X_train, y_train, batch_size = 10, epochs = 100) 

# Predicting Future Values
y_pred = classifier.predict(X_test)

# Converts values <= 0.5 to false (0) and values > 0.5 to true (1).
y_pred = (y_pred > 0.5)

from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)

cm

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78

Epoch 80/100
Epoch 81/100
Epoch 82/100
Epoch 83/100
Epoch 84/100
Epoch 85/100
Epoch 86/100
Epoch 87/100
Epoch 88/100
Epoch 89/100
Epoch 90/100
Epoch 91/100
Epoch 92/100
Epoch 93/100
Epoch 94/100
Epoch 95/100
Epoch 96/100
Epoch 97/100
Epoch 98/100
Epoch 99/100
Epoch 100/100


array([[1535,   60],
       [ 212,  193]], dtype=int64)