# ANN Implementation

Artificial Neural Network has three layers-

Input Layer.
Hidden Layer.
Output Layer.

Data is passed to the input layer. And then the input layer passed this data to the next layer, which is a hidden layer. The hidden layer performs certain operations. And pass the result to the output layer.

# Input Layer

In terms of an artificial neural network, the input layer contains independent variables. So the independent variable 1, independent variable 2, and independent variable n.

The important thing you need to remember is that these independent variables are for one observation.

Another important point you need to know is that you need to perform some standardization or normalization on these independent variables. It depends upon the scenario. The main purpose of doing standardization or normalization is to make all values in the same range.

# Output Layer

output value can be-

Continous( Like price).
Binary( in Yes/no form).
Categorical variable.

If the output value is categorical then the important thing is, in that case, your output value is not one. It may be more than one output value.

# Synapses

Synapses are nothing but the connecting lines between two layers.

In synapses, weights are assigned to each synapse. These weights are crucial for artificial neural networks work. Weights are how neural networks learn. By adjusting the weights neural network decides what signal is important and what signal is not important.

# Hidden Layer or Neuron

Inside the neurons, the two main important steps happen-

Weighted Sum.
Activation Function.

The first step is the weighted sum, which means all of the weights assigned to the synapses are added with input values. Something like that-

[ x1.w1+x2.w2+x3.w3+………………..Xn.Wn]

After calculating the weighted sum, the activation function is applied to this weighted sum. And then the neuron decides whether to send this signal to the next layer or not.

# Implementation Steps

# Importing Essenstial libraries

In [1]:
import keras
from keras.models import Sequential
from keras.layers import Dense

# Initialize the Artificial Neural Network

In [2]:
classifier = Sequential()

The Sequential class allows us to build ANN but as a sequence of layers.

# Add the input layer and the first hidden layer

In [None]:
classifier.add(Dense(output_dim = 6, init = 'uniform', activation = 'relu', input_dim = 1))

Dense is the famous class in Tensorflow. Dense is used to add fully connected layer in ANN.

“add” is the method in the Sequential Class. output_dim represents the number of hidden neurons in the hidden layer. But there is no rule of thumb for this. That’s why I used 6. You can use any other number and check.

The activation function in the hidden layer for a fully connected neural network should be the Rectifier Activation function. That’s why I use ‘relu’.

Our Input layer has 1 neurons. Why…?

Because we have 1 independent variable.

That’s why input_dim = 1.

Now we have built our first input layer and one hidden layer.

# Add the second hidden layer

In [None]:
classifier.add(Dense(output_dim = 6, init = 'uniform', activation = 'relu'))

Here again, we are using 6 hidden neurons in the second hidden layer. Now we have added one input layer and two hidden layers. It’s time to add our output layer.

# Add the output layer

In [None]:
classifier.add(Dense(output_dim = 1, init = 'uniform', activation = 'sigmoid'))

In output layer, we need 1 neuron. Why…?

Because as you can see in the dataset, we have a dependent variable in Binary form. That means we have to predict in 0 or 1 form. That’s why only one neuron is required in the output layer.

The next thing is Activation Function. In output layer, there should be Sigmoid activation function. Why…?

Because Sigmoid activation function allows not only predict but also provides the probability of customer leave the bank or not.

# Train the ANN

# Compile the ANN

In [None]:
classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])

compile is a method of Tensorflow. “adam’ is the optimizer that can perform the stochastic gradient descent. The optimizer updates the weights during training and reduces the loss.

For evaluating our ANN model, I am gonna use Accuracy metrics. And that’s why metrics = [‘accuracy’].

# Fit the ANN to the Training set

In [None]:
classifier.fit(X_train, y_train, batch_size = 10, nb_epoch = 100)

Instead of comparing our prediction with real results one by one, it’s good to perform in a batch. That’s why I write batch_size = 10.

The neural network has to train on a certain number of epochs to improve the accuracy over time. So I decided the nb_epoch = 100. So when you run this code, you can see the accuracy in each epoch.

# Predict the Test Set Results-

In [None]:
y_pred = classifier.predict(X_test)
y_pred = (y_pred > 0.5)

y_pred > 0.5 means if y-pred is in between 0 to 0.5, then this new y_pred will become 0(False). And if y_pred is larger than 0.5, then new y_pred will become 1(True).

# Make the Confusion Matrix

For a small dataset, you can predict values. But when we have a large dataset, it’s quite impossible. And that’s why we use a confusion matrix, to clear our confusion.

In [None]:
from sklearn.metrics import confusion_matrix, accuracy_score
cm = confusion_matrix(y_test, y_pred)
print(cm)
accuracy_score(y_test,y_pred)

# how to choose Activation Function?

### Case 1- For Binary Variables

Suppose when your independent variable or input signal is in binary format (Y=0 or 1) which means in the form of 0 or 1. So in that case which activation function you should use?

So the answer is there are 2 options that you can approach with this. One is the Threshold Function because it is between 0 and 1. Therefore it fits properly and gives the result only in 0 or 1 form.

The second option is the Sigmoid Function. It is also between 0 and 1 just you need. But at the same time, you want just 0 or 1, therefore it is not exactly what you want. So in this case what you can do is just use it as the probability of Y being yes or no. And you can say that the sigmoid function tells you the probability of Y is equal to 1. Basically the closer you get to the top the more likely it is 1 or yes rather than a no or 0.

### Case-2-Categorical Variables

Here, in the first layer, we have some inputs. They are sent off to our first hidden layer and then the activation function is applied. So usually the activation function applied mostly is Rectifier Function. After applying the Rectifier function the signals pass on to the output layer. And in the output layer, Sigmoid Function is applied. After that, you get your final output.

So this combination is very common in neural networks. That means –

Hidden Layer- Rectifier Function.

Output Layer- Sigmoid Function.

#### Rectifier Function-

Rectifier Function is one of the most popular functions in artificial neural networks even though it has a kink in the curve.

φ (x) =max(x,0)

Rectifier Function goes all the way to 0 and then from 0 it’s gradually progressing as the input value increase.

In the hidden layer, mostly rectifier function is used.

#### Sigmoid Function-

Sigmoid Function Looks something like that-

φ (x) = 1/ 1+ e-x

In that case, x is the value of the weighted sum. This is the function used in Logistic Regression.

It is a nice and smooth gradual progression. So anything below 0 is just like drop off and above 0 acts approximates towards 1.

The sigmoid function is very useful in the final layer that is the output layer. Especially when you are trying to predict the probabilities.

#### Threshold Function-

In the threshold function, on the X-axis, you have a weighted sum. And on the Y-axis you have the values between 0 and 1. Basically a threshold function is very simple kind of function. The formula of threshold function is-

φ (x)={ 1 if x>=0 and 0 if x<0}

According to the threshold function, if the value is less than 0, so the threshold function passes on 0. And if the value is greater than 0 or equal to 0, then the threshold passes on 1. The threshold function is a kind of Yes/ No function. It is a very straight forward function.