## Artificial Neural Network (ANN)
An artificial neural network is an attempt to simulate the network of neurons that make up a human brain so that the computer will be able to learn things and make decisions in a humanlike manner. ANNs are created by programming regular computers to behave as though they are interconnected brain cells.

### Building Neural Network using Keras for binary classification

### Binary Classifications
Binary or binomial classification is the task of classifying the elements of a given set into two groups (predicting which group each one belongs to) on the basis of a classification rule.


## What is Keras?
Keras is a high-level neural network API which is written in Python and it wraps the efficient numerical computation libraries <b>Theano and TensorFlow</b> and allows you to define and train neural network models in just a few lines of code.


Keras can be used as a deep learning library because it also support <b>Convolutional and Recurrent Neural Networks</b>

###  

## For binary classification we are using Bank Customers data.

There are <b>10000 observations with 13 input variables and 1 output variable</b>.
Using this dataset we will build a model which will predict whether the customer will leave the bank or not.

### Below are the details of cutomer.

#### Variables Name:
1. RowNumber
2. CustomerId
3. Surname
4. CreditScore
5. Geography
6. Gender
7. Age(years)
8. Tenure
9. Balance
10. NumOfProducts
11. HasCrCard
12. IsActiveMember
13. EstimatedSalary
14. Exited(Target variable)

## Loading Dataset

First import the basic libraries
1. pandas.
2. numpy.
3. sklearn

In [1]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split

dataset = pd.read_csv('Customer_data.csv',delimiter=',')

### Describe the dataset for better understanding

In [2]:
dataset.describe(include='all')

Unnamed: 0,RowNumber,CustomerId,Surname,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
count,10000.0,10000.0,10000,10000.0,10000,10000,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0
unique,,,2932,,3,2,,,,,,,,
top,,,Smith,,France,Male,,,,,,,,
freq,,,32,,5014,5457,,,,,,,,
mean,5000.5,15690940.0,,650.5288,,,38.9218,5.0128,76485.889288,1.5302,0.7055,0.5151,100090.239881,0.2037
std,2886.89568,71936.19,,96.653299,,,10.487806,2.892174,62397.405202,0.581654,0.45584,0.499797,57510.492818,0.402769
min,1.0,15565700.0,,350.0,,,18.0,0.0,0.0,1.0,0.0,0.0,11.58,0.0
25%,2500.75,15628530.0,,584.0,,,32.0,3.0,0.0,1.0,0.0,0.0,51002.11,0.0
50%,5000.5,15690740.0,,652.0,,,37.0,5.0,97198.54,1.0,1.0,1.0,100193.915,0.0
75%,7500.25,15753230.0,,718.0,,,44.0,7.0,127644.24,2.0,1.0,1.0,149388.2475,0.0


### Note

As we can see that all features are not numerical and we do have categorical data. As we have categorical variables we need to do some data conversion of categorical variables and also we have to do some <b>feature engineering</b> on our dataset to optimize our model accuracy by providing best features to our model while training.


### What is Feature Engineering?

Feature engineering is a part of data pre-processing, where we will analyze or understand our dataset and try to find the strongest relationship of a feature with the target variable. If all feature has strong relationships with target then we don't do any update on our dataset else we will remove those columns from the dataset which are not impacting on the target variable.


###  

### Analyzing Dataset

In [3]:
dataset.head(5)

Unnamed: 0,RowNumber,CustomerId,Surname,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
0,1,15634602,Hargrave,619,France,Female,42,2,0.0,1,1,1,101348.88,1
1,2,15647311,Hill,608,Spain,Female,41,1,83807.86,1,0,1,112542.58,0
2,3,15619304,Onio,502,France,Female,42,8,159660.8,3,1,0,113931.57,1
3,4,15701354,Boni,699,France,Female,39,1,0.0,2,0,0,93826.63,0
4,5,15737888,Mitchell,850,Spain,Female,43,2,125510.82,1,1,1,79084.1,0


### Observation
1. Here We can see that <b>RowNumber,CustomerId and Surname</b> are not impacting our target variable because the decision of a customer to leave the bank does not depend on the <b>customer surname and customerid</b>. So we can consider this as <b>weak features</b> in our dataset which will minimize the accuracy of our model.
2. We can also see that <b>Geography</b> and <b>Gender</b> are categorical data so we need to do data conversion.

###  

### Modifying the dataset on the above observation

#### Removing CustomerId and Surname

In [4]:
# Features (Indepenndent Variable)
X = dataset.iloc[:, 3:13].values

# Target (Dependent Varaiable)
Y = dataset.iloc[:,-1].values

# After Removing CustomerId and Surname
print(X)

[[619 'France' 'Female' ... 1 1 101348.88]
 [608 'Spain' 'Female' ... 0 1 112542.58]
 [502 'France' 'Female' ... 1 0 113931.57]
 ...
 [709 'France' 'Female' ... 0 1 42085.58]
 [772 'Germany' 'Male' ... 1 0 92888.52]
 [792 'France' 'Female' ... 1 0 38190.78]]


#### Data Conversion

In [5]:
from sklearn.preprocessing import LabelEncoder, OneHotEncoder

labelencoder_X1 = LabelEncoder()
X[:, 1] = labelencoder_X1.fit_transform(X[:, 1])

labelencoder_X2 = LabelEncoder()
X[:, 2] = labelencoder_X1.fit_transform(X[:, 2])

# After Data Conversion
print(X)

[[619 0 0 ... 1 1 101348.88]
 [608 2 0 ... 0 1 112542.58]
 [502 0 0 ... 1 0 113931.57]
 ...
 [709 0 0 ... 0 1 42085.58]
 [772 1 1 ... 1 0 92888.52]
 [792 0 0 ... 1 0 38190.78]]


###  
### Preparing Dataset for training and testing.

We now split the input features and target variables into training and test dataset. our test dataset will be 20% of our entire dataset.

In [6]:
X_train, X_test, y_train, y_test = train_test_split(X,Y,test_size=0.2,random_state=0)

###  
<b>Since our input features are at different scales we need to standardize the input.</b>

In [7]:
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
x_train = sc.fit_transform(X_train)
x_test = sc.fit_transform(X_test)

### Implementation of keras

We have preprocessed the data and we are now ready to build the neural network. Here We are using keras to build our neural network. 

So let's import the keras library to create our first neural network layers.

To Create our neural network:-
1. We will use Sequential model to build our neural network.
2. We will use Dense library to build input, hidden and output layers of a neural network.    


### Model Architecture
1. <b>Layers</b><br>
    We have <b>10 input features</b> and one target variable. <b>2 Hidden layers</b>. Each hidden layer will have <b>6 nodes.</b><br><br>

2. <b>Activation Function</b><br>
    <b>ReLu</b> will be the activation function for hidden layers. As this is a binary classification problem we will use <b>sigmoid as the activation function</b>.<br><br>

3. <b>Loss Function</b><br>
    As this is a binary classification problem, we use <b>binary_crossentropy</b> to calculate the loss function between the actual output and the predicted output.<br><br>

4. <b>Optimizer</b><br>
    To optimize our neural network we use <b>Adam</b>. Adam stands for Adaptive moment estimation. Adam is a combination of RMSProp + Momentum.
        
<b>we use accuracy as the metrics to measure the performance of the model.</b>



 ###  

## Graphical Representation Of Neural Network Architecture 

<img src='Neural_Net_Arch.png'>

In [8]:
import keras
from keras.models import Sequential
from keras.layers import Dense

classifier = Sequential()
classifier.add(Dense(output_dim=6, activation='relu'))
classifier.add(Dense(output_dim=6, activation='relu'))
classifier.add(Dense(output_dim=1 ,activation='sigmoid'))

classifier.compile(
                    optimizer='adam',
                    loss = 'binary_crossentropy',
                    metrics = ['accuracy']
                )

Using TensorFlow backend.


Instructions for updating:
Colocations handled automatically by placer.


  
  import sys
  


### Model Training
Now we are going to fit our training data to the model we created. We use a batch_size of 10. 
This implies that we use 10 samples per gradient update.

We iterate over 50 epochs to train the model. An epoch is an iteration over the entire data set.

In [9]:
classifier.fit(x_train, y_train, batch_size=10, epochs=50)

Instructions for updating:
Use tf.cast instead.
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


<keras.callbacks.History at 0x2774ab796d8>

#### As we can see that the accuracy of our model is 86%  in 50 epocs.

<br><b>We can also evaluate the loss value & metrics values for the model in test mode using evaluate function</b></br>

In [10]:
eval_model=classifier.evaluate(x_train, y_train)
eval_model



[0.3353754553794861, 0.86275]

### Model Testing

Now we have trained model with us, so its time to test our model on test dataset.
We now predict the output for our test dataset. If the prediction is greater than 0.5 then the output is 1(True) else the output is 0(False)

If the output of our model is <b>True</b> then customer will close his/her account.

In [11]:
y_pred=classifier.predict(x_test)
result=y_pred>0.5
print(result)

[[False]
 [False]
 [False]
 ...
 [False]
 [False]
 [False]]


###  

In [12]:
# Random Check
new_pre = sc.transform(np.array([[519,0,0,32,1,0.1,0,1,1,23000]]))
result = classifier.predict(new_pre)>0.5
print(result[0][0])

True


### Above are the output of our model. If you want to do some random test please update the values in random section accordingly