# Artificial Neural Network in Python with Keras

This notebook is a simple demonstration to how to build an Artificial Neuron Network using TensorFlow to a classification problem.

**Problem Description**

The goal of program is to build a classification model to predict if a certain client will leave the bank service in the next six months.

**Dataset Description**

The dataset is composed by 10000 instances (rows) and 14 features (columns).   The features considered to build the model are:

-  RowNumber (This is not important to the model)
-  CustomerId (This is not important to the model)
-  Surname (This is not important to the model)
-  CreditScore (numerical variable)
-  Geography (categorical variable)
-  Gender (categorical variable)
-  Age (numerical variable)
-  Tenure (categorical variable)
-  Balance (numerical variable)
-  NumOfProducts (categorical variable)
-  HasCrCard (categorical variable)
-  EstimatedSalary (numerical variable)
-  Exited (target)

# Data preprocessing

## Importing Libraries

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

import keras # To build our ANN model
from keras.models import Sequential # To initialize the ANN
from keras.layers import Dense # To creat the hidden layers

## Importing the Dataset

In [2]:
dataset = pd.read_csv('Churn_Modelling.csv')

## Visualizing some informations from the dataset

In [None]:
dataset.shape

In [None]:
dataset.dtypes

In [None]:
dataset.isna().sum()

In [3]:
dataset.head(10)

Unnamed: 0,RowNumber,CustomerId,Surname,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
0,1,15634602,Hargrave,619,France,Female,42,2,0.0,1,1,1,101348.88,1
1,2,15647311,Hill,608,Spain,Female,41,1,83807.86,1,0,1,112542.58,0
2,3,15619304,Onio,502,France,Female,42,8,159660.8,3,1,0,113931.57,1
3,4,15701354,Boni,699,France,Female,39,1,0.0,2,0,0,93826.63,0
4,5,15737888,Mitchell,850,Spain,Female,43,2,125510.82,1,1,1,79084.1,0


In [None]:
dataset.describe()

# Exploratory Data Analysis (EDA)

## Visualizing the categorical variables

### Target variable

In [None]:
dataset['Exited'].value_counts()

In [None]:
plt.figure(figsize=(15, 5))
plt.subplot(1, 2, 1)
sns.countplot(x=dataset['Exited'])
plt.subplot(1, 2, 2)
values = dataset.iloc[:, - 1].value_counts(normalize = True).values # to show the binirie values in parcentage
index = dataset.iloc[:, -1].value_counts(normalize = True).index
plt.pie(values, labels= index, autopct='%1.1f%%', colors=['b', 'tab:orange'])
plt.show()

### Others categorical variables

In [None]:
categorical_list = ['Geography', 'Gender', 'Tenure', 'NumOfProducts', 'HasCrCard','IsActiveMember', 'Exited']

In [None]:
data_cat = dataset[categorical_list]

In [None]:
fig = plt.figure(figsize=(15, 15))
plt.suptitle('Pie Chart Distribution', fontsize = 20)
for i in range(1, data_cat.shape[1]):
    plt.subplot(2, 3, i)
    f = plt.gca()
    f.axes.get_yaxis().set_visible(False)
    f.set_title(data_cat.columns.values[i - 1])
  # Setting the biniries values
    values = data_cat.iloc[:, i - 1].value_counts(normalize = True).values # to show the binirie values in parcentage
    index = data_cat.iloc[:, i -1].value_counts(normalize = True).index
    plt.pie(values, labels= index, autopct='%1.1f%%')
    plt.axis('equal')
#fig.tight_layout(rect=[0, 0.03, 1, 0.95])
plt.show()

In [None]:
plt.figure(figsize=(15, 10))
for i in range(1, data_cat.shape[1]):
    plt.subplot(3, 3, i)
    sns.countplot(x=data_cat.iloc[: , i-1], hue=data_cat['Exited'])
plt.show()

## Visualizing the numerical variables

In [None]:
numerical_list = ['CreditScore', 'Age', 'Balance', 'EstimatedSalary', 'Exited']

In [None]:
data_num  = dataset[numerical_list]

### Distribution of numerical variables

In [None]:
plt.figure(figsize=(25,15))
plt.suptitle('Histograms of numerical variables (mean values)', fontsize = 20)
for i in range(1, data_num.shape[1]):
    plt.subplot(2, 2, i)
    f = plt.gca()
    sns.histplot(data_num.iloc[:, i-1], color = '#3F5D7D', kde= True)
plt.show()

In [None]:
plt.figure(figsize=(25,15))
plt.suptitle('Histograms of numerical variables (mean values)', fontsize = 20)
for i in range(1, data_num.shape[1]):
    plt.subplot(2, 2, i)
    f = plt.gca()
    sns.histplot(data=data_num, x=data_num.iloc[:, i-1], hue='Exited', kde = True)
plt.show()

## Correlation and PairPlot (scatter)

### Correlation with the response variable

In [None]:
column_drop = ['RowNumber', 'CustomerId', 'Surname', 'Exited']

In [None]:
dataset.drop(columns=column_drop).corrwith(dataset.Exited).plot.bar(
        figsize = (20, 10), title = "Correlation with Exited", fontsize = 15,
        rot = 45, grid = True)

### Correlation Between the Variables

In [None]:
column_drop = ['RowNumber', 'CustomerId', 'Surname']
## Correlation Matrix
sns.set(style="white")

# Compute the correlation matrix
corr = dataset.drop(columns=column_drop).corr()

# Generate a mask for the upper triangle
mask = np.zeros_like(corr, dtype=np.bool)
mask[np.triu_indices_from(mask)] = True

# Set up the matplotlib figure
f, ax = plt.subplots(figsize=(10, 20))

# Generate a custom diverging colormap
cmap = sns.diverging_palette(220, 10, as_cmap=True)

# Draw the heatmap with the mask and correct aspect ratio
sns.heatmap(corr, mask=mask, cmap=cmap, vmax=1, vmin=-1, center=0,
            square=True, linewidths=.5, cbar_kws={"shrink": .5}, annot = True)

### Pair plot for the numerical variables

In [None]:
sns.pairplot(data_num, hue = 'Exited', kind = 'scatter', corner=True, diag_kind='None')

## EDA conclusion

**Target variable**

Represented by 0 or 1 (stay/leave) shows a frequency of $79.6 \%$ for the customers which decided to stay in the bank against $20.4\%$ of customers that decided to leave the bank.

**Categorical variables** 

The analysis of these variables shows that the most clients are French. The majority of clients are males. The frequency of these variables are approximately equals without a fact that requires attention and manipulation, it means feature engineering.

**Numerical variables**

The distribution of these variables are normal except for the salary. One interesting aspect is the age distribution for the clients which leaved the bank, the distribution shows that the almost clients have age among 40 and 50 years. This fact can be well visualized in the scatter plots.

**Correlation**

The correlation between the target and the independent variables are not so big, but considerable to build the model. The correlation among the independent variables are satisfactory, the almost of correlation show a low values.

# Building the Artificial Neuron Network

## Data Preprocessing

### Excluding not important columns

In [None]:
dataset = dataset.drop(columns=['RowNumber', 'CustomerId', 'Surname'])

### Encoding Categorical Data

In [None]:
 dataset['Gender'] = dataset['Gender'].astype('category').cat.codes

#### One Hot Econding

In [None]:
dataset = pd.get_dummies(dataset)

In [None]:
dataset.head()

### Defining the independent and target variable

In [None]:
response = dataset['Exited']

In [None]:
i_var = dataset.drop(columns=['Exited'])

### Splitting the Dataset into train and test set

In [None]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(i_var, response, test_size = 0.2, random_state = 0)

### Feature Scaling
In almost ANN models we must to apply feature scale.

In [None]:
from sklearn.preprocessing import StandardScaler
sc_X = StandardScaler()
X_train_bckp = pd.DataFrame(sc_X.fit_transform(X_train))
X_test_bckp = pd.DataFrame(sc_X.transform(X_test))
X_train_bckp.columns = X_train.columns.values
X_test_bckp.columns = X_test.columns.values
X_train_bckp.index = X_train.index.values
X_test_bckp.index = X_test.index.values
X_train = X_train_bckp
X_test = X_test_bckp

## The Artificial Neuron Network - ANN

An Artificial Neuron Network is a technique that tries to reproduce brain functions.  To build an ANN, we consider a set of input information about something, each of these information will be considered as a neuron, we call this set of neuron input layer. As we know, in the brain we have many neurons and the communication among them is made by synapses process. The neurons from the input layer will communicate with other set of neurons in a hidden layer by the synapses process. Once the communication between the neurons of input and hidden layer is made, the initial information is changed, in this moment, the neurons of the hidden layer will communicate with a others neurons, these neurons is considered as a output layer and it gives a response about something. For example, in a classification model, if we consider five independent variables, the input layer will be composed by five neurons, these neurons communicate with the hidden layer (the number of neurons must be chosen), the hidden layer communicate with the output layer, that provides the final response 0 or 1, if we have two class.

**How do synapses work in an ANN?**

The synapses process in an ANN is made by an activation function, this function transforms the input information according with an associated weight (this might be interpreted as an importance degree to each input variable). We have some kind of activation function as
- Threshold function
- Sigmoid function
- Rectifier function
- Hyperbolic tangent function.
For this notebook we are interested in the rectifier function and sigmoid function.

The rectifier function is defined as $\phi(x) = \max(x,0)$, if a certain values is less than 0 the function returns 0, otherwise the function returns the maximum value. The synapses or the communication is made according $\sum_{i=1}^{m}w_{i}x_{i}$, where $m$ is the number of input variable. We consider the rectifier function between the input layer and the hidden layers. The activation function that we consider to make the communication among the hidden and output layer is the sigmoid function. The sigmoid function returns the probability of occurrence to certain class, this function is defined as $\phi(x) = \frac{1}{1 + e^{-x}}$.

**How do an ANN learn?**

The learning process of an ANN starts with the input parameters in a input layer, this layer communicate with the hidden layer by synapses. The next step is the communication among the hidden layer with the output layer, to obtain a response. In this stage, the learning process is not ended. After the response, the ANN must calculate the loss function to measure the precision of the prediction. Once the loss function was calculated there is the retro-propagation process, this process tries to find the minimal of the loss function changing the associated weights to each neuron. After this step, the ANN remake the previous process of synapses among the layers. The number of retro-propagation is defined as epochs. 

**About loss function**

One of the most used loss function is the cross entropy. To more details [see](https://towardsdatascience.com/cross-entropy-loss-function-f38c4ec8643e).

**About the retro-propagation**

Here, we consider the stochastic gradient descent. This method tries to minimize the loss function changing the weights of each neuron. In this process, if we have 128 instances, we choose a batch number as 32, it means that we calculate the response for the associated batch number and we returns to the begin to remake the same process to other 32 instances. To more details [see](https://en.wikipedia.org/wiki/Stochastic_gradient_descent#Adam).

**Summarizing the ANN process**

 1. Initializing the ANN
 2. Build the input layer
 3. Build the hidden layers
 4. Build the output layer
 5. Training the ANN
 
   1. Compile
   2. Train.

## Initializing the ANN

The first step is to create an object to build a sequence layer. This sequence layer takes account the input layer (the parameters that we initialize the ANN), the hidden layers and output layer. To do it, we utilize the Keras library and the object Sequencial.

In [11]:
classifier = Sequential() # To initiate the ANN

## Input layer

In this step, we create the input layer, it means, we set all independent variables (considering that these variables as input neurons). To communicate the input layer with the first hidden layer, we must to choose an activation function. For this case, we choose rectifier function (the weights to each input parameters are chosen randomly by the ann object). Units is the number of input neurons, relu is the activation function.

In [None]:
classifier.add(Dense(units=X_train.shape[1], activation='relu',
                    kernel_initializer= 'uniform',
                    input_dim=X_train.shape[1])) # Creating the hidden layer

## Hidden layers

Here, we set the hidden layers. We can put how much we want. For this problem, we consider just one hidden layer. The object to create the hidden layer is the same to the input layer, but we can change it according with the problem. The parameters are the same of the input layer.

In [None]:
layers.Dropout(0.2) # To drop 20% of input neurons, to avoid overfit.
classifier.add(Dense(units=X_train.shape[1], activation='relu',
                    kernel_initializer= 'uniform')) # Creating the hidden layer

## Output layer

The last layer is the output layer. We use the same object the we used to build the preceding layers, but here, we make some changes in the parameters. Like this problem has a binary response yes or no, the number of neuron corresponds to 1, but if the response gives more then two results (0, 1, 2, for example), we must to consider the correspondent number of responses. The second change is on the activation function. Here, we consider sigmoid activation function, for one simple reason, this gives to us the probability which will be interpreted as 0 or 1 according with the values. Likelihood less than $0.5$ is considered as 0, otherwise 1. 

In [14]:
layers.Dropout(0.2) 
classifier.add(Dense(units=1, activation='sigmoid',
                    kernel_initializer= 'uniform')) # Creating the hidden layer

## Learning process of the ANN

### Compiling the ANN

Compile the ANN is one of most important step. We select a method to optimize our ANN, stochastic gradient descent, represented by adam. The lost function is also very import, because from this function we are able to improve the accuracy, precision and other relevant parameters. The lost function that we are going to use is Binary Cross Entropy (due to have a binary response). Finally, the metric which we choose is the accuracy. Beyond this metric, we have others important metrics as F1, precision and recall.

In [15]:
classifier.compile(optimizer='adam', loss='binary_crossentropy', 
                   metrics=['accuracy', metrics.Recall(), metrics.Precision()])

### Training the ANN

Now we train the ANN. Here we have two important hype parameters. Batch size determines how many instances the stochastic gradient descent will consider to work in the minimizing process. Epochs indicates the number of retro-propagation we want to train the model.

In [16]:
classifier.fit(X_train, y_train, batch_size=32, epochs=100)

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78

Epoch 80/100
Epoch 81/100
Epoch 82/100
Epoch 83/100
Epoch 84/100
Epoch 85/100
Epoch 86/100
Epoch 87/100
Epoch 88/100
Epoch 89/100
Epoch 90/100
Epoch 91/100
Epoch 92/100
Epoch 93/100
Epoch 94/100
Epoch 95/100
Epoch 96/100
Epoch 97/100
Epoch 98/100
Epoch 99/100
Epoch 100/100


<tensorflow.python.keras.callbacks.History at 0x7f03b0700080>

## Making a single prediction

Here, we make a simple prediction. Remember, Geography and Gender was changed. We must take it account.

In [1]:
print(classifier.predict(sc_X.transform([[1, 0, 0, 600, 1, 1, 40, 60000, 2, 1, 1, 50000]])) > 0.5)

NameError: name 'classifier' is not defined

## Predicting the test results

In [17]:
y_pred = classifier.predict(X_test)
y_pred = (y_pred > 0.5) # Here we must put it, because we have as outcome the probability, but we want a binary response.

## Making the confusion matrix and metrics scores

In [18]:
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)
plt.figure()
sns.heatmap(cm, annot=True)
plt.show()

[[1539   56]
 [ 255  150]]


0.8445

In [None]:
# Here, we consider the metrics class from TensorFlow library.
m1 = metrics.Accuracy() # Object
m2 = metrics.Recall()
m3 = metrics.Precision()

m1.update_state(y_test, y_pred) # Calculating the metric
m2.update_state(y_test, y_pred)
m3.update_state(y_test, y_pred)

print('Metric results for the test set\n')

print('Accuracy {:.2f}%'.format(m1.result().numpy()*100))
print('Recall {:.2f}%'.format(m2.result().numpy()*100))
print('Precision {:.2f}%'.format(m3.result().numpy()*100))

In [None]:
score_train = classifier.evaluate(X_train, y_train)
print('\n')
print('Metric results for the training set\n')

print('Accuracy {:.2f}%'.format(score_train[1]*100))
print('Recall {:.2f}%'.format(score_train[2]*100))
print('Precision {:.2f}%'.format(score_train[3]*100))

## Metrics with cross validate

In [None]:
def build_classifier(optimizer='adam'):
    classifier = Sequential()
    classifier.add(Dense(units = X.shape[1], kernel_initializer = 'uniform', activation = 'relu', input_dim = X.shape[1]))
    classifier.add(Dense(units = X.shape[1], kernel_initializer = 'uniform', activation = 'relu'))
    classifier.add(Dense(units = X.shape[1], kernel_initializer = 'uniform', activation = 'sigmoid'))
    classifier.compile(optimizer = optimizer, loss = 'binary_crossentropy', metrics = ['accuracy'])
    return classifier

In [None]:
scoring = ['accuracy', 'recall', 'precision'] # List of metrics

from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import cross_validate
classifier = KerasClassifier(build_fn = build_classifier, batch_size = 32, epochs = 100)
accuracies = cross_validate(estimator = classifier, X = X_train, y = y_train, 
                             cv = 10, n_jobs = -1, scoring = scoring)

In [None]:
print('Metric results with cross validate cv=10')
print('\n')
print("Accuracy: {:.2f} %".format(accuracies['test_accuracy'].mean()*100))
print("Recall: {:.2f} %".format(accuracies['test_recall'].mean()*100))
print("Precision: {:.2f} %".format(accuracies['test_precision'].mean()*100))

## Boosting the model with GridSearchCV

In [26]:
scoring = {'ACC' : 'accuracy', 'REC' : 'recall', 'PC' : 'precision'} # Dictionary of metrics
from sklearn.model_selection import GridSearchCV

classifier = KerasClassifier(build_fn = build_classifier)

parameters = {'batch_size': [8, 16, 32],
              'epochs': [50, 100, 500],
              'optimizer': ['adam', 'rmsprop']}

grid_search = GridSearchCV(estimator = classifier,
                           param_grid = parameters,
                           scoring = scoring,
                           refit = 'ACC',
                           cv = 10, n_jobs = -1)



Epoch 1/500
Epoch 2/500
Epoch 3/500
Epoch 4/500
Epoch 5/500
Epoch 6/500
Epoch 7/500
Epoch 8/500
Epoch 9/500
Epoch 10/500
Epoch 11/500
Epoch 12/500
Epoch 13/500
Epoch 14/500
Epoch 15/500
Epoch 16/500
Epoch 17/500
Epoch 18/500
Epoch 19/500
Epoch 20/500
Epoch 21/500
Epoch 22/500
Epoch 23/500
Epoch 24/500
Epoch 25/500
Epoch 26/500
Epoch 27/500
Epoch 28/500
Epoch 29/500
Epoch 30/500
Epoch 31/500
Epoch 32/500
Epoch 33/500
Epoch 34/500
Epoch 35/500
Epoch 36/500
Epoch 37/500
Epoch 38/500
Epoch 39/500
Epoch 40/500
Epoch 41/500
Epoch 42/500
Epoch 43/500
Epoch 44/500
Epoch 45/500
Epoch 46/500
Epoch 47/500
Epoch 48/500
Epoch 49/500
Epoch 50/500
Epoch 51/500
Epoch 52/500
Epoch 53/500
Epoch 54/500
Epoch 55/500
Epoch 56/500
Epoch 57/500
Epoch 58/500
Epoch 59/500
Epoch 60/500
Epoch 61/500
Epoch 62/500
Epoch 63/500
Epoch 64/500
Epoch 65/500
Epoch 66/500
Epoch 67/500
Epoch 68/500
Epoch 69/500
Epoch 70/500
Epoch 71/500
Epoch 72/500
Epoch 73/500
Epoch 74/500
Epoch 75/500
Epoch 76/500
Epoch 77/500


Epoch 78/500
Epoch 79/500
Epoch 80/500
Epoch 81/500
Epoch 82/500
Epoch 83/500
Epoch 84/500
Epoch 85/500
Epoch 86/500
Epoch 87/500
Epoch 88/500
Epoch 89/500
Epoch 90/500
Epoch 91/500
Epoch 92/500
Epoch 93/500
Epoch 94/500
Epoch 95/500
Epoch 96/500
Epoch 97/500
Epoch 98/500
Epoch 99/500
Epoch 100/500
Epoch 101/500
Epoch 102/500
Epoch 103/500
Epoch 104/500
Epoch 105/500
Epoch 106/500
Epoch 107/500
Epoch 108/500
Epoch 109/500
Epoch 110/500
Epoch 111/500
Epoch 112/500
Epoch 113/500
Epoch 114/500
Epoch 115/500
Epoch 116/500
Epoch 117/500
Epoch 118/500
Epoch 119/500
Epoch 120/500
Epoch 121/500
Epoch 122/500
Epoch 123/500
Epoch 124/500
Epoch 125/500
Epoch 126/500
Epoch 127/500
Epoch 128/500
Epoch 129/500
Epoch 130/500
Epoch 131/500
Epoch 132/500
Epoch 133/500
Epoch 134/500
Epoch 135/500
Epoch 136/500
Epoch 137/500
Epoch 138/500
Epoch 139/500
Epoch 140/500
Epoch 141/500
Epoch 142/500
Epoch 143/500
Epoch 144/500
Epoch 145/500
Epoch 146/500
Epoch 147/500
Epoch 148/500
Epoch 149/500
Epoch 150/500


Epoch 231/500
Epoch 232/500
Epoch 233/500
Epoch 234/500
Epoch 235/500
Epoch 236/500
Epoch 237/500
Epoch 238/500
Epoch 239/500
Epoch 240/500
Epoch 241/500
Epoch 242/500
Epoch 243/500
Epoch 244/500
Epoch 245/500
Epoch 246/500
Epoch 247/500
Epoch 248/500
Epoch 249/500
Epoch 250/500
Epoch 251/500
Epoch 252/500
Epoch 253/500
Epoch 254/500
Epoch 255/500
Epoch 256/500
Epoch 257/500
Epoch 258/500
Epoch 259/500
Epoch 260/500
Epoch 261/500
Epoch 262/500
Epoch 263/500
Epoch 264/500
Epoch 265/500
Epoch 266/500
Epoch 267/500
Epoch 268/500
Epoch 269/500
Epoch 270/500
Epoch 271/500
Epoch 272/500
Epoch 273/500
Epoch 274/500
Epoch 275/500
Epoch 276/500
Epoch 277/500
Epoch 278/500
Epoch 279/500
Epoch 280/500
Epoch 281/500
Epoch 282/500
Epoch 283/500
Epoch 284/500
Epoch 285/500
Epoch 286/500
Epoch 287/500
Epoch 288/500
Epoch 289/500
Epoch 290/500
Epoch 291/500
Epoch 292/500
Epoch 293/500
Epoch 294/500
Epoch 295/500
Epoch 296/500
Epoch 297/500
Epoch 298/500
Epoch 299/500
Epoch 300/500
Epoch 301/500
Epoch 

Epoch 383/500
Epoch 384/500
Epoch 385/500
Epoch 386/500
Epoch 387/500
Epoch 388/500
Epoch 389/500
Epoch 390/500
Epoch 391/500
Epoch 392/500
Epoch 393/500
Epoch 394/500
Epoch 395/500
Epoch 396/500
Epoch 397/500
Epoch 398/500
Epoch 399/500
Epoch 400/500
Epoch 401/500
Epoch 402/500
Epoch 403/500
Epoch 404/500
Epoch 405/500
Epoch 406/500
Epoch 407/500
Epoch 408/500
Epoch 409/500
Epoch 410/500
Epoch 411/500
Epoch 412/500
Epoch 413/500
Epoch 414/500
Epoch 415/500
Epoch 416/500
Epoch 417/500
Epoch 418/500
Epoch 419/500
Epoch 420/500
Epoch 421/500
Epoch 422/500
Epoch 423/500
Epoch 424/500
Epoch 425/500
Epoch 426/500
Epoch 427/500
Epoch 428/500
Epoch 429/500
Epoch 430/500
Epoch 431/500
Epoch 432/500
Epoch 433/500
Epoch 434/500
Epoch 435/500
Epoch 436/500
Epoch 437/500
Epoch 438/500
Epoch 439/500
Epoch 440/500
Epoch 441/500
Epoch 442/500
Epoch 443/500
Epoch 444/500
Epoch 445/500
Epoch 446/500
Epoch 447/500
Epoch 448/500
Epoch 449/500
Epoch 450/500
Epoch 451/500
Epoch 452/500
Epoch 453/500
Epoch 

In [None]:
grid_search = grid_search.fit(X_train, y_train)

In [None]:
best_parameters = grid_search.best_params_
best_accuracy = grid_search.best_score_

# Conclusion

In this program,  we shown a simple example how to build an ANN.  The objective of build a good regression model was achieved, the model presents an accuracy of $85.9 \%$. Artificial Neuron Network has a has a wide applicability, we might also to build regression models. Whit the classification model, we are free to apply in many problems, for example image recognition. Methods like this might be helpful tool in decision-make about client polices.