# ANN Implementation using Churn Modelling Dataset, from Kaggle


#### Owner: https://www.mltut.com/implementation-of-artificial-neural-network-in-python/

Artificial Neural Network can be used for both classification and regression. And here we are going to use ANN for classification.

# 1. Data Preprocessing
## 1.1 Import the Libraries

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

NumPy is an open-source Python library used to perform various mathematical and scientific tasks. NumPy is used for working with arrays. It also has functions for working in the domain of linear algebra, Fourier transform, and matrices.

Matplotlib is a plotting library, that is used for creating a figure, plotting area in a figure, plot some lines in a plotting area, decorates the plot with labels, etc.

Pandas is a tool used for data wrangling and analysis.

So in step 1, we imported all required libraries. Now the next step is-

## 1.2 Load the Dataset

In [5]:
df = pd.read_csv('Churn_Modelling.csv')
df.head()

Unnamed: 0,RowNumber,CustomerId,Surname,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
0,1,15634602,Hargrave,619,France,Female,42,2,0.0,1,1,1,101348.88,1
1,2,15647311,Hill,608,Spain,Female,41,1,83807.86,1,0,1,112542.58,0
2,3,15619304,Onio,502,France,Female,42,8,159660.8,3,1,0,113931.57,1
3,4,15701354,Boni,699,France,Female,39,1,0.0,2,0,0,93826.63,0
4,5,15737888,Mitchell,850,Spain,Female,43,2,125510.82,1,1,1,79084.1,0


In [10]:
df.shape # rows x columns

(10000, 14)

In [23]:
dt = df.to_dict()
# dt

## 1.3 Split Dataset into X and Y

In [8]:
X = pd.DataFrame(df.iloc[:, 3:13].values) # rows all, columns 3 to 13, convert them into values starting from zero (0)
X

Unnamed: 0,0,1,2,3,4,5,6,7,8,9
0,619,France,Female,42,2,0.0,1,1,1,101348.88
1,608,Spain,Female,41,1,83807.86,1,0,1,112542.58
2,502,France,Female,42,8,159660.8,3,1,0,113931.57
3,699,France,Female,39,1,0.0,2,0,0,93826.63
4,850,Spain,Female,43,2,125510.82,1,1,1,79084.1
...,...,...,...,...,...,...,...,...,...,...
9995,771,France,Male,39,5,0.0,2,1,0,96270.64
9996,516,France,Male,35,10,57369.61,1,1,1,101699.77
9997,709,France,Female,36,7,0.0,1,0,1,42085.58
9998,772,Germany,Male,42,3,75075.31,2,1,0,92888.52


In [12]:
y = pd.DataFrame(df.iloc[:, 13].values)
y

Unnamed: 0,0
0,1
1,0
2,1
3,0
4,0
...,...
9995,0
9996,0
9997,1
9998,1


## 1.4 Encode Categorical Data
### label encoding for Gender variable

In [29]:
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
from sklearn.compose import ColumnTransformer
labelEncoder_X_2 = LabelEncoder()
X.loc[:, 2] = labelEncoder_X_2.fit_transform(X.iloc[:, 2])
X

Unnamed: 0,0,1,2,3,4,5,6,7,8,9
0,619,0,0,42,2,0.0,1,1,1,101348.88
1,608,2,0,41,1,83807.86,1,0,1,112542.58
2,502,0,0,42,8,159660.8,3,1,0,113931.57
3,699,0,0,39,1,0.0,2,0,0,93826.63
4,850,2,0,43,2,125510.82,1,1,1,79084.1
...,...,...,...,...,...,...,...,...,...,...
9995,771,0,1,39,5,0.0,2,1,0,96270.64
9996,516,0,1,35,10,57369.61,1,1,1,101699.77
9997,709,0,0,36,7,0.0,1,0,1,42085.58
9998,772,1,1,42,3,75075.31,2,1,0,92888.52


### label encoding for Geography variable

In [30]:
labelEncoder_X_1 = LabelEncoder()
X.loc[:, 1] = labelEncoder_X_1.fit_transform(X.iloc[:, 1])
X

Unnamed: 0,0,1,2,3,4,5,6,7,8,9
0,619,0,0,42,2,0.0,1,1,1,101348.88
1,608,2,0,41,1,83807.86,1,0,1,112542.58
2,502,0,0,42,8,159660.8,3,1,0,113931.57
3,699,0,0,39,1,0.0,2,0,0,93826.63
4,850,2,0,43,2,125510.82,1,1,1,79084.1
...,...,...,...,...,...,...,...,...,...,...
9995,771,0,1,39,5,0.0,2,1,0,96270.64
9996,516,0,1,35,10,57369.61,1,1,1,101699.77
9997,709,0,0,36,7,0.0,1,0,1,42085.58
9998,772,1,1,42,3,75075.31,2,1,0,92888.52


### one-hot encoding for Geography variable

In [102]:
onehotencoder = ColumnTransformer([('Geography', OneHotEncoder(), [1])], remainder = 'passthrough')
onehotencoder

ColumnTransformer(remainder='passthrough',
                  transformers=[('Geography', OneHotEncoder(), [1])])

In [103]:
# X.loc[:, 1] = labelEncoder_X_1.fit_transform(X.iloc[:, 1])
# X = onehotencoder.fit_transform(X).toarray()
# X = X[:, 1:]
X = onehotencoder.fit_transform(X)
X
# labelEncoder_X_1 = LabelEncoder()
# X[:, 1] = labelEncoder_X_1.fit_transform(X[:, 1])

array([[1.0, 0.0, 1.0, ..., 1, 1, 101348.88],
       [1.0, 0.0, 1.0, ..., 0, 1, 112542.58],
       [1.0, 0.0, 1.0, ..., 1, 0, 113931.57],
       ...,
       [1.0, 0.0, 1.0, ..., 0, 1, 42085.58],
       [0.0, 1.0, 0.0, ..., 1, 0, 92888.52],
       [1.0, 0.0, 1.0, ..., 1, 0, 38190.78]], dtype=object)

In [104]:
X

array([[1.0, 0.0, 1.0, ..., 1, 1, 101348.88],
       [1.0, 0.0, 1.0, ..., 0, 1, 112542.58],
       [1.0, 0.0, 1.0, ..., 1, 0, 113931.57],
       ...,
       [1.0, 0.0, 1.0, ..., 0, 1, 42085.58],
       [0.0, 1.0, 0.0, ..., 1, 0, 92888.52],
       [1.0, 0.0, 1.0, ..., 1, 0, 38190.78]], dtype=object)

## 1.5 Split the X and Y Dataset into the Training set and Test set

In [105]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

## 1.6 Perform Feature Scaling

In [110]:
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.fit_transform(X_test)
X

array([[1.0, 0.0, 1.0, ..., 1, 1, 101348.88],
       [1.0, 0.0, 1.0, ..., 0, 1, 112542.58],
       [1.0, 0.0, 1.0, ..., 1, 0, 113931.57],
       ...,
       [1.0, 0.0, 1.0, ..., 0, 1, 42085.58],
       [0.0, 1.0, 0.0, ..., 1, 0, 92888.52],
       [1.0, 0.0, 1.0, ..., 1, 0, 38190.78]], dtype=object)

# 2. Build Artificial Neural Network
## 2.1 Import the Keras libraries and packages

In [113]:
import keras
from keras.models import Sequential
from keras.layers import Dense

## 2.2 Initialize the Artificial Neural Network

In [115]:
classifier = Sequential()

## 2.3 Add the input layer and the first hidden layer

In [116]:
classifier.add(Dense(output_dim=6, init='uniform', activation='relu', input_dim=11))

TypeError: __init__() missing 1 required positional argument: 'units'

## 2.4 Add the second hidden layer

In [117]:
classifier.add(Dense(output_dim=6, init='uniform', activation='relu'))

TypeError: __init__() missing 1 required positional argument: 'units'

## 2.5 Add the output layer

In [118]:
classifier.add(Dense(output_dim=1, init='uniform', activation='sigmoid'))

TypeError: __init__() missing 1 required positional argument: 'units'

# 3. Train the ANN
The training part requires two steps- **Compile the ANN**, and **Fit the ANN** to the Training set. So let’s start with the first step-

## 3.1 Compile the ANN

In [119]:
classifier.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

## 3.2 Fit the ANN to the Training set

In [120]:
classifier.fit(X_train, y_train, batch_size=10, nb_epoch=100)

TypeError: fit() got an unexpected keyword argument 'nb_epoch'

# 4. Predict the Test Set Results

In [None]:
y_pred = classifier.predict(X_test)
y_pred = (y_pred > 0.5)

# 5. Make the Confusion Matrix

In [None]:
from 