#### Deep Learning is the most exciting and powerful branch of Machine Learning. Deep Learning models can be used for a variety of complex tasks:

1. Artificial Neural Networks for Regression and Classification
2. Convolutional Neural Networks for Computer Vision
3. Recurrent Neural Networks for Time Series Analysis
4. Self Organizing Maps for Feature Extraction
5. Deep Boltzmann Machines for Recommendation Systems
6. Auto Encoders for Recommendation Systems

##### In this part, you will understand and learn how to implement the following Deep Learning models:

1. Artificial Neural Networks for a Business Problem
2. Convolutional Neural Networks for a Computer Vision task

##### Training the ANN with stochastic Gradient Descent
1. Randomly initialize the weight to small numbers close to 0 (but not 0), these weight are  minimized until the cost function is minmized
2. Input the first observation of your dataset in the input layer, each feature in one input node.
3. Forward propagation: from left to right the neurons are activated in a way that the impact of each neuron is limited by the weights. 
4. Compare the predicted result to the actual result. Measure the generated error.
5. Back propagation: from right to left, the error is back propagated. Update the weights according to how  much they are responsible for the error. The learning rate decides how much we update the weights.
6. Repeat steps 1 to 5 and update the weights after each observation (Reinforcement Learning) or Repeat steps 1 to 5 but update the weights only after a batch of observation (Batch learning)
7. When whole training set passed through the ANN, that makes an epoch, Train NN for better and better


###### Buisness problem: We have a bank data, so according to data customers are leaving the company at higher rate, they want to assess and  know the reason.

* Deep learning using in many areas:
1. Computer Vision
2. Medicine
3. Making prediction and classification of buisness problems
4. Recognize images
5. Recommendation engines (Deep Boltzman learning)

##### So in out buisness problems we are discussing  which customer are leaving the bank

##### Basically we are doing classification but with deep learning.

##### Theano: It is a open source numerical computing library, CPU and GPU

##### Tensorflow: 

* Tensorflow and Theano for creating deep learning from scratch

##### Keras: To build deep learning model.


In [3]:
# Part 1: Data preprocessing
import numpy as np
import pandas as pd

In [4]:
dataset = pd.read_csv('Churn_Modelling.csv')
dataset

Unnamed: 0,RowNumber,CustomerId,Surname,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
0,1,15634602,Hargrave,619,France,Female,42,2,0.00,1,1,1,101348.88,1
1,2,15647311,Hill,608,Spain,Female,41,1,83807.86,1,0,1,112542.58,0
2,3,15619304,Onio,502,France,Female,42,8,159660.80,3,1,0,113931.57,1
3,4,15701354,Boni,699,France,Female,39,1,0.00,2,0,0,93826.63,0
4,5,15737888,Mitchell,850,Spain,Female,43,2,125510.82,1,1,1,79084.10,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
9995,9996,15606229,Obijiaku,771,France,Male,39,5,0.00,2,1,0,96270.64,0
9996,9997,15569892,Johnstone,516,France,Male,35,10,57369.61,1,1,1,101699.77,0
9997,9998,15584532,Liu,709,France,Female,36,7,0.00,1,0,1,42085.58,1
9998,9999,15682355,Sabbatini,772,Germany,Male,42,3,75075.31,2,1,0,92888.52,1


In [62]:
# 1. Row no: no impact
# 2. Customer id: No impact
# 3. Surname: No impact
# 4. Credit score: Impact
# 5. Geography: Impact
# 6. Gender: Impact
# 7. Age: Imapact
# 8. Tenure: Impact
# 9. Balance: Impact
# 10. Number of product: Impact
# 11. Credit card: Impact
# 12. Is active: Impact
# 13. Estimated salary: Impact
# We dont know which independent variable has more impact
# thats AI will calculate, give weight to those independent variable
# who has more  impact
x = dataset.iloc[:, 3:-1].values
y = dataset.iloc[:,-1].values

In [63]:
x

array([[619, 'France', 'Female', ..., 1, 1, 101348.88],
       [608, 'Spain', 'Female', ..., 0, 1, 112542.58],
       [502, 'France', 'Female', ..., 1, 0, 113931.57],
       ...,
       [709, 'France', 'Female', ..., 0, 1, 42085.58],
       [772, 'Germany', 'Male', ..., 1, 0, 92888.52],
       [792, 'France', 'Female', ..., 1, 0, 38190.78]], dtype=object)

In [64]:
y

array([1, 0, 1, ..., 1, 1, 0])

In [66]:
# we have some categorical data, encode those
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
from sklearn.compose import ColumnTransformer
labelencoder_x1 = LabelEncoder()
x[:,1] = labelencoder_x1.fit_transform(x[:,1])
labelencoder_x2 = LabelEncoder()
x[:,2] = labelencoder_x2.fit_transform(x[:,2])
ct = ColumnTransformer(transformers=[('encoder', OneHotEncoder(), [1])], remainder='passthrough')
x = np.array(ct.fit_transform(x))

In [67]:
y

array([1, 0, 1, ..., 1, 1, 0])

In [68]:
x

array([[1.0, 0.0, 0.0, ..., 1, 1, 101348.88],
       [0.0, 0.0, 1.0, ..., 0, 1, 112542.58],
       [1.0, 0.0, 0.0, ..., 1, 0, 113931.57],
       ...,
       [1.0, 0.0, 0.0, ..., 0, 1, 42085.58],
       [0.0, 1.0, 0.0, ..., 1, 0, 92888.52],
       [1.0, 0.0, 0.0, ..., 1, 0, 38190.78]], dtype=object)

In [69]:
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x,y, test_size=0.2, random_state=0)

In [70]:
x_train

array([[0.0, 0.0, 1.0, ..., 1, 0, 163830.64],
       [0.0, 1.0, 0.0, ..., 1, 1, 57098.0],
       [1.0, 0.0, 0.0, ..., 1, 0, 185630.76],
       ...,
       [1.0, 0.0, 0.0, ..., 1, 0, 181429.87],
       [0.0, 0.0, 1.0, ..., 1, 1, 148750.16],
       [0.0, 1.0, 0.0, ..., 1, 0, 118855.26]], dtype=object)

In [71]:
x_test

array([[0.0, 1.0, 0.0, ..., 1, 1, 192852.67],
       [1.0, 0.0, 0.0, ..., 1, 0, 128702.1],
       [0.0, 0.0, 1.0, ..., 1, 1, 75732.25],
       ...,
       [0.0, 0.0, 1.0, ..., 1, 0, 141533.19],
       [0.0, 1.0, 0.0, ..., 1, 1, 11276.48],
       [0.0, 1.0, 0.0, ..., 1, 0, 192950.6]], dtype=object)

In [72]:
y_test

array([0, 1, 0, ..., 0, 0, 0])

In [73]:
y_train

array([0, 0, 0, ..., 0, 0, 1])

In [74]:
# Feature scaling is highly compulsary
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
x_train = sc.fit_transform(x_train)
x_test = sc.fit_transform(x_test)

In [75]:
x_train

array([[-1.01460667, -0.5698444 ,  1.74309049, ...,  0.64259497,
        -1.03227043,  1.10643166],
       [-1.01460667,  1.75486502, -0.57369368, ...,  0.64259497,
         0.9687384 , -0.74866447],
       [ 0.98560362, -0.5698444 , -0.57369368, ...,  0.64259497,
        -1.03227043,  1.48533467],
       ...,
       [ 0.98560362, -0.5698444 , -0.57369368, ...,  0.64259497,
        -1.03227043,  1.41231994],
       [-1.01460667, -0.5698444 ,  1.74309049, ...,  0.64259497,
         0.9687384 ,  0.84432121],
       [-1.01460667,  1.75486502, -0.57369368, ...,  0.64259497,
        -1.03227043,  0.32472465]])

In [76]:
x_test

array([[-0.95692675,  1.62776996, -0.57427105, ...,  0.66011376,
         0.97628121,  1.62185911],
       [ 1.04501206, -0.61433742, -0.57427105, ...,  0.66011376,
        -1.02429504,  0.504204  ],
       [-0.95692675, -0.61433742,  1.74133801, ...,  0.66011376,
         0.97628121, -0.41865644],
       ...,
       [-0.95692675, -0.61433742,  1.74133801, ...,  0.66011376,
        -1.02429504,  0.72775202],
       [-0.95692675,  1.62776996, -0.57427105, ...,  0.66011376,
         0.97628121, -1.54162886],
       [-0.95692675,  1.62776996, -0.57427105, ...,  0.66011376,
        -1.02429504,  1.62356528]])

In [77]:
 # Now lets make ANN using Keras

In [78]:
import keras

In [79]:
# Modules
# 1. Sequential module: init NN
# 2. Dense module: build layer of ANN
from keras.models import Sequential
from keras.layers import Dense

In [80]:
# Initialze the ANN
# 1. Define by sequence of layers 
# 2. Or by graph
# with successive layer
# this model is classifier
classifier = Sequential()

In [81]:
classifier

<keras.engine.sequential.Sequential at 0x7f1c0c799898>

In [82]:
# Add input and hidden layer
# AI NN build in 7 steps
# Dense: weight
# Input layer: independent variables
# forward propagation: high value, more signal
# activation function is rectifier
# sigmoid for output layer probability for different layer
# compare predicted: generate error
# back propagate
# update weight
# repeat steps 1 to 5
# pass ANN, make epoch, re repeat epoc

In [83]:
# Add input layer and first hidden layer
# output dim: no of nodes of layer
# no of nodes = practical techniques
# number of input + number of output load / 2
# input dim = "number of nodes in input layer"
classifier.add(Dense(3, kernel_initializer = 'uniform', activation = 'relu', input_dim=11))

In [84]:
# Add second hidden layer
classifier.add(Dense(3, kernel_initializer = 'uniform', activation = 'relu'))

In [85]:
# Adding the final output layer
classifier.add(Dense(3, kernel_initializer = 'uniform', activation = 'sigmoid'))

In [86]:
# compile whole AI NN network
# optimizer: algo you are use
# loss = "binary_entropy" or "categorical_entropy"
# metrics = list of metrics to be evaluated by the model
classifier.compile(optimizer="adam",loss="binary_crossentropy", metric=["accuracy"])

In [91]:
#  Fit the ANN in training set
# batch size: number of observation after which you update the weight
# no of epochs
classifier.fit(x_train, y_train, batch_size=10, epochs=100)

ValueError: Error when checking input: expected dense_20_input to have shape (11,) but got array with shape (12,)

In [57]:
y_train

array([0, 0, 0, ..., 0, 0, 1])