# ARTIFICIAL NEURAL NETWORKS

### TRAINING ANN WITH STOCHASTIC GRADIENT DESCENT

Steps:-

1. Randomly initialise the weights to small numbers close to 0
2. input first observation of your dataset in the input layer each feature in one input node
3. Forward Propagation:  from left to right, the neurons are activated in a way that the impact of each neuron's activation is limited by the weights. Propagate the activations until getting predicted result y.
4. Comparethe predicted result to the actual result. Measure the generated error
5. Backward Propagation:  from right to left, the error is propagated backwards. Update the weights according to how much they are responsible for the error. The learning rate decides by how much we update the weights.
6. Repeat steps 1 to 5 and update the weights after each observation (Reinforcement Learning). Or:  Repeat steps 1 to 5 and update the weights only after a batch of observations(Batch Learning)
7. When the whole training set passes through the ANN, that makes an epoch. Redo
more epochs


### Part 1 - Data Preprocessing

Importing Dataset

In [43]:
import tensorflow as tf 
import pandas as pd 
import numpy as np

In [66]:
dataset=pd.read_csv("Churn_Modelling.csv")
X=dataset.iloc[:,3:-1].values 
Y=dataset.iloc[:,-1]

In [67]:
X

array([[619, 'France', 'Female', ..., 1, 1, 101348.88],
       [608, 'Spain', 'Female', ..., 0, 1, 112542.58],
       [502, 'France', 'Female', ..., 1, 0, 113931.57],
       ...,
       [709, 'France', 'Female', ..., 0, 1, 42085.58],
       [772, 'Germany', 'Male', ..., 1, 0, 92888.52],
       [792, 'France', 'Female', ..., 1, 0, 38190.78]], dtype=object)

In [83]:
X.shape

(10000, 12)

In [110]:
Y

array([1, 0, 1, ..., 1, 1, 0], dtype=int64)

In [84]:
Y.shape

(10000,)

#### Encoding Categorical Data

Label Encoding the "Gender" column

In [69]:
from sklearn.preprocessing import LabelEncoder
le=LabelEncoder()
X[:,2]=le.fit_transform(X[:,2])
X

array([[619, 'France', 0, ..., 1, 1, 101348.88],
       [608, 'Spain', 0, ..., 0, 1, 112542.58],
       [502, 'France', 0, ..., 1, 0, 113931.57],
       ...,
       [709, 'France', 0, ..., 0, 1, 42085.58],
       [772, 'Germany', 1, ..., 1, 0, 92888.52],
       [792, 'France', 0, ..., 1, 0, 38190.78]], dtype=object)

In [87]:
X.shape

(10000, 12)

One Hot Encoding the "Geography" Column

In [70]:
from sklearn.preprocessing import OneHotEncoder
from sklearn.compose import ColumnTransformer
ct=ColumnTransformer(transformers=[('encoder',OneHotEncoder(),[1])],remainder='passthrough')
X=ct.fit_transform(X)
X

array([[1.0, 0.0, 0.0, ..., 1, 1, 101348.88],
       [0.0, 0.0, 1.0, ..., 0, 1, 112542.58],
       [1.0, 0.0, 0.0, ..., 1, 0, 113931.57],
       ...,
       [1.0, 0.0, 0.0, ..., 0, 1, 42085.58],
       [0.0, 1.0, 0.0, ..., 1, 0, 92888.52],
       [1.0, 0.0, 0.0, ..., 1, 0, 38190.78]], dtype=object)

In [88]:
X.shape

(10000, 12)

#### Splitting Dataset into Training and Test tests

In [89]:
from sklearn.model_selection import train_test_split

X_train,X_test,y_train,y_test = train_test_split(X, Y , test_size=0.2,random_state=0)

In [90]:
X_train.shape

(8000, 12)

In [91]:
y_train.shape

(8000,)

In [111]:
y_test=np.array(y_test)

#### Feature Scaling

In [92]:
from sklearn.preprocessing import StandardScaler

ss=StandardScaler()

# feature scale every coloumn for deep learning

X_train=ss.fit_transform(X_train)
X_test=ss.transform(X_test)

In [93]:
X_train.shape

(8000, 12)

In [94]:
y_train.shape

(8000,)

### Part 2 - Building ANN

Initializing ANN

In [95]:
ann=tf.keras.models.Sequential()  #neuron created

Adding the input layer and first hidden Layer

In [97]:
ann.add(tf.keras.layers.Dense(units=6,activation='relu')) #relu :- rectifier activation function

Adding Second Layer

In [98]:
ann.add(tf.keras.layers.Dense(units=6,activation='relu')) 

Adding Output Layer

In [99]:
ann.add(tf.keras.layers.Dense(units=1,activation='sigmoid')) #units=1 for binary output

### Part 3 - Training ANN

Compiling ANN

1. Loss Function: The loss function measures how well the model's predictions match the actual labels. It calculates the error between the predicted outputs and the actual outputs. The goal of training is to minimize this loss.

2. Optimizer: This determines how the model will update its weights during training to minimize the loss function

3. Metrics: Metrics are used to evaluate the performance of the model. Unlike the loss function, which is minimized during training, metrics provide additional insights into how well the model is performing (e.g., accuracy, precision).

In [100]:
ann.compile(optimizer='adam' , loss= 'binary_crossentropy' , metrics=['accuracy'])

#binary_crossentropy is the loss function used for binary classification problems, where the output is either 0 or 1

Training ANN on the Training Set

In [101]:
ann.fit(X_train,y_train,batch_size=32,epochs=100)

Epoch 1/100
[1m250/250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 1ms/step - accuracy: 0.5799 - loss: 0.6752
Epoch 2/100
[1m250/250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - accuracy: 0.7916 - loss: 0.4962
Epoch 3/100
[1m250/250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - accuracy: 0.7970 - loss: 0.4563
Epoch 4/100
[1m250/250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - accuracy: 0.8057 - loss: 0.4360
Epoch 5/100
[1m250/250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - accuracy: 0.8081 - loss: 0.4281
Epoch 6/100
[1m250/250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - accuracy: 0.7989 - loss: 0.4369
Epoch 7/100
[1m250/250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - accuracy: 0.8227 - loss: 0.4066
Epoch 8/100
[1m250/250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - accuracy: 0.8187 - loss: 0.4201
Epoch 9/100
[1m250/250[0m [32

<keras.src.callbacks.history.History at 0x1a68c4ca910>

### Part 4 - Making the Predictions and Evaluating the Model

Predicting result of single Observation

In [105]:
ann.predict(ss.transform([[1,0,0,600,1,40,3,60000,2,1,1,50000]]))>0.05

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 32ms/step


array([[False]])

Predicting the Test Results

In [112]:
y_pred=ann.predict(X_test)
y_pred=(y_pred>0.5)
print(np.concatenate((y_pred.reshape(len(y_pred),1),y_test.reshape(len(y_test),1)),1))

[1m63/63[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step
[[0 0]
 [0 1]
 [0 0]
 ...
 [0 0]
 [0 0]
 [0 0]]


Making the Confusion Matrix

In [113]:
from sklearn.metrics import confusion_matrix

cm=confusion_matrix(y_pred,y_test)

cm

array([[1522,  214],
       [  73,  191]], dtype=int64)

In [114]:
from sklearn.metrics import accuracy_score
accuracy_score(y_pred,y_test)

0.8565