# Artificial Neural Network

### Importing the libraries

In [1]:
import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow import keras
import keras.layers as layers

In [2]:
tf.__version__

'2.17.0'

## Part 1 - Data Preprocessing

### Importing the dataset

In [3]:
dataset = pd.read_csv('Churn_Modelling.csv')
X = dataset.iloc[:, 3:-1].values # write down here why we are using 3:-1 why these columns
#In the first 3 columns there isnt much useful info as they are all unique to the person/not numerical
#The next ones are things which have info which can either be columnized or can be numerically quantified
y = dataset.iloc[:, -1].values

In [4]:
print(X)

[[619 'France' 'Female' ... 1 1 101348.88]
 [608 'Spain' 'Female' ... 0 1 112542.58]
 [502 'France' 'Female' ... 1 0 113931.57]
 ...
 [709 'France' 'Female' ... 0 1 42085.58]
 [772 'Germany' 'Male' ... 1 0 92888.52]
 [792 'France' 'Female' ... 1 0 38190.78]]


In [5]:
print(y)

[1 0 1 ... 1 1 0]


### Encoding categorical data

Label Encoding the "Gender" column

In [6]:
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
X[:,2]=le.fit_transform(X[:,2])

In [7]:
print(X)

[[619 'France' 0 ... 1 1 101348.88]
 [608 'Spain' 0 ... 0 1 112542.58]
 [502 'France' 0 ... 1 0 113931.57]
 ...
 [709 'France' 0 ... 0 1 42085.58]
 [772 'Germany' 1 ... 1 0 92888.52]
 [792 'France' 0 ... 1 0 38190.78]]


One Hot Encoding the "Geography" column

In [8]:
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder
ct = ColumnTransformer(transformers=[('encoder', OneHotEncoder(), [1])], remainder='passthrough')
transformed=ct.fit_transform(X)
X=pd.DataFrame(transformed,columns=ct.get_feature_names_out())

In [9]:
print(X)

     encoder__x1_France encoder__x1_Germany encoder__x1_Spain remainder__x0  \
0                   1.0                 0.0               0.0           619   
1                   0.0                 0.0               1.0           608   
2                   1.0                 0.0               0.0           502   
3                   1.0                 0.0               0.0           699   
4                   0.0                 0.0               1.0           850   
...                 ...                 ...               ...           ...   
9995                1.0                 0.0               0.0           771   
9996                1.0                 0.0               0.0           516   
9997                1.0                 0.0               0.0           709   
9998                0.0                 1.0               0.0           772   
9999                1.0                 0.0               0.0           792   

     remainder__x2 remainder__x3 remainder__x4 rema

### Splitting the dataset into the Training set and Test set

In [10]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X,y,random_state=42,test_size=0.2)

### Feature Scaling

In [11]:
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

## Part 2 - Building the ANN

### Initializing the ANN

In [12]:
ann = keras.models.Sequential()

### Adding the input layer and the first hidden layer

In [13]:
input_shape=X.shape[1]
ann.add(keras.Input(shape=(input_shape,)))
ann.add(layers.Dense(128,activation="relu"))

### Adding the second hidden layer

In [14]:
ann.add(layers.Dense(64,activation="relu"))

### Adding the output layer

In [15]:
ann.add(layers.Dense(1,activation="sigmoid"))
ann.summary()

## Part 3 - Training the ANN

### Compiling the ANN

In [17]:
from keras.optimizers import Adam
ann.compile(loss="binary_crossentropy",optimizer=Adam(learning_rate=3e-4),metrics=["accuracy"])

### Training the ANN on the Training set

In [18]:
history=ann.fit(X_train,y_train,epochs=50,verbose=2,validation_split=0.2)

Epoch 1/50
200/200 - 0s - 2ms/step - accuracy: 0.7934 - loss: 0.4838 - val_accuracy: 0.8225 - val_loss: 0.4189
Epoch 2/50
200/200 - 0s - 455us/step - accuracy: 0.8216 - loss: 0.4182 - val_accuracy: 0.8438 - val_loss: 0.3954
Epoch 3/50
200/200 - 0s - 458us/step - accuracy: 0.8384 - loss: 0.3941 - val_accuracy: 0.8519 - val_loss: 0.3791
Epoch 4/50
200/200 - 0s - 458us/step - accuracy: 0.8494 - loss: 0.3733 - val_accuracy: 0.8556 - val_loss: 0.3667
Epoch 5/50
200/200 - 0s - 465us/step - accuracy: 0.8545 - loss: 0.3584 - val_accuracy: 0.8587 - val_loss: 0.3590
Epoch 6/50
200/200 - 0s - 460us/step - accuracy: 0.8591 - loss: 0.3498 - val_accuracy: 0.8562 - val_loss: 0.3543
Epoch 7/50
200/200 - 0s - 458us/step - accuracy: 0.8631 - loss: 0.3437 - val_accuracy: 0.8575 - val_loss: 0.3524
Epoch 8/50
200/200 - 0s - 460us/step - accuracy: 0.8628 - loss: 0.3397 - val_accuracy: 0.8550 - val_loss: 0.3494
Epoch 9/50
200/200 - 0s - 459us/step - accuracy: 0.8625 - loss: 0.3364 - val_accuracy: 0.8587 - va

## Part 4 - Making the predictions and evaluating the model

### Predicting the result of a single observation

**Extra**

Use our ANN model to predict if the customer with the following informations will leave the bank: 

Geography: France

Credit Score: 600

Gender: Male

Age: 40 years old

Tenure: 3 years

Balance: \$ 60000

Number of Products: 2

Does this customer have a credit card ? Yes

Is this customer an Active Member: Yes

Estimated Salary: \$ 50000

So, should we say goodbye to that customer ?

**Solution**

In [21]:
x_test=pd.DataFrame([[1, 0, 0, 600, 1, 40, 3, 60000, 2, 1, 1, 50000]])
m=ann.predict(x_test,verbose=0)
n=(m>0.5).astype(int)
print(n)
if n[0][0]==0:
    print("Goodbye")
else:
    print("Wlcm")

[[1]]
Wlcm


### Predicting the Test set results

In [20]:
a=ann.predict(X_test,verbose=0)
b=(a>0.5).astype(int)
print(b)

[[0]
 [0]
 [0]
 ...
 [1]
 [0]
 [0]]


### Making the Confusion Matrix

In [22]:
from sklearn.metrics import confusion_matrix, accuracy_score
print(confusion_matrix(y_test,b))
print(f"accuracy = {accuracy_score(y_test,b)*100} %")

[[1523   84]
 [ 200  193]]
accuracy = 85.8 %


write down about precision recall f1-score, why is it better than just accuracy, what are some other interesting metrics u can find