# Praktikum 4 - Klasifikasi dengan ANN

### Deskripsi

Pada praktikum kali ini, Anda diminta untuk membuat model ANN untuk mengklasifikasi potensi seorang customer akan meninggalkan perusahaan Anda atau tidak. Istirlah populer dari fenomena ini disebut sebagai 'churn'. Tingkat churn yang tinggi (chrun rate) akan berdampak tidak baik bagi perusahaan.

## Perhatian!

Pada praktikum ini, Anda akan menggunakan library tensorflow dari google. Oleh karena itu, Anda diharuskan untuk menginstal tensorflow terlebih dahulu.

Anda juga perlu menyesuaikan instalasi tensorflow yang Anda gunakan pada komputer lokal, apakah komputasi pada,

  * CPU

  * GPU (GPU support CUDA)

  * Apple Silicon (M1/M2)



In [29]:
pip install tensorflow



## Pra Pengolahan Data

* Langkah 1 - Import Library

In [30]:
import numpy as np
import pandas as pd
import tensorflow as tf

* Langkah 2 - Load Data

In [31]:
dataset = pd.read_csv('/content/drive/MyDrive/Machine_Learning/Jobsheet9-ANN/Churn_Modelling.csv')
print(dataset.head())
X = dataset.iloc[:, 3:-1].values
y = dataset.iloc[:, -1].values

   RowNumber  CustomerId   Surname  CreditScore Geography  Gender  Age  \
0          1    15634602  Hargrave          619    France  Female   42   
1          2    15647311      Hill          608     Spain  Female   41   
2          3    15619304      Onio          502    France  Female   42   
3          4    15701354      Boni          699    France  Female   39   
4          5    15737888  Mitchell          850     Spain  Female   43   

   Tenure    Balance  NumOfProducts  HasCrCard  IsActiveMember  \
0       2       0.00              1          1               1   
1       1   83807.86              1          0               1   
2       8  159660.80              3          1               0   
3       1       0.00              2          0               0   
4       2  125510.82              1          1               1   

   EstimatedSalary  Exited  
0        101348.88       1  
1        112542.58       0  
2        113931.57       1  
3         93826.63       0  
4         790

In [32]:
print(f"X: ", X)
print(f"y: ", y)

X:  [[619 'France' 'Female' ... 1 1 101348.88]
 [608 'Spain' 'Female' ... 0 1 112542.58]
 [502 'France' 'Female' ... 1 0 113931.57]
 ...
 [709 'France' 'Female' ... 0 1 42085.58]
 [772 'Germany' 'Male' ... 1 0 92888.52]
 [792 'France' 'Female' ... 1 0 38190.78]]
y:  [1 0 1 ... 1 1 0]


* Langkah 3 - Encoding Data Kategorikal

In [33]:
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
X[:, 2] = le.fit_transform(X[:, 2])

In [34]:
print(f"X: ", X)

X:  [[619 'France' 0 ... 1 1 101348.88]
 [608 'Spain' 0 ... 0 1 112542.58]
 [502 'France' 0 ... 1 0 113931.57]
 ...
 [709 'France' 0 ... 0 1 42085.58]
 [772 'Germany' 1 ... 1 0 92888.52]
 [792 'France' 0 ... 1 0 38190.78]]


* Langkah 4 - Encoding Kolom "Geography" dengan One Hot Encoder

In [35]:
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder
ct = ColumnTransformer(transformers=[('encoder', OneHotEncoder(), [1])], remainder='passthrough')
X = np.array(ct.fit_transform(X))

In [36]:
print(f"X: ", X)

X:  [[1.0 0.0 0.0 ... 1 1 101348.88]
 [0.0 0.0 1.0 ... 0 1 112542.58]
 [1.0 0.0 0.0 ... 1 0 113931.57]
 ...
 [1.0 0.0 0.0 ... 0 1 42085.58]
 [0.0 1.0 0.0 ... 1 0 92888.52]
 [1.0 0.0 0.0 ... 1 0 38190.78]]


* Langkah 5 - Split Data

In [37]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)

* Langkah 6 - Scaling Fitur

In [38]:
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

## Membuat Model ANN

* Langkah 1 - Inisiasi Model ANN

In [39]:
ann = tf.keras.models.Sequential()

* Langkah 2 - Membuat Input Layer dan Hidden Layer Pertama

In [40]:
ann.add(tf.keras.layers.Dense(units=6, activation='relu'))

* Langkah 3 - Membuat Hidden Layer Kedua

In [41]:
ann.add(tf.keras.layers.Dense(units=6, activation='relu'))

* Langkah 4 - Membuat Output Layer

In [42]:
ann.add(tf.keras.layers.Dense(units=1, activation='sigmoid'))

## Training Model

* Langkah 1 - Compile Model (Menyatukan Arsitektur) ANN

In [43]:
ann.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])

* Langkah 2 - Fitting Model

In [44]:
ann.fit(X_train, y_train, batch_size = 32, epochs = 100)

Epoch 1/100
[1m250/250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 2ms/step - accuracy: 0.7597 - loss: 0.5825
Epoch 2/100
[1m250/250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 5ms/step - accuracy: 0.7966 - loss: 0.4728
Epoch 3/100
[1m250/250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 7ms/step - accuracy: 0.7960 - loss: 0.4447
Epoch 4/100
[1m250/250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 7ms/step - accuracy: 0.7933 - loss: 0.4445
Epoch 5/100
[1m250/250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 9ms/step - accuracy: 0.7950 - loss: 0.4375
Epoch 6/100
[1m250/250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 15ms/step - accuracy: 0.8026 - loss: 0.4209
Epoch 7/100
[1m250/250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 12ms/step - accuracy: 0.8003 - loss: 0.4295
Epoch 8/100
[1m250/250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 7ms/step - accuracy: 0.7980 - loss: 0.4301
Epoch 9/100
[1m250/250[0m [

<keras.src.callbacks.history.History at 0x7dc3692f0eb0>

## Membuat Prediksi

Diberikan informasi sebagai berikut,

  * Geography: France

  * Credit Score: 600

  * Gender: Male

  * Age: 40 years old

  * Tenure: 3 years

  * Balance: $ 60000

  * Number of Products: 2

  * Does this customer have a credit card ? Yes

  * Is this customer an Active Member: Yes

  * Estimated Salary: $ 50000

Apakah customer tersebut perlu dipertahankan?

## Modelkan Data Baru dan Buat Prediksi



In [45]:
print(ann.predict(sc.transform([[1, 0, 0, 600, 1, 40, 3, 60000, 2, 1, 1, 50000]])) > 0.5)

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 125ms/step
[[False]]


* Apakah hasilnya False?

## Prediksi Dengan Data Testing

In [46]:
y_pred = ann.predict(X_test)
y_pred = (y_pred > 0.5)
print(np.concatenate((y_pred.reshape(len(y_pred),1), y_test.reshape(len(y_test),1)),1))

[1m63/63[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step
[[0 0]
 [0 1]
 [0 0]
 ...
 [0 0]
 [0 0]
 [0 0]]


## Cek Akurasi dan Confusion Matrix

In [47]:
from sklearn.metrics import confusion_matrix, accuracy_score
cm = confusion_matrix(y_test, y_pred)
print(cm)
accuracy_score(y_test, y_pred)

[[1523   72]
 [ 199  206]]


0.8645