# Studi Kasus Backpropagation (Advanced Model Tuning)

Anda sedang membuat sistem untuk memprediksi gaji seseorang menggunakan backpropagation. Saat anda mencoba mengembangkan model, model anda belum dapat mencapai akurasi yang optimal, oleh karena itu anda berniat mencoba melakukan tuning model lebih mendalam dengan menentukan fungsi aktivasi yang digunakan.

# Data Loading & Inspection

In [84]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
np.set_printoptions(threshold=np.inf)
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import accuracy_score
from sklearn.metrics import mean_squared_error
from sklearn.preprocessing import MinMaxScaler


In [85]:
df = pd.read_csv('adult.csv')

df.head()

Unnamed: 0,age,workclass,fnlwgt,education,education.num,marital.status,occupation,relationship,race,sex,capital.gain,capital.loss,hours.per.week,native.country,income
0,90,?,77053,HS-grad,9,Widowed,?,Not-in-family,White,Female,0,4356,40,United-States,<=50K
1,82,Private,132870,HS-grad,9,Widowed,Exec-managerial,Not-in-family,White,Female,0,4356,18,United-States,<=50K
2,66,?,186061,Some-college,10,Widowed,?,Unmarried,Black,Female,0,4356,40,United-States,<=50K
3,54,Private,140359,7th-8th,4,Divorced,Machine-op-inspct,Unmarried,White,Female,0,3900,40,United-States,<=50K
4,41,Private,264663,Some-college,10,Separated,Prof-specialty,Own-child,White,Female,0,3900,40,United-States,<=50K


In [86]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 32561 entries, 0 to 32560
Data columns (total 15 columns):
 #   Column          Non-Null Count  Dtype 
---  ------          --------------  ----- 
 0   age             32561 non-null  int64 
 1   workclass       32561 non-null  object
 2   fnlwgt          32561 non-null  int64 
 3   education       32561 non-null  object
 4   education.num   32561 non-null  int64 
 5   marital.status  32561 non-null  object
 6   occupation      32561 non-null  object
 7   relationship    32561 non-null  object
 8   race            32561 non-null  object
 9   sex             32561 non-null  object
 10  capital.gain    32561 non-null  int64 
 11  capital.loss    32561 non-null  int64 
 12  hours.per.week  32561 non-null  int64 
 13  native.country  32561 non-null  object
 14  income          32561 non-null  object
dtypes: int64(6), object(9)
memory usage: 3.7+ MB


In [87]:
for i in df.columns:
  print(i, df[i].unique())

age [90 82 66 54 41 34 38 74 68 45 52 32 51 46 57 22 37 29 61 21 33 49 23 59
 60 63 53 44 43 71 48 73 67 40 50 42 39 55 47 31 58 62 36 72 78 83 26 70
 27 35 81 65 25 28 56 69 20 30 24 64 75 19 77 80 18 17 76 79 88 84 85 86
 87]
workclass ['?' 'Private' 'State-gov' 'Federal-gov' 'Self-emp-not-inc' 'Self-emp-inc'
 'Local-gov' 'Without-pay' 'Never-worked']
fnlwgt [  77053  132870  186061  140359  264663  216864  150601   88638  422013
   70037  172274  164526  129177  136204  172175   45363  172822  317847
  119592  203034  188774   77009   29059  153870  135285   34310  228696
  122066  107164  175360   44064  107287  198863  123011  205246   39181
  149650  197163  137527  161691  326232  115806  115066  289669  100820
  121253  110380  233882  192052  174995  335549  237729   68898  107276
  141584  207668  313243  147372  237608  194901  155106  121441  162028
  160724  132222  226355  329980  124137  187702  199029  145290  297248
  227856  179731  154374   27187  326857  160369  396

# Data Pre-Processing

Data yang anda gunakan merupakan gabungan dari data numerik dan kategorikal. Lakukan encoding sesuai dengan tipe data. (Jika perlu konteks lebih lanjut dapat mencari informasi mengenai adult dataset)

In [None]:
df['income'] = df['income'].map({'<=50K': 0, '>50K': 1}) 
df['sex'] = df['sex'].map({'Male': 1, 'Female': 0})  


label_encoder = LabelEncoder()
for col in ['workclass', 'education', 'marital.status', 'occupation', 'relationship', 'race', 'native.country']:
    df[col] = label_encoder.fit_transform(df[col])

# Cek hasil akhir
print(df.head())

   age  workclass  fnlwgt  education  education.num  marital.status  \
0   90          0   77053         11              9               6   
1   82          4  132870         11              9               6   
2   66          0  186061         15             10               6   
3   54          4  140359          5              4               0   
4   41          4  264663         15             10               5   

   occupation  relationship  race  sex  capital.gain  capital.loss  \
0           0             1     4    0             0          4356   
1           4             1     4    0             0          4356   
2           0             4     2    0             0          4356   
3           7             4     4    0             0          3900   
4          10             3     4    0             0          3900   

   hours.per.week  native.country  income  
0              40              39       0  
1              18              39       0  
2              40   

In [89]:
df.head(5)

Unnamed: 0,age,workclass,fnlwgt,education,education.num,marital.status,occupation,relationship,race,sex,capital.gain,capital.loss,hours.per.week,native.country,income
0,90,0,77053,11,9,6,0,1,4,0,0,4356,40,39,0
1,82,4,132870,11,9,6,4,1,4,0,0,4356,18,39,0
2,66,0,186061,15,10,6,0,4,2,0,0,4356,40,39,0
3,54,4,140359,5,4,0,7,4,4,0,0,3900,40,39,0
4,41,4,264663,15,10,5,10,3,4,0,0,3900,40,39,0


In [90]:
for i in df.columns:
  print(i, df[i].unique())

age [90 82 66 54 41 34 38 74 68 45 52 32 51 46 57 22 37 29 61 21 33 49 23 59
 60 63 53 44 43 71 48 73 67 40 50 42 39 55 47 31 58 62 36 72 78 83 26 70
 27 35 81 65 25 28 56 69 20 30 24 64 75 19 77 80 18 17 76 79 88 84 85 86
 87]
workclass [0 4 7 1 6 5 2 8 3]
fnlwgt [  77053  132870  186061  140359  264663  216864  150601   88638  422013
   70037  172274  164526  129177  136204  172175   45363  172822  317847
  119592  203034  188774   77009   29059  153870  135285   34310  228696
  122066  107164  175360   44064  107287  198863  123011  205246   39181
  149650  197163  137527  161691  326232  115806  115066  289669  100820
  121253  110380  233882  192052  174995  335549  237729   68898  107276
  141584  207668  313243  147372  237608  194901  155106  121441  162028
  160724  132222  226355  329980  124137  187702  199029  145290  297248
  227856  179731  154374   27187  326857  160369  396745  151089  336188
  279015   43221   30529  201742  218490  156996  298449  191712  198654
  102

In [91]:
from sklearn.model_selection import train_test_split

X = df.drop('income', axis = 1).to_numpy()
y = df['income'].to_numpy()

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=101)

scaler = MinMaxScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# Implementasi MLP

Dari data diatas, buatlah sebuah model MLP untuk mengklasifikasikan dataset Adult. Ujikan setiap fungsi aktivasi yang diberikan. Untuk pengujian MLP dapat menggunakan dari studi kasus sebelumnya.

In [92]:
# Sigmoid
def sig(x):
    return 1 / (1 + np.exp(-x))

def sigd(x):
    s = sig(x)
    return s * (1 - s)

# Rectified Linear Unit
def relu(x):
    return np.maximum(0, x)

def relud(x):
    return np.where(x > 0, 1, 0)

# Hyperbolic Tangent
def tanh(x):
    return np.tanh(x)

def tanhd(x):
    return 1 - np.tanh(x)**2

In [None]:
def bp_fit(X, target, layer_conf, max_epoch, max_error=0.1, learn_rate=0.1, print_per_epoch=100, activation='sigmoid'):

    if activation == 'sigmoid':
        act, act_deriv = sig, sigd
    elif activation == 'relu':
        act, act_deriv = relu, relud
    elif activation == 'tanh':
        act, act_deriv = tanh, tanhd
    else:
        raise ValueError("Unknown activation function. Choose from 'sigmoid', 'relu', or 'tanh'.")

    np.random.seed(1)
    nin = [np.empty(i) for i in layer_conf]
    n = [np.empty(j + 1) if i < len(layer_conf) - 1 else np.empty(j) for i, j in enumerate(layer_conf)]
    w = [np.random.rand(layer_conf[i] + 1, layer_conf[i + 1]) for i in range(len(layer_conf) - 1)]
    dw = [np.empty((layer_conf[i] + 1, layer_conf[i + 1])) for i in range(len(layer_conf) - 1)]
    d = [np.empty(s) for s in layer_conf[1:]]
    din = [np.empty(s) for s in layer_conf[1:-1]]
    epoch = 0
    mse = 1

    
    for i in range(0, len(n) - 1):
        n[i][-1] = 1

    while (max_epoch == -1 or epoch < max_epoch) and mse > max_error:
        epoch += 1
        mse = 0

        for r in range(len(X)):

            n[0][:-1] = X[r]

            for L in range(1, len(layer_conf)):
                nin[L] = np.dot(n[L-1], w[L-1])
                n[L][:len(nin[L])] = act(nin[L]) 

      
            e = target[r] - n[-1]
            mse += sum(e ** 2)
            d[-1] = e * act_deriv(nin[-1]) 
            dw[-1] = learn_rate * d[-1] * n[-2].reshape((-1, 1))

            for L in range(len(layer_conf) - 1, 1, -1):
                din[L-2] = np.dot(d[L-1], np.transpose(w[L-1][:-1]))
                d[L-2] = din[L-2] * act_deriv(nin[L-1])
                dw[L-2] = learn_rate * d[L-2] * n[L-2].reshape((-1, 1))

            for i in range(len(w)):
                w[i] += dw[i]

        mse /= len(X)

        if print_per_epoch > -1 and epoch % print_per_epoch == 0:
            print(f'Epoch: {epoch}, MSE: {mse}')

    return w, epoch, mse

In [94]:
def bp_predict(X, w):
  n = [np.empty(len(i)) for i in w]
  nin = [np.empty(len(i[0])) for i in w]
  predict = []

  n.append(np.empty(len(w[-1][0])))

  for x in X:
    n[0][:-1] = x

    for L in range(0, len(w)):
      nin[L] = np.dot(n[L], w[L])
      n[L + 1][:len(nin[L])] = sig(nin[L])

    predict.append((n[-1]).copy())

  return predict

In [95]:
w, ep, mse = bp_fit(X_train, y_train, layer_conf = (14, 25, 2), learn_rate = .1, max_epoch = 100, max_error = .1, print_per_epoch = 25, activation= 'sigmoid')
print(f'Epoch: {ep}, MSE: {mse}')

predict = bp_predict(X_test, w)
predict_binary = [1 if p[1] > 0.5 else 0 for p in predict]
accuracy = accuracy_score(y_test, predict_binary)

print(f'Akurasi:', accuracy)

Epoch: 25, MSE: 0.222068612729628
Epoch: 50, MSE: 0.21417708031242533
Epoch: 75, MSE: 0.21039142518180984
Epoch: 100, MSE: 0.20794000339962126
Epoch: 100, MSE: 0.20794000339962126
Akurasi: 0.8453861507753724


In [96]:
w, ep, mse = bp_fit(X_train, y_train, layer_conf = (14, 25, 2), learn_rate = .1, max_epoch = 100, max_error = .1, print_per_epoch = 25, activation= 'relu')
print(f'Epoch: {ep}, MSE: {mse}')

predict = bp_predict(X_test, w)
predict_binary = [1 if p[1] > 0.5 else 0 for p in predict]
accuracy = accuracy_score(y_test, predict_binary)

print(f'Akurasi:', accuracy)

Epoch: 25, MSE: 0.4804207616707617
Epoch: 50, MSE: 0.4804207616707617
Epoch: 75, MSE: 0.4804207616707617
Epoch: 100, MSE: 0.4804207616707617
Epoch: 100, MSE: 0.4804207616707617
Akurasi: 0.7567941040994933


In [97]:
w, ep, mse = bp_fit(X_train, y_train, layer_conf = (14, 25, 2), learn_rate = .1, max_epoch = 100, max_error = .1, print_per_epoch = 25, activation= 'tanh')
print(f'Epoch: {ep}, MSE: {mse}')

predict = bp_predict(X_test, w)
predict_binary = [1 if p[1] > 0.5 else 0 for p in predict]
accuracy = accuracy_score(y_test, predict_binary)

print(f'Akurasi:', accuracy)

Epoch: 25, MSE: 1.519579238061817
Epoch: 50, MSE: 1.519579238061662
Epoch: 75, MSE: 1.5195792380615103
Epoch: 100, MSE: 1.5195792380613573
Epoch: 100, MSE: 1.5195792380613573
Akurasi: 0.24320589590050667


## Analisis

1. Jelaskan apa saja jenis encoding yang anda lakukan pada setiap fitur pada dataset!
2. Apa peran dari fungsi aktivasi dan turunannya dalam pelatihan Backpropagation?
3. Dari ketiga fungsi aktivasi yang digunakan mana yang mendapatkan hasil terbaik? (Ujikan model dalam hyperparameter yang berbeda-beda)
4. Dari ketiga fungsi aktivasi yang diberikan, jelaskan secara singkat dan berikan kelebihan dan kekurangannya! (Jika ada)

JAWABAN
1. Untuk kelas target 'income' dan kelas 'sex', saya menggunakan binary encoding karena hanya ada 2 kategori lalu sisanya saya menggunakan label encoding.
2. Fungsi aktivasi digunakan untuk membuat keputusan pada neuron output. Lebih jelasnya lagi fungsi aktivasi memperkenalkan non-linearitas dalam jaringan, memungkinkan jaringan untuk mempelajari hubungan yang kompleks dan tidak linear antar data. Sedangkan untuk fungsi turunan digunakan untuk menghitung gradien error terhadap bobot, membantu menentukan seberapa besar perubahan bobot yang diperlukan untuk mengurangi error.
3. Fungsi aktivasi yang menghasilkan hasil terbaik adalah fungsi Sigmoid dan untuk Hyperparameter, tidak ada perubahan yang signifikan meskipun menggunakan hyperparameter yang berbeda-beda
4.
Sigmoid: 
Kelebihan: Cocok untuk output biner dan stabil dalam propagasi gradien.
Kekurangan: Rentan terhadap vanishing gradient dan overflow pada input ekstrem.
Hasil: MSE menurun stabil dengan akurasi mencapai 84.5%, menunjukkan performa terbaik.

ReLu:
Kelebihan: Menghindari vanishing gradient pada nilai positif dan sederhana untuk dihitung.
Kekurangan: Menghasilkan dead neurons untuk input negatif, dan sensitif terhadap pengaturan bobot.
Hasil: MSE tetap tinggi dengan akurasi 75.7%, mengindikasikan performa lebih rendah dibandingkan sigmoid.

Tanh:
Kelebihan: Output terpusat di sekitar nol (-1, 1), yang bisa stabil dalam beberapa kasus.
Kekurangan: Rentan terhadap vanishing gradient dan lebih lambat dibanding ReLU.
Hasil: MSE tinggi dengan akurasi 24.3%, menunjukkan performa yang sangat buruk.
