##**Overfitting**



definisi: sebuah fenomena dimana model yang digunakan terlalu cocok atau sesuai dengan data training sehingga kehilangan kemampuan untuk melakukandeteksi pada data baru (data testing)<br>
<br>

contoh: ketika hasil training bagus, tapi hasil testingnya buruk

Beberapa karakteristik utama terjadinya overfitting: <br>
1. Performanya tinggi pada training tetapi rendah di testing
2. Model yang dinuat terlalu rumit


Penyebab Overfitting: <br>
1. Data training terlalu sedikit
2. Datanya tidak beragam
3. Model terlalu kompleks
4. Proses training terlalu lama


Cara mengatasi overfitting: <br>
1. Sediakan data yang banyak
2. Menggunakan Dropout
3. Cross Validation

In [85]:
import matplotlib.pyplot as plt
from tensorflow.keras.datasets import fashion_mnist

(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()

In [86]:
print("Ukuran Citra Train",train_images.shape)
print("Ukuran Label Train",train_labels.shape)

print("Ukuran Citra Test",test_images.shape)
print("Ukuran Label Test",test_labels.shape)

Ukuran Citra Train (60000, 28, 28)
Ukuran Label Train (60000,)
Ukuran Citra Test (10000, 28, 28)
Ukuran Label Test (10000,)


In [87]:
print(set(train_labels))

{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}


In [88]:
train_images = train_images.reshape((60000,28*28))
train_images = train_images.astype("float32") / 255

test_images = test_images.reshape((10000,28*28))
test_images = test_images.astype("float32") / 255

In [89]:
#Arsitektur Modelnya

import tensorflow as tf
import numpy as np

model = tf.keras.Sequential([
    tf.keras.layers.Dense(512, activation = 'relu'),
    tf.keras.layers.Dense(10, activation = 'sigmoid')
])

In [90]:
model.compile(optimizer ='rmsprop', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

In [94]:
sampel_train_images = train_images[:100]
sampel_train_labels = train_labels[:100]

In [95]:
sampel_train_images.shape

(100, 784)

In [96]:
model.fit(sampel_train_images, sampel_train_labels, epochs=10, batch_size=128)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.src.callbacks.History at 0x7f33b3b06650>

In [97]:
prediksi = model(test_images)
prediksi = prediksi.numpy() # =>supaya hasilnya array
prediksi_label = np.argmax(prediksi,axis=1)
prediksi_betul = prediksi_label == test_labels
print(f"Akurasi hasil data test :{prediksi_betul.mean():.2f}")


Akurasi hasil data test :0.64


##Pembuatan Model lebih Deep

In [98]:
#Arsitektur Modelnya

import tensorflow as tf
import numpy as np

model_kedua = tf.keras.Sequential([
    tf.keras.layers.Dense(512, activation = 'relu'),
    tf.keras.layers.Dropout(0,5),
    tf.keras.layers.Dense(256, activation = 'relu'),
    tf.keras.layers.Dropout(0,5),
    tf.keras.layers.Dense(128, activation = 'relu'),
    tf.keras.layers.Dense(64, activation = 'relu'),
    tf.keras.layers.Dense(32, activation = 'relu'),
    tf.keras.layers.Dense(10, activation = 'sigmoid')
])

In [99]:
model_kedua.compile(optimizer ='rmsprop', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

In [100]:
model_kedua.fit(train_images,train_labels, epochs=10, batch_size=256)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.src.callbacks.History at 0x7f33b3babd90>

In [101]:
model_kedua.summary()

Model: "sequential_8"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_24 (Dense)            (None, 512)               401920    
                                                                 
 dropout_2 (Dropout)         (None, 512)               0         
                                                                 
 dense_25 (Dense)            (None, 256)               131328    
                                                                 
 dropout_3 (Dropout)         (None, 256)               0         
                                                                 
 dense_26 (Dense)            (None, 128)               32896     
                                                                 
 dense_27 (Dense)            (None, 64)                8256      
                                                                 
 dense_28 (Dense)            (None, 32)               

In [106]:
prediksi = model_kedua(test_images)
prediksi = prediksi.numpy() # =>supaya hasilnya array
prediksi_label = np.argmax(prediksi,axis=1)
prediksi_betul = prediksi_label == test_labels
print(f"Akurasi hasil data test :{prediksi_betul.mean():.2f}")


Akurasi hasil data test :0.87


### => Tujuan ***Dropout*** adalah untuk mencegah terjadinya overfitting

##Pretrained Model

In [107]:
model_kedua.save('model_Tia_Agustiani.h5')