cnn-trash-classifier

In [1]:
pip install tensorflow numpy pandas matplotlib scikit-learn huggingface-hub datasets

Defaulting to user installation because normal site-packages is not writeable
Note: you may need to restart the kernel to use updated packages.


Load dataset from https://huggingface.co/datasets/garythung/trashnet

In [1]:
from datasets import load_dataset

dataset = load_dataset("garythung/trashnet")

  from .autonotebook import tqdm as notebook_tqdm


In [2]:
import os
import numpy as np
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import matplotlib.pyplot as plt
from tensorflow.keras.optimizers import Nadam
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout

Function to prepare the data for training by organizing it into directories

In [8]:
def prepare_data(dataset):
    base_path = "./trashnet_data"
    os.makedirs(base_path, exist_ok=True)

    for split in dataset.keys():
        split_path = os.path.join(base_path, split)
        os.makedirs(split_path, exist_ok=True)

        for i, data in enumerate(dataset[split]):
            label = data['label']
            image = data['image']
            label_path = os.path.join(split_path, str(label))
            os.makedirs(label_path, exist_ok=True)
            
            image.save(os.path.join(label_path, f"{i}.jpg"))

prepare_data(dataset)

Data augmentation and preprocessing for training and validation and create training data & validation data generator

Pada bagian ini, untuk augmentasi dan preproses, tidak terlalu banyak yang di berikan kecuali pada rotation_range.
Dikarenakan, untuk datanya sendiri sudah cukup baik hanya saja memiliki berbagai angle pada pengambilan gambarnya.
Untuk value sisanya mostly trial & error.

In [37]:
image_size = 128
batch_size = 32
epoch = 20
lr =  0.001

train_datagen = ImageDataGenerator(
    rescale=1.0/255,
    rotation_range=45,
    width_shift_range=0.1,
    height_shift_range=0.1,
    shear_range=0.1,
    zoom_range=0.1,
    validation_split=0.2 
)

train_generator = train_datagen.flow_from_directory(
    './trashnet_data/train',
    target_size=(image_size, image_size),
    batch_size=batch_size,
    class_mode='categorical',
    subset='training'
)

val_generator = train_datagen.flow_from_directory(
    './trashnet_data/train',
    target_size=(image_size, image_size),
    batch_size=batch_size,
    class_mode='categorical',
    subset='validation'
)

# for i in range(5):
#     images, labels = next(train_generator)
#     plt.subplot(1, 5, i+1)
#     plt.imshow(images[0])
#     plt.axis('off')
# plt.show()


Found 4046 images belonging to 6 classes.
Found 1008 images belonging to 6 classes.


Define the CNN model architecture (layer)

Penggunaan layer konvolusi hingga ke 256, digunakan untuk menangkap lebih banyak fitur pada gambar.

In [38]:
model = Sequential([
    Conv2D(32, (3, 3), activation='relu', input_shape=(image_size, image_size, 3)),
    MaxPooling2D(pool_size=(2, 2)),
    Conv2D(64, (3, 3), activation='relu'),
    MaxPooling2D(pool_size=(2, 2)),
    Conv2D(128, (3, 3), activation='relu'),
    MaxPooling2D(pool_size=(2, 2)),
    Conv2D(256, (3, 3), activation='relu'),
    MaxPooling2D(pool_size=(2, 2)),
    Flatten(),
    Dense(128, activation='relu'),
    Dropout(0.5),
    Dense(train_generator.num_classes, activation='softmax') 
])


Compile the model with Nadam optimizer and categorical crossentropy loss

Untuk optimizer juga trial & error dari beberapa optimizer yang cocok digunakan pada CNN klasifikasi gambar.
Penggunaan menghasilkan performa yang lebih baik dari pada Adam.

In [39]:
model.compile(
    optimizer=Nadam(learning_rate=lr),
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

Train the model

In [40]:
history = model.fit(
    train_generator,
    epochs=epoch,
    validation_data=val_generator
)

Epoch 1/20
[1m127/127[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m108s[0m 839ms/step - accuracy: 0.2746 - loss: 1.6797 - val_accuracy: 0.4008 - val_loss: 1.3958
Epoch 2/20
[1m127/127[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m104s[0m 823ms/step - accuracy: 0.4127 - loss: 1.4088 - val_accuracy: 0.4702 - val_loss: 1.3577
Epoch 3/20
[1m127/127[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m105s[0m 825ms/step - accuracy: 0.4857 - loss: 1.2974 - val_accuracy: 0.5337 - val_loss: 1.1561
Epoch 4/20
[1m127/127[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m105s[0m 828ms/step - accuracy: 0.5522 - loss: 1.1808 - val_accuracy: 0.5694 - val_loss: 1.1154
Epoch 5/20
[1m127/127[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m105s[0m 825ms/step - accuracy: 0.5767 - loss: 1.1394 - val_accuracy: 0.6121 - val_loss: 1.0382
Epoch 6/20
[1m127/127[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m105s[0m 826ms/step - accuracy: 0.5969 - loss: 1.0698 - val_accuracy: 0.6161 - val_loss: 1.0213
Epoc

Evaluate the model on the validation data

In [41]:
loss, accuracy = model.evaluate(val_generator)
print(f"Validation Accuracy: {accuracy:.2f}")

[1m32/32[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m30s[0m 934ms/step - accuracy: 0.7998 - loss: 0.5822
Validation Accuracy: 0.80


In [49]:
model.export("model/cnn_trash_classifier")

INFO:tensorflow:Assets written to: model/cnn_trash_classifier\assets


INFO:tensorflow:Assets written to: model/cnn_trash_classifier\assets


Saved artifact at 'model/cnn_trash_classifier'. The following endpoints are available:

* Endpoint 'serve'
  args_0 (POSITIONAL_ONLY): TensorSpec(shape=(None, 128, 128, 3), dtype=tf.float32, name='keras_tensor_26')
Output Type:
  TensorSpec(shape=(None, 6), dtype=tf.float32, name=None)
Captures:
  1398280055376: TensorSpec(shape=(), dtype=tf.resource, name=None)
  1398280050576: TensorSpec(shape=(), dtype=tf.resource, name=None)
  1398285116496: TensorSpec(shape=(), dtype=tf.resource, name=None)
  1398285116112: TensorSpec(shape=(), dtype=tf.resource, name=None)
  1398285118416: TensorSpec(shape=(), dtype=tf.resource, name=None)
  1398285118224: TensorSpec(shape=(), dtype=tf.resource, name=None)
  1398285119184: TensorSpec(shape=(), dtype=tf.resource, name=None)
  1398285114000: TensorSpec(shape=(), dtype=tf.resource, name=None)
  1398285117264: TensorSpec(shape=(), dtype=tf.resource, name=None)
  1398285117072: TensorSpec(shape=(), dtype=tf.resource, name=None)
  1398285120336: Tensor

In [50]:
model.save("model/cnn_trash_classifier.keras")