# UAS JARINGAN SYARAF TIRUAN
KELOMPOK 4:
*   Renti Epana Sari
*   Wahyu Dwi Prasetio
*   Fedryanto Dartiko
*   Aisyah Amalia Alfitri
*   Wahyu Syahputra




In [1]:
# Import TensorFlow
import tensorflow as tf

from keras import datasets
from keras import layers
from keras import models
from keras import applications
from keras import models
from keras import preprocessing
from keras import losses

import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import os

## **CNN**
Dengan menggunakan dataset sekunder berlabel yang tersedia di internet, buatlah minimal tiga arsitektur CNN untuk melakukan klasifikasi


### **Data Loading**

Dataset yang digunakan ialah **Vegetable Image Dataset** yang tersedia di [Kaggle](https://www.kaggle.com/datasets/misrakahmed/vegetable-image-dataset)

Dataset terdiri atas 21000 citra yang **sudah di-split** train, test, dan validation.


In [2]:
TRAIN_DIR = '../input/vegetable-image-dataset/Vegetable Images/train'
TEST_DIR = '../input/vegetable-image-dataset/Vegetable Images/test'
VAL_DIR = '../input/vegetable-image-dataset/Vegetable Images/validation'

In [3]:
image_formats = ["png", "jpg"];

def show_images(image_files):  
    plt.figure() # tetapkan ukuran grid secara keseluruhan
    fig = plt.figure(figsize=(10,10))
    fig.patch.set_facecolor('xkcd:white')
    
    for i in range(len(image_files)):
        plt.subplot(4,5,i+1)    # jumlah gambar dalam grid yaitu 4*5 (20)
        img=mpimg.imread(image_files[i])
        plt.imshow(img)
        plt.tight_layout()
        plt.axis('off')
        plt.title(image_files[i].split("/")[5]+"\n"+"{}x{}".format(img.shape[0], img.shape[1])) # nama sayur dan ukuran gambar
    
    plt.show()

def list_files(dir):
    arr = []
    for root, dirs, files in os.walk(dir):
        for name in files:
            if name.endswith(".jpg") or name.endswith(".png"):
              arr.append(os.path.join(root, name))
              break
    return arr

Dataset terdari dari 15 kelas, yaitu bean, bitter gourd, bottle gourd, brinjal, broccoli, cabbage, capsicum, carrot, cauliflower, cucumber, papaya, potato, pumpkin, radish dan tomato. 

Gambar pada dataset ini berukuran seragam (uniform), yaitu 224*224 pixel berformat *.jpg.

In [4]:
image_list = list_files(TRAIN_DIR)
show_images(image_list)

Dataset dibagi dengan rincian sebagai berikut. 
- Data train berjumlah 15000 gambar (@1000)
- Data test berjumlah 3000 gambar (@200)
- Data validation berjumlah 3000 gambar (@200)


In [5]:
def count_files(dir):
    arr = []
    for root, dirs, files in os.walk(dir):
        count = 0
        for name in files:
            if name.endswith(".jpg") or name.endswith(".png"):
              count = count + 1
        if count > 0:
          arr.append(count)
    return arr

def get_all_veg_names(dir):
    arr = []
    for root, dirs, files in os.walk(dir):
        arr.append(dirs)
        break
    return arr

image_count = count_files(TRAIN_DIR)
print(len(image_count))
chars = get_all_veg_names(TRAIN_DIR)
# print(chars)

fig = plt.figure()
ax = fig.add_axes([0,0,3,1])
ax.bar(chars[0], image_count)
plt.title("Distribusi data train")
plt.show()

In [6]:
image_count = count_files(VAL_DIR)
print(len(image_count))
chars = get_all_veg_names(VAL_DIR)
# print(chars)

fig = plt.figure()
ax = fig.add_axes([0,0,3,1])
ax.bar(chars[0], image_count)
plt.title("Distribusi data val")
plt.show()

In [7]:
image_count = count_files(TEST_DIR)
print(len(image_count))
chars = get_all_veg_names(TEST_DIR)
# print(chars)

fig = plt.figure()
ax = fig.add_axes([0,0,3,1])
ax.bar(chars[0], image_count)
plt.title("Distribusi data test")
plt.show()

### **Preproses Data**

Melakukan preproses data menggunakan ImageDataGenerator agar data siap untuk di train. Objek datagen berfungsi untuk memproses data sebelum di load. Selain itu, penggunaan fungsi flow_from_directory akan membuat data ter-label-kan berdasarkan nama folder di mana ia tersimpan.

In [8]:
IMAGE_SIZE = (224,224)

train_datagen = preprocessing.image.ImageDataGenerator(rescale = 1./255,
                                         shear_range = 0.2,
                                         zoom_range = 0.2,
                                         horizontal_flip = True)

val_datagen = preprocessing.image.ImageDataGenerator(rescale = 1./255.)

train_generator = train_datagen.flow_from_directory(TRAIN_DIR,
                                                    batch_size = 64,
                                                    class_mode = 'categorical', 
                                                    target_size = IMAGE_SIZE) 

val_generator = val_datagen.flow_from_directory(VAL_DIR,
                                                batch_size = 64,
                                                class_mode = 'categorical', 
                                                target_size = IMAGE_SIZE) 

test_generator = val_datagen.flow_from_directory(TEST_DIR,
                                                batch_size = 64,
                                                class_mode = 'categorical', 
                                                target_size = IMAGE_SIZE) 


### **Pengembangan Model CNN**

Dalam mengembangkan model CNN, kita perlu berpatokan pada building block CNN. Ada tiga tipe lapisan pada CNN
- Lapisan Konvolusi atau Convolutional Layers +relu.
- Lapisan Pooling atau Pooling Layers.
- Lapisan Logika atau Fully-Connected Layers 

Kami menggunakan 3 model CNN dengan arsitektur yang beragam, sebagai berikut.

#### Model 1

In [9]:
"""
Setting Hyperparameter
"""
EPOCHS = 10
BATCH_SIZE = 64

In [10]:
"""
Model CNN 1
3 layer konvolusi
"""
# Create the convolutional base. 
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(224, 224, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))

# Add Dense layers on top
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(15, activation='softmax'))

# display the architecture of our model
model.summary()

"""
Compile Model
"""
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])


In [11]:
"""
Training Model
"""
history = model.fit(
          train_generator,
          epochs = EPOCHS,
          steps_per_epoch = int(0.2*len(train_generator)),
          validation_data = val_generator,
          validation_steps = len(val_generator),
          verbose = 1)


In [12]:
"""
Evaluasi Model
"""
# evaluate the model accuracy
plt.plot(history.history['accuracy'], label='acc')
plt.plot(history.history['val_accuracy'], label = 'val_acc')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.title("Model CNN 1 Accuracy")
plt.legend(loc='lower right')

In [13]:
# evaluate the model losses
plt.plot(history.history['loss'], label='loss')
plt.plot(history.history['val_loss'], label = 'val_loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title("Model CNN 1 Loss")
plt.legend(loc='upper right')

In [14]:
val_loss,val_acc = model.evaluate(val_generator)
test_loss,test_acc = model.evaluate(test_generator)

print("Val loss: {} \nVal acc: {}".format(val_loss, val_acc))
print("Test loss: {} \nTest acc: {}".format(test_loss, test_acc))

#### Model 2

In [15]:
train_generator.reset()
val_generator.reset()
test_generator.reset()
"""
Model CNN 2
3 layer konvolusi dengan stride 5x5
"""
# Create the convolutional base. 
model2 = models.Sequential()
model2.add(layers.Conv2D(32, (5, 5), activation='relu', input_shape=(224, 224, 3)))
model2.add(layers.MaxPooling2D((2, 2)))
model2.add(layers.Conv2D(64, (5, 5), activation='relu'))
model2.add(layers.MaxPooling2D((2, 2)))
model2.add(layers.Conv2D(128, (5, 5), activation='relu'))
model2.add(layers.MaxPooling2D((2, 2)))

# Add Dense layers on top
model2.add(layers.Flatten())
model2.add(layers.Dense(64, activation='relu'))
model2.add(layers.Dropout(0.2))
model2.add(layers.Dense(15, activation='softmax'))

# display the architecture of our model
model2.summary()

"""
Compile Model
"""
model2.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])


In [16]:
"""
Training Model
"""
history2 = model2.fit(
          train_generator,
          epochs = EPOCHS,
          steps_per_epoch = int(0.2*len(train_generator)),
          validation_data = val_generator,
          validation_steps = len(val_generator),
          verbose = 1)


In [17]:
"""
Evaluasi Model
"""
# evaluate the model accuracy
plt.plot(history2.history['accuracy'], label='acc')
plt.plot(history2.history['val_accuracy'], label = 'val_acc')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.title("Model CNN 2 Accuracy")
plt.legend(loc='lower right')

In [18]:
# evaluate the model losses
plt.plot(history2.history['loss'], label='loss')
plt.plot(history2.history['val_loss'], label = 'val_loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title("Model CNN 2 Loss")
plt.legend(loc='upper right')

In [19]:
val_loss,val_acc = model2.evaluate(val_generator)
test_loss,test_acc = model2.evaluate(test_generator)

print("Val loss: {} \nVal acc: {}".format(val_loss, val_acc))
print("Test loss: {} \nTest acc: {}".format(test_loss, test_acc))

#### Model 3

In [20]:
train_generator.reset()
val_generator.reset()
test_generator.reset()
"""
Model CNN 3
Padding same/1
"""
# Create the convolutional base. 
model3 = models.Sequential()
model3.add(layers.Conv2D(32, (3, 3), padding='same', activation='relu', input_shape=(224, 224, 3)))
model3.add(layers.MaxPooling2D((2, 2)))
model3.add(layers.Conv2D(196, (3, 3), padding='same', activation='relu'))
model3.add(layers.MaxPooling2D((2, 2)))
model3.add(layers.Dropout(0.2))

# Add Dense layers on top
model3.add(layers.Flatten())
model3.add(layers.Dense(1024, activation='relu'))
model3.add(layers.Dropout(0.3))
model3.add(layers.Dense(15, activation='softmax'))

# display the architecture of our model
model3.summary()

"""
Compile Model
"""
model3.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])


In [21]:
"""
Training Model
"""
history3 = model3.fit(
          train_generator,
          epochs = EPOCHS,
          steps_per_epoch = int(0.2*len(train_generator)),
          validation_data = val_generator,
          validation_steps = len(val_generator),
          verbose = 1)


In [22]:
"""
Evaluasi Model
"""
# evaluate the model accuracy
plt.plot(history3.history['accuracy'], label='acc')
plt.plot(history3.history['val_accuracy'], label = 'val_acc')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.title("Model CNN 3 Accuracy")
plt.legend(loc='lower right')

In [23]:
# evaluate the model losses
plt.plot(history3.history['loss'], label='loss')
plt.plot(history3.history['val_loss'], label = 'val_loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title("Model CNN 3 Loss")
plt.legend(loc='upper right')

In [24]:
val_loss,val_acc = model3.evaluate(val_generator)
test_loss,test_acc = model3.evaluate(test_generator)

print("Val loss: {} \nVal acc: {}".format(val_loss, val_acc))
print("Test loss: {} \nTest acc: {}".format(test_loss, test_acc))

#### **Kesimpulan**: model yang memiliki performa terbaik ialah Model 3 dengan nilai akurasi 0.917 untuk training dan 0.915 untuk testing. Arsitektur dari model ini, yaitu 

Convolutional Structure
- input shape (224, 224, 3)
- Conv2D 32 (3,3) padding='same', relu
- MaxPooling2D (2,2)
- Conv2D 32 (3,3) padding='same', relu
- MaxPooling2D (2,2)
- Dropout 0.2

Fully Connected Layer
- Flatten
- Dense 1024 relu
- Dropout 0.3
- Dense 15 

