# Traditional Machine learning versus Convolutional Neural Networks
This notebook servers as a proof-of-concept for the research done for the development of the Moodboard Generator project. Within this research, traditional machine learning and cnns will be compared in usefulness for smaller datasets. The dataset that will be used is pictures of Pokemon, that will be seperated into liked and disliked classes.

## Downloading the images and preparing them for use

In order to dived the images, 50 images were chosen for the liked class, and 50 for the disliked. These were picked at random at first, but this made both models about 50% accurate. In order to mitigate the random selection, 50 images were manually chosen per class. This was done by selecting winged Pokemon for the liked folder, and Pokemon without wings for the disliked class. This could make it a more realistic scenario, and ensure that the models have something to go on. Do not execute the next code block.

In [1]:
import os
import random
import shutil

files_list = []

for root, dirs, files in os.walk("./MLvsCNNimages/images"):
    for file in files:
        if file.endswith(".jpg") or file.endswith(".png") or file.endswith(".jpeg"):
            files_list.append(os.path.join(root, file))

file_count = len(files_list)
print(file_count)

filesToCopyLiked = random.sample(files_list, 50)
filesToCopyDisliked = random.sample(files_list, 50)

destPathLiked = "./MLvsCNNimages/dataset/liked"
destPathDisliked = "./MLvsCNNimages/dataset/disliked"


if not os.path.isdir(destPathLiked):
    os.makedirs(destPathLiked)

if not os.path.isdir(destPathDisliked):
    os.makedirs(destPathDisliked)

for file in filesToCopyLiked:
    shutil.copy(file, destPathLiked)

for file in filesToCopyDisliked:
    shutil.copy(file, destPathDisliked)

809


Now, let's see if everything is in order by seeing how many images are in both folders.

In [2]:
import os

files_list = []
for root, dirs, files in os.walk("./MLvsCNNimages/dataset/disliked"):
    for file in files:
        if file.endswith(".jpg") or file.endswith(".png") or file.endswith(".jpeg"):
            files_list.append(os.path.join(root, file))

file_count = len(files_list)
print("Disliked images: " + str(file_count))

files_list = []
for root, dirs, files in os.walk("./MLvsCNNimages/dataset/liked"):
    for file in files:
        if file.endswith(".jpg") or file.endswith(".png") or file.endswith(".jpeg"):
            files_list.append(os.path.join(root, file))

file_count = len(files_list)
print("Liked images: " + str(file_count))

Disliked images: 94
Liked images: 98


Import the images as dataset

In [3]:
import tensorflow as tf

data_dir = "./MLvsCNNimages/dataset"

batch_size = 10
img_height = 180
img_width = 180

train_ds = tf.keras.utils.image_dataset_from_directory(
    data_dir,
    validation_split=0.2,
    subset="training",
    seed=123,
    image_size=(img_height, img_width),
    batch_size=batch_size)

val_ds = tf.keras.utils.image_dataset_from_directory(
    data_dir,
    validation_split=0.2,
    subset="validation",
    seed=123,
    image_size=(img_height, img_width),
    batch_size=batch_size)


class_names = train_ds.class_names
print(class_names)

2023-01-13 12:51:32.878662: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-01-13 12:51:33.835893: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2023-01-13 12:51:33.835972: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2023-01-13 12:51:34.014913: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered

KeyboardInterrupt



Inspecting some images

In [None]:
import matplotlib.pyplot as plt

plt.figure(figsize=(10, 10))
for images, labels in train_ds.take(1):
    for i in range(9):
        ax = plt.subplot(3, 3, i + 1)
        plt.imshow(images[i].numpy().astype("uint8"))
        plt.title(class_names[labels[i]])
        plt.axis("off")


Configure the dataset for performance

In [None]:
 AUTOTUNE = tf.data.AUTOTUNE

train_ds = train_ds.cache().prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)

## Traditional Machine learning model

In [None]:
model = tf.keras.Sequential([
    tf.keras.layers.Rescaling(1./255),
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dense(10)
])

model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

model.fit(train_ds, validation_data=val_ds, epochs=25)

Evaluate the accuracy

In [None]:
test_loss, test_acc = model.evaluate(val_ds, verbose=2)
print('\nTest accuracy:', str(test_acc * 100) + "%")

## Convolutional Neural Network

In [None]:
cnn_model = tf.keras.Sequential([
    tf.keras.layers.Rescaling(1./255),
    tf.keras.layers.Conv2D(32, 3, activation='relu'),
    tf.keras.layers.MaxPooling2D(),
    tf.keras.layers.Conv2D(32, 3, activation='relu'),
    tf.keras.layers.MaxPooling2D(),
    tf.keras.layers.Conv2D(32, 3, activation='relu'),
    tf.keras.layers.MaxPooling2D(),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dense(len(class_names))
])

cnn_model.compile(
    optimizer='adam',
    loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=['accuracy'])

cnn_model.fit(
    train_ds,
    validation_data=val_ds,
    epochs=10
)

Evaluate the accuracy

In [None]:
test_loss, test_acc = cnn_model.evaluate(val_ds, verbose=2)
print('\nTest accuracy:', str(test_acc * 100) + "%")

# Results
With random images (3-25 epochs): ML = 50%, CNN = 50%
With picked images (5 epochs): ML = 75%, CNN = 50%
With picked images (10 epochs): ML = 90%, CNN = 80%
With picked images (25 epochs): ML = 90%, CNN = 85%