# Element 1 Test - Image Classifier

In [216]:
import os
import tensorflow as tf
import keras
import numpy as np

## Image Pre-Processing

In [217]:
num_skipped = 0
num_changed = 0
for root, dirs, files in os.walk("./data"):
  path = root.split(os.sep)

  for file in enumerate(files):
    _, extension = os.path.splitext(file[1])
    if extension == ".jpg" or extension == ".jpeg" or extension == ".png":
      filepath = root + "/" + file[1]
        
      if not file[1].strip(extension).isascii():
          num_changed += 1
          new_file_name = "image" + str(file[0]) + extension
          new_filepath = root + "/" + new_file_name
          os.rename(filepath, new_filepath)
          filepath = new_filepath
    
      try:
        fobj = open(filepath, "rb")
        is_good = tf.compat.as_bytes("JFIF") in fobj.peek(10) or tf.compat.as_bytes("PNG") in fobj.peek(10)
      finally:
        fobj.close()

      if not is_good:
        num_skipped += 1
        # Delete corrupted image
        os.remove(filepath)

print("Changed name of %d images" % num_changed)
print("Deleted %d images" % num_skipped)

Changed name of 0 images
Deleted 0 images


This is taken from week 10 Data Processing with some changes. It walks through every image in our dataset and removes any file that does not fit the perameters. Here, this means that jpg, jpeg and png images that don't have a JFIF or PNG header will be removed. Keras also supports bmp and gif but there doesn't seem like much of a point to look for these.
Some file names where also causing an error while tring to load data because their file name were not utf-8, so I added a few lines that checks the file names are ascii and renames them if not.

The first run removed **249** images.

## Loading Data

In [218]:
images_path = "./data"
batch_size=16
shape=(300,300,3)

training_dataset = keras.preprocessing.image_dataset_from_directory(
    images_path,
    labels='inferred',
    label_mode='categorical',
    batch_size=batch_size,
    image_size=shape[:2],
    subset='training',
    validation_split=0.1,
    seed=42,
    pad_to_aspect_ratio=True
)

test_dataset = keras.preprocessing.image_dataset_from_directory(
    images_path,
    labels='inferred',
    label_mode='categorical',
    batch_size=batch_size,
    image_size=shape[:2],
    subset='validation',
    validation_split=0.1,
    seed=42,
    pad_to_aspect_ratio=True
)

Found 1273 files belonging to 4 classes.
Using 1146 files for training.
Found 1273 files belonging to 4 classes.
Using 127 files for validation.


For this model there are 4 categories: bicycle, car, deer and mountain. This loads our data from the dataset and splits it into two more datasets, one for training and one for testing. Both also have a small portion set aside for validation.
- I've made sure to set the label to to catargorical, as the task is to catagorise the images.
- The batch size offers a reasonable training time and good accuracy. Making it a little smaller than default (32) could help with getting better validation accuracy and adding some noise to the model. Increasing it would improve training times.
- I have made the image size 300 by 300 as this helps to reduce training time whislt still keeping important features of the images. I could possibly increase this for a boost in accuracy with a sacrifice of speed, or vice versa. Note that many of the photos (like panoramas) would get quite squished which might also affect performance. I've added pad_to_aspect_ratio to help with this as we're looking for objects in an entire image and cropping them to fit the frame would not be wise.
- The validation split is set so that in training I can see how well the modal in generalizing, or if it is strating to overfit
- There is also a default colour mode of rgb, meaning that any black and white images will be coverted to 3 channels to fit the shape of the model.

## Creating the Model

In [219]:
model = keras.Sequential([
    keras.Input(shape=shape),
    keras.layers.RandomFlip(mode="horizontal"),
    keras.layers.RandomRotation(0.1),
    keras.layers.RandomBrightness(0.2),
    keras.layers.RandomContrast(0.2),
    
    keras.layers.Conv2D(8, kernel_size=(3, 3), activation="relu"),
    keras.layers.MaxPooling2D(pool_size=(2, 2)),
    
    keras.layers.Conv2D(16, kernel_size=(3, 3), activation="relu"),
    keras.layers.MaxPooling2D(pool_size=(2, 2)),
    
    keras.layers.Conv2D(32, kernel_size=(3, 3), activation="relu"),
    keras.layers.MaxPooling2D(pool_size=(2, 2)),
    
    keras.layers.Conv2D(64, kernel_size=(3, 3), activation="relu"),
    keras.layers.MaxPooling2D(pool_size=(2, 2)),
    
    keras.layers.Conv2D(128, kernel_size=(3, 3), activation="relu"),
    keras.layers.MaxPooling2D(pool_size=(2, 2)),
    keras.layers.SpatialDropout2D(0.1),
    
    keras.layers.Conv2D(256, kernel_size=(3, 3), activation="relu"),
    keras.layers.GlobalAveragePooling2D(),
    keras.layers.Flatten(),
    keras.layers.Dense(64, activation="relu"),
    keras.layers.Dense(128, activation="relu"),
    keras.layers.Dense(64, activation="relu"),
    keras.layers.Dropout(0.3),
    keras.layers.Dense(4, activation="softmax")
])

model.summary()


After some extensive testing and looking at other smaller models such as MobileNetV2 I've found that these layers work out to have reasonable accuracy. The convolutional layers seem to get better results with ReLU instead of sigmoid, which looks to be fairly standard for other CNNs.
Keras documentation also recommends the use of softmax for catagorical tasks with more than two catagories, otherwise I would use sigmoid. The difference between sigmoid and softmax here is quite small. There is also a spacial dropout and dropout layers to help prevent overfitting. Finally, I have a mix of pooling layers as each help with finding different aspects of images. Max is most prominent because in most of the images we are looking for details withing the image like deer, bikes or cars. Global average pooling helps looking at the overall image, helping to look for images with mountains. Whilst these might not make a huge difference they might still help accuracy a little.

In [220]:
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['categorical_accuracy'])

Loss is set to categorical crossentropy, as we are categorising images into four categories. I've also set the metrics to categorical accuracy for the same reason. I have also left the learning rate as default (0.001) as it seems to be working well and is adaptive with the adam optimizer. Other opitmizers like SGD might work too, but would require a lot more fine tuning.

In [225]:
model.fit(
  training_dataset,
  validation_data=test_dataset,
  epochs=10
)

Epoch 1/10
[1m72/72[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m34s[0m 463ms/step - categorical_accuracy: 0.7811 - loss: 0.5812 - val_categorical_accuracy: 0.7795 - val_loss: 0.5724
Epoch 2/10
[1m72/72[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m34s[0m 465ms/step - categorical_accuracy: 0.7640 - loss: 0.5466 - val_categorical_accuracy: 0.8031 - val_loss: 0.5433
Epoch 3/10
[1m72/72[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m35s[0m 482ms/step - categorical_accuracy: 0.7836 - loss: 0.5456 - val_categorical_accuracy: 0.8031 - val_loss: 0.5095
Epoch 4/10
[1m72/72[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m34s[0m 466ms/step - categorical_accuracy: 0.8001 - loss: 0.5099 - val_categorical_accuracy: 0.7559 - val_loss: 0.7007
Epoch 5/10
[1m72/72[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m34s[0m 467ms/step - categorical_accuracy: 0.7767 - loss: 0.5408 - val_categorical_accuracy: 0.7559 - val_loss: 0.6355
Epoch 6/10
[1m72/72[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m 

<keras.src.callbacks.history.History at 0x1fb1a092590>

In [228]:
model.evaluate(test_dataset)

[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 267ms/step - categorical_accuracy: 0.8072 - loss: 0.6281


[0.7045027017593384, 0.7795275449752808]

After 20 epochs the highest test accuracy I have been able to achieve is around 72%. However, only a few epochs in, the models loss continues to decrease but validation loss stagnates - this could be a sign of overfitting or it could be because I am using a smaller batch size, making each step less accurate meaning I need more epochs. This requires more testing. <br>
To try and fix this I've added some extra data [processing](https://keras.io/api/layers/preprocessing_layers/image_augmentation/) to increase the size of the dataset.
I've also added random contrast and random brightness to the model to further increase the amount of data. This seems to have imporved validation loss as well as the testing accuracy but only by 1-2%. <br>
<br>
I could further reduce batch sizing; increase the image size and add more convolutional layers; and look into normalization and regularization; or gather more data to get a better model. But all of these increase training time and need for computational power. One other thing I could do it filter the image dataset a bit more. Most of the images are valid however some images don't correspont with their labels at all - I would probably have to do this manually.
<br>

edit: I added some dense layers at the end of the model for fun at it looks to have improved the consistancy of testing accuracy. This is also where I added pad to aspect ration, which all together gives testing results between 73%-80% after some longer training (50 epochs). However, this is with 127 images for testing, which might not be enough to give a good metric on accuracy. A bigger dataset would help here, and also in training.

## Saving the model

In [229]:
model.save("./Element1_CNN.keras")

In [169]:
# model = keras.models.load_model("./Element1_CNN.keras")
# model.summary()

Finally, the model has been saved. To use it go to ImageClassifier.ipynb.