## Flower Classification Convolutional Neural Network Project

### Dataset

Dataset containing 4,242 images of flowers. Dataset obtained from Kaggle at https://www.kaggle.com/datasets/alxmamaev/flowers-recognition. 

Dataset contained following (class / number of images):

daisy / 764
dandelion / 1,052
rose / 784
sunflower / 733
tulip / 984

##### Preprocessing

All images are resized to 244 x 244 in this notebook.

### Findings

From performing operations on this dataset, I have found that the first convolutional neural network created to classify the flowers was not accurate. As seen by the accuracy score of 0.63, the model is correct in its classification more often than not, but is not consistent enough to be relied upon.

## Changelog

#### Version 1

- Set root directory for image locations
- Read in images.
- Resized images to all be the same size.
- Create ImageDataGenerator.
- Split data into training and validation data.
- Created  and compiled model.
- Trained model on the data.
- Analysed accuracy of the model using accuracy and loss.

In [None]:
# Import all necessary libraries

import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import cv2
import os
from tensorflow.keras import layers, models
import numpy as np
from tensorflow.keras.preprocessing import image

In [None]:
# Set root directory for class folders and target size for images

root_directory = 'flowers/'
target_size = (224, 224)

In [None]:
# Resize images in dataset to target_size

for class_folder in os.listdir(root_directory):
    class_path = os.path.join(root_directory, class_folder)
    if os.path.isdir(class_path):  # Check if it is a directory
        # Loop through all images in the class folder
        for filename in os.listdir(class_path):
            if filename.endswith('.jpg') or filename.endswith('.png'):
                img_path = os.path.join(class_path, filename)
                img = cv2.imread(img_path)
                img_resized = cv2.resize(img, target_size)
                cv2.imwrite(img_path, img_resized)  # Overwrite the original image or save to a new file

In [None]:
# Set number of training samples used in one iteration of training
# Set number of passes through dataset

batch_size = 32
epochs = 10

In [None]:
# Create ImageDataGenerator to allow real-time data augmentation. This helps the model generalise and reduces overfitting
# Rescale pixel values from 0-255 to 0-1 to improve convergence during training
# Reserve 20% of the dataset for validation

train_datagen = ImageDataGenerator(rescale=1./255, validation_split=0.2)

In [None]:
# Load training data

train_generator = train_datagen.flow_from_directory(
    root_directory,
    target_size=target_size,
    batch_size=batch_size,
    class_mode='categorical',
    subset='training'
)

Found 3457 images belonging to 5 classes.


In [None]:
# Check class indice values

print(train_generator.class_indices)

{'daisy': 0, 'dandelion': 1, 'rose': 2, 'sunflower': 3, 'tulip': 4}


In [None]:
# Load validation data

validation_generator = train_datagen.flow_from_directory(
    root_directory,
    target_size=target_size,
    batch_size=batch_size,
    class_mode='categorical',
    subset='validation'
)

Found 860 images belonging to 5 classes.


In [None]:
# Three 2D convolution layers for RGB colours, filter size of 3x3
# Reduce height and width of feature maps to reduce number of parameters and computation in the network

model = models.Sequential([
    layers.Conv2D(32, (3, 3), activation='relu', input_shape=(224, 224, 3)),
    layers.MaxPooling2D(pool_size=(2, 2)),
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.MaxPooling2D(pool_size=(2, 2)),
    layers.Conv2D(128, (3, 3), activation='relu'),
    layers.MaxPooling2D(pool_size=(2, 2)),
    layers.Flatten(),   # flatten 3D output to 1D vector for following layers
    layers.Dense(128, activation='relu'),   # Learn high-level features by combining features learning in convolutional layers
    layers.Dense(len(train_generator.class_indices), activation='softmax')  # Same number of neurons as classes, softmax outputs probability for each class
]);

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [None]:
# Adaptive Moment Estimation optimiser because it is memory-efficient and adapts the learning rate dynamically for each parameter
# Minimise loss function during training

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

In [None]:
# Train model

history = model.fit(
    train_generator,
    steps_per_epoch=train_generator.samples // batch_size,
    validation_data=validation_generator,
    validation_steps=validation_generator.samples // batch_size,
    epochs=epochs
)

  self._warn_if_super_not_called()


Epoch 1/10
[1m108/108[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m31s[0m 281ms/step - accuracy: 0.3605 - loss: 1.7730 - val_accuracy: 0.5529 - val_loss: 1.0848
Epoch 2/10
[1m  1/108[0m [37m━━━━━━━━━━━━━━━━━━━━[0m [1m17s[0m 162ms/step - accuracy: 0.4062 - loss: 1.1524



[1m108/108[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 11ms/step - accuracy: 0.4062 - loss: 1.1524 - val_accuracy: 0.5505 - val_loss: 1.0748
Epoch 3/10
[1m108/108[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m19s[0m 173ms/step - accuracy: 0.6257 - loss: 0.9940 - val_accuracy: 0.6382 - val_loss: 0.9523
Epoch 4/10
[1m108/108[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 10ms/step - accuracy: 0.5625 - loss: 0.9802 - val_accuracy: 0.6118 - val_loss: 0.9867
Epoch 5/10
[1m108/108[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m18s[0m 170ms/step - accuracy: 0.6842 - loss: 0.7996 - val_accuracy: 0.6310 - val_loss: 0.9934
Epoch 6/10
[1m108/108[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 10ms/step - accuracy: 0.7188 - loss: 0.5296 - val_accuracy: 0.6358 - val_loss: 0.9976
Epoch 7/10
[1m108/108[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m18s[0m 168ms/step - accuracy: 0.8223 - loss: 0.4854 - val_accuracy: 0.6142 - val_loss: 1.0347
Epoch 8/10
[1m108/108[0m 

In [None]:
# Analyse loss and accuracy measurements

loss, accuracy = model.evaluate(validation_generator)
print(f'Validation loss: {loss}')
print(f'Validation accuracy: {accuracy}')

[1m27/27[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 44ms/step - accuracy: 0.6370 - loss: 1.3624
Validation loss: 1.367043137550354
Validation accuracy: 0.6255813837051392


#### Results

An accuracy value of 0.63 tells us that the model is correct more often than not, but not accurate enough to be relied upon consistently.

In [80]:
# Test model on image of sunflower

img_path = 'flowers/sunflower/6953297_8576bf4ea3.jpg'
img = image.load_img(img_path, target_size=target_size)
img_array = image.img_to_array(img) / 255.0
img_array = np.expand_dims(img_array, axis=0)
predictions = model.predict(img_array)
predicted_class = np.argmax(predictions, axis=1)
print(f'Predicted class: {predicted_class}')

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 26ms/step
Predicted class: [3]


In [81]:
# Test model on image of daisy

img_path = 'flowers/daisy/144603918_b9de002f60_m.jpg'
img = image.load_img(img_path, target_size=target_size)
img_array = image.img_to_array(img) / 255.0
img_array = np.expand_dims(img_array, axis=0)
predictions = model.predict(img_array)
predicted_class = np.argmax(predictions, axis=1)
print(f'Predicted class: {predicted_class}')

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 25ms/step
Predicted class: [4]
