Next, we explore how a Convolutional Neural Network (CNN) performs in classifying images into 25 categories.

First, we use `tf.keras.utils.image_dataset_from_directory` to load the entire dataset into memory as tensors. Since the dataset consists of only 150 images, we apply **k-fold cross-validation** to evaluate classification performance.

In [1]:
from pathlib import Path
import tensorflow as tf

data_dir = Path("./images_grouped")

whole_ds = tf.keras.utils.image_dataset_from_directory(
    data_dir,
    labels = 'inferred',
    label_mode = 'int',
    seed=42,
    image_size=(200, 180),
    batch_size=32)

Found 150 files belonging to 25 classes.


However, since `KFold` from `scikit-learn` library is not compatible with TensorFlow datasets, we convert the data to NumPy arrays. Additionally, we rescale the pixel values from the [0, 255] range to the [0, 1] interval for better training stability.

In [2]:
import numpy as np

X = []
y = []

for batch in whole_ds:
    images, labels = batch
    X.append(images.numpy())
    y.append(labels.numpy())

X = np.concatenate(X)
y = np.concatenate(y)

X = X / 255.0

To build the **architecture of the CNN**:

- We define the input layer with a shape that matches our image dimensions: height = 200, width = 180, color channels = 3.

- We add several `tf.keras.layers.Conv2D` layers with the ReLU activation function to extract relevant features from the images.

- To reduce spatial dimensions and computation, we include `tf.keras.layers.MaxPooling2D` layers for downsampling.

- The output from the final convolutional block is flattened using `tf.keras.layers.Flatten` and passed to a fully connected (dense) layer.

- Finally, we use a `tf.keras.layers.Dense` layer with a softmax activation function, suitable for multi-class classification, and set the number of output units to 25 which corresponds to the number of classes.

In [3]:
def create_model():
    
    model = tf.keras.Sequential([
        tf.keras.layers.Input(shape=(200, 180, 3)),
        tf.keras.layers.Conv2D(32, (3, 3), activation='relu'),
        tf.keras.layers.MaxPooling2D((2, 2)),
        tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
        tf.keras.layers.MaxPooling2D((2, 2)),
        tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
        tf.keras.layers.Flatten(),
        tf.keras.layers.Dense(64, activation='relu'),
        tf.keras.layers.Dense(25, activation='softmax')])
    
    return model

In [4]:
create_model().summary()

Next, we initialize K-Fold cross-validation using the `scikit-learn` library. For each split, we train the previously defined model and store the **validation accuracy** from the final epoch in a dictionary.

In [5]:
from sklearn.model_selection import KFold

kf = KFold(n_splits=5, shuffle=True, random_state=42)

val_accuracy = {}

for fold, (train_idx, val_idx) in enumerate(kf.split(X)):
    
    print(f'Fold {fold+1}')

    model = create_model()
    model.compile(optimizer='adam',
                  loss='sparse_categorical_crossentropy',
                  metrics=['accuracy'])
    
    epochs = 10
    
    model.fit(
        X[train_idx], y[train_idx],
        validation_data=(X[val_idx], y[val_idx]),
        epochs=epochs,
        batch_size=32)
    
    history=model.history.history
    val_accuracy[f'Fold {fold+1}'] = history['val_accuracy'][epochs-1]

Fold 1
Epoch 1/10
[1m4/4[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 760ms/step - accuracy: 0.0644 - loss: 3.9451 - val_accuracy: 0.3000 - val_loss: 3.0872
Epoch 2/10
[1m4/4[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 594ms/step - accuracy: 0.2631 - loss: 2.9661 - val_accuracy: 0.3000 - val_loss: 2.7665
Epoch 3/10
[1m4/4[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 604ms/step - accuracy: 0.5160 - loss: 2.3974 - val_accuracy: 0.4667 - val_loss: 1.9543
Epoch 4/10
[1m4/4[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 595ms/step - accuracy: 0.6019 - loss: 1.4551 - val_accuracy: 0.6667 - val_loss: 1.2423
Epoch 5/10
[1m4/4[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 618ms/step - accuracy: 0.8406 - loss: 0.6667 - val_accuracy: 0.8000 - val_loss: 0.6276
Epoch 6/10
[1m4/4[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 627ms/step - accuracy: 0.9690 - loss: 0.2531 - val_accuracy: 0.9000 - val_loss: 0.3545
Epoch 7/10
[1m4/4[0m [32m━━━━━

During training, we observe that the accuracy on both the training and validation sets increases progressively, indicating the model is learning effectively.

In [6]:
val_accuracy

{'Fold 1': 0.9666666388511658,
 'Fold 2': 0.9333333373069763,
 'Fold 3': 0.9666666388511658,
 'Fold 4': 0.9666666388511658,
 'Fold 5': 1.0}

The mean cross-validation accuracy across 5 folds is 0.967, demonstrating strong generalization performance.

In [7]:
print(f'Mean 5-Fold CV accuracy: {np.array(list(val_accuracy.values())).mean()}')

Mean 5-Fold CV accuracy: 0.9666666507720947


This result is consistent with the test accuracy obtained using the PCA-FDA-based classifier, suggesting that the simpler, projection-based approach can achieve performance comparable to more complex deep learning models under the given conditions.