# **Fine-Grained Classification**
### Elsun Nabatov

### **Introduction**

This project embarked on a journey to develop a neural network model capable of identifying specific airplane models from a diverse and detailed dataset. Leveraging the power of transfer learning, state-of-the-art architectures, and innovative data preprocessing techniques, it aimed to bridge the gap between generic object detection and the precise classification of nearly indistinguishable categories. The task was not merely an exercise in technical skill but a venture into the art of fine-tuning and optimization, exploring the limits of current methodologies and seeking new pathways to accuracy and efficiency in image classification.

### **Data Preparation**

The first step in the project involved setting up the environment and accessing the dataset stored on Google Drive. The dataset includes images of airplanes and is split into three parts: training, validation, and test sets. This structure is ideal for training machine learning models, where the model is trained on the training set, hyper-parameters are optimized using the validation set, and the final model's performance is evaluated on the test set.

In [50]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [51]:
import pandas as pd
import os

# Directory paths
dataset_dir = '/content/drive/My Drive/dataset'
data_dir = '/content/drive/My Drive/data'
images_dir = os.path.join(data_dir, 'images')  # The directory where images are stored

In [52]:
# Load .csv files
train_csv = pd.read_csv(os.path.join(dataset_dir, 'train.csv'))
val_csv = pd.read_csv(os.path.join(dataset_dir, 'val.csv'))
test_csv = pd.read_csv(os.path.join(dataset_dir, 'test.csv'))

In [53]:
# Load class information
families = pd.read_csv(os.path.join(data_dir, 'families.txt'), header=None)
variants = pd.read_csv(os.path.join(data_dir, 'variants.txt'), header=None)
manufacturers = pd.read_csv(os.path.join(data_dir, 'manufacturers.txt'), header=None)

### **Data Preprocessing and Augmentation**

To enhance the model's ability to generalize from the training data and improve its performance on unseen data, image data augmentation techniques were applied. This process included random transformations such as rotation, width and height shifts, shear, zoom, and horizontal flipping. These transformations introduce variability in the training data without changing the labels, helping the model learn more robust features.

In [54]:
import pandas as pd
import os
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.applications.efficientnet import preprocess_input, EfficientNetB7

# Preprocessing and data augmentation
train_datagen = ImageDataGenerator(
    preprocessing_function=preprocess_input,
    rotation_range=30,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    fill_mode='nearest'
)

val_datagen = ImageDataGenerator(preprocessing_function=preprocess_input)
test_datagen = ImageDataGenerator(preprocessing_function=preprocess_input)

In [68]:
# Creating data generators
train_generator = train_datagen.flow_from_dataframe(
    dataframe=train_csv,
    directory=images_dir,
    x_col='filename',
    y_col='Classes',
    target_size=(224, 224),
    batch_size=32,
    class_mode='categorical'
)

val_generator = val_datagen.flow_from_dataframe(
    dataframe=val_csv,
    directory=images_dir,
    x_col='filename',
    y_col='Classes',
    target_size=(224, 224),
    batch_size=32,
    class_mode='categorical'
)


test_generator = test_datagen.flow_from_dataframe(
    dataframe=test_csv,
    directory=images_dir,
    x_col='filename',
    y_col='Classes',
    target_size=(224, 224),
    batch_size=32,
    class_mode='categorical',
    shuffle=False   # Important for test set to not shuffle data
)

Found 3334 validated image filenames belonging to 100 classes.
Found 3333 validated image filenames belonging to 100 classes.
Found 3333 validated image filenames belonging to 100 classes.


#### **Data Integrity Check**

To ensure the integrity of the dataset and avoid issues during training, a check was performed to confirm that all filenames listed in the CSV files correspond to actual image files in the dataset directories.

In [56]:
import os
import pandas as pd

dataset_dir = '/content/drive/My Drive/dataset'  # dataset directory path
data_dir = '/content/drive/My Drive/data'  # data directory path
images_dir = os.path.join(data_dir, 'images')  # The directory where images are stored

# Load the CSV files
train_csv = pd.read_csv(os.path.join(dataset_dir, 'train.csv'))
val_csv = pd.read_csv(os.path.join(dataset_dir, 'val.csv'))
test_csv = pd.read_csv(os.path.join(dataset_dir, 'test.csv'))

# Function to check for non-matching files
def find_non_matching_filenames(df, images_dir):
    non_matching_files = []
    for index, row in df.iterrows():
        image_path = os.path.join(images_dir, row['filename'])
        if not os.path.isfile(image_path):
            non_matching_files.append(row['filename'])
    return non_matching_files

# Find non-matching filenames in each set
non_matching_train = find_non_matching_filenames(train_csv, images_dir)
non_matching_val = find_non_matching_filenames(val_csv, images_dir)
non_matching_test = find_non_matching_filenames(test_csv, images_dir)

# Print out the non-matching filenames
print(f"Non-matching training filenames: {non_matching_train}")
print(f"Non-matching validation filenames: {non_matching_val}")
print(f"Non-matching test filenames: {non_matching_test}")

Non-matching training filenames: []
Non-matching validation filenames: []
Non-matching test filenames: []


### **Model Development and Training**

With the data prepared and preprocessed, the next step was to develop the model for classification. For this project, I chose to use the EfficientNetB7 architecture due to its excellent balance between accuracy and computational efficiency. EfficientNetB7 is a state-of-the-art model pre-trained on the ImageNet dataset, making it an excellent choice for transfer learning.

In [57]:
train_datagen = ImageDataGenerator(
    preprocessing_function=preprocess_input, # EfficientNet's own preprocess input function
    rotation_range=30,   # Degree range for random rotations
    width_shift_range=0.2,  # Range (as a fraction of total width) for random horizontal shifts
    height_shift_range=0.2,  # Range (as a fraction of total height) for random vertical shifts
    shear_range=0.2,  # Shear intensity (shear angle in degrees)
    zoom_range=0.2,  # Range for random zoom
    horizontal_flip=True,  # Randomly flip inputs horizontally
    fill_mode='nearest'  # Strategy to fill newly created pixels
)

By utilizing transfer learning, the model benefits from the pre-trained weights of EfficientNetB7, significantly reducing the time and resources required for training from scratch. I added custom layers on top, including a Dense layer with 1024 units and a dropout layer to prevent overfitting. The final output layer has a softmax activation function, suitable for multi-class classification.

In [58]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.applications import EfficientNetB7
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.losses import CategoricalCrossentropy
from tensorflow.keras.metrics import Accuracy

# Load the pre-trained model with weights and exclude the top layer.
model = EfficientNetB7(include_top=False, weights='imagenet', input_shape=(224, 224, 3), pooling='max')

# Adding custom layers on top of EfficientNet
model = keras.Sequential([
    model,
    keras.layers.Dense(1024, activation='relu'),
    keras.layers.Dropout(0.5),
    keras.layers.Dense(len(train_generator.class_indices), activation='softmax')
])

model.compile(optimizer=Adam(learning_rate=0.001),
              loss=CategoricalCrossentropy(),
              metrics=['accuracy'])


### **Training Process**

The model was trained for 10 epochs with early stopping and model checkpoint callbacks to save the best model based on validation loss.

In [59]:
from tensorflow.keras.models import load_model

# Load the pre-trained model
model = load_model('/content/drive/My Drive/best_model.h5')

The training process showed promising results, with the model achieving an accuracy of up to 88.09% on the training set and 69.82% on the validation set. These results indicate that the model was learning effectively from the training data and making reasonable predictions on unseen validation data.

In [62]:
from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping

checkpoint = ModelCheckpoint('/content/drive/My Drive/best_model.h5', save_best_only=True, monitor='val_loss', mode='min')
early_stop = EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True)

history = model.fit(
    train_generator,
    validation_data=val_generator,
    epochs=10,
    callbacks=[checkpoint, early_stop]
)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [69]:
# Evaluate on validation set
val_loss, val_accuracy = model.evaluate(val_generator)
print(f"Validation Loss: {val_loss}, Validation Accuracy: {val_accuracy}")

# Evaluate on test set
test_generator.reset()
test_loss, test_accuracy = model.evaluate(test_generator)
print(f"Test Loss: {test_loss}, Test Accuracy: {test_accuracy}")

Validation Loss: 1.5105085372924805, Validation Accuracy: 0.6369637250900269
Test Loss: 1.473116397857666, Test Accuracy: 0.6483648419380188


### **Hyperparameter Tuning and Model Optimization**

After evaluating my model on both the validation and test sets, I observed a validation accuracy of 63.70% and a test accuracy of 64.84%. Although these results were promising, they indicated a need for further improvement to meet the project's accuracy benchmarks. Here’s a detailed account of my approach to hyperparameter tuning and model optimization, aiming for higher performance.

Data Augmentation: I significantly enhanced data augmentation techniques, such as increasing rotation and shift ranges, introducing shear and zoom variations, and adjusting brightness, aiming to boost the model's robustness and adaptability to diverse image conditions.

Learning Rate Scheduler: I implemented a learning rate scheduler, starting with an initial rate of 0.0001 and applying a decay factor of 0.5 every 10 epochs, to enable more precise model weight adjustments and improve convergence.

Batch Size Adjustment: Reducing the batch size from 32 to 16 was a strategic choice intended to promote a more stable learning process and achieve finer optimization of the model.

Model Compilation: The model was recompiled with the Adam optimizer and an adjusted learning rate, using categorical crossentropy as the loss function, to optimize performance for our multi-class classification task.

Fine-Tuning: I unfroze the last 20 layers of the EfficientNetB7 model for fine-tuning, aiming to enhance the model's ability to discern the subtle differences between airplane models and improve overall accuracy.

I proceeded to train the model for an additional 50 epochs, employing callbacks such as ModelCheckpoint to save the best model and EarlyStopping to halt training if the validation loss did not improve for 10 consecutive epochs. The training process showed promising improvements. **Improved to 77.89%** by the 13th epoch. Contributed to more nuanced weight updates, allowing the model to escape potential local minima.

In [115]:
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import load_model
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.applications.efficientnet import preprocess_input
from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping, LearningRateScheduler
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.losses import CategoricalCrossentropy

# Load the pre-trained model
model_path = '/content/drive/My Drive/best_model.h5'  # path to where model is stored
model = load_model(model_path)

# Data Augmentation
train_datagen = ImageDataGenerator(
    preprocessing_function=preprocess_input,
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    fill_mode='nearest',
    brightness_range=[0.8, 1.2]
)



val_datagen = ImageDataGenerator(preprocessing_function=preprocess_input)

# Using flow_from_dataframe
train_generator = train_datagen.flow_from_dataframe(
    dataframe=train_csv,
    directory=images_dir, 
    x_col='filename',  
    y_col='Classes', 
    target_size=(224, 224),
    batch_size=16,
    class_mode='categorical'
)

val_generator = val_datagen.flow_from_dataframe(
    dataframe=val_csv,
    directory=images_dir,
    x_col='filename',
    y_col='Classes',
    target_size=(224, 224),
    batch_size=16,
    class_mode='categorical'
)


# Learning Rate Scheduler
def step_decay(epoch):
    initial_lr = 0.0001
    drop = 0.5
    epochs_drop = 10.0
    lr = initial_lr * np.power(drop, np.floor((1+epoch)/epochs_drop))
    return lr

lr_scheduler = LearningRateScheduler(step_decay)

# Unfreeze the last few layers for fine-tuning
for layer in model.layers[-20:]:
    if not isinstance(layer, keras.layers.BatchNormalization):
        layer.trainable = True

model.compile(optimizer=Adam(learning_rate=0.0001),
              loss=CategoricalCrossentropy(),
              metrics=['accuracy'])

# Callbacks
checkpoint = ModelCheckpoint('/content/drive/My Drive/best_model_updated.h5', save_best_only=True, monitor='val_loss', mode='min')
early_stop = EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True)

# Training
history = model.fit(
    train_generator,
    epochs=50,  # epoch counts
    validation_data=val_generator,
    callbacks=[checkpoint, early_stop, lr_scheduler]
)


Found 3334 validated image filenames belonging to 100 classes.
Found 3333 validated image filenames belonging to 100 classes.
Epoch 1/50

  saving_api.save_model(


Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50


### **Final Evaluation on Test Set**

After enhancing the data augmentation strategies significantly to capture real-world variability, I ensured the model's robustness and adaptability were top-notch. By implementing a learning rate scheduler that dynamically adjusted the weights starting with an initial rate of 0.0001 and reducing it by half every 10 epochs, I aimed for fine-tuned model weights over time.

In [116]:
test_datagen = ImageDataGenerator(preprocessing_function=preprocess_input)

test_generator = test_datagen.flow_from_dataframe(
    dataframe=test_csv,
    directory=images_dir,
    x_col='filename',
    y_col='Classes',  # test set labeling
    target_size=(224, 224),
    batch_size=16,  # batch size 16
    class_mode='categorical',  # Using labels is categorical
    shuffle=False  # to keep data in order
)


Found 3333 validated image filenames belonging to 100 classes.


By reducing the batch size to 16 and fine-tuning the last 20 layers of the EfficientNetB7 model, I aimed for a more precise optimization path and a deeper understanding of the dataset's intricate details. These efforts culminated in a significant improvement in the model's performance, with the final evaluation on the test set revealing a test accuracy of 77.62%. This result was a testament to the effectiveness of the adjustments made, highlighting the model's enhanced ability to classify airplane models accurately.

### **Model Performance Analysis**

After implementing strategic improvements and optimizations, I subjected my model to a rigorous evaluation on the test set. This section delves into the detailed analysis of the model's performance, encompassing accuracy, precision, recall, and the confusion matrix, followed by a comprehensive classification report to offer insights into its capability to distinguish between the nuanced categories of airplane models.

In [117]:
# Evaluate the model
test_loss, test_accuracy = model.evaluate(test_generator)

print(f"Test Loss: {test_loss}, Test Accuracy: {test_accuracy}")

Test Loss: 1.1010876893997192, Test Accuracy: 0.7761776447296143


### **Accuracy, Precision, Recall and Confusion Matrix**

The model demonstrated commendable performance on the test set, achieving an accuracy of 77.62%. This metric signifies the model's overall effectiveness in correctly identifying the airplane models from the test set. Precision, calculated as 78.89%, indicates the model's reliability in its positive predictions, while the recall of 77.62% reflects its ability to find all relevant instances within the test set. These metrics collectively suggest that the model has achieved a balanced performance in terms of both specificity and sensitivity across the different classes.

In [120]:
predictions = model.predict(test_generator)
predicted_classes = np.argmax(predictions, axis=1)



In [121]:
true_classes = test_generator.classes

In [122]:
from sklearn.metrics import accuracy_score, precision_score, recall_score, confusion_matrix

# Accuracy
accuracy = accuracy_score(true_classes, predicted_classes)
print(f'Accuracy: {accuracy}')

# Precision
precision = precision_score(true_classes, predicted_classes, average='weighted')  # Using 'macro' for unweighted
print(f'Precision: {precision}')

# Recall
recall = recall_score(true_classes, predicted_classes, average='weighted')  # Using 'macro' for unweighted
print(f'Recall: {recall}')

# Confusion Matrix
conf_matrix = confusion_matrix(true_classes, predicted_classes)
print(f'Confusion Matrix:\n{conf_matrix}')

Accuracy: 0.7761776177617762
Precision: 0.7889895954133672
Recall: 0.7761776177617762
Confusion Matrix:
[[27  0  0 ...  0  0  0]
 [ 0 25  0 ...  0  3  0]
 [ 1  0 28 ...  0  0  0]
 ...
 [ 0  1  0 ... 29  0  0]
 [ 0  0  0 ...  0 32  0]
 [ 0  0  0 ...  0  3 27]]


### **Classification Report**

The classification report offered a detailed breakdown of the model's performance by class, including precision, recall, and f1-score for each airplane model category. This report illuminated the model's strengths in identifying specific models, such as the "Boeing 737-800" or "Airbus A320", with high precision and recall rates. It also underscored challenges in distinguishing between models with closely related features, where precision and recall were lower.

For models like the "Boeing 747-400" and "Airbus A340-300", the f1-scores indicated a harmonious balance between precision and recall, suggesting effective classification by the model. However, for others like "DC-3" and "MD-87", the lower scores pointed to difficulties in classification, possibly due to the model's limitations in capturing the fine-grained distinctions necessary for accurate identification.

In [123]:
# Make predictions
predictions = model.predict(test_generator)
predicted_classes = np.argmax(predictions, axis=1)

# true labels for comparison
true_classes = test_generator.classes
class_labels = list(test_generator.class_indices.keys())  # Getting class labels from the generator

# print a confusion matrix or classification report
from sklearn.metrics import classification_report, confusion_matrix

print(classification_report(true_classes, predicted_classes, target_names=class_labels))

                     precision    recall  f1-score   support

            707-320       0.66      0.82      0.73        33
            727-200       0.78      0.76      0.77        33
            737-200       0.97      0.82      0.89        34
            737-300       0.62      0.45      0.53        33
            737-400       0.74      0.76      0.75        33
            737-500       0.68      0.68      0.68        34
            737-600       0.64      0.82      0.72        33
            737-700       0.63      0.73      0.68        33
            737-800       0.70      0.62      0.66        34
            737-900       0.83      0.76      0.79        33
            747-100       0.57      0.24      0.34        33
            747-200       0.61      0.65      0.63        34
            747-300       0.55      0.33      0.42        33
            747-400       0.54      0.79      0.64        33
            757-200       0.65      0.76      0.70        34
            757-300    

### **Demo**

In [None]:
from tensorflow.keras.preprocessing import image
from tensorflow.keras.models import load_model
from tensorflow.keras.applications.efficientnet import preprocess_input
import numpy as np

# Load the pre-trained model
model_path = '/content/drive/My Drive/best_model_updated.h5'
model = load_model(model_path)

class_indices = train_generator.class_indices
# Invert the dictionary to map indices to class names
idx_to_class = {v: k for k, v in class_indices.items()}

def predict_single_image(image_path, model, idx_to_class):
    # Load and preprocess the image
    img = image.load_img(image_path, target_size=(224, 224))  # Adjusting target_size to match model's expected input
    img_array = image.img_to_array(img)
    img_array = np.expand_dims(img_array, axis=0)
    img_preprocessed = preprocess_input(img_array)

    # Predict
    predictions = model.predict(img_preprocessed)
    predicted_class_idx = np.argmax(predictions, axis=1)[0]
    predicted_class = idx_to_class[predicted_class_idx]
    confidence = np.max(predictions)

    return predicted_class, confidence

# Path to test image
image_path = '/content/drive/My Drive/test/image.jpg'
predicted_class, confidence = predict_single_image(image_path, model, idx_to_class)

print(f'This image is a {predicted_class} with confidence {confidence:.2f}')
