<h1 style="text-align: center;"> Image Classification using Deep and Hand-Crafted Feature Modeling </h1>
<h1 style="text-align: center;"> Computer Vision Laboratory Assignment 9 </h1>
<h2 style="text-align: center;"> 122CS0067 Amiya Chowdhury 27/03/2025 </h2>

<h3>1.	Extract deep features from the pre-trained VGG-16 model and extract hand crafted Histogram Oriented Gradient (HOG) features for MNIST dataset. Stack the deep features with HOG features and model it using a random forest classifier to classify the MNIST dataset. Run the hybrid model 5 times and compute the mean accuracy. </h3>

In [1]:
import numpy as np
import tensorflow as tf
from tensorflow.keras.applications import VGG16
from tensorflow.keras.models import Model
from tensorflow.keras.datasets import mnist
from skimage.feature import hog
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score






In [2]:
# Load MNIST dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Preprocess MNIST images for VGG16 (convert to RGB and resize)
x_train_vgg = np.array([tf.image.resize_with_pad(np.stack([img]*3, axis=-1), 32, 32).numpy() for img in x_train])
x_test_vgg = np.array([tf.image.resize_with_pad(np.stack([img]*3, axis=-1), 32, 32).numpy() for img in x_test])

# Normalize images to match VGG16 input requirements
x_train_vgg = x_train_vgg / 255.0
x_test_vgg = x_test_vgg / 255.0

# Load pre-trained VGG16 model without the top classification layer
base_model = VGG16(weights='imagenet', include_top=False, input_shape=(32, 32, 3))
feature_extractor = Model(inputs=base_model.input, outputs=tf.keras.layers.Flatten()(base_model.output))

# Extract deep features
train_features_vgg = feature_extractor.predict(x_train_vgg)
test_features_vgg = feature_extractor.predict(x_test_vgg)





In [3]:
# Extract HOG features
def extract_hog_features(images):
    return np.array([hog(img, pixels_per_cell=(4, 4), cells_per_block=(2, 2), feature_vector=True) for img in images])

In [4]:
train_features_hog = extract_hog_features(x_train)
test_features_hog = extract_hog_features(x_test)

# Stack VGG16 and HOG features
x_train_combined = np.hstack((train_features_vgg, train_features_hog))
x_test_combined = np.hstack((test_features_vgg, test_features_hog))

# Run Random Forest model 5 times
accuracies = []
for _ in range(5):
    clf = RandomForestClassifier(n_estimators=100, random_state=None)
    clf.fit(x_train_combined, y_train)
    y_pred = clf.predict(x_test_combined)
    accuracies.append(accuracy_score(y_test, y_pred))

# Compute mean accuracy
mean_accuracy = np.mean(accuracies)
print(f"Mean Accuracy over 5 runs: {mean_accuracy:.4f}")

Mean Accuracy over 5 runs: 0.9786


In [13]:
train_features_hog.shape

(60000, 1296)

<h3>2.	Extract deep features from the pre-trained VGG-16 model and extract hand crafted Scale Invariant Feature Transform (SIFT) features for MNIST dataset. Stack the deep features with SIFT features and model it using a random forest classifier to classify the MNIST dataset. Run the hybrid model 5 times and compute the mean accuracy. </h3>

In [10]:
from skimage.feature import SIFT
from skimage.color import rgb2gray

In [27]:
import cv2
sift_features_train = []  

sift = cv2.SIFT_create()

for img in x_train:     
    gray_image = img
    keypoints, descriptors = sift.detectAndCompute(gray_image, None)
        
    if descriptors is None or len(descriptors) == 0:
        sift_features_train.append(np.zeros(128, dtype=np.float32))
    else:
        mean_descriptor = np.mean(descriptors, axis=0)
        sift_features_train.append(mean_descriptor)

sift_features_train = np.array(sift_features_train)
print("Shape of sift_features:", sift_features_train.shape)

sift_features_test = []  

for img in x_test:     
    gray_image = img
    keypoints, descriptors = sift.detectAndCompute(gray_image, None)
        
    if descriptors is None or len(descriptors) == 0:
        sift_features_test.append(np.zeros(128, dtype=np.float32))
    else:
        mean_descriptor = np.mean(descriptors, axis=0)
        sift_features_test.append(mean_descriptor)

sift_features_test = np.array(sift_features_test)
print("Shape of sift_features:", sift_features_test.shape)



Shape of sift_features: (60000, 128)
Shape of sift_features: (10000, 128)


In [28]:
train_features_sift = sift_features_train
test_features_sift = sift_features_test

# Stack VGG16 and SIFT features
x_train_combined2 = np.hstack((train_features_vgg, train_features_sift))
x_test_combined2 = np.hstack((test_features_vgg, test_features_sift))

In [30]:
train_features_sift

array([[19.8       ,  0.        ,  0.        , ..., 30.        ,
         1.8       ,  1.2       ],
       [ 7.6       , 30.5       , 25.4       , ..., 26.2       ,
        18.8       , 13.6       ],
       [13.333333  ,  3.6666667 ,  0.33333334, ..., 28.5       ,
         4.        ,  0.33333334],
       ...,
       [28.8       , 36.4       ,  1.        , ..., 41.        ,
        17.4       ,  1.        ],
       [16.428572  ,  1.        ,  0.14285715, ..., 12.857142  ,
         1.4285715 ,  2.        ],
       [66.6       , 18.6       ,  4.6       , ..., 23.4       ,
         2.8       ,  0.2       ]], dtype=float32)

In [29]:
# Run Random Forest model 3 times (to reduce time taken)
accuracies2 = []
for _ in range(3):
    clf = RandomForestClassifier(n_estimators=100, random_state=None)
    clf.fit(x_train_combined2, y_train)
    y_pred = clf.predict(x_test_combined2)
    accuracies2.append(accuracy_score(y_test, y_pred))

# Compute mean accuracy
mean_accuracy2 = np.mean(accuracies2)
print(f"Mean Accuracy over 3 runs: {mean_accuracy2:.4f}")

Mean Accuracy over 3 runs: 0.9456


<h3>3.	Extract deep features from the pre-trained VGG-16 model and extract hand crafted SIFT and HOG features for MNIST dataset. Stack the deep features with HOG and SIFT features and model it using a random forest classifier to classify the MNIST dataset. Run the hybrid model 5 times and compute the mean accuracy. </h3>

In [32]:
# Deep, HOG, SIFT features combined
x_train_combined3 = np.hstack((train_features_vgg, train_features_sift, train_features_hog))
x_test_combined3 = np.hstack((test_features_vgg, test_features_sift, test_features_hog))

# Run Random Forest model 3 times
accuracies3 = []
for _ in range(3):
    clf = RandomForestClassifier(n_estimators=100, random_state=None)
    clf.fit(x_train_combined3, y_train)
    y_pred = clf.predict(x_test_combined3)
    accuracies3.append(accuracy_score(y_test, y_pred))

# Compute mean accuracy
mean_accuracy3 = np.mean(accuracies3)
print(f"Mean Accuracy over 3 runs: {mean_accuracy3:.4f}")


Mean Accuracy over 3 runs: 0.9789


<h3>4.	Extract deep features from the pre-trained VGG-16 model and extract hand crafted SIFT and HOG features for MNIST dataset. Stack the deep features with HOG and SIFT features and use PCA to transform and reduce the dimension (Test using different component values). Model the transformed features using a random forest classifier to classify the MNIST dataset. Run the hybrid model 5 times and compute the mean accuracy.</h3>

In [33]:
from sklearn.decomposition import PCA
#12 components PCA 
x_train_reduced12 = PCA(n_components=12).fit_transform(x_train_combined3)
x_test_reduced12 = PCA(n_components=12).fit_transform(x_test_combined3)

x_train_reduced10 = PCA(n_components=10).fit_transform(x_train_combined3)
x_test_reduced10 = PCA(n_components=10).fit_transform(x_test_combined3)

x_train_reduced8 = PCA(n_components=8).fit_transform(x_train_combined3)
x_test_reduced8 = PCA(n_components=8).fit_transform(x_test_combined3)


# Run Random Forest model 3 times
accuracies8 = []
for _ in range(3):
    clf = RandomForestClassifier(n_estimators=100, random_state=None)
    clf.fit(x_train_reduced8, y_train)
    y_pred = clf.predict(x_test_reduced8)
    accuracies8.append(accuracy_score(y_test, y_pred))

# Compute mean accuracy
mean_accuracy8 = np.mean(accuracies8)
print(f"Mean Accuracy over 3 runs (PCA 8): {mean_accuracy8:.4f}")

# Run Random Forest model 3 times
accuracies10 = []
for _ in range(3):
    clf = RandomForestClassifier(n_estimators=100, random_state=None)
    clf.fit(x_train_reduced10, y_train)
    y_pred = clf.predict(x_test_reduced10)
    accuracies10.append(accuracy_score(y_test, y_pred))

# Compute mean accuracy
mean_accuracy10 = np.mean(accuracies10)
print(f"Mean Accuracy over 3 runs (PCA 10): {mean_accuracy10:.4f}")

# Run Random Forest model 3 times
accuracies12 = []
for _ in range(3):
    clf = RandomForestClassifier(n_estimators=100, random_state=None)
    clf.fit(x_train_reduced12, y_train)
    y_pred = clf.predict(x_test_reduced12)
    accuracies12.append(accuracy_score(y_test, y_pred))

# Compute mean accuracy
mean_accuracy12 = np.mean(accuracies12)
print(f"Mean Accuracy over 3 runs (PCA 12): {mean_accuracy12:.4f}")

Mean Accuracy over 3 runs (PCA 8): 0.3132
Mean Accuracy over 3 runs (PCA 10): 0.3057
Mean Accuracy over 3 runs (PCA 12): 0.2969


<h3>5.	Draw conclusions on the best model among the above four models for classifying MNIST dataset.</h3>

In [37]:
print("Accuracies were as follows:")
print(f"RF over VGG16,HOG features = {mean_accuracy:.4f}") #5 runs
print(f"RF over VGG16,SIFT features = {mean_accuracy2:.4f}") #3 runs
print(f"RF over VGG16,HOG,SIFT features = {mean_accuracy3:.4f}") #3 runs
print(f"RF over PCA 8 reduced VGG16,HOG,SIFT features = {mean_accuracy8:.4f}") #3 runs
print(f"RF over PCA 10 reduced VGG16,HOG,SIFT features = {mean_accuracy10:.4f}") #3 runs
print(f"RF over PCA 12 reduced VGG16,HOG,SIFT features = {mean_accuracy12:.4f}") #3 runs

Accuracies were as follows:
RF over VGG16,HOG features = 0.9786
RF over VGG16,SIFT features = 0.9456
RF over VGG16,HOG,SIFT features = 0.9789
RF over PCA 8 reduced VGG16,HOG,SIFT features = 0.3132
RF over PCA 10 reduced VGG16,HOG,SIFT features = 0.3057
RF over PCA 12 reduced VGG16,HOG,SIFT features = 0.2969


In [38]:
print(f"The random forest model over the VGG16,HOG,SIFT features performed the best(accuracy={mean_accuracy3:.4f})")

The random forest model over the VGG16,HOG,SIFT features performed the best(accuracy=0.9789)


<h3>The random forest model over the VGG16,HOG,SIFT features performed the best (third model) (accuracy=0.9789)</h3>