Aim:

To develop an image classification system using the Bag of Visual Words (BoVW) approach on the CIFAR-10 dataset by extracting combined HOG (Histogram of Oriented Gradients) and LBP (Local Binary Patterns) features, clustering them using K-Means, and classifying the images using a Support Vector Machine (SVM) classifier.
-To implement image classification using Bag of Visual Words (BoVW)
-on the CIFAR-10 dataset using HOG & LBP feature extraction and SVM classifier.

Objectives:
1.   To load and preprocess the CIFAR-10 dataset, which contains 60,000 32×32 color images across 10 distinct classes.
2.   To convert RGB images into grayscale, simplifying the computational complexity while preserving structural features.
3. To extract feature descriptors using HOG and LBP, capturing both shape and texture characteristics from the images.

4. To apply K-Means clustering on extracted features, creating a visual vocabulary to represent images as histograms of visual words.

5. To build a Bag of Visual Words (BoVW) representation by mapping local descriptors to visual words and forming a fixed-length feature vector.

6. To train a Support Vector Machine (SVM) classifier using the BoVW histograms as input features for learning.

7. To evaluate the performance of the trained model using accuracy score as a metric on the test data.





# Theory:
Image classification is a core problem in computer vision, where the goal is to categorize images into predefined labels. Traditional methods involve extracting handcrafted features from images and using machine learning classifiers to perform categorization.

The CIFAR-10 dataset is a well-known benchmark dataset containing 60,000 images classified into 10 categories: airplane, automobile, bird, cat, deer, dog, frog, horse, ship, and truck. Each image is small (32×32 pixels) and in color, which makes it a challenging task due to limited resolution and high inter-class similarity.

Feature Extraction:
Two key feature extraction methods are used in this project:

HOG (Histogram of Oriented Gradients):
HOG is a feature descriptor that captures the gradient orientation in localized portions of an image. It is particularly effective in describing object shapes and contours. It works by dividing the image into small regions (cells), computing the gradient direction histogram in each region, and normalizing them to make the descriptor illumination invariant.

LBP (Local Binary Patterns):
LBP is a texture descriptor that labels each pixel by thresholding its surrounding pixels. It encodes texture patterns in a local region, which is useful for recognizing repetitive textures and micro-patterns. The LBP histogram summarizes the frequency of these patterns across the image.

Bag of Visual Words (BoVW):
The BoVW model is inspired by the Bag of Words model in natural language processing. In BoVW:

Local features (HOG + LBP in this case) are treated as "visual words."

K-Means clustering is used to form a visual vocabulary by grouping similar feature descriptors.

Each image is represented as a histogram of visual word occurrences, capturing the distribution of patterns.

Classification:
A Support Vector Machine (SVM) is used as the classifier due to its robustness in handling high-dimensional data. It attempts to find the optimal hyperplane that maximally separates data points of different classes.

Evaluation:
Model performance is assessed using accuracy on the test dataset. A high accuracy indicates that the system has effectively learned the visual patterns associated with each class.

In [None]:
# Import Required Libraries
!pip install opencv-python scikit-learn numpy scikit-image matplotlib tqdm




In [None]:
import numpy as np
import cv2
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
from tqdm import tqdm
from skimage.feature import local_binary_pattern
from skimage.feature import hog
from tensorflow.keras.datasets import cifar10
from sklearn.model_selection import train_test_split

In [None]:
# Parameters
num_clusters = 100  # Number of clusters for BoVW
lbp_radius = 1
lbp_n_points = 8 * lbp_radius

In [None]:
# Load CIFAR-10 Dataset
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
y_train = y_train.flatten()
y_test = y_test.flatten()

Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
[1m170498071/170498071[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 0us/step


In [None]:
# Convert images to grayscale
x_train_gray = np.array([cv2.cvtColor(img, cv2.COLOR_RGB2GRAY) for img in x_train])
x_test_gray = np.array([cv2.cvtColor(img, cv2.COLOR_RGB2GRAY) for img in x_test])

In [None]:
# Feature Extraction: HOG + LBP
def extract_features(image):
    # HOG
    hog_features = hog(image, pixels_per_cell=(8, 8), cells_per_block=(2, 2), feature_vector=True)

    # LBP
    lbp = local_binary_pattern(image, lbp_n_points, lbp_radius, method='uniform')
    lbp_hist, _ = np.histogram(lbp.ravel(), bins=np.arange(0, lbp_n_points + 3), range=(0, lbp_n_points + 2))
    lbp_hist = lbp_hist.astype("float")
    lbp_hist /= (lbp_hist.sum() + 1e-7)

    # Combine features
    return np.hstack([hog_features, lbp_hist])

In [None]:
# Extract features from training and test data
print("Extracting features from images...")
train_features = [extract_features(img) for img in tqdm(x_train_gray)]
test_features = [extract_features(img) for img in tqdm(x_test_gray)]

train_features = np.array(train_features)
test_features = np.array(test_features)

Extracting features from images...


100%|██████████| 50000/50000 [00:44<00:00, 1124.03it/s]
100%|██████████| 10000/10000 [00:08<00:00, 1113.15it/s]


In [None]:
# K-Means Clustering for BoVW
print("Training K-Means model for BoVW...")
kmeans = KMeans(n_clusters=num_clusters, random_state=42, n_init=10)
kmeans.fit(train_features)

Training K-Means model for BoVW...


In [None]:
# Create BoVW Histograms
def create_bow_features(features, kmeans_model):
    cluster_labels = kmeans_model.predict(features)
    histogram = np.bincount(cluster_labels, minlength=num_clusters)
    return histogram / np.sum(histogram)

In [None]:
# Construct BoVW Feature Vectors
print("Constructing BoVW feature representations...")
train_bow = np.array([create_bow_features(f.reshape(1, -1), kmeans) for f in tqdm(train_features)])
test_bow = np.array([create_bow_features(f.reshape(1, -1), kmeans) for f in tqdm(test_features)])

Constructing BoVW feature representations...


100%|██████████| 50000/50000 [00:34<00:00, 1445.88it/s]
100%|██████████| 10000/10000 [00:03<00:00, 2681.89it/s]


In [None]:
# Train SVM Classifier
print("Training Support Vector Machine (SVM)...")
svm = SVC(kernel='linear', C=1.0)
svm.fit(train_bow, y_train)

Training Support Vector Machine (SVM)...


In [None]:
# Predict and Evaluate
print("Evaluating the model...")
y_pred = svm.predict(test_bow)
accuracy = accuracy_score(y_test, y_pred)
print(f"Classification Accuracy: {accuracy * 100:.2f}%")

Evaluating the model...
Classification Accuracy: 41.20%
