# ðŸŽ¯ SVM Object Recognition
## Bachelor's Thesis - Data Analytics Final Project

---

**Author:** Data Analytics Student  
**Date:** January 2026  
**Topic:** Support Vector Machine for Image Classification

---

## Table of Contents

1. [Introduction & Theory](#1-introduction)
2. [Data Loading & Exploration](#2-data-loading)
3. [Feature Engineering (HOG)](#3-feature-engineering)
4. [Model Training](#4-model-training)
5. [Hyperparameter Tuning](#5-hyperparameter-tuning)
6. [Results & Analysis](#6-results)
7. [Conclusion](#7-conclusion)

---

## 1. Introduction & Theory <a id='1-introduction'></a>

### 1.1 Project Overview

This project implements a **Support Vector Machine (SVM)** classifier for object recognition using the **CIFAR-10** dataset. SVM is a supervised learning algorithm that finds the optimal hyperplane to separate different classes.

### 1.2 Support Vector Machine (SVM)

SVM works by finding the hyperplane that maximizes the margin between classes:

- **Linear SVM**: For linearly separable data
- **Kernel SVM**: Uses the "kernel trick" for non-linear boundaries
- **RBF Kernel**: `K(x, y) = exp(-Î³||x-y||Â²)` - creates flexible decision boundaries

### 1.3 HOG Feature Extraction

**Histogram of Oriented Gradients (HOG)** captures edge and gradient structure:

1. Compute gradients in x and y directions
2. Create histograms of gradient orientations in cells
3. Normalize histograms across blocks
4. Concatenate to form feature vector

### 1.4 CIFAR-10 Dataset

| Property | Value |
|----------|-------|
| Total Images | 60,000 |
| Training Set | 50,000 |
| Test Set | 10,000 |
| Image Size | 32Ã—32 RGB |
| Classes | 10 |

---

## Setup & Imports

In [None]:
# Standard libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
import os
import sys

# Add src to path
sys.path.insert(0, '../src')

# Import project modules
from data_loader import load_cifar10, preprocess_images, CLASS_NAMES, get_class_distribution
from feature_extraction import HOGFeatureExtractor, visualize_hog_features
from svm_classifier import SVMClassifier
from visualization import (
    plot_sample_images, plot_class_distribution, plot_confusion_matrix,
    plot_roc_curves, plot_prediction_samples, plot_metrics_comparison,
    plot_per_class_accuracy, plot_hog_visualization
)

# Settings
warnings.filterwarnings('ignore')
plt.style.use('seaborn-v0_8-whitegrid')
sns.set_palette('husl')

# For reproducibility
np.random.seed(42)

print('âœ“ All imports successful!')
print(f'âœ“ Working directory: {os.getcwd()}')

---

## 2. Data Loading & Exploration <a id='2-data-loading'></a>

### 2.1 Load CIFAR-10 Dataset

In [None]:
# Load dataset (using subset for faster execution)
# Change subset_size to None for full dataset
SUBSET_SIZE = 10000  # Use 10,000 samples for demonstration

data = load_cifar10(data_dir='../data', subset_size=SUBSET_SIZE)

# Extract data
X_train = data['X_train']
y_train = data['y_train']
X_test = data['X_test']
y_test = data['y_test']

print(f'\nðŸ“Š Dataset Summary:')
print(f'   Training samples: {len(X_train):,}')
print(f'   Test samples: {len(X_test):,}')
print(f'   Image shape: {X_train.shape[1:]}')
print(f'   Number of classes: {len(CLASS_NAMES)}')

### 2.2 Visualize Sample Images

In [None]:
# Display sample images from the dataset
fig, axes = plt.subplots(2, 5, figsize=(15, 6))
fig.suptitle('CIFAR-10 Sample Images (One per Class)', fontsize=14, fontweight='bold')

for i, ax in enumerate(axes.flat):
    # Find first image of class i
    idx = np.where(y_train == i)[0][0]
    ax.imshow(X_train[idx])
    ax.set_title(f'{CLASS_NAMES[i].capitalize()}', fontsize=12)
    ax.axis('off')

plt.tight_layout()
plt.show()

### 2.3 Class Distribution Analysis

In [None]:
# Analyze class distribution
distribution = get_class_distribution(y_train)

# Create DataFrame for display
df_dist = pd.DataFrame({
    'Class': list(distribution.keys()),
    'Count': list(distribution.values())
})
df_dist['Percentage'] = (df_dist['Count'] / df_dist['Count'].sum() * 100).round(2)

print('ðŸ“Š Training Set Class Distribution:')
print(df_dist.to_string(index=False))

# Visualize
fig, ax = plt.subplots(figsize=(12, 5))
colors = plt.cm.viridis(np.linspace(0.2, 0.8, 10))
bars = ax.bar(df_dist['Class'], df_dist['Count'], color=colors, edgecolor='white', linewidth=1.5)

for bar, count in zip(bars, df_dist['Count']):
    ax.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 20,
            f'{count}', ha='center', va='bottom', fontsize=10, fontweight='bold')

ax.set_xlabel('Object Class', fontsize=12)
ax.set_ylabel('Number of Images', fontsize=12)
ax.set_title('Class Distribution in Training Set', fontsize=14, fontweight='bold')
plt.xticks(rotation=45, ha='right')
plt.tight_layout()
plt.show()

---

## 3. Feature Engineering (HOG) <a id='3-feature-engineering'></a>

### 3.1 Image Preprocessing

In [None]:
# Preprocess images: convert to grayscale and normalize
print('Preprocessing images...')

X_train_processed = preprocess_images(X_train, grayscale=True, normalize=True)
X_test_processed = preprocess_images(X_test, grayscale=True, normalize=True)

print(f'âœ“ Training images preprocessed: {X_train_processed.shape}')
print(f'âœ“ Test images preprocessed: {X_test_processed.shape}')

# Visualize preprocessing
fig, axes = plt.subplots(1, 2, figsize=(10, 4))

idx = np.random.randint(0, len(X_train))

axes[0].imshow(X_train[idx])
axes[0].set_title(f'Original RGB - {CLASS_NAMES[y_train[idx]]}', fontsize=12)
axes[0].axis('off')

axes[1].imshow(X_train_processed[idx], cmap='gray')
axes[1].set_title('Preprocessed (Grayscale, Normalized)', fontsize=12)
axes[1].axis('off')

plt.suptitle('Image Preprocessing', fontsize=14, fontweight='bold')
plt.tight_layout()
plt.show()

### 3.2 HOG Feature Extraction

In [None]:
# Initialize HOG Feature Extractor
extractor = HOGFeatureExtractor(
    orientations=9,
    pixels_per_cell=(4, 4),
    cells_per_block=(2, 2)
)

print('HOG Parameters:')
print(f'   Orientations: {extractor.orientations}')
print(f'   Pixels per cell: {extractor.pixels_per_cell}')
print(f'   Cells per block: {extractor.cells_per_block}')

In [None]:
# Extract HOG features from training set
X_train_features = extractor.fit_transform(X_train_processed, apply_pca=False)

print(f'\nðŸ“Š Feature Extraction Results:')
print(f'   Input image shape: {X_train_processed[0].shape}')
print(f'   Features per image: {X_train_features.shape[1]}')
print(f'   Total training features: {X_train_features.shape}')

In [None]:
# Extract features from test set
X_test_features = extractor.transform(X_test_processed)

print(f'âœ“ Test features: {X_test_features.shape}')

### 3.3 Visualize HOG Features

In [None]:
# Visualize HOG features for sample images
fig, axes = plt.subplots(3, 4, figsize=(16, 12))

for i in range(3):
    idx = np.where(y_train == i)[0][0]
    
    # Original
    axes[i, 0].imshow(X_train[idx])
    axes[i, 0].set_title(f'Original: {CLASS_NAMES[y_train[idx]]}', fontsize=11)
    axes[i, 0].axis('off')
    
    # Grayscale
    axes[i, 1].imshow(X_train_processed[idx], cmap='gray')
    axes[i, 1].set_title('Grayscale', fontsize=11)
    axes[i, 1].axis('off')
    
    # HOG
    _, hog_image = visualize_hog_features(X_train_processed[idx], extractor)
    axes[i, 2].imshow(hog_image, cmap='gray')
    axes[i, 2].set_title('HOG Features', fontsize=11)
    axes[i, 2].axis('off')
    
    # Feature histogram
    features = X_train_features[idx]
    axes[i, 3].hist(features, bins=50, color='steelblue', edgecolor='white', alpha=0.7)
    axes[i, 3].set_title(f'Feature Distribution ({len(features)} dims)', fontsize=11)
    axes[i, 3].set_xlabel('Feature Value')
    axes[i, 3].set_ylabel('Frequency')

plt.suptitle('HOG Feature Extraction Pipeline', fontsize=14, fontweight='bold')
plt.tight_layout()
plt.show()

---

## 4. Model Training <a id='4-model-training'></a>

### 4.1 Train SVM Classifier

In [None]:
# Initialize SVM classifier
classifier = SVMClassifier(
    kernel='rbf',
    C=10.0,
    gamma='scale'
)

# Train
classifier.train(X_train_features, y_train)

### 4.2 Initial Evaluation

In [None]:
# Evaluate on test set
results = classifier.evaluate(X_test_features, y_test, class_names=CLASS_NAMES)

---

## 5. Hyperparameter Tuning <a id='5-hyperparameter-tuning'></a>

### 5.1 Grid Search with Cross-Validation

In [None]:
# Define parameter grid (simplified for speed)
param_grid = {
    'C': [1, 10],
    'gamma': ['scale', 0.01],
    'kernel': ['rbf']
}

# Initialize new classifier for tuning
tuned_classifier = SVMClassifier()

# Perform grid search
best_params = tuned_classifier.tune_hyperparameters(
    X_train_features, 
    y_train,
    param_grid=param_grid,
    cv=3
)

In [None]:
# Evaluate tuned model
tuned_results = tuned_classifier.evaluate(X_test_features, y_test, class_names=CLASS_NAMES)

---

## 6. Results & Analysis <a id='6-results'></a>

### 6.1 Results Summary

In [None]:
# Use tuned results
final_results = tuned_results

# Create summary DataFrame
summary = pd.DataFrame({
    'Metric': ['Accuracy', 'Precision', 'Recall', 'F1-Score'],
    'Score': [
        f"{final_results['accuracy']*100:.2f}%",
        f"{final_results['precision']*100:.2f}%",
        f"{final_results['recall']*100:.2f}%",
        f"{final_results['f1_score']*100:.2f}%"
    ]
})

print('\n' + '='*50)
print('ðŸ“Š FINAL MODEL PERFORMANCE')
print('='*50)
print(summary.to_string(index=False))
print('='*50)

### 6.2 Confusion Matrix

In [None]:
# Plot confusion matrix
from sklearn.metrics import confusion_matrix

cm = confusion_matrix(y_test, final_results['y_pred'])
cm_normalized = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]

fig, axes = plt.subplots(1, 2, figsize=(16, 7))

# Raw counts
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues',
            xticklabels=CLASS_NAMES, yticklabels=CLASS_NAMES, ax=axes[0])
axes[0].set_xlabel('Predicted')
axes[0].set_ylabel('True')
axes[0].set_title('Confusion Matrix (Counts)', fontweight='bold')

# Normalized
sns.heatmap(cm_normalized, annot=True, fmt='.2%', cmap='Blues',
            xticklabels=CLASS_NAMES, yticklabels=CLASS_NAMES, ax=axes[1])
axes[1].set_xlabel('Predicted')
axes[1].set_ylabel('True')
axes[1].set_title('Confusion Matrix (Normalized)', fontweight='bold')

plt.tight_layout()
plt.show()

### 6.3 ROC Curves

In [None]:
# Plot ROC curves
fig, ax = plt.subplots(figsize=(12, 8))

colors = plt.cm.tab10(np.linspace(0, 1, len(CLASS_NAMES)))

for i, (class_idx, data) in enumerate(final_results['roc_data'].items()):
    ax.plot(data['fpr'], data['tpr'], color=colors[i], lw=2,
            label=f"{CLASS_NAMES[class_idx]} (AUC = {data['auc']:.3f})")

ax.plot([0, 1], [0, 1], 'k--', lw=1.5, label='Random Classifier')
ax.set_xlim([0.0, 1.0])
ax.set_ylim([0.0, 1.05])
ax.set_xlabel('False Positive Rate', fontsize=12)
ax.set_ylabel('True Positive Rate', fontsize=12)
ax.set_title('ROC Curves (One-vs-Rest)', fontsize=14, fontweight='bold')
ax.legend(loc='lower right', fontsize=9)
ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

### 6.4 Per-Class Accuracy

In [None]:
# Calculate per-class accuracy
class_accuracies = []
for i in range(len(CLASS_NAMES)):
    mask = y_test == i
    if mask.sum() > 0:
        acc = (final_results['y_pred'][mask] == i).sum() / mask.sum()
    else:
        acc = 0
    class_accuracies.append(acc * 100)

# Create DataFrame
df_class_acc = pd.DataFrame({
    'Class': CLASS_NAMES,
    'Accuracy (%)': [f'{acc:.1f}%' for acc in class_accuracies]
})

print('\nðŸ“Š Per-Class Accuracy:')
print(df_class_acc.to_string(index=False))

# Visualize
fig, ax = plt.subplots(figsize=(12, 6))
colors = plt.cm.RdYlGn(np.array(class_accuracies) / 100)
bars = ax.bar(CLASS_NAMES, class_accuracies, color=colors, edgecolor='white', linewidth=1.5)

for bar, acc in zip(bars, class_accuracies):
    ax.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 1,
            f'{acc:.1f}%', ha='center', va='bottom', fontsize=10, fontweight='bold')

ax.set_ylim(0, 100)
ax.set_xlabel('Object Class', fontsize=12)
ax.set_ylabel('Accuracy (%)', fontsize=12)
ax.set_title('Per-Class Classification Accuracy', fontsize=14, fontweight='bold')
ax.axhline(y=np.mean(class_accuracies), color='blue', linestyle='--', 
           alpha=0.7, label=f'Mean: {np.mean(class_accuracies):.1f}%')
ax.legend()

plt.xticks(rotation=45, ha='right')
plt.tight_layout()
plt.show()

### 6.5 Sample Predictions

In [None]:
# Display sample predictions
n_samples = 15
indices = np.random.choice(len(X_test), n_samples, replace=False)

fig, axes = plt.subplots(3, 5, figsize=(15, 9))
fig.suptitle('Model Predictions on Test Images', fontsize=14, fontweight='bold')

for idx, ax in zip(indices, axes.flat):
    ax.imshow(X_test[idx])
    
    true_label = CLASS_NAMES[y_test[idx]]
    pred_label = CLASS_NAMES[final_results['y_pred'][idx]]
    is_correct = y_test[idx] == final_results['y_pred'][idx]
    
    color = 'green' if is_correct else 'red'
    symbol = 'âœ“' if is_correct else 'âœ—'
    
    ax.set_title(f'{symbol} Pred: {pred_label}\nTrue: {true_label}',
                fontsize=9, color=color, fontweight='bold')
    ax.axis('off')

plt.tight_layout()
plt.show()

---

## 7. Conclusion <a id='7-conclusion'></a>

### 7.1 Summary

This project successfully implemented an **SVM-based object recognition system** using the CIFAR-10 dataset.

### Key Findings:

1. **HOG features** effectively capture edge and gradient information for object recognition
2. **RBF kernel SVM** provides flexible non-linear decision boundaries
3. **Hyperparameter tuning** improves model performance
4. Some classes (e.g., ships, trucks) are easier to classify than others (cats, dogs)

### 7.2 Comparison to Other Methods

| Method | Expected Accuracy on CIFAR-10 |
|--------|------------------------------|
| Random Baseline | 10% |
| **HOG + SVM (this project)** | **55-65%** |
| CNN (AlexNet) | ~82% |
| CNN (ResNet) | ~93% |
| State-of-the-art | 99%+ |

### 7.3 Future Work

- Try different feature extraction methods (SIFT, SURF)
- Experiment with feature combination
- Implement ensemble methods
- Compare with CNN-based approaches

In [None]:
# Save model
import os
os.makedirs('../outputs/models', exist_ok=True)

tuned_classifier.save('../outputs/models/svm_classifier.joblib')
extractor.save('../outputs/models/feature_extractor.joblib')

print('\nâœ“ Models saved successfully!')
print('\nðŸŽ“ Project Complete!')