# Task 3: Implement a support vector machine (SVM) to classify images of cats and dogs 

## Cat vs Dog Image Classifier - Testing / Inference Script

This script is responsible for **loading the trained SVM model** and using it to make predictions on a test set of unlabeled images.

### It performs the following steps:

1. **Loads the trained model** (`svm_model_hog_pca.pkl`) along with its scaler and PCA transformer.
2. **Reads and preprocesses images** from the test folder.
3. **Extracts HOG features** from each image.
4. **Applies PCA and scaling** to transform features just like during training.
5. **Uses the trained SVM model** to predict whether each image is a cat or a dog.
6. **Creates a submission CSV file** with filenames and corresponding predicted labels (`cat` or `dog`), sorted in filename order.

 **Output:** A CSV file named **`submission.csv`** containing predictions for all test images.


## 1. Importing the Libraries

In [1]:
import os  # For file and directory operations
import numpy as np  # For numerical operations and arrays
import pickle  # For loading the trained SVM model
import pandas as pd  # For creating the submission CSV
from skimage.io import imread  # For reading images
from skimage.transform import resize  # For resizing images to a fixed size
from skimage.color import rgb2gray  # To convert RGB images to grayscale
from skimage.feature import hog  # For extracting HOG features
import re  # For extracting numerical part of filenames (to sort)

## 2. Set path and Configuration

In [2]:
test_img_dir = "test1/test1"  # Folder containing test images
model_path = "svm_model_hog_pca.pkl"  # Path to the trained model (SVM + Scaler + PCA)
output_csv = "submission.csv"  # Output CSV file where predictions will be saved
img_size = (128, 128)  # Resize all images to this size before feature extraction

## 3. Load the Trained Model SVM model, Scaler and PCA

In [3]:
with open(model_path, 'rb') as file:
    model, scaler, pca = pickle.load(file)

## 4. Helper function to sort filenames numerically

In [4]:
def sort_key(filename):
    match = re.search(r'(\d+)', filename)
    return int(match.group(1)) if match else float('inf')

## 5. Function to extract HOG features from all test images

In [5]:
def extract_hog_features_from_folder(folder, img_size=(128, 128)):
    features = []
    image_ids = []

    # Sort filenames numerically so output CSV is in order
    filenames = sorted(os.listdir(folder), key=sort_key)

    for filename in filenames:
        if filename.lower().endswith(('.jpg', '.png')):  # Ensure only image files are processed
            file_path = os.path.join(folder, filename)
            img = imread(file_path)  # Load image

            # If image has alpha channel (4 channels), remove the alpha channel
            if img.shape[-1] == 4:
                img = img[:, :, :3]

            # Resize and convert to grayscale for HOG
            img_resized = resize(img, img_size, anti_aliasing=True, preserve_range=True)
            img_gray = rgb2gray(img_resized)

            # Extract HOG features
            hog_feature = hog(
                img_gray,
                orientations=9,
                pixels_per_cell=(8, 8),
                cells_per_block=(2, 2),
                visualize=False,
                channel_axis=None
            )
            features.append(hog_feature)
            image_ids.append(filename)  # Keep track of filenames for CSV output

    return np.array(features), image_ids

## 6. Extract HOG features from test images 

In [6]:
# Extract features
X_test_raw, image_ids = extract_hog_features_from_folder(test_img_dir, img_size=img_size)

## 7. Apply PCA and Scaling (same as done during training)

In [7]:
X_test_pca = pca.transform(X_test_raw)
X_test_scaled = scaler.transform(X_test_pca)   

## 8. Perform prediction using trained SVM model

In [8]:
y_pred = model.predict(X_test_scaled)  # Predict class labels (0 for cat, 1 for dog)

## 9. Map numeric predictions to string labels

In [9]:
label_map = {0: "cat", 1: "dog"}
predicted_labels = [label_map[label] for label in y_pred]

## 10. Create submission DataFrame and save to CSV

In [10]:
submission_df = pd.DataFrame({
    "filename": image_ids,  # Sorted filenames
    "label": predicted_labels  # Corresponding predicted labels
})
submission_df.to_csv(output_csv, index=False)

print(f"Predictions saved to: {output_csv}")

Predictions saved to: submission.csv
