# Generate HOG Features for SVM Training (48x48)

This notebook extracts HOG (Histogram of Oriented Gradients) features from all images and saves them to disk.

**Purpose**: Run this notebook ONCE to generate HOG features, then use the saved features in `svm hog.ipynb` for faster training.

**Image Size**: 48x48 (testing medium-small resolution for optimal balance)

In [1]:
# Import required libraries
import cv2
import numpy as np
import os

## Setup Paths

In [2]:
# Define dataset paths
BASE_PATH = '/home/ubuntu/Desktop/AIML project/AlphaNum3'
TRAIN_PATH = os.path.join(BASE_PATH, "train")
VALIDATION_PATH = os.path.join(BASE_PATH, "validation")
TEST_PATH = os.path.join(BASE_PATH, "test")

# Define output directory for HOG features
HOG_OUTPUT_PATH = '/home/ubuntu/Desktop/AIML project/AlphaNum/hog_features'
os.makedirs(HOG_OUTPUT_PATH, exist_ok=True)

# Print paths for verification
print(f"Train path: {TRAIN_PATH}")
print(f"Validation path: {VALIDATION_PATH}")
print(f"Test path: {TEST_PATH}")
print(f"HOG features will be saved to: {HOG_OUTPUT_PATH}")

Train path: /home/ubuntu/Desktop/AIML project/AlphaNum3/train
Validation path: /home/ubuntu/Desktop/AIML project/AlphaNum3/validation
Test path: /home/ubuntu/Desktop/AIML project/AlphaNum3/test
HOG features will be saved to: /home/ubuntu/Desktop/AIML project/AlphaNum/hog_features


## HOG Feature Extraction Function

In [3]:
def extract_hog_features(directory_path):
    """
    Extract HOG features from all images in a directory.
    
    Args:
        directory_path: Path to directory containing image subdirectories (classes)
        
    Returns:
        features: NumPy array of HOG features (shape: [num_images, num_features])
        labels: NumPy array of string labels (class names)
    """
    features = []
    labels = []
    
    # Initialize HOG descriptor with fixed parameters for 48x48 images
    # These MUST match the parameters used in prediction
    hog = cv2.HOGDescriptor(
        _winSize=(48, 48),      # Image size (changed to 48x48)
        _blockSize=(16, 16),    # Block size
        _blockStride=(8, 8),    # Step size for blocks
        _cellSize=(8, 8),       # Cell size
        _nbins=9                # Number of orientation bins
    )
    
    total_images = 0
    
    # Loop through each class folder
    for class_name in os.listdir(directory_path):
        class_path = os.path.join(directory_path, class_name)
        if not os.path.isdir(class_path):
            continue
        
        # Loop through each image in the class folder
        for image_name in os.listdir(class_path):
            image_path = os.path.join(class_path, image_name)
            
            # Load image in grayscale
            image = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
            if image is not None:
                # Resize to 48x48 pixels
                resized_image = cv2.resize(image, (48, 48))
                
                # Compute HOG features
                hog_features = hog.compute(resized_image)
                
                if hog_features is not None:
                    # Flatten and store features
                    features.append(hog_features.flatten())
                    labels.append(class_name)
                    total_images += 1
    
    print(f"✓ Processed {total_images} images from {os.path.basename(directory_path)}")
    return np.array(features), np.array(labels)

## Extract HOG Features for Training Data

In [4]:
# Extract HOG features from training images
print("Extracting HOG features from training images...")
train_features, train_labels = extract_hog_features(TRAIN_PATH)
print(f"Training features shape: {train_features.shape}")
print(f"Training labels shape: {train_labels.shape}")

Extracting HOG features from training images...
✓ Processed 530 images from train
Training features shape: (530, 900)
Training labels shape: (530,)


## Extract HOG Features for Validation Data

In [5]:
# Extract HOG features from validation images
print("\nExtracting HOG features from validation images...")
validation_features, validation_labels = extract_hog_features(VALIDATION_PATH)
print(f"Validation features shape: {validation_features.shape}")
print(f"Validation labels shape: {validation_labels.shape}")


Extracting HOG features from validation images...
✓ Processed 530 images from validation
Validation features shape: (530, 900)
Validation labels shape: (530,)


## Extract HOG Features for Test Data

In [6]:
# Extract HOG features from test images
print("\nExtracting HOG features from test images...")
test_features, test_labels = extract_hog_features(TEST_PATH)
print(f"Test features shape: {test_features.shape}")
print(f"Test labels shape: {test_labels.shape}")


Extracting HOG features from test images...
✓ Processed 530 images from test
Test features shape: (530, 900)
Test labels shape: (530,)


## Save HOG Features to Disk

In [7]:
# Save all HOG features and labels to disk

# Save training data
np.save(os.path.join(HOG_OUTPUT_PATH, 'train_hog_features_48x48.npy'), train_features)
np.save(os.path.join(HOG_OUTPUT_PATH, 'train_labels_48x48.npy'), train_labels)
print("✓ Saved training HOG features (48x48) and labels")

# Save validation data
np.save(os.path.join(HOG_OUTPUT_PATH, 'validation_hog_features_48x48.npy'), validation_features)
np.save(os.path.join(HOG_OUTPUT_PATH, 'validation_labels_48x48.npy'), validation_labels)
print("✓ Saved validation HOG features (48x48) and labels")

# Save test data
np.save(os.path.join(HOG_OUTPUT_PATH, 'test_hog_features_48x48.npy'), test_features)
np.save(os.path.join(HOG_OUTPUT_PATH, 'test_labels_48x48.npy'), test_labels)
print("✓ Saved test HOG features (48x48) and labels")

print(f"\n✅ All 48x48 HOG features saved successfully to: {HOG_OUTPUT_PATH}")
print(f"📊 Using 48x48 resolution for testing medium-small size")

✓ Saved training HOG features (48x48) and labels
✓ Saved validation HOG features (48x48) and labels
✓ Saved test HOG features (48x48) and labels

✅ All 48x48 HOG features saved successfully to: /home/ubuntu/Desktop/AIML project/AlphaNum/hog_features
📊 Using 48x48 resolution for testing medium-small size


## Summary

HOG features have been extracted and saved for 48x48 images:
- **Training data**: `train_hog_features_48x48.npy` and `train_labels_48x48.npy`
- **Validation data**: `validation_hog_features_48x48.npy` and `validation_labels_48x48.npy`
- **Test data**: `test_hog_features_48x48.npy` and `test_labels_48x48.npy`

**Why 48x48?**
Testing medium-small resolution:
- Smaller than 64x64 for faster processing
- Larger than 24x24 for better detail capture
- Good balance between accuracy and computational efficiency
- Will compare results with other resolutions

These files can now be loaded in the main SVM notebook to train the model faster without re-extracting HOG features every time.