# TCV 3151 ‚Äì Computer Vision Lab (Practical Test)
## CIFAR-100 Classification (Classes 61-70)
## Optimized for High Accuracy on Small Images

**Classes:** plain, plate, poppy, porcupine, possum, rabbit, raccoon, ray, road, rocket

‚ö° **Key Strategy:** Custom CNN architecture specifically designed for 32√ó32 CIFAR images
üí° **Why not ResNet50?** ResNet is designed for 224√ó224 ImageNet images and performs poorly on 32√ó32 images


## Section 1: Import Required Libraries

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.datasets import cifar100
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import (
    Conv2D, MaxPooling2D, Dense, Dropout, 
    Flatten, BatchNormalization, Activation
)
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.regularizers import l2
from sklearn.metrics import confusion_matrix, classification_report
import seaborn as sns


print("‚úÖ All libraries imported successfully!")
print(f"GPU Available: {len(tf.config.list_physical_devices('GPU')) > 0}")
print(f"TensorFlow Version: {tf.__version__}")

‚úÖ All libraries imported successfully!
GPU Available: False
TensorFlow Version: 2.20.0


## Section 2: Load and Prepare CIFAR-100 Dataset (Classes 61-70)

In [None]:
import os
from PIL import Image

# Dataset configuration
DATA_DIR = 'Public_dataset'
CLASS_NAMES = sorted([d for d in os.listdir(DATA_DIR) if os.path.isdir(os.path.join(DATA_DIR, d))])
IMG_SIZE = 512  # Resize images to 512x512

print(f"Class Names: {CLASS_NAMES}")
print(f"Number of classes: {len(CLASS_NAMES)}")

# Load images from directories
images = []
labels = []

for class_idx, class_name in enumerate(CLASS_NAMES):
    class_dir = os.path.join(DATA_DIR, class_name)
    for img_file in os.listdir(class_dir):
        if img_file.lower().endswith(('.jpg', '.jpeg', '.png', '.gif', '.bmp')):
            img_path = os.path.join(class_dir, img_file)
            try:
                img = Image.open(img_path).convert('RGB')
                img = img.resize((IMG_SIZE, IMG_SIZE))
                images.append(np.array(img))
                labels.append(class_idx)
            except Exception as e:
                print(f"‚ö†Ô∏è Error loading {img_path}: {e}")

images = np.array(images)
labels = np.array(labels)

print(f"\n‚úÖ Dataset Loaded Successfully!")
print(f"Total samples: {len(images)}")
print(f"Image shape: {images[0].shape}")
print(f"Label distribution:")
for idx, class_name in enumerate(CLASS_NAMES):
    count = (labels == idx).sum()
    print(f"  {class_name}: {count} samples")

# Split into train (80%) and test (20%)
from sklearn.model_selection import train_test_split
x_train_filtered, x_test_filtered, y_train_filtered, y_test_filtered = train_test_split(
    images, labels, test_size=0.2, random_state=42, stratify=labels
)

print(f"\nTraining samples: {x_train_filtered.shape[0]}")
print(f"Test samples: {x_test_filtered.shape[0]}")

# For compatibility with existing code
y_train_mapped = y_train_filtered
y_test_mapped = y_test_filtered

In [None]:
# Split training data into 90% train and 10% validation
from sklearn.model_selection import train_test_split

x_train_final, x_valid, y_train_final, y_valid = train_test_split(
    x_train_filtered, y_train_mapped, test_size=0.1, random_state=42
)

print(f"\nüìä Final Data Split:")
print(f"Training: {x_train_final.shape[0]} samples")
print(f"Validation: {x_valid.shape[0]} samples")
print(f"Test: {x_test_filtered.shape[0]} samples")

## Section 3: Data Preprocessing and Augmentation

In [None]:
# Normalization: Scale pixel values to [0, 1]
x_train_norm = x_train_final.astype('float32') / 255.0
x_valid_norm = x_valid.astype('float32') / 255.0
x_test_norm = x_test_filtered.astype('float32') / 255.0

# One-hot encode labels
y_train_onehot = to_categorical(y_train_final, 10)
y_valid_onehot = to_categorical(y_valid, 10)
y_test_onehot = to_categorical(y_test_mapped, 10)

print("‚úÖ Data normalized and labels one-hot encoded!")
print(f"Training labels shape: {y_train_onehot.shape}")
print(f"Validation labels shape: {y_valid_onehot.shape}")
print(f"Test labels shape: {y_test_onehot.shape}")

In [None]:
# Stronger Data Augmentation for better generalization
datagen = ImageDataGenerator(
    rotation_range=15,           # Random rotation
    horizontal_flip=True,        # Horizontal flip
    width_shift_range=0.1,       # Horizontal shift
    height_shift_range=0.1,      # Vertical shift
    zoom_range=0.1,              # Zoom
    fill_mode='nearest'
)

# Fit augmentation on training data
datagen.fit(x_train_norm)

print("‚úÖ Data augmentation configured!")
print("   Transformations: rotation, flip, shift, zoom")

## Section 4: Build Custom CNN (Optimized for CIFAR 32√ó32 Images)

In [None]:
# Build Custom CNN Architecture (VGG-style for Public_dataset)
model = Sequential([
    # Block 1
    Conv2D(64, (3, 3), padding='same', input_shape=(512, 512, 3)),
    BatchNormalization(),
    Activation('relu'),
    Conv2D(64, (3, 3), padding='same'),
    BatchNormalization(),
    Activation('relu'),
    MaxPooling2D((2, 2)),
    Dropout(0.2),
    
    # Block 2
    Conv2D(128, (3, 3), padding='same'),
    BatchNormalization(),
    Activation('relu'),
    Conv2D(128, (3, 3), padding='same'),
    BatchNormalization(),
    Activation('relu'),
    MaxPooling2D((2, 2)),
    Dropout(0.3),
    
    # Block 3
    Conv2D(256, (3, 3), padding='same'),
    BatchNormalization(),
    Activation('relu'),
    Conv2D(256, (3, 3), padding='same'),
    BatchNormalization(),
    Activation('relu'),
    Conv2D(256, (3, 3), padding='same'),
    BatchNormalization(),
    Activation('relu'),
    MaxPooling2D((2, 2)),
    Dropout(0.4),
    
    # Classifier
    Flatten(),
    Dense(512, activation='relu', kernel_regularizer=l2(0.001)),
    BatchNormalization(),
    Dropout(0.5),
    Dense(3, activation='softmax')
])


print("‚úÖ Custom CNN built successfully!")
print(f"   Total parameters: {model.count_params():,}")

In [None]:
# Compile model with Adam optimizer
model.compile(
    optimizer=Adam(learning_rate=0.001),  # Good starting LR for custom CNN
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

print("‚úÖ Model compiled successfully!")
print("\nüìã Model Summary:")
model.summary()

## Section 5: Train Model with Callbacks

In [None]:
# Callback 1: Early Stopping
early_stopping = EarlyStopping(
    monitor='val_accuracy',         # Monitor validation accuracy
    patience=10,                    # Wait 10 epochs before stopping
    restore_best_weights=True,
    verbose=1
)

# Callback 2: Reduce Learning Rate on Plateau
reduce_lr = ReduceLROnPlateau(
    monitor='val_loss',
    factor=0.5,                     # Reduce LR by half
    patience=5,                     # After 5 epochs with no improvement
    min_lr=1e-6,
    verbose=1
)

# Train with data augmentation
print("üöÄ Starting training...\n")

history = model.fit(
    datagen.flow(x_train_norm, y_train_onehot, batch_size=128),
    epochs=50,                      # Max 50 epochs (will stop early)
    validation_data=(x_valid_norm, y_valid_onehot),
    callbacks=[early_stopping, reduce_lr],
    verbose=1
)

print("\n‚úÖ Training complete!")

## Section 6: Evaluate and Visualize Results

In [None]:
# Plot training/validation curves
plt.figure(figsize=(14, 5))

# Accuracy plot
plt.subplot(1, 2, 1)
plt.plot(history.history['accuracy'], label='Train Accuracy', linewidth=2)
plt.plot(history.history['val_accuracy'], label='Val Accuracy', linewidth=2)
plt.title('Model Accuracy (Train vs Validation)', fontsize=12, fontweight='bold')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.grid(True, alpha=0.3)

# Loss plot
plt.subplot(1, 2, 2)
plt.plot(history.history['loss'], label='Train Loss', linewidth=2)
plt.plot(history.history['val_loss'], label='Val Loss', linewidth=2)
plt.title('Model Loss (Train vs Validation)', fontsize=12, fontweight='bold')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print("‚úÖ Training curves plotted!")

In [None]:
# Evaluate on test set
test_loss, test_accuracy = model.evaluate(x_test_norm, y_test_onehot, verbose=0)

print("\n" + "="*60)
print("üìä TEST SET EVALUATION RESULTS")
print("="*60)
print(f"Test Loss:     {test_loss:.4f}")
print(f"Test Accuracy: {test_accuracy*100:.2f}%")
print("="*60)

In [None]:
# Generate predictions for confusion matrix
y_pred_probs = model.predict(x_test_norm, verbose=0)
y_pred = np.argmax(y_pred_probs, axis=1)
y_test_true = np.argmax(y_test_onehot, axis=1)

# Confusion Matrix
cm = confusion_matrix(y_test_true, y_pred)

plt.figure(figsize=(10, 8))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', 
            xticklabels=CLASS_NAMES, yticklabels=CLASS_NAMES,
            cbar_kws={'label': 'Count'})
plt.title('Confusion Matrix - CIFAR-100 Classes 61-70', fontsize=14, fontweight='bold')
plt.ylabel('True Label', fontsize=12)
plt.xlabel('Predicted Label', fontsize=12)
plt.xticks(rotation=45, ha='right')
plt.yticks(rotation=0)
plt.tight_layout()
plt.show()

print("‚úÖ Confusion Matrix plotted!")

In [None]:
# Classification Report
print("\n" + "="*60)
print("üìã CLASSIFICATION REPORT")
print("="*60)
print(classification_report(y_test_true, y_pred, target_names=CLASS_NAMES))
print("="*60)

## Summary & Key Insights

### üéØ Approach Used:
- **Custom CNN Architecture** - VGG-style network designed specifically for 32√ó32 images
- **BatchNormalization** - Stabilizes training and allows higher learning rates
- **Progressive Dropout** - 0.2 ‚Üí 0.3 ‚Üí 0.4 ‚Üí 0.5 to prevent overfitting
- **Data Augmentation** - Rotation, flips, shifts, zoom for better generalization
- **Smart Callbacks** - Early stopping + learning rate reduction

### üí° Why This Works Better Than ResNet50:
1. **Scale Mismatch:** ResNet50 is designed for 224√ó224 ImageNet images. On 32√ó32 CIFAR images, its deep layers and downsampling destroy critical features
2. **Parameter Efficiency:** Custom CNN has ~2-3M parameters vs ResNet50's 25M+ parameters
3. **Proper Feature Extraction:** Shallow network preserves spatial information crucial for small images
4. **Tailored Architecture:** 3 conv blocks match CIFAR's image resolution perfectly

### üìà Expected Performance:
- **Training Time:** 20-35 minutes (M4 GPU)
- **Test Accuracy:** 85-92% (vs 10-20% with ResNet50)
- **Epochs to Convergence:** 20-35 (with early stopping)

### ‚úÖ Key Improvements Over ResNet Approach:
- ‚úÖ Proper architecture for 32√ó32 images
- ‚úÖ BatchNormalization for stable training
- ‚úÖ L2 regularization to prevent overfitting
- ‚úÖ Learning rate scheduling
- ‚úÖ Better accuracy monitoring (val_accuracy instead of val_loss)