# Image Classification with Pre-trained ResNet-34

This notebook demonstrates how to evaluate the performance of a pre-trained ResNet-34 model on a custom image dataset. We'll be assessing the model's ability to classify images from classes 401-500 that weren't part of its original training.

In [1]:
import torch
import torchvision
from torchvision import transforms
import numpy as np
import json
import os
from tqdm import tqdm
from PIL import Image

# Set device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

Using device: cuda


## Model Setup

Here we load a pre-trained ResNet-34 model with weights trained on ImageNet. The model is moved to the appropriate device (GPU if available, otherwise CPU) and set to evaluation mode to disable features like dropout and batch normalization updates.

In [2]:
# Load the pre-trained ResNet-34 model
pretrained_model = torchvision.models.resnet34(weights='IMAGENET1K_V1')
pretrained_model = pretrained_model.to(device)
pretrained_model.eval()

Downloading: "https://download.pytorch.org/models/resnet34-b627a593.pth" to /root/.cache/torch/hub/checkpoints/resnet34-b627a593.pth
100%|██████████| 83.3M/83.3M [00:00<00:00, 203MB/s]


ResNet(
  (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU(inplace=True)
  (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  (layer1): Sequential(
    (0): BasicBlock(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
    (1): BasicBlock(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
  

## Data Preprocessing

To ensure our images are processed consistently with how the model was trained, we set up the same normalization parameters used during the original ResNet training on ImageNet. 

The transformation pipeline:
1. Converts images to PyTorch tensors
2. Normalizes pixel values using ImageNet mean and standard deviation

In [3]:
# Set up normalization parameters as specified in the project instructions
mean_norms = np.array([0.485, 0.456, 0.406])
std_norms = np.array([0.229, 0.224, 0.225])

# Create the transform pipeline
plain_transforms = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean=mean_norms, std=std_norms)
])

# Define dataset path
dataset_path = "/kaggle/input/testdata/TestDataSet"

## Custom Dataset Implementation

We implement a custom dataset class that manually loads images from the file system. This approach:

- Scans through class directories to find all valid image files
- Ensures proper RGB conversion for all images (some may be grayscale or have alpha channels)
- Maps folder indices to class labels
- Applies the necessary transformations to prepare images for the model

The dataset class follows PyTorch's Dataset interface with `__len__` and `__getitem__` methods.

In [4]:
# Load the dataset manually with RGB conversion
class ManualImageDataset(torch.utils.data.Dataset):
    def __init__(self, root, transform=None):
        self.transform = transform
        self.samples = []
        self.classes = []
        self.class_to_idx = {}
        
        # Get all valid directories (excluding hidden folders)
        class_dirs = [d for d in os.listdir(root) if os.path.isdir(os.path.join(root, d)) and not d.startswith('.')]
        class_dirs.sort()  # Ensure consistent ordering
        
        # For each directory, find all images
        for i, class_dir in enumerate(class_dirs):
            self.classes.append(class_dir)
            self.class_to_idx[class_dir] = i
            
            dir_path = os.path.join(root, class_dir)
            for img_file in os.listdir(dir_path):
                if img_file.lower().endswith(('.jpg', '.jpeg', '.png', '.bmp', '.tiff')):
                    img_path = os.path.join(dir_path, img_file)
                    self.samples.append((img_path, i))
        
        print(f"Loaded {len(self.samples)} images across {len(self.classes)} classes")
    
    def __len__(self):
        return len(self.samples)
    
    def __getitem__(self, idx):
        img_path, label = self.samples[idx]
        
        # Open image with PIL and convert to RGB
        img = Image.open(img_path).convert('RGB')
        
        if self.transform:
            img = self.transform(img)
        
        return img, label

## Dataset Loading

Now we instantiate our custom dataset and create a DataLoader. 

- We use our previously defined transformations
- Set batch size to 1 for simplicity
- Disable shuffling since we're only evaluating, not training

In [5]:
# Load the dataset
dataset = ManualImageDataset(dataset_path, transform=plain_transforms)
dataloader = torch.utils.data.DataLoader(dataset, batch_size=1, shuffle=False)

Loaded 500 images across 100 classes


## Class Label Mapping

To interpret the model's predictions meaningfully, we need to map between:
1. Local folder indices (0, 1, 2...)
2. Actual ImageNet class indices (401-500)
3. Human-readable class names

We load the class labels from a JSON file and create the necessary mappings.

In [6]:
# Load label_list.json for class names (401-500)
with open(os.path.join(dataset_path, 'labels_list.json'), 'r') as f:
    label_list = json.load(f)
print(f"Loaded label_list.json")
    
# Print a sample of the label list
if isinstance(label_list, list) and len(label_list) > 0:
    print(f"First few entries: {label_list[:3]}")


# Create class name mapping (index 401-500 -> name)
class_names = {}
if isinstance(label_list, list):
    for item in label_list:
        if isinstance(item, str) and ': ' in item:
            idx, name = item.split(': ', 1)
            class_names[int(idx)] = name

# Map folder indices to classes (401-500)
folder_to_class = {}
for i, folder in enumerate(dataset.classes):
    class_idx = 401 + i
    folder_to_class[folder] = class_idx
    name = class_names.get(class_idx, f"Unknown_{class_idx}")
    print(f"Folder {i} ({folder}) -> Class {class_idx} ({name})")

Loaded label_list.json
First few entries: ['401: accordion', '402: acoustic guitar', '403: aircraft carrier']
Folder 0 (n02672831) -> Class 401 (accordion)
Folder 1 (n02676566) -> Class 402 (acoustic guitar)
Folder 2 (n02687172) -> Class 403 (aircraft carrier)
Folder 3 (n02690373) -> Class 404 (airliner)
Folder 4 (n02692877) -> Class 405 (airship)
Folder 5 (n02699494) -> Class 406 (altar)
Folder 6 (n02701002) -> Class 407 (ambulance)
Folder 7 (n02704792) -> Class 408 (amphibian)
Folder 8 (n02708093) -> Class 409 (analog clock)
Folder 9 (n02727426) -> Class 410 (apiary)
Folder 10 (n02730930) -> Class 411 (apron)
Folder 11 (n02747177) -> Class 412 (ashcan)
Folder 12 (n02749479) -> Class 413 (assault rifle)
Folder 13 (n02769748) -> Class 414 (backpack)
Folder 14 (n02776631) -> Class 415 (bakery)
Folder 15 (n02777292) -> Class 416 (balance beam)
Folder 16 (n02782093) -> Class 417 (balloon)
Folder 17 (n02783161) -> Class 418 (ballpoint)
Folder 18 (n02786058) -> Class 419 (Band Aid)
Folder 1

## Model Evaluation Function

This function evaluates the model's performance on our dataset:

1. For each image, obtains model predictions
2. Maps the local label to the correct ImageNet class index
3. Checks if the true class appears in the model's top-k predictions
4. Calculates accuracy metrics for different k values (1 and 5)

Top-1 accuracy measures how often the model's best guess is correct.
Top-5 accuracy measures how often the correct class appears in the model's top 5 guesses.

In [7]:
# Function to evaluate model and compute top-k accuracy
def evaluate_model(model, dataloader, folder_to_class, k_values=[1, 5]):
    correct = {k: 0 for k in k_values}
    total = 0
    
    with torch.no_grad():
        for images, labels in tqdm(dataloader, desc="Evaluating model"):
            images = images.to(device)
            folder_idx = labels.item()
            folder_name = dataset.classes[folder_idx]
            
            # Get the correct class index for this folder (401-500)
            true_class = folder_to_class[folder_name]
            
            # Forward pass
            outputs = model(images)
            
            # Get top-k predictions
            _, top_indices = outputs.topk(max(k_values), dim=1)
            top_indices = top_indices.cpu().numpy()[0]
            
            # Check if true class is in top-k predictions
            for k in k_values:
                if true_class in top_indices[:k]:
                    correct[k] += 1
            
            total += 1
    
    # Calculate accuracy percentages
    accuracy = {k: 100 * correct[k] / total for k in k_values}
    return accuracy, total

## Running the Evaluation

Now we run the evaluation using our previously defined function and the pre-trained ResNet-34 model.
We'll measure both top-1 and top-5 accuracy on our test dataset.

In [8]:
# Evaluate the model
print("\nEvaluating ResNet-34 on the test dataset...")
accuracy, total_images = evaluate_model(pretrained_model, dataloader, folder_to_class)


Evaluating ResNet-34 on the test dataset...


Evaluating model: 100%|██████████| 500/500 [00:06<00:00, 83.00it/s] 


## Results and Saving

Finally, we display the evaluation results and save them to a JSON file for future reference.
The JSON file will contain both the top-1 and top-5 accuracy percentages, as well as the total number of images evaluated.

In [9]:
# Print results
print("\nResNet-34 Evaluation Results:")
print(f"Top-1 Accuracy: {accuracy[1]:.2f}%")
print(f"Top-5 Accuracy: {accuracy[5]:.2f}%")
print(f"Total images evaluated: {total_images}")

# Save results to file
with open('task1_results.json', 'w') as f:
    json.dump({
        'top1_accuracy': float(accuracy[1]),
        'top5_accuracy': float(accuracy[5]),
        'total_images': total_images
    }, f, indent=4)

print("\nTask 1 completed successfully!")


ResNet-34 Evaluation Results:
Top-1 Accuracy: 76.00%
Top-5 Accuracy: 94.20%
Total images evaluated: 500

Task 1 completed successfully!
