<a href="https://www.kaggle.com/code/benjamindup/94-gender-recognition-resnet18?scriptVersionId=189052575" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

## **Disclaimer**
**This is a personnal notebook, comments are AI generated (but code is mine). Do not rely on them.**

# Image Classification with PyTorch: Gender Recognition

## 1. Introduction

Welcome to this Kaggle notebook on image classification using PyTorch! In this tutorial, we'll be tackling the task of gender classification using facial images. Our objective is to build, train, and evaluate a Convolutional Neural Network (CNN) that can accurately predict whether an image contains a male or female face.

This notebook is designed to guide you through the entire process of creating an image classification model, from data preparation to model evaluation. We'll cover important concepts in deep learning and computer vision, making this an excellent resource for students and practitioners alike.

Let's get started!

## 2. Setup

First, we need to import the necessary libraries and set up our environment. Each library we're using serves a specific purpose in our machine learning pipeline.

In [None]:
# Importing necessary libraries
import os
import numpy as np
import pandas as pd
from pathlib import Path
from PIL import Image
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader
import torchvision.transforms as transforms
import torchvision.models as models
import torch.nn.functional as F
from tqdm.notebook import tqdm
from collections import Counter
import matplotlib.pyplot as plt
import seaborn as sns
from torchvision.models import resnet18

# Setting the random seed for reproducibility
torch.manual_seed(42)
np.random.seed(42)

# Setting the paths
TRAIN_PATH = Path("/kaggle/input/gender-recognizer/dataset")
IMAGE_PATH_LIST = list(TRAIN_PATH.glob("*/*.jpg"))

print(f"Total number of images: {len(IMAGE_PATH_LIST)}")
print(f"Sample image path: {IMAGE_PATH_LIST[0]}")
print(f"Sample image path 2: {IMAGE_PATH_LIST[-1]}")

Let's break down the purpose of each imported library:

- `os`, `pathlib`: For handling file and directory paths
- `numpy`, `pandas`: For numerical operations and data manipulation
- `PIL`: For opening and processing images
- `sklearn`: For data splitting and evaluation metrics
- `torch`, `torch.nn`, `torch.optim`: Core PyTorch libraries for building and training neural networks
- `torchvision.transforms`, `torchvision.models`: For image transformations and pre-trained models
- `tqdm`: For progress bars during training
- `matplotlib`, `seaborn`: For creating visualizations

## 3. Data Preparation

### 3.1 Custom Dataset Class

We'll create a custom `GenderDataset` class that inherits from `torch.utils.data.Dataset`. This class will handle loading and preprocessing our images.

In [None]:
class GenderDataset(Dataset):
    def __init__(self, image_paths, transform=None):
        self.image_paths = image_paths
        self.transform = transform
        self.labels = [0 if 'woman' in str(path).lower() else 1 for path in image_paths]
        
    def __len__(self):
        return len(self.image_paths)
    
    def __getitem__(self, idx):
        img_path = self.image_paths[idx]
        image = Image.open(img_path).convert("RGB")
        label = self.labels[idx]
        
        if self.transform:
            image = self.transform(image)
            
        return image, label

# Let's check the distribution of our dataset
label_counts = Counter(0 if 'woman' in str(path).lower() else 1 for path in IMAGE_PATH_LIST)
print("Class distribution:")
print(f"Female (0): {label_counts[0]}")
print(f"Male (1): {label_counts[1]}")

plt.figure(figsize=(8, 6))
plt.bar(['Female', 'Male'], [label_counts[0], label_counts[1]])
plt.title('Class Distribution')
plt.ylabel('Number of Images')
plt.show()

The `GenderDataset` class does the following:
1. Stores the list of image paths and the transformation pipeline
2. Creates labels based on the image filenames (1 for male, 0 for female)
3. Implements `__len__` to return the total number of images
4. Implements `__getitem__` to load and preprocess an image when indexed

### 3.2 Data Augmentation and Normalization

Data augmentation helps our model generalize better by exposing it to various transformations of the input data. Normalization ensures that our input features are on a similar scale, which can help with model convergence.

In [None]:
# Data augmentation and normalization
transform = transforms.Compose([
    transforms.Resize((224, 224)),
 # Changed to 224x224 for ResNet    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

# Splitting the data into train and validation sets
train_paths, val_paths = train_test_split(IMAGE_PATH_LIST, test_size=0.2, random_state=42)
train_dataset = GenderDataset(train_paths, transform=transform)
val_dataset = GenderDataset(val_paths, transform=transform)

train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=32, shuffle=False)

print(f"Number of training samples: {len(train_dataset)}")
print(f"Number of validation samples: {len(val_dataset)}")

Let's break down our transformation pipeline:
1. `Resize`: Ensures all images are the same size (224x224 pixels)
2. `RandomHorizontalFlip`: Randomly flips images horizontally, increasing dataset variety
3. `ToTensor`: Converts images to PyTorch tensors
4. `Normalize`: Normalizes the image using mean and std dev values typical for ImageNet

## 4. Model Definition

### 4.1 Using Pre-trained ResNet

Instead of building a CNN from scratch, we'll use a pre-trained ResNet model from `torchvision.models` and fine-tune it for our classification task. Pre-trained models have been trained on large datasets like ImageNet and can be adapted to specific tasks with transfer learning.

In [None]:
# Initialize model architecture without pre-trained weights
resnet = resnet18(weights=None)

# Load the model weights from the provided file
model_weights_path = '/kaggle/input/resnet-18/pytorch/resnet18/1/resnet18-f37072fd.pth'
resnet.load_state_dict(torch.load(model_weights_path))

# Freeze the convolutional base
for param in resnet.parameters():
    param.requires_grad = False

# Modify the final layer for binary classification
num_ftrs = resnet.fc.in_features
resnet.fc = nn.Linear(num_ftrs, 1)  # Binary classification (male vs. female)

# Move the model to the GPU (if available)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
resnet = resnet.to(device)

print("Ready to be used")

### 4.2 Loss Function and Optimizer

We'll use the Binary Cross-Entropy loss function and the Adam optimizer for training our model.

In [None]:
criterion = nn.BCEWithLogitsLoss()  # Binary Cross-Entropy loss
optimizer = optim.Adam(resnet.fc.parameters(), lr=0.001)  # Only train the final layer
print('Optimizer and criterion setuped.')

## 5. Training the Model

### 5.1 Training Loop

We'll implement the training loop to train our model for a specified number of epochs. During each epoch, we'll iterate over the training DataLoader, compute the loss, and update the model parameters.

In [None]:
def train_model(model, criterion, optimizer, train_loader, val_loader, num_epochs=10):
    train_loss_history = []
    val_loss_history = []
    val_accuracy_history = []
    best_accuracy = 0.0
    
    for epoch in range(num_epochs):
        model.train()
        running_loss = 0.0
        
        for images, labels in tqdm(train_loader, desc=f"Training Epoch {epoch+1}/{num_epochs}"):
            images, labels = images.to(device), labels.to(device).float().unsqueeze(1)
            
            optimizer.zero_grad()
            outputs = model(images)
            loss = criterion(outputs, labels)
            loss.backward()
            optimizer.step()
            running_loss += loss.item() * images.size(0)
        
        epoch_loss = running_loss / len(train_loader.dataset)
        train_loss_history.append(epoch_loss)
        
        # Validation loss and accuracy calculation
        model.eval()
        val_running_loss = 0.0
        all_labels = []
        all_preds = []
        
        with torch.no_grad():
            for images, labels in tqdm(val_loader, desc="Validation"):
                images, labels = images.to(device), labels.to(device).float().unsqueeze(1)
                outputs = model(images)
                loss = criterion(outputs, labels)
                val_running_loss += loss.item() * images.size(0)
                
                preds = torch.round(torch.sigmoid(outputs))  # Apply sigmoid and round to get binary predictions
                all_labels.extend(labels.cpu().numpy())
                all_preds.extend(preds.cpu().numpy())
        
        val_loss = val_running_loss / len(val_loader.dataset)
        val_loss_history.append(val_loss)
        
        # Calculate accuracy
        accuracy = accuracy_score(all_labels, all_preds)
        val_accuracy_history.append(accuracy)
        
        print(f"Epoch [{epoch+1}/{num_epochs}], Training Loss: {epoch_loss:.4f}, Validation Loss: {val_loss:.4f}, Validation Accuracy: {accuracy:.4f}")
        
        if accuracy > best_accuracy:
            best_accuracy = accuracy
            torch.save(model.state_dict(), 'best_model.pth')
            print("Best model saved!")
    
    return train_loss_history, val_loss_history, val_accuracy_history

# Train the model
num_epochs = 20
train_loss_history, val_loss_history, val_accuracy_history = train_model(resnet, criterion, optimizer, train_loader, val_loader, num_epochs=num_epochs)


### 5.2 Plotting the Loss Curves

Visualizing the loss curves helps us understand the training process and identify potential issues like overfitting.

In [None]:
# Plotting the training and validation loss curves
plt.figure(figsize=(12, 6))
plt.subplot(1, 2, 1)
plt.plot(train_loss_history, label='Training Loss')
plt.plot(val_loss_history, label='Validation Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Training and Validation Loss Curves')
plt.legend()

# Plotting the validation accuracy
plt.subplot(1, 2, 2)
plt.plot(val_accuracy_history, label='Validation Accuracy', color='green')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.title('Validation Accuracy Curve')
plt.legend()

plt.show()

## 6. Evaluating the Model

### 6.1 Loading the Best Model

We'll load the best model (with the lowest validation loss) saved during training and evaluate it on the validation set.

In [None]:
# Load the best model
best_model = models.resnet18(weights=False)
best_model.fc = nn.Linear(num_ftrs, 1)
best_model.load_state_dict(torch.load('best_model.pth'))
best_model = best_model.to(device)

# Evaluation function
def evaluate_model(model, data_loader):
    model.eval()
    all_labels = []
    all_preds = []
    
    with torch.no_grad():
        for images, labels in data_loader:
            images, labels = images.to(device), labels.to(device).float().unsqueeze(1)
            outputs = model(images)
            preds = torch.round(torch.sigmoid(outputs))  # Apply sigmoid and round to get binary predictions
            all_labels.extend(labels.cpu().numpy())
            all_preds.extend(preds.cpu().numpy())
    
    return np.array(all_labels), np.array(all_preds)

# Evaluate the best model on the validation set
val_labels, val_preds = evaluate_model(best_model, val_loader)

# Calculate accuracy
accuracy = accuracy_score(val_labels, val_preds)
print(f"Validation Accuracy: {accuracy:.4f}")

# Print classification report
print("\nClassification Report:")
print(classification_report(val_labels, val_preds, target_names=['Female', 'Male']))

# Plot confusion matrix
cm = confusion_matrix(val_labels, val_preds)
plt.figure(figsize=(8, 6))
sns.heatmap(cm, annot=True, fmt="d", cmap="Blues", xticklabels=['Female', 'Male'], yticklabels=['Female', 'Male'])
plt.xlabel('Predicted Labels')
plt.ylabel('True Labels')
plt.title('Confusion Matrix')
plt.show()

## 7. Conclusion

In this project, we fine-tuned a pre-trained ResNet model to classify facial images as male or female using PyTorch. We went through the entire machine learning workflow, from data preprocessing to model training and evaluation. The final model achieved a good accuracy on the validation set, demonstrating the effectiveness of transfer learning for this task.

Further improvements could be made by:
- Experimenting with different architectures or fine-tuning more layers of the ResNet model.
- Using a larger and more diverse dataset.
- Applying data augmentation techniques to increase the variability of the training data.
- Tuning hyperparameters such as learning rate, batch size, and the number of epochs.

This concludes the gender classification project using a pre-trained ResNet model with PyTorch.

In [None]:
import requests
from PIL import Image
from io import BytesIO

# Function to predict gender from an image URL
def predict_gender_from_url(model, url):
    response = requests.get(url)
    image = Image.open(BytesIO(response.content)).convert("RGB")
    transformed_image = transform(image).unsqueeze(0).to(device)
    output = model(transformed_image)
    predicted_label = torch.round(torch.sigmoid(output)).item()
    gender = "Male" if predicted_label == 1 else "Female"
    probability = torch.sigmoid(output).item()
    
    # Display the image
    plt.imshow(image)
    plt.axis('off')
    plt.show()
    
    # Print the predicted gender and probability
    print(f"Male Probability: {probability:.4f}")
    print(f"Female Probability: {1-probability:.4f}")

          
# Example usage
image_url = "https://www.byrdie.com/thmb/vwVnoCain1FIr5KwvSZorEZLZ9g=/1500x0/filters:no_upscale():max_bytes(150000):strip_icc()/facialhair-26314088e12548ba9fc8af98d8500e8a.png"  # Replace with your image URL
predicted_gender = predict_gender_from_url(best_model, image_url)