# 🔥 Train ResNet-50 on Real Data Only (D-Fire)

This notebook implements the baseline fire classification model using only real-world images from the D-Fire dataset. We fine-tune a pretrained ResNet-50 model to distinguish between "fire" and "no fire" images in a binary classification task.

This baseline will later be compared to models trained on different combinations of synthetic and real data. The model's performance will be evaluated using accuracy, precision, recall, and F1-score on a held-out test set of real images.

📌 **Objectives:**
- Load and split the real D-Fire dataset
- Train ResNet-50 for binary fire classification
- Log training metrics and save the best-performing model
- Prepare for comparison with future synthetic+real experiments


## 📦 Notebook Setup: Mount Drive & Clone GitHub Repo

This cell ensures the notebook is reproducible in any new Colab session by:

- Mounting your Google Drive (to access datasets, secrets, and checkpoints)
- Loading your GitHub token from Drive
- Cloning the fire-detection-dissertation repository
- Navigating into the correct folder
- Setting Git identity for future commits

⚠️ **Note:** This cell must be run every time you open this notebook in a new Colab session.


In [1]:
# 🔧 Minimal Colab setup for any working notebook

# 1. Mount Google Drive
import os
from google.colab import drive
if not os.path.ismount("/content/drive"):
    drive.mount("/content/drive")

# 2. Load GitHub token securely from Drive
token_path = "/content/drive/MyDrive/fire-detection-dissertation/secrets/github_token.txt"
with open(token_path, "r") as f:
    token = f.read().strip()

# 3. Clone the GitHub repo (force fresh clone for safety)
username = "Misharasapu"
repo = "fire-detection-dissertation"
clone_url = f"https://{token}@github.com/{username}/{repo}.git"
repo_path = f"/content/{repo}"

# Optional: Remove old clone (safe to rerun)
!rm -rf {repo_path}

# Clone fresh and move into the repo
%cd /content
!git clone {clone_url}
%cd {repo}

# 4. Set Git identity (required in Colab sessions)
!git config --global user.name "Misharasapu"
!git config --global user.email "misharasapu@gmail.com"


Mounted at /content/drive
/content
Cloning into 'fire-detection-dissertation'...
remote: Enumerating objects: 30, done.[K
remote: Counting objects: 100% (30/30), done.[K
remote: Compressing objects: 100% (22/22), done.[K
remote: Total 30 (delta 9), reused 23 (delta 6), pack-reused 0 (from 0)[K
Receiving objects: 100% (30/30), 176.77 KiB | 16.07 MiB/s, done.
Resolving deltas: 100% (9/9), done.
/content/fire-detection-dissertation


## 🔹 Step 1: Load & Split the Real Dataset (Train/Val Only)

In this step, we load the real-world D-Fire dataset using the `FireClassificationDataset` class and divide it into training and validation subsets. The separate test set remains untouched, as it will be reserved for final model evaluation after all experiments are complete.

Each image in the dataset is paired with a YOLO-style `.txt` label file:
- `class_id == 1` indicates fire (labelled as `1`)
- `class_id == 0` indicates smoke only (labelled as `0`)
- Missing or empty `.txt` files also result in label `0`

These labels are processed automatically inside the dataset class. After loading, we use an 80/20 split to create training and validation sets, followed by wrapping each in a `DataLoader` for efficient batching during training.

This step ensures the model learns from real-world examples while maintaining a consistent and reproducible split strategy.


In [None]:
import torch
from torch.utils.data import DataLoader, random_split
from torchvision import transforms
from utils.fire_classification_dataset import FireClassificationDataset

# Define image transforms: resize to 224x224 and convert to tensor
transform = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
])

# Set paths to D-Fire training images and YOLO label files
image_dir = "/content/drive/MyDrive/fire-detection-dissertation/data/raw/real/D-Fire/train/images"
label_dir = "/content/drive/MyDrive/fire-detection-dissertation/data/raw/real/D-Fire/train/labels"

# Load full dataset using the custom Dataset class
full_dataset = FireClassificationDataset(image_dir=image_dir, label_dir=label_dir, transform=transform)

# Split dataset into training and validation subsets (80/20 split)
train_ratio = 0.8
train_size = int(train_ratio * len(full_dataset))
val_size = len(full_dataset) - train_size

# Use a fixed seed to ensure reproducible splits
generator = torch.Generator().manual_seed(42)
train_dataset, val_dataset = random_split(full_dataset, [train_size, val_size], generator=generator)

# Wrap subsets in DataLoaders for batching and shuffling
batch_size = 32
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True, num_workers=2)
val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False, num_workers=2)


# 🔍 Confirm dataset sizes
print(f"Total samples in full dataset: {len(full_dataset)}")
print(f"Training samples: {len(train_dataset)}")
print(f"Validation samples: {len(val_dataset)}")

# 🔍 Preview one batch from the train_loader
train_batch = next(iter(train_loader))
images, labels = train_batch

print(f"\nBatch shape (images): {images.shape}")   # Expected: (32, 3, 224, 224)
print(f"Batch shape (labels): {labels.shape}")     # Expected: (32,)
print(f"Sample labels: {labels.tolist()}")         # Quick look at a label distribution



Total samples in full dataset: 17222
Training samples: 13777
Validation samples: 3445

Batch shape (images): torch.Size([32, 3, 224, 224])
Batch shape (labels): torch.Size([32])
Sample labels: [0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]


## 🔹 Step 2: Load and Modify ResNet-50 for Binary Classification

In this step, we load a pretrained ResNet-50 model from PyTorch's `torchvision.models` library and modify its output layer to suit our binary classification task (fire vs. no fire).

By default, ResNet-50 is trained on ImageNet with 1000 output classes. We will:
- Load the pretrained model with its existing weights
- Freeze the convolutional layers to retain pretrained features (optional, can be unfrozen later)
- Replace the final fully connected (FC) layer with a new linear layer for 2-class output

This fine-tuning approach allows us to benefit from transfer learning — leveraging the rich feature representations learned from large-scale natural image data, while adapting the final classification head to our specific fire detection task.


In [None]:
import torch.nn as nn
from torchvision import models

# Load pretrained ResNet-50 model from torchvision
resnet = models.resnet50(pretrained=True)

# Optional: Freeze all layers except the final classification head
# This prevents updating the weights in the backbone during training
for param in resnet.parameters():
    param.requires_grad = False

# Get the number of input features to the original final layer
in_features = resnet.fc.in_features  # Expected: 2048

# Replace the original 1000-class FC layer with a new 2-class layer
resnet.fc = nn.Linear(in_features, 2)

# Move model to GPU if available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
resnet = resnet.to(device)

# Print model summary (optional)
print(resnet)

# Optional: Print detailed model summary using torchinfo
!pip install torchinfo
from torchinfo import summary

# Show summary for input batch size of 32 (3 channels, 224x224 resolution)
summary(resnet, input_size=(32, 3, 224, 224))



ResNet(
  (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU(inplace=True)
  (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  (layer1): Sequential(
    (0): Bottleneck(
      (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (downsample): Sequential(
        (0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 

Layer (type:depth-idx)                   Output Shape              Param #
ResNet                                   [32, 2]                   --
├─Conv2d: 1-1                            [32, 64, 112, 112]        (9,408)
├─BatchNorm2d: 1-2                       [32, 64, 112, 112]        (128)
├─ReLU: 1-3                              [32, 64, 112, 112]        --
├─MaxPool2d: 1-4                         [32, 64, 56, 56]          --
├─Sequential: 1-5                        [32, 256, 56, 56]         --
│    └─Bottleneck: 2-1                   [32, 256, 56, 56]         --
│    │    └─Conv2d: 3-1                  [32, 64, 56, 56]          (4,096)
│    │    └─BatchNorm2d: 3-2             [32, 64, 56, 56]          (128)
│    │    └─ReLU: 3-3                    [32, 64, 56, 56]          --
│    │    └─Conv2d: 3-4                  [32, 64, 56, 56]          (36,864)
│    │    └─BatchNorm2d: 3-5             [32, 64, 56, 56]          (128)
│    │    └─ReLU: 3-6                    [32, 64, 56, 56]   

## 🔹 Step 3: Define Loss Function, Optimizer, and Metrics

In this step, we define the core components needed to train the model:

- **Loss Function:** Measures the difference between the model's predictions and the true labels. Since this is a binary classification task (fire vs. no fire) with logits (unnormalized outputs), we use `CrossEntropyLoss`, which expects two output logits per image.
  
- **Optimizer:** Updates model parameters to reduce the loss. We use Adam, which adapts the learning rate for each parameter and typically converges faster than standard SGD.

- **Metrics:** During training and validation, we want to track:
  - Accuracy: % of correct predictions
  - Precision: How many predicted "fire" labels are correct
  - Recall: How many actual "fire" samples were correctly found
  - F1 Score: Harmonic mean of precision and recall

These metrics help us evaluate how well the model is learning and whether it's favoring one class over the other (e.g. predicting too many false positives or false negatives).


In [None]:
import torch.nn as nn
import torch.optim as optim
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score

# Define loss function
# CrossEntropyLoss is appropriate for 2-class classification with logits
criterion = nn.CrossEntropyLoss()

# Define optimizer — only train parameters that require gradients (i.e., not frozen ones)
optimizer = optim.Adam(resnet.fc.parameters(), lr=1e-4)

# Metrics will be computed manually during validation
# These are just placeholders for now (we'll compute them later in the evaluation loop)
def calculate_metrics(y_true, y_pred):
    """
    Computes accuracy, precision, recall, and F1 score.
    Args:
        y_true (Tensor): Ground truth labels (0 or 1)
        y_pred (Tensor): Predicted class indices (0 or 1)
    Returns:
        dict: metric_name → value
    """
    y_true = y_true.cpu().numpy()
    y_pred = y_pred.cpu().numpy()

    return {
        'accuracy': accuracy_score(y_true, y_pred),
        'precision': precision_score(y_true, y_pred, zero_division=0),
        'recall': recall_score(y_true, y_pred, zero_division=0),
        'f1': f1_score(y_true, y_pred, zero_division=0)
    }


## 🔹 Step 4: Training and Validation Loop

In this step, we implement the main training loop to fine-tune the ResNet-50 model using only the final fully connected layer. We perform training over multiple epochs, and at each epoch we:

- Loop over batches from the training DataLoader
- Forward pass: compute model predictions
- Compute loss using `CrossEntropyLoss`
- Backward pass: compute gradients for the final layer only
- Update parameters using the Adam optimizer

During validation, we:
- Disable gradient tracking (for efficiency)
- Collect predictions and ground-truth labels across all batches
- Compute accuracy, precision, recall, and F1 score using our custom `calculate_metrics` function

We also track and print losses and metrics for both training and validation to monitor learning progress.


In [None]:
import time
from tqdm import tqdm

# ✅ Confirm model is on GPU before starting
print("🔍 Model device:", next(resnet.parameters()).device)

# Training settings
num_epochs = 5
print_every = 1
print_batch_loss = False

# For tracking training progress
train_losses, val_losses = [], []

# ✅ Best model tracking
best_f1 = 0.0
best_model_path = "/content/drive/MyDrive/fire-detection-dissertation/models/resnet_real_best.pt"

# Main training loop
for epoch in range(num_epochs):
    start_time = time.time()
    print(f"\n🔁 Epoch {epoch + 1}/{num_epochs}")

    # ========== Training ==========
    resnet.train()
    running_loss = 0.0
    train_loop = tqdm(enumerate(train_loader), total=len(train_loader), desc="🚂 Training", leave=False)

    for batch_idx, (images, labels) in train_loop:
        images = images.to(device)
        labels = labels.to(device)

        outputs = resnet(images)
        loss = criterion(outputs, labels)

        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        running_loss += loss.item()

        if print_batch_loss and (batch_idx + 1) % 10 == 0:
            print(f"  Batch {batch_idx + 1}/{len(train_loader)} - Loss: {loss.item():.4f}")

    avg_train_loss = running_loss / len(train_loader)
    train_losses.append(avg_train_loss)

    # ========== Validation ==========
    resnet.eval()
    val_loss = 0.0
    all_preds = []
    all_labels = []

    val_loop = tqdm(val_loader, total=len(val_loader), desc="🔎 Validating", leave=False)

    with torch.no_grad():
        for images, labels in val_loop:
            images = images.to(device)
            labels = labels.to(device)

            outputs = resnet(images)
            loss = criterion(outputs, labels)
            val_loss += loss.item()

            preds = torch.argmax(outputs, dim=1)
            all_preds.append(preds)
            all_labels.append(labels)

    avg_val_loss = val_loss / len(val_loader)
    val_losses.append(avg_val_loss)

    all_preds = torch.cat(all_preds)
    all_labels = torch.cat(all_labels)
    metrics = calculate_metrics(all_labels, all_preds)

    # Print epoch summary
    if (epoch + 1) % print_every == 0:
        print(f"✅ Epoch [{epoch + 1}/{num_epochs}] | "
              f"Train Loss: {avg_train_loss:.4f} | "
              f"Val Loss: {avg_val_loss:.4f} | "
              f"Acc: {metrics['accuracy']:.4f} | "
              f"Precision: {metrics['precision']:.4f} | "
              f"Recall: {metrics['recall']:.4f} | "
              f"F1: {metrics['f1']:.4f} | "
              f"Time: {time.time() - start_time:.1f}s")

    # ✅ Save best model based on F1 score
    if metrics['f1'] > best_f1:
        best_f1 = metrics['f1']
        torch.save(resnet.state_dict(), best_model_path)
        print(f"💾 New best model saved (F1: {best_f1:.4f}) → {best_model_path}")


🔍 Model device: cuda:0

🔁 Epoch 1/5




✅ Epoch [1/5] | Train Loss: 0.3275 | Val Loss: 0.2780 | Acc: 0.9019 | Precision: 0.8784 | Recall: 0.7384 | F1: 0.8023 | Time: 8001.0s
💾 New best model saved (F1: 0.8023) → /content/drive/MyDrive/fire-detection-dissertation/models/resnet_real_best.pt

🔁 Epoch 2/5




✅ Epoch [2/5] | Train Loss: 0.2634 | Val Loss: 0.2488 | Acc: 0.9152 | Precision: 0.8760 | Recall: 0.7987 | F1: 0.8356 | Time: 210.2s
💾 New best model saved (F1: 0.8356) → /content/drive/MyDrive/fire-detection-dissertation/models/resnet_real_best.pt

🔁 Epoch 3/5


                                                                

✅ Epoch [3/5] | Train Loss: 0.2446 | Val Loss: 0.2448 | Acc: 0.9109 | Precision: 0.9191 | Recall: 0.7341 | F1: 0.8163 | Time: 211.2s

🔁 Epoch 4/5




✅ Epoch [4/5] | Train Loss: 0.2333 | Val Loss: 0.2263 | Acc: 0.9196 | Precision: 0.8671 | Recall: 0.8288 | F1: 0.8476 | Time: 211.2s
💾 New best model saved (F1: 0.8476) → /content/drive/MyDrive/fire-detection-dissertation/models/resnet_real_best.pt

🔁 Epoch 5/5




✅ Epoch [5/5] | Train Loss: 0.2257 | Val Loss: 0.2214 | Acc: 0.9222 | Precision: 0.8939 | Recall: 0.8073 | F1: 0.8484 | Time: 212.3s
💾 New best model saved (F1: 0.8484) → /content/drive/MyDrive/fire-detection-dissertation/models/resnet_real_best.pt


In [2]:
%cd /content/fire-detection-dissertation
!git add notebooks/03_train_resnet_real_only.ipynb
!git commit -m "Add full training pipeline for real D-Fire dataset using ResNet-50 with feature extraction and best model saving"
!git push


/content/fire-detection-dissertation
fatal: pathspec 'notebooks/03_train_resnet_real_only.ipynb' did not match any files
On branch main
Your branch is up to date with 'origin/main'.

nothing to commit, working tree clean
Everything up-to-date
