<a href="https://colab.research.google.com/github/Kamohelo99/C0S711_Assignment_3/blob/Ndumiso/supervised_Only_on_human_labeled_data.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Supervised Training Template

This notebook outlines the steps required to train a convolutional neural network on the labelled MGCLS dataset.  It is not a complete solution; instead it provides guidance and placeholders for you to implement your own logic.  Follow the comments in each cell and fill in the `TODO` sections to build your own training pipeline.


## 1. Set up environment

Import the required libraries.  You may need to install some packages via pip if they are not already available in your environment.  Ensure you are using a GPU runtime if available.

In [None]:
# Install required libraries for this notebook
!pip install -q iterative-stratification torchmetrics astropy

# Standard Library Imports
import os
import re
import warnings
from pathlib import Path

# Core Data Science and ML Imports
import numpy as np
import pandas as pd
import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader
from torchvision import transforms, models
from PIL import Image
from tqdm.notebook import tqdm

from astropy.coordinates import SkyCoord
from astropy import units as u

from sklearn.preprocessing import MultiLabelBinarizer
from iterstrat.ml_stratifiers import MultilabelStratifiedShuffleSplit

# Torchmetrics for Evaluation
from torchmetrics import MetricCollection
from torchmetrics.classification import MultilabelF1Score

# Global Settings
warnings.filterwarnings("ignore")
SEED = 42
torch.manual_seed(SEED)
np.random.seed(SEED)

print("All required libraries are imported.")


[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/983.2 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m983.2/983.2 kB[0m [31m63.0 MB/s[0m eta [36m0:00:00[0m
[?25hAll required libraries are imported.


## 2. Define file paths

Specify the locations of your extracted data and labels.  Update these variables to point to the directories on your own system or Colab environment.

In [None]:
# Mount Google Drive to access your data
from google.colab import drive
drive.mount('/content/drive')

# TODO: set these paths appropriately, I save mine in MyDrive
DRIVE_PATH = Path("/content/drive/MyDrive/assignmentdata")
DATA_DIR = DRIVE_PATH / "data"
LABELS_FILE = DRIVE_PATH / "labels.csv"

# directories for generated files
CHECKPOINT_DIR = DRIVE_PATH / "checkpoints"
SPLIT_DIR = DRIVE_PATH / "splits"
DATA_DIR.mkdir(exist_ok=True)
CHECKPOINT_DIR.mkdir(exist_ok=True)
SPLIT_DIR.mkdir(exist_ok=True)

# We will define num_classes dynamically after inspecting the data.
num_classes = None

print(f"Data root set to: {DATA_DIR}")
print("Please ensure your data (labels.csv, typ/, exo/) is in this directory.")


Mounted at /content/drive
Data root set to: /content/drive/MyDrive/assignmentdata/data
Please ensure your data (labels.csv, typ/, exo/) is in this directory.


## 3. Load and inspect the labels

Use pandas to read `labels.csv` and explore its columns.  Identify the coordinate columns (e.g. `ra`, `dec`) and the label columns (e.g. `label1`, `label2`, ...).  You will need this information when implementing the coordinate matching function.

In [None]:
# Read the labels CSV, providing column names as the file has no header
labels_df = pd.read_csv(LABELS_FILE, names=['RA', 'DEC', 'L1', 'L2', 'L3', 'L4'])

# Consolidate all label columns into a single 'labels' string
label_cols = ['L1', 'L2', 'L3', 'L4']
labels_df['labels'] = labels_df[label_cols].apply(lambda row: ', '.join(row.dropna().astype(str)), axis=1)

print("Labels DataFrame:")
labels_df.head()


Labels DataFrame:


Unnamed: 0,RA,DEC,L1,L2,L3,L4,labels
0,10.328221,-20.476357,FR II,,,,FR II
1,92.109802,-49.431413,typical,,,,typical
2,88.916825,-59.431868,Point Source,,,,Point Source
3,5.457981,-25.589637,FR II,,,,FR II
4,119.417608,-53.396711,FR II,,,,FR II


## 4. Implement coordinate parsing and label matching

The dataset module in `src/dataset.py` provides skeleton functions `parse_coords_from_filename()` and `match_labels()`.  You need to implement these functions so they correctly extract coordinates from image filenames and find the nearest label entry.  Test your implementation in this cell.  For example, pick a few filenames from the `typ` directory and check that the returned labels make sense.

In [None]:
# combine the logic for coordinate parsing and matching.
def extract_coords_from_filename(fname: str):
    """Extract RA/Dec from filename using a robust regex."""
    pattern = r"([-+]?\d*\.\d+|\d+)\s+([-+]?\d*\.\d+|\d+)_"
    m = re.search(pattern, fname)
    if m:
        try:
            return float(m.group(1)), float(m.group(2))
        except (ValueError, IndexError):
            return None, None
    return None, None

def perform_matching(data_dir, labels_df):
    """Scans image folders, extracts coordinates, and matches them to labels using Astropy."""
    print("--- Starting Coordinate Matching ---")

    # Create an Astropy SkyCoord object for the catalog labels for fast matching
    catalog = SkyCoord(ra=labels_df["RA"].values*u.deg, dec=labels_df["DEC"].values*u.deg, frame='icrs')

    # Scan image folders ('typ' and 'exo')
    imgs = []
    for folder in ["typ/typ_PNG", "exo/exo_PNG"]:
        folder_path = data_dir / folder
        if not folder_path.exists(): continue
        for fpath in folder_path.glob("*.png"):
            ra, dec = extract_coords_from_filename(fpath.name)
            if ra is not None:
                imgs.append({"image_path": str(fpath), "RA_img": ra, "DEC_img": dec})

    images_df = pd.DataFrame(imgs)
    print(f"Found {len(images_df)} PNG files with valid coordinates.")

    # Use Astropy to find the nearest neighbor in the catalog for each image
    image_coords = SkyCoord(ra=images_df["RA_img"].values*u.deg, dec=images_df["DEC_img"].values*u.deg, frame='icrs')
    idx, sep2d, _ = image_coords.match_to_catalog_sky(catalog)

    # Combine the matched data into a single DataFrame
    matched_labels = labels_df.iloc[idx].reset_index(drop=True)
    combined_df = pd.concat([images_df, matched_labels], axis=1)
    combined_df['distance_arcsec'] = sep2d.arcsec

    print("Matching complete.")
    return combined_df

# Execute the matching process
combined_data_df = perform_matching(DRIVE_PATH, labels_df)

print("\n--- Matched Data Sample ---")
combined_data_df.head()


--- Starting Coordinate Matching ---
Found 2107 PNG files with valid coordinates.
Matching complete.

--- Matched Data Sample ---


Unnamed: 0,image_path,RA_img,DEC_img,RA,DEC,L1,L2,L3,L4,labels,distance_arcsec
0,/content/drive/MyDrive/assignmentdata/typ/typ_...,202.83,-32.16,202.829848,-32.160019,FR II,,,,FR II,0.467545
1,/content/drive/MyDrive/assignmentdata/typ/typ_...,119.632,-52.429,119.631969,-52.428629,FR I,,,,FR I,1.336699
2,/content/drive/MyDrive/assignmentdata/typ/typ_...,2.138,-24.612,2.172402,-24.781252,FR I,,,,FR I,619.611134
3,/content/drive/MyDrive/assignmentdata/typ/typ_...,341.92,-15.09,341.919885,-15.090129,FR II,,,,FR II,0.613246
4,/content/drive/MyDrive/assignmentdata/typ/typ_...,53.824,-40.614,53.824299,-40.614103,FR I,,,,FR I,0.898608


In [None]:
# def match_labels(coords, labels_df, tol_arcsec=1.0):
#     """
#     Match image coords (RA, Dec) to a row in labels_df using a tolerance in arcseconds.
#     Returns: list of class label strings
#     """
#     import numpy as np

#     ra, dec = coords
#     # Convert to numpy for vectorized diff
#     ra_diff = np.abs(labels_df['ra'].values - ra)
#     dec_diff = np.abs(labels_df['dec'].values - dec)

#     # Angular tolerance in degrees
#     tol_deg = tol_arcsec / 3600.0

#     matches = (ra_diff < tol_deg) & (dec_diff < tol_deg)
#     if not matches.any():
#         return []

#     matched_row = labels_df[matches].iloc[0]
#     label_str = matched_row['labels']  # assumed comma-separated string
#     return [l.strip() for l in label_str.split(',') if l.strip()]


In [None]:
# # Load labels.csv
# labels_df = pd.read_csv('data/labels.csv')

# # Test with a file from 'typ/'
# filename = os.listdir('data/typ')[0]
# coords = parse_coords_from_filename(filename)
# labels = match_labels(coords, labels_df)

# print("Filename:", filename)
# print("Coordinates:", coords)
# print("Labels:", labels)

## 5. Create the dataset and dataloaders

Here we instantiate the `RadioDataset` for the labelled images.  You may choose to combine the typical and exotic datasets or create separate datasets and use `ConcatDataset`.  Apply appropriate transformations (e.g. resizing, normalisation, augmentation).

In [None]:
# Define Transformations
IMG_SIZE = 128 # Using the size from Kamo
train_tf = transforms.Compose([
    transforms.RandomRotation(360),
    transforms.RandomHorizontalFlip(p=0.5),
    transforms.RandomVerticalFlip(p=0.5),
    transforms.RandomResizedCrop(IMG_SIZE, scale=(0.8, 1.0)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.5], std=[0.5]),
])
val_tf = transforms.Compose([
    transforms.Resize((IMG_SIZE, IMG_SIZE)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.5], std=[0.5]),
])

# Create Stratified Splits
print("--- Creating Multilabel Stratified Splits ---")
combined_data_df["labels_list"] = combined_data_df["labels"].astype(str).apply(
    lambda s: [lbl.strip() for lbl in s.split(",") if lbl.strip()]
)
mlb = MultiLabelBinarizer()
y = mlb.fit_transform(combined_data_df["labels_list"])
CLASSES = mlb.classes_.tolist()
num_classes = len(CLASSES)
print(f"Found {num_classes} unique classes: {CLASSES}")

# 70/30 split for train and validation
msss = MultilabelStratifiedShuffleSplit(n_splits=1, test_size=0.3, random_state=SEED)
train_idx, val_idx = next(msss.split(combined_data_df, y))

train_df = combined_data_df.iloc[train_idx]
val_df = combined_data_df.iloc[val_idx]
print(f"Split complete. Train size: {len(train_df)}, Validation size: {len(val_df)}")

# Define the Dataset Class
class RadioDataset(Dataset):
    def __init__(self, df, classes, transform=None):
        self.df = df.reset_index(drop=True)
        self.transform = transform
        self.mlb = MultiLabelBinarizer(classes=classes)
    def __len__(self): return len(self.df)
    def __getitem__(self, idx):
        row = self.df.iloc[idx]
        img = Image.open(row['image_path']).convert('L') # Force 1-channel
        if self.transform:
            img = self.transform(img)
        labels_one_hot = self.mlb.fit_transform([row['labels_list']])[0]
        labels = torch.tensor(labels_one_hot, dtype=torch.float32)
        return img, labels

# Instantiate Datasets and DataLoaders
train_dataset = RadioDataset(train_df, classes=CLASSES, transform=train_tf)
val_dataset = RadioDataset(val_df, classes=CLASSES, transform=val_tf)

train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True, num_workers=2)
val_loader = DataLoader(val_dataset, batch_size=32, shuffle=False, num_workers=2)

print("Datasets and DataLoaders are ready.")

--- Creating Multilabel Stratified Splits ---
Found 9 unique classes: ['Bent', 'Exotic', 'FR I', 'FR II', 'Point Source', 'S/Z shaped', 'Should be discarded', 'X-Shaped', 'typical']
Split complete. Train size: 1480, Validation size: 627
Datasets and DataLoaders are ready.


## 6. Build the model

Create a ResNet‑based classifier using the helper function `build_model()` in `src/model.py`.  Remember to pass `num_classes` equal to the total number of labels you have.  Move the model to GPU if available.

In [None]:
def build_adapted_model(model_name="efficientnet_b0", num_classes=num_classes):
    """Adapts a pre-trained model for 1-channel input and our classification task."""
    model = models.get_model(model_name, weights='IMAGENET1K_V1')

    # Adapt first conv layer for 1-channel input
    conv_layer = model.features[0][0]
    new_conv = nn.Conv2d(1, conv_layer.out_channels,
                         kernel_size=conv_layer.kernel_size, stride=conv_layer.stride,
                         padding=conv_layer.padding, bias=conv_layer.bias is not None)
    new_conv.weight.data = conv_layer.weight.data.mean(dim=1, keepdim=True)
    model.features[0][0] = new_conv

    # Adapt final classifier
    in_features = model.classifier[1].in_features
    model.classifier = nn.Sequential(nn.Dropout(p=0.3), nn.Linear(in_features, num_classes))

    print(f"Adapted {model_name} for 1-channel input and {num_classes} classes.")
    return model

# Instantiate the model
model = build_adapted_model(num_classes=num_classes)
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = model.to(device)
print(f"Model moved to device: {device}")

# Define loss function and optimiser
criterion = torch.nn.BCEWithLogitsLoss()
optimizer = torch.optim.AdamW(model.parameters(), lr=1e-4, weight_decay=1e-5)


Downloading: "https://download.pytorch.org/models/efficientnet_b0_rwightman-7f5810bc.pth" to /root/.cache/torch/hub/checkpoints/efficientnet_b0_rwightman-7f5810bc.pth


100%|██████████| 20.5M/20.5M [00:00<00:00, 196MB/s]


Adapted efficientnet_b0 for 1-channel input and 9 classes.
Model moved to device: cuda


## 7. Training loop

Implement the training loop.  For each batch, convert the list of label strings into a multi‑hot tensor.  Compute the loss, backpropagate, and update the model weights.  At the end of each epoch, evaluate the model on the validation set and compute metrics such as precision, recall, F1 and mAP using functions from `src/utils.py`.  Save the best model checkpoint.

In [None]:
# Define the evaluation loop logic
def evaluate_epoch(model, loader, device, metrics_collection):
    model.eval()
    metrics_collection.reset()
    with torch.no_grad():
        for images, labels in loader:
            images, labels = images.to(device), labels.to(device)
            outputs = model(images)
            metrics_collection.update(outputs, labels.int())
    return metrics_collection.compute()

# Main Training Loop
num_epochs = 50
best_f1 = 0.0
metrics = MetricCollection({'MacroF1': MultilabelF1Score(num_labels=num_classes, average='macro')}).to(device)

for epoch in range(num_epochs):
    model.train()
    running_loss = 0.0
    pbar = tqdm(train_loader, desc=f"Epoch {epoch+1}/{num_epochs}")

    for images, labels in pbar:
        images, labels = images.to(device), labels.to(device)

        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item()
        pbar.set_postfix(loss=running_loss/len(pbar))

    # Validation
    val_metrics = evaluate_epoch(model, val_loader, device, metrics)
    val_f1 = val_metrics['MacroF1'].item()

    print(f"Epoch {epoch+1} Summary: Train Loss: {running_loss/len(train_loader):.4f}, Val MacroF1: {val_f1:.4f}")

    # Save best model
    if val_f1 > best_f1:
        best_f1 = val_f1
        torch.save(model.state_dict(), CHECKPOINT_DIR / 'supervised_best_model.pth')
        print(f'New best model saved with F1-score: {best_f1:.4f}')

Epoch 1/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 1 Summary: Train Loss: 0.5557, Val MacroF1: 0.0660
New best model saved with F1-score: 0.0660


Epoch 2/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 2 Summary: Train Loss: 0.3486, Val MacroF1: 0.1042
New best model saved with F1-score: 0.1042


Epoch 3/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 3 Summary: Train Loss: 0.2720, Val MacroF1: 0.1391
New best model saved with F1-score: 0.1391


Epoch 4/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 4 Summary: Train Loss: 0.2479, Val MacroF1: 0.1509
New best model saved with F1-score: 0.1509


Epoch 5/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 5 Summary: Train Loss: 0.2337, Val MacroF1: 0.2064
New best model saved with F1-score: 0.2064


Epoch 6/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 6 Summary: Train Loss: 0.2258, Val MacroF1: 0.2560
New best model saved with F1-score: 0.2560


Epoch 7/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 7 Summary: Train Loss: 0.2196, Val MacroF1: 0.2122


Epoch 8/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 8 Summary: Train Loss: 0.2144, Val MacroF1: 0.2049


Epoch 9/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 9 Summary: Train Loss: 0.2072, Val MacroF1: 0.2657
New best model saved with F1-score: 0.2657


Epoch 10/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 10 Summary: Train Loss: 0.2046, Val MacroF1: 0.2339


Epoch 11/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 11 Summary: Train Loss: 0.2052, Val MacroF1: 0.2395


Epoch 12/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 12 Summary: Train Loss: 0.1961, Val MacroF1: 0.2417


Epoch 13/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 13 Summary: Train Loss: 0.1941, Val MacroF1: 0.2747
New best model saved with F1-score: 0.2747


Epoch 14/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 14 Summary: Train Loss: 0.1951, Val MacroF1: 0.2622


Epoch 15/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 15 Summary: Train Loss: 0.1880, Val MacroF1: 0.2558


Epoch 16/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 16 Summary: Train Loss: 0.1877, Val MacroF1: 0.2847
New best model saved with F1-score: 0.2847


Epoch 17/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 17 Summary: Train Loss: 0.1866, Val MacroF1: 0.2788


Epoch 18/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 18 Summary: Train Loss: 0.1852, Val MacroF1: 0.2767


Epoch 19/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 19 Summary: Train Loss: 0.1823, Val MacroF1: 0.3181
New best model saved with F1-score: 0.3181


Epoch 20/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 20 Summary: Train Loss: 0.1773, Val MacroF1: 0.2992


Epoch 21/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 21 Summary: Train Loss: 0.1762, Val MacroF1: 0.2913


Epoch 22/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 22 Summary: Train Loss: 0.1754, Val MacroF1: 0.2763


Epoch 23/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 23 Summary: Train Loss: 0.1711, Val MacroF1: 0.3332
New best model saved with F1-score: 0.3332


Epoch 24/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 24 Summary: Train Loss: 0.1692, Val MacroF1: 0.2872


Epoch 25/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 25 Summary: Train Loss: 0.1663, Val MacroF1: 0.3013


Epoch 26/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 26 Summary: Train Loss: 0.1630, Val MacroF1: 0.3060


Epoch 27/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 27 Summary: Train Loss: 0.1641, Val MacroF1: 0.2856


Epoch 28/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 28 Summary: Train Loss: 0.1627, Val MacroF1: 0.3109


Epoch 29/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 29 Summary: Train Loss: 0.1615, Val MacroF1: 0.2857


Epoch 30/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 30 Summary: Train Loss: 0.1620, Val MacroF1: 0.3164


Epoch 31/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 31 Summary: Train Loss: 0.1571, Val MacroF1: 0.3030


Epoch 32/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 32 Summary: Train Loss: 0.1514, Val MacroF1: 0.2964


Epoch 33/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 33 Summary: Train Loss: 0.1480, Val MacroF1: 0.3093


Epoch 34/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 34 Summary: Train Loss: 0.1530, Val MacroF1: 0.2833


Epoch 35/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 35 Summary: Train Loss: 0.1475, Val MacroF1: 0.3054


Epoch 36/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 36 Summary: Train Loss: 0.1464, Val MacroF1: 0.2960


Epoch 37/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 37 Summary: Train Loss: 0.1446, Val MacroF1: 0.3192


Epoch 38/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 38 Summary: Train Loss: 0.1426, Val MacroF1: 0.3214


Epoch 39/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 39 Summary: Train Loss: 0.1412, Val MacroF1: 0.3553
New best model saved with F1-score: 0.3553


Epoch 40/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 40 Summary: Train Loss: 0.1431, Val MacroF1: 0.2960


Epoch 41/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 41 Summary: Train Loss: 0.1408, Val MacroF1: 0.2828


Epoch 42/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 42 Summary: Train Loss: 0.1347, Val MacroF1: 0.3136


Epoch 43/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 43 Summary: Train Loss: 0.1321, Val MacroF1: 0.3196


Epoch 44/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 44 Summary: Train Loss: 0.1279, Val MacroF1: 0.3203


Epoch 45/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 45 Summary: Train Loss: 0.1302, Val MacroF1: 0.3037


Epoch 46/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 46 Summary: Train Loss: 0.1329, Val MacroF1: 0.2882


Epoch 47/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 47 Summary: Train Loss: 0.1311, Val MacroF1: 0.2864


Epoch 48/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 48 Summary: Train Loss: 0.1270, Val MacroF1: 0.3711
New best model saved with F1-score: 0.3711


Epoch 49/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 49 Summary: Train Loss: 0.1240, Val MacroF1: 0.3434


Epoch 50/50:   0%|          | 0/47 [00:00<?, ?it/s]

Epoch 50 Summary: Train Loss: 0.1204, Val MacroF1: 0.2772


## 8. Save the model

After training, save your model checkpoint to disk.  You can use this checkpoint in the semi‑supervised phase.

In [None]:
# The training loop already saves the BEST model.
# This cell will save the FINAL model after the last epoch, for comparison.
final_model_path = CHECKPOINT_DIR / 'supervised_final_model.pth'
torch.save(model.state_dict(), final_model_path)
print(f"Training finished. Final model state saved to: {final_model_path}")
print(f"The best performing model was saved to: {CHECKPOINT_DIR / 'supervised_best_model.pth'}")


Training finished. Final model state saved to: /content/drive/MyDrive/assignmentdata/checkpoints/supervised_final_model.pth
The best performing model was saved to: /content/drive/MyDrive/assignmentdata/checkpoints/supervised_best_model.pth
