# Preparation of milestone three

Today we will start preparing the third milestone. The third milestone is to train an object detector to recognize cells. To successfully complete the milestone, you will have to complete the following sub-tasks:
- Initialize a pytorch object detector. I'd suggest to choose a RetinaNet or FCOS detection model. More Information can be found [here](https://pytorch.org/vision/stable/models.html#object-detection-instance-segmentation-and-person-keypoint-detection). Since we do not have endless compute power available, we will use a pre-trained backbone and only train the detection and classification heads of our object detector. So you will have to finde a way to **freeze** the backbone of your detector.
- You will have to write a [training and validation/test](https://pytorch.org/tutorials/beginner/basics/optimization_tutorial.html) loop to train your detector. Make sure you measure the convergence of the training by monitoring a detection metric like the [mAP](https://torchmetrics.readthedocs.io/en/stable/detection/mean_average_precision.html). Also, you will have to find a way to select the best model during training based on some metric. The detection models of torchvision already return a loss when they are in evaluation mode, so you don't need to configure your own one.
- Train your model for a few epochs. If you do not have a gpu available, I suggest using colab, as training the model on the cpu is extremely slow. You will also need to pass your dataset to a dataloader to take advantage of multithreading and automatic batching.
- At the end, you will have to save the **state_dict** of your trained object detector, to be able to reuse it later.

Please use a jupyter notebook for coding your training/testing pipeline. In the end, you will have to submit that jupiter notebook at moodle. Please do not try to upload any model weights.

# If you run the notebook in colab, you have to mount the google drive with the images. Proceed as follows:

- **First**: Open the following **[link](https://drive.google.com/drive/folders/18P74V8kli6qDZtGBLN-tPrJFu3O2NPEK?usp=sharing)** in a new tab.
- **Second**: Add a link to your google Drive.
Example: [Link](https://drive.google.com/file/d/1IcFGGIoktPkDj9-4j5IQ3evInn0c2aq-/view?usp=sharing)
- **Third**: Run the line of code below
- **Fourth**: Grant Google access to your Drive

In [6]:
from google.colab import drive

# (https://drive.google.com/drive/folders/18P74V8kli6qDZtGBLN-tPrJFu3O2NPEK?usp=sharing)
# # path to the link you created
path_to_slides = '/content/gdrive/MyDrive/AgNORs/'
# # mount the data
drive.mount('/content/gdrive')

Drive already mounted at /content/gdrive; to attempt to forcibly remount, call drive.mount("/content/gdrive", force_remount=True).


In [1]:
import os
import numpy as np
import pandas as pd
from PIL import Image
import torch
import albumentations as A
from albumentations.pytorch import ToTensorV2
from torch.utils.data import Dataset, DataLoader, random_split
import torchvision
from torchvision.models.detection.faster_rcnn import FastRCNNPredictor
import torch.optim as optim
# !pip install torchmetrics -y
import torchmetrics



Found Intel OpenMP ('libiomp') and LLVM OpenMP ('libomp') loaded at
the same time. Both libraries are known to be incompatible and this
can cause random crashes or deadlocks on Linux when loaded in the
same Python program.
Using threadpoolctl may cause crashes or deadlocks. For more
information and possible workarounds, please see
    https://github.com/joblib/threadpoolctl/blob/master/multiple_openmp.md



# 1. Create a Custom Dataset Class for training your object detection model

In the last milestone, you learned about custom data sets in pytorch. In this milestone, we are going to train an object detection model, for which we also need a custom dataset. Last time, the dataset returned only one cell each time the getitem method was called. This time, we need to return a larger portion of the image to feed the object detector. Since there may be more than one cell in the image, you will also need to return the labels and bounding box information for each cell in the respective crop. Think about how to deal with cells that are at the edges of the crop and therefore not completely visible. Also consider what to do if there is no annotation on the image.

Your dataset must have the following characteristics:

1. Since you are going to randomly sample the crops from the images, you need a parameter that defines how many crops will be sampled in an epoch (num_samples).
2. You need to define the size of the patches that will be sampled from the images during training (crop_size).
3. Since we do not have many images available, we will need image augmentation. Geometric augmentations are a bit tricky for object detection, because you always have to transform the bounding boxes as well. The [albumentations](https://albumentations.ai/docs/getting_started/bounding_boxes_augmentation/) module solves this quite nicely, so I suggest using it for the augmentations in this task.
4. Have a look at pytorch and torchmetrics to see in which format the metrics as well as the models expect the images and tragets.
5. Although we do not actively detect a background class, the label "0" is always reserved for the background class. Therefore, your dataset must return the label "1" for each cell.

In [2]:
import os
import pandas as pd
import numpy as np
import torch
from torch.utils.data import Dataset
from PIL import Image

class CustomDatasetClass(Dataset):
    def __init__(self, annotations_file, img_dir, num_samples, crop_size, transform=None):
        """
        Custom Dataset Class for Object Detection Tasks.

        Args:
        - annotations_file (str): Path to the CSV file containing annotations.
        - img_dir (str): Directory containing images.
        - num_samples (int): Number of samples in the dataset.
        - crop_size (tuple): Size of the cropped image (height, width).
        - transform (callable, optional): Optional transformations to be applied to the image.
        """
        self.img_labels = pd.read_csv(annotations_file)
        self.img_dir = img_dir
        self.num_samples = num_samples
        self.crop_size = crop_size
        self.transform = transform

    def __len__(self):
        return self.num_samples

    def __getitem__(self, sample_index):
        img_idx = sample_index % len(self.img_labels)
        img_path = os.path.join(self.img_dir, self.img_labels.iloc[img_idx, 0])

        try:
            img = Image.open(img_path).convert("RGB")
        except Exception as e:
            print(f"Error loading image at index {sample_index}: {e}")
            return None

        annotations = self.img_labels[self.img_labels.iloc[:, 0] == self.img_labels.iloc[img_idx, 0]]

        width, height = img.size
        crop_x = torch.randint(0, width - self.crop_size[1] + 1, (1,)).item()
        crop_y = torch.randint(0, height - self.crop_size[0] + 1, (1,)).item()
        img = img.crop((crop_x, crop_y, crop_x + self.crop_size[1], crop_y + self.crop_size[0]))

        # Bounding boxes for the crop
        bboxes = []
        labels = []
        for _, row in annotations.iterrows():
            x_min, y_min, x_max, y_max = row[3], row[4], row[1], row[2]
            if x_min >= crop_x and x_max <= crop_x + self.crop_size[1] and y_min >= crop_y and y_max <= crop_y + self.crop_size[0]:
                bboxes.append([x_min - crop_x, y_min - crop_y, x_max - crop_x, y_max - crop_y])
                labels.append(1)  # Label 1 for cells

        if len(bboxes) == 0:
            bboxes = [[0, 0, 0, 0]]  # Default if no bbox in crop
            labels = [0]  # Background class

        if self.transform:
            try:
                img = self.transform(img)
            except Exception as e:
                print(f"Error applying transform to image at index {sample_index}: {e}")
                return None

        img = np.array(img)
        img = torch.tensor(img).permute(2, 0, 1).float() / 255.0
        target = {
            'boxes': torch.tensor(bboxes, dtype=torch.float32),
            'labels': torch.tensor(labels, dtype=torch.int64)
        }

        return img, target, labels


# 2. Initializing the model

Initialize a pre-trained object detector from torchvision. Since you need to detect cells, you need two classes (a background class and a "cell" class). Some detection models like FasterRCNN or Retinanet use anchor boxes. If you choose to work with one of these models, you will need to select anchorboxes with a size that matches the size of the detection targets.

In [7]:
import torchvision
from torchvision.models.detection import fasterrcnn_resnet50_fpn, retinanet_resnet50_fpn
from torchvision.transforms import transforms
from torch.utils.data import DataLoader

weights = torchvision.models.detection.FasterRCNN_ResNet50_FPN_Weights.COCO_V1
model = fasterrcnn_resnet50_fpn(weights=weights)

num_classes = 2  # background + cell
in_features = model.roi_heads.box_predictor.cls_score.in_features
model.roi_heads.box_predictor = torchvision.models.detection.faster_rcnn.FastRCNNPredictor(in_features, num_classes)

# Dataset
annotations_file = 'annotation_frame.csv'
img_dir=''

num_samples = len(pd.read_csv(annotations_file))
crop_size = (256, 256)
transform = transforms.Compose([
    transforms.Resize((crop_size[0], crop_size[1])),
    transforms.ToTensor(),
])

dataset = CustomDatasetClass(annotations_file, img_dir, num_samples, crop_size, transform=transform)

# DataLoader
batch_size = 32
dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=True)


# 3. Setting up an optimzer, a detection metric and the train and validation dataloaders

To train the object detector, it is necessary to select an appropriate optimizer. Additionally, the torchmetrics class needs to be instantiated before it can be used for evaluation or tracking metrics during training.
Additionally, initialize a training and validation dataloader your dataset. For more information on how to set up your dataloaders have a look [here](https://pytorch.org/tutorials/beginner/basics/data_tutorial.html)


In [8]:
import torch.optim as optim
import torchmetrics
from sklearn.model_selection import train_test_split

params = [p for p in model.parameters() if p.requires_grad]
optimizer = optim.SGD(params, lr=0.005, momentum=0.9, weight_decay=0.0005)

detection_metric = torchmetrics.detection.MeanAveragePrecision()

# split in training and validation sets
train_dataset, val_dataset = train_test_split(dataset, test_size=0.2, random_state=42)

# trainingDataLoader
train_dataloader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)

# validationDataLoader
val_dataloader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False)


Error loading image at index 2133: [Errno 2] No such file or directory: '/Users/alexandervaptsarov/Desktop/Image Understanding/Milestone Three/AgNOR_2876.tiff'
Error loading image at index 14926: [Errno 2] No such file or directory: '/Users/alexandervaptsarov/Desktop/Image Understanding/Milestone Three/AgNOR_2865.tiff'
Error loading image at index 3288: [Errno 2] No such file or directory: '/Users/alexandervaptsarov/Desktop/Image Understanding/Milestone Three/AgNOR_0519.tiff'
Error loading image at index 7730: [Errno 2] No such file or directory: '/Users/alexandervaptsarov/Desktop/Image Understanding/Milestone Three/AgNOR_0531.tiff'
Error loading image at index 8852: [Errno 2] No such file or directory: '/Users/alexandervaptsarov/Desktop/Image Understanding/Milestone Three/AgNOR_2862.tiff'
Error loading image at index 425: [Errno 2] No such file or directory: '/Users/alexandervaptsarov/Desktop/Image Understanding/Milestone Three/AgNOR_0495.tiff'
Error loading image at index 10277: [Err

IOPub data rate exceeded.
The Jupyter server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--ServerApp.iopub_data_rate_limit`.

Current values:
ServerApp.iopub_data_rate_limit=1000000.0 (bytes/sec)
ServerApp.rate_limit_window=3.0 (secs)



# 4. Train and validation loop

Please write two functions, one for training and one for evaluating your object detector. The functionality should be very similar to what you had to do for the last milestone. Use these functions to train the detector for a few epochs. During training, track both the training losses and the validation metrics to monitor the performance of the model. Save the best detector as observed by the validation metric.

In [9]:
def train_one_epoch(model, dataloader, optimizer, device):
    model.train()
    train_loss = 0.0

    for images, targets in dataloader:
        images = [image.to(device) for image in images]
        targets = [{k: v.to(device) for k, v in target.items()} for target in targets]

        optimizer.zero_grad()
        loss_dict = model(images, targets)
        losses = sum(loss for loss in loss_dict.values())
        losses.backward()
        optimizer.step()

        train_loss += losses.item()

    return train_loss / len(dataloader)


def evaluate(model, dataloader, device, metric):
    model.eval()
    metric.reset()

    with torch.no_grad():
        for images, targets in dataloader:
            images = [image.to(device) for image in images]
            targets = [{k: v.to(device) for k, v in target.items()} for target in targets]

            outputs = model(images)
            metric(outputs, targets)

    return metric.compute()


def train_and_validate(model, train_dataloader, val_dataloader, optimizer, device, num_epochs):
    best_metric = 0.0
    best_model = None

    for epoch in range(num_epochs):
        # Training
        train_loss = train_one_epoch(model, train_dataloader, optimizer, device)
        print(f"Epoch [{epoch+1}/{num_epochs}], Train Loss: {train_loss:.4f}")

        # Validation
        val_metric = evaluate(model, val_dataloader, device, metric)
        print(f"Epoch [{epoch+1}/{num_epochs}], Validation mAP: {val_metric['mAP'].item():.4f}")

        # Save the best model based on validation metric
        if val_metric['mAP'].item() > best_metric:
            best_metric = val_metric['mAP'].item()
            best_model = model.state_dict()

    return best_model


In [10]:
device = torch.device('cpu')
# device = torch.device('mps')
model.to(device)
print(device)

cpu


In [11]:
train_and_evaluate(model, train_dataloader, val_dataloader, optimizer, detection_metric, device, num_epochs=10)

NameError: name 'train_and_evaluate' is not defined