# Experiment Pipeline

In [1]:
import torch
import torch.nn as nn
import torch.optim as optim

from hubmap.experiments.load_data import make_expert_loader
from hubmap.experiments.load_data import make_annotated_loader
from hubmap.experiments.training import Trainer
from hubmap.dataset import transforms as T

from torchmetrics.detection import MeanAveragePrecision

First we load the data that we need for the experiment. This includes the training data, the validation (test) data that we will use for training.

For this, depending on the experiments we use different transformations on the data. The following transformations are a minimal example. Furhter transformations should be included for more sophisticated experiments.

In [2]:
train_transformations = T.Compose(
    [T.ToTensor(), T.Resize((512, 512)), T.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]
)

test_transformations = T.Compose(
    [T.ToTensor(), T.Resize((512, 512)), T.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]
)

Depending on the experiment we may want to load all annotated images or just the ones that are annotated by experts.

Here we create a function to load all the images that are annotated (not only the ones by experts).
The created function can than be used to load the data loaders with a specific batch size.

In [3]:
# The train, test split ratio is set to 0.8 by default.
# Meaning 80% of the data is used for training and 20% for testing.
load_annotated_data = make_annotated_loader(train_transformations, test_transformations)

In the following, we determine the device we want to train on. 
If a GPU is available, we use it, otherwise we fall back to the CPU. 
We also set the random seed for reproducibility.

In [4]:
device = "cuda" if torch.cuda.is_available() else "cpu"

Next, we need to load the model we want to train.

In [5]:
from hubmap.models import FCT, init_weights

model = FCT().to(device)
_ = model.apply(init_weights)

  torch.nn.init.kaiming_normal(m.weight)


Next we create the other modules needed for training, such as the loss measure, and the optimizer.

In [6]:
optimizer = optim.Adam(model.parameters(), lr=1e-4)
criterion = nn.BCELoss()
metric = mAP() # TODO: implement the mean Average Precision metric

ValueError: Expected argument `iou_thresholds` to either be `None` or a list of floats but got 0.6

Next, we initialize the trainer and start training. The trainer is responsible for running the training loop, saving checkpoints, and logging metrics 

In [None]:
BATCH_SIZE = 32

train_loader, test_loader = load_annotated_data(BATCH_SIZE)

In [None]:
trainer = Trainer(
    model=model,
    optimizer=optimizer,
    criterion=criterion,
    train_loader=train_loader,
    test_loader=test_loader,
    device=device,
    metric=metric,
)