<a href="https://colab.research.google.com/github/wandb/edu/blob/main/mlops-001/lesson1/03_Baseline.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>
<!--- @wandbcode{course-lesson1} -->

# Baseline solution

<!--- @wandbcode{course-lesson1} -->

In this notebook we will create a baseline solution to our semantic segmentation problem. To iterate fast a notebook is a handy solution. We will then refactor this code into a script to be able to use hyperparameter sweeps.

In [1]:
import wandb
import pandas as pd
from fastai.vision.all import *
from fastai.callback.wandb import WandbCallback

import params
from utils import get_predictions, create_iou_table, MIOU, BackgroundIOU, \
                  RoadIOU, TrafficLightIOU, TrafficSignIOU, PersonIOU, VehicleIOU, BicycleIOU

Again, we're importing some global configuration parameters from `params.py` file. We have also defined some helper functions in `utils.py` - for example metrics we will track during our experiments.

Let's now create a `train_config` that we'll pass to W&B `run` to control training hyperparameters. 

In [2]:
train_config = SimpleNamespace(
    framework="fastai",
    img_size=(180, 320),
    batch_size=8,
    augment=True, # use data augmentation
    epochs=10, 
    lr=2e-3,
    pretrained=True,  # whether to use pretrained encoder
    seed=42,  # for reproducibility
)

We are setting seed for reproducibility: this is a `fastai` function!

In [3]:
set_seed(train_config.seed, reproducible=True)

## Start run and get artifacts

Initialize the `wandb` run: we pass the configuration we defined about for training

In [4]:
run = wandb.init(project=params.WANDB_PROJECT, entity=params.ENTITY, job_type="training", config=train_config)

[34m[1mwandb[0m: Currently logged in as: [33merinaldi[0m ([33merinaldi-team[0m). Use [1m`wandb login --relogin`[0m to force relogin


As usual, we will use W&B Artifacts to track the lineage of our models. 

In [5]:
processed_data_at = run.use_artifact(f'{params.PROCESSED_DATA_AT}:latest')
processed_dataset_dir = Path(processed_data_at.download())
df = pd.read_csv(processed_dataset_dir / 'data_split.csv')

[34m[1mwandb[0m: Downloading large artifact bdd_simple_1k_split:latest, 846.07MB. 4010 files... 
[34m[1mwandb[0m:   4010 of 4010 files downloaded.  
Done. 0:0:1.2


We will not use the hold out dataset stage at this moment. `is_valid` column will tell our trainer how we want to split data between training and validation. 

In [6]:
df = df[df.Stage != 'test'].reset_index(drop=True)  # drop the files corresponding to the test split
df['is_valid'] = df.Stage == 'valid'

## Prepare data for training

Function to get the mask file from the image file

In [7]:
def label_func(fname: Path) -> Path:
    """Generate the path to the mask file corresponding
    to the image file in input

    Args:
        fname (Path): The path to an image file

    Returns:
        Path: The path to the corresponding mask file
    """
    return (fname.parent.parent/"labels")/f"{fname.stem}_mask.png"

We will use `fastai`'s `DataBlock` API to feed data into model training and validation. 

In [8]:
# assign paths to images and labels
df["image_fname"] = [processed_dataset_dir/f'images/{f}' for f in df.File_Name.values]
df["label_fname"] = [label_func(f) for f in df.image_fname.values]

In [10]:
df.head()

Unnamed: 0,File_Name,Stage,is_valid,image_fname,label_fname
0,a59131a5-00000000.jpg,train,False,artifacts/bdd_simple_1k_split:v0/images/a59131a5-00000000.jpg,artifacts/bdd_simple_1k_split:v0/labels/a59131a5-00000000_mask.png
1,6886b3d9-6ab2b28d.jpg,train,False,artifacts/bdd_simple_1k_split:v0/images/6886b3d9-6ab2b28d.jpg,artifacts/bdd_simple_1k_split:v0/labels/6886b3d9-6ab2b28d_mask.png
2,115e4aff-00000000.jpg,train,False,artifacts/bdd_simple_1k_split:v0/images/115e4aff-00000000.jpg,artifacts/bdd_simple_1k_split:v0/labels/115e4aff-00000000_mask.png
3,b803d91d-671b8cff.jpg,train,False,artifacts/bdd_simple_1k_split:v0/images/b803d91d-671b8cff.jpg,artifacts/bdd_simple_1k_split:v0/labels/b803d91d-671b8cff_mask.png
4,c665137e-6fffaf45.jpg,train,False,artifacts/bdd_simple_1k_split:v0/images/c665137e-6fffaf45.jpg,artifacts/bdd_simple_1k_split:v0/labels/c665137e-6fffaf45_mask.png


In [9]:
def get_data(
    df: pd.DataFrame, bs: int = 4, img_size: tuple = (180, 320), augment: bool = True
) -> DataLoaders:
    """Create the data loaders for images using the fastai DataBlock API
    from a dataframe that contains the paths to images and labels.

    Args:
        df (pd.DataFrame): The dataframe with the paths to the files to use as data
        bs (int, optional): The batch size. Defaults to 4.
        img_size (tuple, optional): The image size. Defaults to (180, 320).
        augment (bool, optional): If using augmentation transforms. Defaults to True.

    Returns:
        DataLoaders: The dataloaders
    """
    block = DataBlock(
        blocks=(ImageBlock, MaskBlock(codes=params.BDD_CLASSES)),
        get_x=ColReader("image_fname"),
        get_y=ColReader("label_fname"),
        splitter=ColSplitter(),
        item_tfms=Resize(img_size),
        batch_tfms=aug_transforms() if augment else None,
    )
    return block.dataloaders(df, bs=bs)


We are using `wandb.config` to track our training hyperparameters. 

In [11]:
config = wandb.config

Create our dataloaders

In [12]:
dls = get_data(df, bs=config.batch_size, img_size=config.img_size, augment=config.augment)

In [15]:
dls

<fastai.data.core.DataLoaders at 0x176ab8700>

## Define and train UNet model based on `ResNet18`

We will use *intersection over union* metrics: mean across all classes (MIOU) and IOU for each class separately. Our model will be a `unet` based on pretrained `resnet18` backbone. 

In [16]:
metrics = [MIOU(), BackgroundIOU(), RoadIOU(), TrafficLightIOU(), \
           TrafficSignIOU(), PersonIOU(), VehicleIOU(), BicycleIOU()]

learn = unet_learner(dls, arch=resnet18, pretrained=config.pretrained, metrics=metrics)

Downloading: "https://download.pytorch.org/models/resnet18-f37072fd.pth" to /Users/enrythebest/.cache/torch/hub/checkpoints/resnet18-f37072fd.pth


  0%|          | 0.00/44.7M [00:00<?, ?B/s]

In `fastai` we already have a callback that integrates tightly with W&B, we only need to pass the `WandbCallback` to the learner and we are ready to go. The callback will log all the useful variables for us. For example, whatever metric we pass to the learner will be tracked by the callback. The `WandbCallback` also accepts a string to identify the model name (useful when trying many models) and the dataset name. The model will be saved as an artifact.

In [17]:
callbacks = [
    SaveModelCallback(monitor='miou'),
    WandbCallback(log_preds=False, log_model=True)  # we do not log the predictions automatically. We create a table later
]

Let's train our model!

In [18]:
learn.fit_one_cycle(config.epochs, config.lr, cbs=callbacks)

epoch,train_loss,valid_loss,miou,background_iou,road_iou,traffic_light_iou,traffic_sign_iou,person_iou,vehicle_iou,bicycle_iou,time
0,0.491318,0.421165,0.251009,0.832042,0.713021,0.0,0.0,0.0,0.211997,0.0,04:49
1,0.452435,0.449401,0.247709,0.837002,0.602234,0.0,0.0,0.0,0.294728,0.0,04:48
2,0.334857,0.29119,0.336765,0.893784,0.792019,0.0,0.0,0.0,0.67155,0.0,04:50
3,0.329983,0.273721,0.342748,0.899133,0.812429,0.0,0.0,0.0,0.687673,0.0,04:44
4,0.279289,0.281466,0.346482,0.899616,0.802815,0.0,0.0,0.0,0.722946,0.0,04:45
5,0.262704,0.272615,0.344906,0.906533,0.829176,0.0,0.0,0.0,0.678633,0.0,04:45
6,0.22506,0.237343,0.35679,0.915882,0.831107,0.005341,0.0,0.0,0.745202,0.0,04:44
7,0.208304,0.235644,0.358016,0.916856,0.835043,0.0,0.0,0.0,0.754211,0.0,04:48
8,0.19153,0.231745,0.373429,0.921806,0.843662,0.090874,0.0,0.0,0.757663,0.0,04:41
9,0.18044,0.225847,0.371104,0.922678,0.843015,0.066249,0.0,0.0,0.765784,0.0,04:39


We will log a table with model predictions and ground truth to W&B, so that we can do error analysis in the W&B dashboard. 

In [19]:
samples, outputs, predictions = get_predictions(learn)
table = create_iou_table(samples, outputs, predictions, params.BDD_CLASSES)
wandb.log({"pred_table":table})

We are reloading the model from the best checkpoint at the end and saving it. To make sure we track the final metrics correctly, we will validate the model again and save the final loss and metrics to `wandb.summary`. 

In [20]:
scores = learn.validate()
metric_names = ['final_loss'] + [f'final_{x.name}' for x in metrics]
final_results = {metric_names[i] : scores[i] for i in range(len(scores))}
for k,v in final_results.items(): 
    wandb.summary[k] = v

In [21]:
wandb.finish()

VBox(children=(Label(value='126.570 MB of 126.570 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0,…

0,1
background_iou,▁▁▆▆▆▇▇███
bicycle_iou,▁▁▁▁▁▁▁▁▁▁
epoch,▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▇▇▇▇▇▇███
eps_0,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
eps_1,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
eps_2,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
lr_0,▁▂▂▃▄▅▆▇███████▇▇▇▇▆▆▆▅▅▅▄▄▄▃▃▃▂▂▂▁▁▁▁▁▁
lr_1,▁▂▂▃▄▅▆▇███████▇▇▇▇▆▆▆▅▅▅▄▄▄▃▃▃▂▂▂▁▁▁▁▁▁
lr_2,▁▂▂▃▄▅▆▇███████▇▇▇▇▆▆▆▅▅▅▄▄▄▃▃▃▂▂▂▁▁▁▁▁▁
miou,▁▁▆▆▆▆▇▇██

0,1
background_iou,0.92268
bicycle_iou,0.0
epoch,10.0
eps_0,1e-05
eps_1,1e-05
eps_2,1e-05
final_background_iou,0.92181
final_bicycle_iou,0.0
final_loss,0.23175
final_miou,0.37343
