# Baseline solution

In this notebook we will create a baseline solution to our image classification problem. To iterate fast a notebook is a handy solution. We will then refactor this code into a script to be able to use hyperparameter sweeps.

In [1]:
# autoreload modules after editing
# without the need of restarting the kernel
%load_ext autoreload
%autoreload 2

# import from file in the parent directory
import sys
sys.path.append('../')

import wandb
import pandas as pd
from fastai.vision.all import *
from fastai.callback.wandb import WandbCallback
import timm

import params

To get all available models from the `timm` library run the following command:
```python
import timm
models_to_benchmark = timm.list_models(pretrained=True)
```
From previous experiments promising candidates were selected as well `Inception` baselines were added. We will load names of those models from a file.

In [2]:
# load list from the file
with open("../models_to_benchmark.txt", "r") as f:
    models_to_benchmark = f.read().splitlines()
models_to_benchmark

['res2net101d.in1k',
 'rexnetr_300.sw_in12k_ft_in1k',
 'seresnextaa101d_32x8d.sw_in12k_ft_in1k_288',
 'coatnet_0_rw_224.sw_in1k',
 'vit_large_r50_s32_224.augreg_in21k',
 'resnext101_32x4d.fb_swsl_ig1b_ft_in1k',
 'vit_base_r50_s16_224.orig_in21k',
 'coatnet_rmlp_1_rw2_224.sw_in12k',
 'inception_v3.tf_in1k',
 'inception_resnet_v2.tf_ens_adv_in1k']

Let's now create a `train_config` that we'll pass to W&B `run` to control training hyperparameters.

In [3]:
train_config = SimpleNamespace(
    framework="fastai",
    img_size=(224, 224),
    batch_size=8,
    augment=True, # use data augmentation
    epochs=10, 
    lr=None, # select learning rate automatically
    arch="res2net101d.in1k",
    pretrained=True,  # whether to use pretrained encoder
    seed=42,
)

We are setting seed for reproducibility.

In [4]:
set_seed(train_config.seed, reproducible=True)

In [5]:
run = wandb.init(project=params.WANDB_PROJECT, entity=params.ENTITY, job_type="training", config=train_config)

[34m[1mwandb[0m: Currently logged in as: [33mapopov[0m ([33mijc-amp[0m). Use [1m`wandb login --relogin`[0m to force relogin


As usual, we will use W&B Artifacts to track the lineage of our models. 

In [6]:
processed_data_at = run.use_artifact(f'{params.PROCESSED_DATA_AT}:latest')
processed_dataset_dir = Path(processed_data_at.download())
df = pd.read_csv(processed_dataset_dir / 'data_split.csv')

[34m[1mwandb[0m: Downloading large artifact TCGA-COAD-split:latest, 341.94MB. 2834 files... 
[34m[1mwandb[0m:   2834 of 2834 files downloaded.  
Done. 0:0:29.9


We will not use the hold out dataset stage at this moment. `is_valid` column will tell our trainer how we want to split data between training and validation.

In [15]:
df = df[df.Split != 'test'].reset_index(drop=True)
df['valid_col'] = df.Split == 'valid'

In [22]:
df.head(1)

Unnamed: 0,Fname,Split,valid_col
0,TCGA-A6-6141_544b2a2e-17e4-4fde-b6ca-696d6dde973e_TCGA-A6-6141-01Z-00-DX1.34b5db5c-74df-47d9-bb89-beec93ded868_CMS3_672_224.png,train,False


Code from the previous experiments:
```python
# create dataloader
dls = ImageDataLoaders.from_name_func(
    path=".", fnames=list(PATH_PATCHES_10_percent.iterdir()),
    valid_pct=0.2, seed=42,
    label_func=fname2label, item_tfms=Resize(PATCH_SIZE))
```

Create dataloader with the split from the dataframe:

In [23]:
processed_dataset_dir

Path('artifacts/TCGA-COAD-split:v0')

In [31]:
def fname2label(fname):
    """Extract class of the patch from absolute path to it."""
    if isinstance(fname, Path):
        fname = str(fname)
    return fname.split("_")[-3]

In [34]:
PATCH_SIZE = (224, 224)

In [36]:
dls = ImageDataLoaders.from_name_func(
    path=".", fnames=list((processed_dataset_dir/"patches").iterdir()),
    valid_pct=0.2, seed=42,
    label_func=fname2label, item_tfms=Resize(PATCH_SIZE))

  if is_categorical_dtype(col):


Create learner:

In [38]:
model_name = 'res2net101d.in1k'

In [66]:
metrics = [accuracy, F1Score(average='macro')]
metrics_names = ['accuracy', 'f1']

In [39]:
learn = vision_learner(dls, model_name, metrics=metrics)

Downloading model.safetensors:   0%|          | 0.00/181M [00:00<?, ?B/s]

Find appropriate learning rate:

In [40]:
suggested_lrs = learn.lr_find(suggest_funcs=(minimum, steep, valley, slide), show_plot=False)

Add required callbacks:

In [46]:
callbacks = [
    SaveModelCallback(monitor='accuracy'),
    WandbCallback(log_preds=False, log_model=True)
]

Train:

In [50]:
learn.fine_tune(5, base_lr=suggested_lrs.valley, cbs=callbacks)

epoch,train_loss,valid_loss,accuracy,f1_score,time
0,0.287388,2.85424,0.339223,0.288212,00:06


Better model found at epoch 0 with accuracy value: 0.33922260999679565.


epoch,train_loss,valid_loss,accuracy,f1_score,time
0,0.223899,2.580197,0.39576,0.334238,00:07
1,0.210413,3.01089,0.434629,0.325807,00:07
2,0.216966,3.326365,0.416961,0.317821,00:07
3,0.190332,3.009998,0.40636,0.360009,00:07
4,0.153114,2.880428,0.427562,0.369724,00:07


Better model found at epoch 0 with accuracy value: 0.3957597315311432.
Better model found at epoch 1 with accuracy value: 0.434628963470459.


We will log a table with model predictions and ground truth to W&B, so that we can do error analysis in the W&B dashboard.
```python
samples, outputs, predictions = get_predictions(learn)
table = create_iou_table(samples, outputs, predictions, params.BDD_CLASSES)
wandb.log({"pred_table":table})
```

We are reloading the model from the best checkpoint at the end and saving it. To make sure we track the final metrics correctly, we will validate the model again and save the final loss and metrics to `wandb.summary`. 

In [67]:
scores = learn.validate()
metric_names = ['final_loss'] + [f'final_{x}' for x in metrics_names]
final_results = {metric_names[i] : scores[i] for i in range(len(scores))}
for k,v in final_results.items(): 
    wandb.summary[k] = v

In [68]:
wandb.finish()

0,1
accuracy,▁▅█▇▆▇
epoch,▁▁▁▁▂▂▂▂▂▃▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▇▇▇▇▇▇███
eps_0,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
eps_1,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
f1_score,▁▅▄▄▇█
lr_0,▁▂▃▄▆▇█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
lr_1,▁▂▃▄▆▇█▂▂▂▂▃▃▄▄▄▅▅▅▄▄▄▄▄▄▄▃▃▃▃▂▂▂▂▁▁▁▁▁▁
mom_0,██▇▅▃▂▁██▇▆▅▄▃▂▂▁▁▁▁▁▂▂▂▃▃▄▄▄▅▅▆▆▇▇▇████
mom_1,██▇▅▃▂▁██▇▆▅▄▃▂▂▁▁▁▁▁▂▂▂▃▃▄▄▄▅▅▆▆▇▇▇████
raw_loss,▂▄▃▅▅▆█▅▆▆▂▄▄▄▁▁▅▃▄▃▃▆▄▆▄▃▆▂▅▅▁▄▄▂▃▂▂▂▂▃

0,1
accuracy,0.42756
epoch,6.0
eps_0,1e-05
eps_1,1e-05
f1_score,0.36972
final_accuracy,0.43463
final_f1,0.32581
final_loss,3.01089
lr_0,0.0
lr_1,0.0
