# Part 3: Model training

##### A CPU is needed to run this notebook!

In this notebook, a pretrained convolutional network is loaded from the fastai API. The network architecture is a ResNet18 -meaning, a residual network with 18 hidden layers.

Altough fastai abstracts and facilitates large parts of the model buliding process, some pre-configurations have to be implemented before the model can be fed with the training data. These are:

- Strip labels and define "codes" for correct interpretation
- Define image enlargement methods which can be applied by the model to train also on slightly altered images
- Set up model paramaters and architecture
- Define metrics for lerning rate

The applied knowledge in this notebook is based on the Practical Deep Learning Course from fastai: https://course.fast.ai/

In [1]:
from fastai.vision.all import*

In [2]:
TRAINING_SET_PATH = Path('./data/training_set/')
TRAINING_IMAGES_PATH = TRAINING_SET_PATH/'images'
TRAINING_MASKS_PATH = TRAINING_SET_PATH/'masks'
# Verify if paths exist
TRAINING_SET_PATH.exists() == TRAINING_IMAGES_PATH.exists() == TRAINING_MASKS_PATH.exists() == True

True

#### Mask preparation

In a first step, the model needs to be told how to interprete the labels. The created masks are defined and are stripped from their filetype extension. Furthermore, codes are assigned to the masks for correct labeling and easier interpretation of the results.

In [3]:
# Define the mask path to be used by the model and stem the filetype extention in the image name
def label_func(fn):
    return TRAINING_MASKS_PATH / f"{fn.stem}.png"

# Assigned codes
codes = 255 * ["not_island"]
codes.append('island')

In order to use the full potential of the given dataset, the model is allowed to create sligh variations of the input image. Therefore, the model can train on more data and ultimately deliver better results.
> In this example, the images are rotated, the brightness is changed, and new images are created by applying zoom on the original image 

In [4]:
# Define image augmentation options for model
batch_tfms = aug_transforms(flip_vert=True, max_lighting=0.1, max_zoom=1.05, max_warp=0.2)

### Setup DataLoader

#### Data block
A data block is a generic container to quickly build Datasets and DataLoaders. Here, all the required components for the calculations are assigned to the model (required output, dtype and format, label function, augementation methods)

#### Data loader
Here, the input data gets assigned. Furthermore, how many samples (images) should be loaded per batch (bs). A random seed is used to ensure that results are reproducible. In other words, using this parameter makes sure that anyone who re-runs the code will get the exact same outputs.

In [5]:
# Datablock
crossDB = DataBlock(blocks=(ImageBlock, MaskBlock(codes)),
                            get_items = get_image_files,
                            get_y = label_func,
                            item_tfms=Resize(394),
                            batch_tfms=batch_tfms)

# Dataloader
dls = crossDB.dataloaders(TRAINING_IMAGES_PATH,bs=5, seed=47)

# Interpretation of the labels
dls.vocab = codes

#Optimizer
opt = ranger

Due to IPython and Windows limitation, python multiprocessing isn't available now.
So `number_workers` is changed to 0 to avoid getting stuck


  ret = func(*args, **kwargs)


### Define Learner strategy

Own metrics on how to interpret the results of each epoch can be assigned (function below). Here the accuracy is measured by comparing the output mask of the model with the target mask. It measures how many pixel values are correct. The result is printed after each epoch.

In [6]:
# Metric
def calc_accuracy(inp, targ):
  targ = targ.squeeze(1)
  mask = targ == 255
  return (inp.argmax(dim=1)[mask]==targ[mask]).float().mean()

Now all defined parameters have to be assinged to the learner. Also, the architecture is defined here py loading the pretrained resnet18.

In [7]:
learn = unet_learner(dls, resnet18, metrics=calc_accuracy,  self_attention=True,act_cls=Mish,opt_func=opt)

Fastai can automatically find a proper learning rate using the lr_find() function for the previously built learner:

In [None]:
lr_min,lr_steep = learn.lr_find(suggest_funcs=(minimum,steep))

Now the actual training process beginns. It starts with 4 training epochs to test if all is working properly. Freezing prevents the weights of a neural network layer from being modified. To continue the training, the model must be unfreezed.

In [6]:
learn.fit_one_cycle(4,5e-3)
learn.unfreeze()
learn.fit_one_cycle(25,lr_max = slice(1e-6,1e-4))

epoch,train_loss,valid_loss,calc_accuracy,time
0,14.808592,0.019832,0.089812,02:13
1,0.015428,0.01072,0.516481,02:13
2,0.010131,0.008949,0.577732,02:16
3,0.00722,0.007035,0.728739,02:16


epoch,train_loss,valid_loss,calc_accuracy,time
0,0.007357,0.006987,0.7402,02:18
1,0.006406,0.006883,0.734939,02:19
2,0.00775,0.006877,0.779731,02:19
3,0.005983,0.006639,0.75245,02:19
4,0.006459,0.006419,0.760159,02:19
5,0.006173,0.006343,0.780071,02:20
6,0.006517,0.006055,0.775187,02:20
7,0.005605,0.006128,0.777016,02:18
8,0.00666,0.005798,0.792072,02:18
9,0.005963,0.005835,0.768255,02:18


In [7]:
# export learner
learn.export()

In [8]:
# verify that export was successfuly
path = Path()
path.ls(file_exts='.pkl')

(#1) [Path('export.pkl')]