In [None]:
import os
from efficientv2_unet.model.efficientv2_unet import create_and_train

# Train a model (and evaluate)

A simple example how to train an EfficientV2-UNet model.

Provide a folder with images (RGB), a folder with masks (where background has a value of 0 and foreground a value of 1), and the type of EfficientV2-Net model to build your model from.

Images and corresponding masks must be placed in separate folders and must have identical names, e.g.

```
path/to/images:
                - image01.tif
                - image02.tif
                - ...
path/to/masks:  - image01.tif
                - image02.tif
                - ...
```

A model will be trained by randomly splitting the input images and masks into 70% training images, 15% validation images and 15% test images.

For custom splitting of your training data, check the `1-2_split_data_and_train` notebook

The splitting will move the images into sub-folders, giving you:

```
├── path/to/images
    ├── train
    ├── val
    └── test
└── path/to/masks
    ├── train
    ├── val
    └── test
```

Your input images, will be patched into smaller images (which are saved to file into corresponding 'crop' folders). This is done at the native resolution, but also at half, and 1/3 resolution. Hence, you can train with rather big images.

The training includes batch image autmentation, including:
- HorizontalFlip
- RandomRotate90
- RandomGamma
- RandomBrightnessContrast
- ElasticTransform
- GridDistortion
- OpticalDistortion
- RandomSizedCrop

Monitored metrics include:
- BinaryAccuracy (at a threshold of 0.5)
- BinaryIoU (at a threshold of 0.5)

Training a model will always load the image-net weigts at the beginning.

At the end of the training, the final epoch model and the best-checkpoint models are saved to file.

And your models will also be evaluated on the test dataset, at full, half and 1/3 resolution. The console will print the best threshold, with the best resolution to use.

Eventually, in the model folder, there will be a json file containing all the training and evaluation metadata. Alongside with some graphs, showing
the models' performance for different thresholds on the test images at different resolutions.

In [None]:
# Variables
image_folder = 'path/to/images'     # folder with the images
mask_folder = 'path/to/masks'       # folder with the corresponding masks [0=background, 1=foreground]

efficientnet_basemodel = 'b0'       # any of ['b0', 'b1', 'b2', 'b3', 's', 'm', 'l']


After executing the next cells, you can monitor the training also in tensorboard.
For that open another cmd of your environment,
- cd to your basedir
- start tensorboard with: "tensorboard --logdir=."
- (or "tensorboard --logdir={basedir}" if you did not cd to the basedir)
- then, access tensorboard in a browser: http://localhost:6006/

In [None]:
model = create_and_train(
    name=None,                              # if not specified it is named 'myEfficientUNet_<efficientnet_basemodel>'
    basedir='path/to/saving_location',      # if not specified it will be placed in the current wordking directory
    train_img_dir=os.path.abspath(image_folder),
    train_mask_dir=os.path.abspath(mask_folder),
    efficientnet=efficientnet_basemodel,
    epochs=100,                             # default
    batch_size=64,                          # default
    file_ext='.tif'                         # default
)

### Parameters explained

**name**: you can chose any name for the model (all training metadata will be saved as a readable json file into the model folder). Evetually, after the training, you model will be save like, e.g.:
```
basedir / models / name :
                          - name.h5  # EfficientV2UNet model file
                          - name_best-ckp.h5  # Best EfficientV2UNet model file according to validation metrics
                          - name.json  # model training and evaluation metadata
                          - name*.png's  # several graphs showing the models' performance at different thresholds
```

**epochs**: number of iterations over the train/val data.

**batch_size**: batch size, i.e. how many images belong to the same batch. That's a parameter you could decrease if you run into memory issues, but must be a multiple of 16.

**file_ext**: image file extension. Tifffile is used to read the images, so I strongly suggest using '.tif'. (png was not tested)


### Other parameters

string inputs for validation image and masks, and test image and mask folders: You can supply those, if you have already split your images. *The sum of validation and test images must be below 80% of all images*

**early_stopping**: by default False. Enabled, it will stop the training if there is not major change in the validation's BinaryIoU metric.

**img_size**: default is 256. Is the size of image patches the model will be traied on.



