# Segmentation Task

## Installation and Setup
For installation and setting up the repo, please refer to the [Installation Notebook](000_install.ipynb). 

In [1]:
import os
from utils.io import setup_repo

# Setup repo and checkout to the branch with the tutorials
setup_repo(
    git_url="https://github.com/openvinotoolkit/training_extensions.git",
    branch='tutorials/cvpr24',
)
os.getcwd()

'/home/sakcay/projects/training_extensions'

The above code will setup the repo, change the directory to the root directory of the repo, so we have access to all the files and folders in the repo.

## Prepare the Data

The first step is to prepare the dataset. If you haven't downloaded the dataset yet, you could download it via the following:

In [3]:
from notebooks.utils.download import download_dataset

download_dataset(
    url=(
        "https://github.com/openvinotoolkit/training_extensions/releases/download"
        "/fruits_and_vegetables_dataset/fruits_and_vegetables.zip"
    ),
    extract_to="data/fruits_and_vegetables"
)

The dataset is already available in data/fruits_and_vegetables


In [4]:
data_root = "./data/fruits_and_vegetables"
work_dir = "./otx-workspace-seg"

## Training with OTX Recipes
The first step in this task is to train a model using OTX recipes, which are available in the `recipes` folder. The recipes are in the form of `.yaml` files, which can be used to train a model using the `otx` library.

These recipes are pre-defined by the OTX, which are validated and tested to work with many different use-cases.

Let's see the available recipes for `SEMANTIC_SEGMENTATION` task.

In [5]:
from otx.engine.utils.api import list_models

available_models = list_models(task="SEMANTIC_SEGMENTATION", print_table=True)

As we can see from the output of the above cell, there are 8 recipes available for the `SEMANTIC_SEGMENTATION` task. We can use any of these recipes to train a model. In this example, we will use the `litehrnet_18.yaml` recipe to quickly train a model.

In [6]:
from otx.engine import Engine

recipe = "src/otx/recipe/semantic_segmentation/litehrnet_18.yaml"
override_dataset_format = {"data.config.data_format": "datumaro"}

engine = Engine.from_config(config_path=recipe, data_root=data_root, work_dir=work_dir, **override_dataset_format)
engine.train(max_epochs=30)

  dataset = pre_filtering(dataset, self.config.data_format, self.config.unannotated_items_ratio)
Downloading: "https://storage.openvinotoolkit.org/repositories/openvino_training_extensions/models/custom_semantic_segmentation/litehrnet18_imagenet1k_rsc.pth" to /home/sakcay/.cache/torch/hub/checkpoints/litehrnet18_imagenet1k_rsc.pth

unexpected key in source state_dict: increase_modules.0.conv1.weight, increase_modules.0.bn1.weight, increase_modules.0.bn1.bias, increase_modules.0.bn1.running_mean, increase_modules.0.bn1.running_var, increase_modules.0.bn1.num_batches_tracked, increase_modules.0.conv2.weight, increase_modules.0.bn2.weight, increase_modules.0.bn2.bias, increase_modules.0.bn2.running_mean, increase_modules.0.bn2.running_var, increase_modules.0.bn2.num_batches_tracked, increase_modules.0.conv3.weight, increase_modules.0.bn3.weight, increase_modules.0.bn3.bias, increase_modules.0.bn3.running_mean, increase_modules.0.bn3.running_var, increase_modules.0.bn3.num_batches_tracked,

init weight - https://storage.openvinotoolkit.org/repositories/openvino_training_extensions/models/custom_semantic_segmentation/litehrnet18_imagenet1k_rsc.pth


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
You are using a CUDA device ('NVIDIA GeForce RTX 3090') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
Missing logger folder: /home/sakcay/projects/training_extensions/otx-workspace-seg/csv/
/home/sakcay/.pyenv/versions/3.11.9/envs/otx/lib/python3.11/site-packages/lightning/pytorch/callbacks/model_checkpoint.py:639: Checkpoint directory /home/sakcay/projects/training_extensions/otx-workspace-seg exists and is not empty.
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1,2,3]


Output()

/home/sakcay/.pyenv/versions/3.11.9/envs/otx/lib/python3.11/site-packages/lightning/pytorch/loops/fit_loop.py:293: The number of training batches (13) is smaller than the logging interval Trainer(log_every_n_steps=50). Set a lower value for log_every_n_steps if you want to see logs for the training epoch.


`Trainer.fit` stopped: `max_epochs=30` reached.


{'lr-Adam': tensor(0.0010),
 'lr-Adam-momentum': tensor(0.9000),
 'lr-Adam-1': tensor(0.0010),
 'lr-Adam-1-momentum': tensor(0.9000),
 'train/loss_ce_ignore': tensor(0.3017),
 'train/loss': tensor(0.3017),
 'train/data_time': tensor(0.0063),
 'train/iter_time': tensor(0.1498),
 'validation/data_time': tensor(0.0057),
 'validation/iter_time': tensor(0.0554),
 'val/Dice': tensor(0.8242),
 'val/mIoU': tensor(0.7747)}

## Evaluate torch model
Now that the training is complete, we could test the performance of the model on a test set. 

In [7]:
engine.test()

Trainer already configured with model summary callbacks: [<class 'lightning.pytorch.callbacks.rich_model_summary.RichModelSummary'>]. Skipping setting a default `ModelSummary` callback.
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1,2,3]


init weight - https://storage.openvinotoolkit.org/repositories/openvino_training_extensions/models/custom_semantic_segmentation/litehrnet18_imagenet1k_rsc.pth


Output()



{'test/data_time': tensor(0.0083),
 'test/iter_time': tensor(0.0828),
 'test/Dice': tensor(0.8822),
 'test/mIoU': tensor(0.8092)}

## Export to IR Model
After we ensure the model is making the right predictions and are happy with the model to deploy, the next step is to export the model to IR format. This is particularly useful to improve the inference speed on edge devices. 

In [8]:
exported_ir_model_path = engine.export()
exported_ir_model_path

Trainer will use only 1 of 4 GPUs because it is running inside an interactive / notebook environment. You may try to set `Trainer(devices=4)` but please note that multi-GPU inside interactive / notebook environments is considered experimental and unstable. Your mileage may vary.
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs


init weight - https://storage.openvinotoolkit.org/repositories/openvino_training_extensions/models/custom_semantic_segmentation/litehrnet18_imagenet1k_rsc.pth


  min_size = [int(_) for _ in x[-1].size()[-2:]]
  if fuse_y.size()[-2:] != y.size()[-2:]:


PosixPath('/home/sakcay/projects/training_extensions/otx-workspace-seg/exported_model.xml')

## Evaluate IR Model
After exporting the model, we would like to ensure that accuracy has not dropped. Let's check it.

In [6]:
engine.test(checkpoint=exported_ir_model_path)

            You can specify the value in config.
  warn(msg, stacklevel=1)
	 transforms: [{'class_path': 'torchvision.transforms.v2.ToImage'}] 
	 transform_lib_type: TORCHVISION 
	 batch_size: 64 
	 image_color_channel: RGB 
And the tiler is disabled.
  warn(msg, stacklevel=1)
Trainer already configured with model summary callbacks: [<class 'lightning.pytorch.callbacks.rich_model_summary.RichModelSummary'>]. Skipping setting a default `ModelSummary` callback.
GPU available: True (cuda), used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
/home/harimkan/workspace/repo/otx-regression/venv/lib/python3.10/site-packages/lightning/pytorch/trainer/setup.py:187: GPU available but not used. You can set it by doing `Trainer(accelerator='gpu')`.


{'test/Dice': tensor(0.8524), 'test/mIoU': tensor(0.8035)}