# Instance Segmentation Task

## Installation and Setup
For installation and setting up the repo, please refer to the [Installation Notebook](000_install.ipynb). 

In [1]:
import os
from utils.io import setup_repo

# Setup repo and checkout to the branch with the tutorials
setup_repo(
    git_url="https://github.com/openvinotoolkit/training_extensions.git",
    branch='tutorials/cvpr24',
)
os.getcwd()

'/home/harimkan/workspace/repo/otx-regression'

The above code will setup the repo, change the directory to the root directory of the repo, so we have access to all the files and folders in the repo.

## Prepare the Data

The first step is to prepare the dataset. If you haven't downloaded the dataset yet, you could download it via the following:

In [2]:
from notebooks.utils.download import download_dataset

download_dataset(
    url=(
        "https://github.com/openvinotoolkit/training_extensions/releases/download"
        "/fruits_and_vegetables_dataset/fruits_and_vegetables.zip"
    ),
    extract_to="data/fruits_and_vegetables",
)

The dataset is already available in data/fruits_and_vegetables


In [3]:
data_root = "./data/fruits_and_vegetables"
work_dir = "./otx-workspace-ins_seg"

## Training with OTX Recipes
The first step in this task is to train a model using OTX recipes, which are available in the `recipes` folder. The recipes are in the form of `.yaml` files, which can be used to train a model using the `otx` library.

These recipes are pre-defined by the OTX, which are validated and tested to work with many different use-cases.

Let's see the available recipes for `INSTANCE_SEGMENTATION` task.

In [4]:
from otx.engine.utils.api import list_models

list_models(task="INSTANCE_SEGMENTATION", print_table=True)

  from .autonotebook import tqdm as notebook_tqdm


['openvino_model',
 'maskrcnn_swint',
 'maskrcnn_efficientnetb2b',
 'maskrcnn_efficientnetb2b_tile',
 'maskrcnn_r50_tile',
 'rtmdet_inst_tiny_tile',
 'maskrcnn_r50',
 'rtmdet_inst_tiny',
 'maskrcnn_swint_tile']

As we can see from the output of the above cell, there are 9 recipes available for the `INSTANCE_SEGMENTATION` task. We can use any of these recipes to train a model. In this example, we will use the `maskrcnn_efficientnetb2b.yaml` recipe to train a model.

In [5]:
from otx.engine import Engine

recipe = "src/otx/recipe/instance_segmentation/maskrcnn_efficientnetb2b.yaml"
override_dataset_format = {"data.config.data_format": "datumaro"}

engine = Engine.from_config(config_path=recipe, data_root=data_root, work_dir=work_dir, **override_dataset_format)
engine.train(max_epochs=20)

  dataset = pre_filtering(dataset, self.config.data_format, self.config.unannotated_items_ratio)


Init model efficientnet_b2b, pretrained=True, models cache /home/harimkan/.torch/models
Init model efficientnet_b2b, pretrained=True, models cache /home/harimkan/.torch/models
Init model efficientnet_b2b, pretrained=True, models cache /home/harimkan/.torch/models



size mismatch for roi_head.bbox_head.fc_cls.weight: copying a param with shape torch.Size([81, 1024]) from checkpoint, the shape in current model is torch.Size([11, 1024]).
size mismatch for roi_head.bbox_head.fc_cls.bias: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([11]).
size mismatch for roi_head.bbox_head.fc_reg.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([40, 1024]).
size mismatch for roi_head.bbox_head.fc_reg.bias: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([40]).
size mismatch for roi_head.mask_head.conv_logits.weight: copying a param with shape torch.Size([80, 80, 1, 1]) from checkpoint, the shape in current model is torch.Size([10, 80, 1, 1]).
size mismatch for roi_head.mask_head.conv_logits.bias: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is t



/home/harimkan/workspace/repo/otx-regression/venv/lib/python3.10/site-packages/lightning/pytorch/loops/fit_loop.py:298: The number of training batches (23) is smaller than the logging interval Trainer(log_every_n_steps=50). Set a lower value for log_every_n_steps if you want to see logs for the training epoch.


`Trainer.fit` stopped: `max_epochs=20` reached.


{'lr-SGD': tensor(0.0070),
 'lr-SGD-momentum': tensor(0.9000),
 'lr-SGD-1': tensor(0.0070),
 'lr-SGD-1-momentum': tensor(0.9000),
 'train/loss_rpn_cls': tensor(0.0009),
 'train/loss_rpn_bbox': tensor(0.0054),
 'train/loss_cls': tensor(0.0154),
 'train/loss_bbox': tensor(0.0755),
 'train/loss_mask': tensor(0.0336),
 'train/loss': tensor(0.1308),
 'train/data_time': tensor(0.0095),
 'train/iter_time': tensor(0.2578),
 'validation/data_time': tensor(0.0020),
 'validation/iter_time': tensor(0.0360),
 'val/map': tensor(0.9325),
 'val/map_50': tensor(1.),
 'val/map_75': tensor(0.9770),
 'val/map_small': tensor(-1.),
 'val/map_medium': tensor(0.9744),
 'val/map_large': tensor(0.9224),
 'val/mar_1': tensor(0.5688),
 'val/mar_10': tensor(0.9538),
 'val/mar_100': tensor(0.9548),
 'val/mar_small': tensor(-1.),
 'val/mar_medium': tensor(0.9833),
 'val/mar_large': tensor(0.9470),
 'val/map_per_class': tensor(-1.),
 'val/mar_100_per_class': tensor(-1.),
 'val/f1-score': tensor(0.9984)}

As seen from the output, the recipe has been loaded successfully, and we have trained the model using the recipe. The model has been saved in the `work_dir` variable, which is `./otx-workspace-ins-seg` in this case. You could browse the `work_dir` to see the saved model and other files.

## Evaluate torch model
Now that we trained the model, we could test the performance with `Engine`'s `test` entrypoint. The `test` entrypoint will evaluate the model on the test dataset and return the metrics.

In [6]:
engine.test()

Init model efficientnet_b2b, pretrained=True, models cache /home/harimkan/.torch/models
Init model efficientnet_b2b, pretrained=True, models cache /home/harimkan/.torch/models
Init model efficientnet_b2b, pretrained=True, models cache /home/harimkan/.torch/models


Trainer already configured with model summary callbacks: [<class 'lightning.pytorch.callbacks.rich_model_summary.RichModelSummary'>]. Skipping setting a default `ModelSummary` callback.
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1]


{'test/data_time': tensor(0.0020),
 'test/iter_time': tensor(0.0344),
 'test/map': tensor(0.8623),
 'test/map_50': tensor(0.9528),
 'test/map_75': tensor(0.9277),
 'test/map_small': tensor(0.),
 'test/map_medium': tensor(0.9689),
 'test/map_large': tensor(0.8329),
 'test/mar_1': tensor(0.4997),
 'test/mar_10': tensor(0.8911),
 'test/mar_100': tensor(0.8926),
 'test/mar_small': tensor(0.),
 'test/mar_medium': tensor(0.9766),
 'test/mar_large': tensor(0.8707),
 'test/map_per_class': tensor(-1.),
 'test/mar_100_per_class': tensor(-1.),
 'test/f1-score': tensor(0.9845)}

## Explain torch model

In [None]:
from otx.core.config.explain import ExplainConfig

engine.explain(explain_config=ExplainConfig(postprocess=True), dump=True)

In [None]:
from PIL import Image
from IPython.display import display

origin_img = Image.open(f'{data_root}/images/test/my_photo-1 - Copy - Copy - Copy.jpg')
saliency_map_img = Image.open(f'{work_dir}/saliency_map/my_photo_1___Copy___Copy___Copy_class_1_saliency_map.png')
overlay_img = Image.open(f'{work_dir}/saliency_map/my_photo_1___Copy___Copy___Copy_class_1_overlay.png')

display(origin_img)
display(saliency_map_img)
display(overlay_img)

## Export to IR Model

In [None]:
exported_ir_model_path = engine.export()
exported_ir_model_path

## Evaluate IR Model

In [None]:
engine.test(checkpoint=exported_ir_model_path)

## Explain IR model

In [None]:
from otx.core.config.explain import ExplainConfig

engine.explain(
    checkpoint=exported_ir_model_path,
    explain_config=ExplainConfig(postprocess=True),
    dump=True,
)

In [None]:
from PIL import Image
from IPython.display import display

origin_img = Image.open(f'{data_root}/images/test/my_photo-1 - Copy - Copy - Copy.jpg')
saliency_map_img = Image.open(f'{work_dir}/saliency_map/my_photo_1___Copy___Copy___Copy_class_1_saliency_map.png')
overlay_img = Image.open(f'{work_dir}/saliency_map/my_photo_1___Copy___Copy___Copy_class_1_overlay.png')

display(origin_img)
display(saliency_map_img)
display(overlay_img)