# OTX API Usage for Advanced-API User

### Requirements
R3. Advanced-level Python API users provide a Dataset, and a model type (str) as input, and OTX provides a workflow.


In [1]:
data_root = "../../tests/assets/classification_dataset"

In [2]:
from otx.engine import Engine

engine = Engine(data_root=data_root)

  from .autonotebook import tqdm as notebook_tqdm


In [3]:
engine.list_models()

['efficientnet_v2_light',
 'efficientnet_b0_light',
 'otx_mobilenet_v3_large',
 'otx_deit_tiny',
 'otx_efficientnet_v2',
 'mobilenet_v3_large_light',
 'otx_efficientnet_b0',
 'otx_dino_v2',
 'otx_dino_v2_linear_probe']

In [4]:
engine.model = "efficientnet_b0_light"

init weight - https://github.com/osmr/imgclsmob/releases/download/v0.0.364/efficientnet_b0-0752-0e386130.pth.zip
01/03 15:16:33 - mmengine - [4m[97mINFO[0m - 
backbone.features.init_block.conv.conv.weight - torch.Size([32, 3, 3, 3]): 
The value is the same before and after calling `init_weights` of ImageClassifier  
 
01/03 15:16:33 - mmengine - [4m[97mINFO[0m - 
backbone.features.init_block.conv.bn.weight - torch.Size([32]): 
The value is the same before and after calling `init_weights` of ImageClassifier  
 
01/03 15:16:33 - mmengine - [4m[97mINFO[0m - 
backbone.features.init_block.conv.bn.bias - torch.Size([32]): 
The value is the same before and after calling `init_weights` of ImageClassifier  
 
01/03 15:16:33 - mmengine - [4m[97mINFO[0m - 
backbone.features.stage1.unit1.dw_conv.conv.weight - torch.Size([32, 1, 3, 3]): 
The value is the same before and after calling `init_weights` of ImageClassifier  
 
01/03 15:16:33 - mmengine - [4m[97mINFO[0m - 
backbone.features

In [5]:
engine.train()

GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
`Trainer(val_check_interval=1)` was configured so validation will run after every batch.
You are using a CUDA device ('NVIDIA GeForce RTX 3090') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1]

  | Name        | Type                      | Params
----------------------------------------------------------
0 | model       | MMPretrainCompatibleModel | 5.3 M 
1 | val_metric  | MulticlassAccuracy        | 0     
2 | test_metric | MulticlassAccuracy        | 0     
----------------------------------------------------------
5.3 M     Trainable 

                                                                           

/home/harimkan/workspace/repo/otx-fork/venv/lib/python3.10/site-packages/lightning/pytorch/loops/fit_loop.py:293: The number of training batches (1) is smaller than the logging interval Trainer(log_every_n_steps=50). Set a lower value for log_every_n_steps if you want to see logs for the training epoch.


Epoch 9: 100%|██████████| 1/1 [00:00<00:00,  9.39it/s, v_num=8, train/loss=0.0176, val/accuracy=0.560]

`Trainer.fit` stopped: `max_epochs=10` reached.


Epoch 9: 100%|██████████| 1/1 [00:00<00:00,  5.71it/s, v_num=8, train/loss=0.0176, val/accuracy=0.560]


{'train/loss': tensor(0.0176), 'val/accuracy': tensor(0.5600)}


R4. Advanced-level Python API users provide a Dataset, and Model (nn.Module) as input, and OTX provides the workflow.

In [6]:
import torch
from torch import nn
from torchvision.models.resnet import ResNet50_Weights, resnet50


class ResNet50WithLossComputation(nn.Module):
    def __init__(self, num_classes: int) -> None:
        super().__init__()
        self.num_classes = num_classes
        net = resnet50(weights=ResNet50_Weights.IMAGENET1K_V2)
        net.fc = nn.Linear(
            in_features=net.fc.in_features, out_features=self.num_classes
        )
        self.net = net
        self.softmax = nn.Softmax(dim=-1)
        self.criterion = nn.CrossEntropyLoss()

    def forward(self, images: torch.Tensor, labels: torch.Tensor) -> torch.Tensor:
        logits = self.net(images)

        if self.training:
            return self.criterion(logits, labels)

        return self.softmax(logits)


In [7]:
from typing import Any

import torch
from lightning.pytorch.cli import ReduceLROnPlateau
from otx.core.data.entity.base import OTXBatchLossEntity
from otx.core.data.entity.classification import (
    MulticlassClsBatchDataEntity,
    MulticlassClsBatchPredEntity,
)
from otx.core.model.entity.classification import OTXClassificationModel
from otx.core.model.module.classification import OTXClassificationLitModule
from torch import nn


class OTXResNet50(OTXClassificationModel):
    def __init__(self, num_classes: int) -> None:
        self.num_classes = num_classes
        super().__init__()
        self.register_buffer(
            "mean",
            torch.FloatTensor([123.675, 116.28, 103.53]).view(-1, 1, 1),
            False,
        )
        self.register_buffer(
            "std",
            torch.FloatTensor([58.395, 57.12, 57.375]).view(-1, 1, 1),
            False,
        )

    def _create_model(self) -> nn.Module:
        # ResNet50_Weights.IMAGENET1K_V2 is a really powerful pretrained model equipped with the modern training scheme:
        # ImageNet-1K acc@1: 80.858, acc@5": 95.434.
        return ResNet50WithLossComputation(num_classes=self.num_classes)

    def _customize_inputs(self, inputs: MulticlassClsBatchDataEntity) -> dict[str, Any]:
        images = torch.stack(inputs.images, dim=0).to(dtype=torch.float32)
        images = (images - self.mean) / self.std
        return {
            "images": images,
            "labels": torch.cat(inputs.labels, dim=0),
        }

    def _customize_outputs(
        self, outputs: Any, inputs: MulticlassClsBatchDataEntity
    ) -> MulticlassClsBatchPredEntity | OTXBatchLossEntity:
        if self.training:
            return {"loss": outputs}

        # To list, batch-wise
        scores = torch.unbind(outputs, 0)

        return MulticlassClsBatchPredEntity(
            batch_size=inputs.batch_size,
            images=inputs.images,
            imgs_info=inputs.imgs_info,
            scores=scores,
            labels=inputs.labels,
        )

lightning_module = OTXClassificationLitModule(
    otx_model=OTXResNet50(num_classes=2),
    torch_compile=False,
    optimizer=lambda p: torch.optim.SGD(p, lr=0.0049, momentum=0.9, weight_decay=0.0001),
    scheduler=lambda o: ReduceLROnPlateau(o, patience=1, factor=0.5, monitor="train/loss"),
)

In [8]:
from otx.engine import Engine

engine = Engine(
    data_root=data_root,
    work_dir="./otx-workspace",
    device="gpu",
)

In [9]:
engine.train(
    model=lightning_module,
)

GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
`Trainer(val_check_interval=1)` was configured so validation will run after every batch.
/home/harimkan/workspace/repo/otx-fork/venv/lib/python3.10/site-packages/lightning/pytorch/utilities/parsing.py:43: attribute 'optimizer' removed from hparams because it cannot be pickled
/home/harimkan/workspace/repo/otx-fork/venv/lib/python3.10/site-packages/lightning/pytorch/utilities/parsing.py:43: attribute 'scheduler' removed from hparams because it cannot be pickled
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1]

  | Name        | Type               | Params
---------------------------------------------------
0 | model       | OTXResNet50        | 23.5 M
1 | val_metric  | MulticlassAccuracy | 0     
2 | test_metric | MulticlassAccuracy | 0     
---------------------------------------------------
23.5 M    Trainable params
0         Non-trainable

Epoch 9: 100%|██████████| 1/1 [00:00<00:00,  7.99it/s, v_num=9, train/loss=0.360, val/accuracy=1.000]

`Trainer.fit` stopped: `max_epochs=10` reached.


Epoch 9: 100%|██████████| 1/1 [00:00<00:00,  2.82it/s, v_num=9, train/loss=0.360, val/accuracy=1.000]


{'train/loss': tensor(0.3597), 'val/accuracy': tensor(1.)}

User provide datamodule & model

In [10]:
from otx.core.config.data import DataModuleConfig, SubsetConfig
from otx.core.data.module import OTXDataModule

task = "MULTI_CLASS_CLS"

train_transform = [
    {"type": "LoadImageFromFile"},
    {"type": "RandomResizedCrop", "scale": 224, "backend": "cv2"},
    {"type": "PackInputs"},
]
val_transform = [
    {"type": "LoadImageFromFile"},
    {"type": "ResizeEdge", "scale": 256, "edge": "short", "backend": "cv2"},
    {"type": "PackInputs"},
]

datamodule = OTXDataModule(
    task=task,
    config=DataModuleConfig(
        data_format="imagenet_with_subset_dirs",
        data_root=data_root,
        train_subset=SubsetConfig(
            batch_size=2,
            subset_name="train",
            transform_lib_type="MMPRETRAIN",
            transforms=train_transform,
        ),
        val_subset=SubsetConfig(
            batch_size=1,
            subset_name="val",
            transform_lib_type="MMPRETRAIN",
            transforms=val_transform,
        ),
        test_subset=SubsetConfig(
            batch_size=1,
            subset_name="test",
            transform_lib_type="MMPRETRAIN",
            transforms=val_transform,
        ),
    ),
)


In [11]:
engine.train(
    model=lightning_module,
    datamodule=datamodule,
)

/home/harimkan/workspace/repo/otx-fork/venv/lib/python3.10/site-packages/lightning/pytorch/callbacks/model_checkpoint.py:639: Checkpoint directory ./otx-workspace/lightning_logs/version_9/checkpoints exists and is not empty.
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1]

  | Name        | Type               | Params
---------------------------------------------------
0 | model       | OTXResNet50        | 23.5 M
1 | val_metric  | MulticlassAccuracy | 0     
2 | test_metric | MulticlassAccuracy | 0     
---------------------------------------------------
23.5 M    Trainable params
0         Non-trainable params
23.5 M    Total params
94.049    Total estimated model params size (MB)


                                                                           

/home/harimkan/workspace/repo/otx-fork/venv/lib/python3.10/site-packages/lightning/pytorch/loops/fit_loop.py:293: The number of training batches (13) is smaller than the logging interval Trainer(log_every_n_steps=50). Set a lower value for log_every_n_steps if you want to see logs for the training epoch.
`Trainer.fit` stopped: `max_epochs=10` reached.


{}

R5. Advanced-level Python API users can use custom training using all the trainer parameters available in lightning.

In [12]:
engine.train(
    model=lightning_module,
    datamodule=datamodule,
    max_epochs=3,
    precision="16",
)

/home/harimkan/workspace/repo/otx-fork/venv/lib/python3.10/site-packages/lightning/fabric/connector.py:565: `precision=16` is supported for historical reasons but its usage is discouraged. Please set your precision to 16-mixed instead!
Using 16bit Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
`Trainer(val_check_interval=1)` was configured so validation will run after every batch.
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1]

  | Name        | Type               | Params
---------------------------------------------------
0 | model       | OTXResNet50        | 23.5 M
1 | val_metric  | MulticlassAccuracy | 0     
2 | test_metric | MulticlassAccuracy | 0     
---------------------------------------------------
23.5 M    Trainable params
0         Non-trainable params
23.5 M    Total params
94.049    Total estimated model params size (MB)


Sanity Checking: |          | 0/? [00:00<?, ?it/s]



Epoch 0:   0%|          | 0/13 [00:00<?, ?it/s]                            



Epoch 2: 100%|██████████| 13/13 [00:02<00:00,  5.26it/s, v_num=10, train/loss=0.284, val/accuracy=1.000] 

`Trainer.fit` stopped: `max_epochs=3` reached.


Epoch 2: 100%|██████████| 13/13 [00:02<00:00,  4.81it/s, v_num=10, train/loss=0.284, val/accuracy=1.000]


{'train/loss': tensor(0.2844), 'val/accuracy': tensor(1.)}