# OTX API DEMO (MMPretrain Example)

## Customization Training API

Select a framework & import adapter modules.

We'll choose MMpretrain here.

"""
Environment:
- mmpretrain-1.0.0rc8
- mmcv-2.0.1
- mmengine-0.7.4
- mmdeploy-1.2.0

"""

## Prepare Dataset & DataLoader
1. Prepare a dataset and enter path into Dataset

    - Convert to OTX's DatasetEntity and Label Schema by leveraging Datumaro's features through paths (path -> Datumaro -> OTX DatasetEntity & LabelSchema)

In [3]:
from otx.v2.adapters.torch.mmengine.mmpretrain import Dataset
dataset = Dataset(
    train_data_roots="/home/harimkan/workspace/repo/otx-fork-3/tests/assets/classification_dataset_class_incremental",
    val_data_roots="/home/harimkan/workspace/repo/otx-fork-3/tests/assets/classification_dataset_class_incremental",
    test_data_roots="/home/harimkan/workspace/repo/otx-fork-3/tests/assets/classification_dataset_class_incremental",
)

2-1. Build Torch Dataset from MMCV config (filepath or dict) -> torch.utils.data.Dataset

    - User can build a dataset from a config file or dictionary.

In [4]:
train_dataloader = dataset.train_dataloader()
print(f"Dataset type: {type(train_dataloader)}")
print(f"Length of DataLoader: {len(train_dataloader)}")
print(f"Dataset size: {len(train_dataloader.dataset)}")
print(f"Number of classes: {dataset.num_classes}")

[*] Detected dataset format: imagenet
[*] Detected task type: CLASSIFICATION
2023-07-06 14:53:00,567 | INFO : Try to create a 0 size memory pool.
Dataset type: <class 'torch.utils.data.dataloader.DataLoader'>
Length of DataLoader: 32
Dataset size: 32
Number of classes: 3


In [5]:
# Customize batch_size
train_dataloader = dataset.train_dataloader(batch_size=2)
print(f"DataLoader type: {type(train_dataloader)}")
print(f"Length of DataLoader: {len(train_dataloader)}")
print(f"Dataset size: {len(train_dataloader.dataset)}")
print(f"Number of classes: {train_dataloader.dataset.num_classes}")

DataLoader type: <class 'torch.utils.data.dataloader.DataLoader'>
Length of DataLoader: 16
Dataset size: 32
Number of classes: 3


## Prepare Model
### Model provided by OTX

In [6]:
## OTX Custom Model
from otx.v2.adapters.torch.mmengine.mmpretrain import build_model_from_config
otx_model = build_model_from_config(
    config="/home/harimkan/workspace/repo/otx-fork-3/src/otx/v2/configs/classification/otx_efficientnet_b0.yaml",
    num_classes=dataset.num_classes
)
print(f"Model type: {type(otx_model)}")

2023-07-06 14:53:17,450 | INFO : init weight - https://github.com/osmr/imgclsmob/releases/download/v0.0.364/efficientnet_b0-0752-0e386130.pth.zip
2023-07-06 14:53:17,474 | INFO : 'in_channels' config in model.head is updated from -1 to 1280
2023-07-06 14:53:17,538 | INFO : init weight - https://github.com/osmr/imgclsmob/releases/download/v0.0.364/efficientnet_b0-0752-0e386130.pth.zip
Model type: <class 'otx.v2.adapters.torch.mmengine.mmpretrain.modules.models.classifiers.sam_classifier.SAMImageClassifier'>


### Model provided by mmpretrain

In [7]:
# mmpretrain's pre-defined model
from mmpretrain import get_model
mmpretrain_model = get_model("resnet18_8xb32_in1k")
print(f"Model type: {type(mmpretrain_model)}")

Model type: <class 'mmpretrain.models.classifiers.image.ImageClassifier'>


## Training

Users can use each framework's training provided by OTX. (Engine)

- The engine requires the necessary models and DataLoaders for each framework.

In [8]:
from otx.v2.adapters.torch.mmengine.mmpretrain.engine import MMPTEngine

# OTX Model Training
engine = MMPTEngine()
output = engine.train(
    model=otx_model,
    train_dataloader=train_dataloader,
    work_dir="/tmp/otx-test",
    max_epochs=5
)

print(f"Output type: {type(output)}")

07/06 14:53:30 - mmengine - [4m[97mINFO[0m - 
------------------------------------------------------------
System environment:
    sys.platform: linux
    Python: 3.9.13 (main, Aug 25 2022, 23:26:10) [GCC 11.2.0]
    CUDA available: True
    numpy_random_seed: 901436827
    GPU 0,1: NVIDIA GeForce RTX 3090
    CUDA_HOME: /usr/local/cuda
    NVCC: Cuda compilation tools, release 11.7, V11.7.64
    GCC: gcc (Ubuntu 9.5.0-1ubuntu1~22.04) 9.5.0
    PyTorch: 1.13.1+cu117
    PyTorch compiling details: PyTorch built with:
  - GCC 9.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 11.7
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=comp

In [9]:
# MMPretrain Model Training
engine = MMPTEngine()
output = engine.train(
    model=mmpretrain_model,
    train_dataloader=train_dataloader,
    work_dir="/tmp/otx-test",
    max_epochs=5
)

print(f"Output type: {type(output)}")

07/06 14:54:01 - mmengine - [4m[97mINFO[0m - 
------------------------------------------------------------
System environment:
    sys.platform: linux
    Python: 3.9.13 (main, Aug 25 2022, 23:26:10) [GCC 11.2.0]
    CUDA available: True
    numpy_random_seed: 50978344
    GPU 0,1: NVIDIA GeForce RTX 3090
    CUDA_HOME: /usr/local/cuda
    NVCC: Cuda compilation tools, release 11.7, V11.7.64
    GCC: gcc (Ubuntu 9.5.0-1ubuntu1~22.04) 9.5.0
    PyTorch: 1.13.1+cu117
    PyTorch compiling details: PyTorch built with:
  - GCC 9.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 11.7
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compu