# Dataset Adapter

With diverse dataset structures available, ensuring compatibility with SuperGradients (SG) can be challenging. This is where the DataloaderAdapter plays a pivotal role. This tutorial takes you through the importance, implementation, and advantages of using the DataloaderAdapter with SG.

### Why Dataset Adapter?

Datasets come in a myriad of structures. However, SG requires data in a specific format.

For instance, consider the Object Detection Format:

Image format should be: (BS, H, W, C) i.e., channel last.
Targets should be in the format: (BS, 6), where 6 represents (sample_id, class_id, label, cx, cy, w, h).
The overhead of adjusting each dataset manually can be cumbersome. Enter DataloaderAdapter – designed to automatically understand your dataset structure and mold it for SG compatibility.

In [8]:
!pip install -q super-gradients==3.5.0

  Preparing metadata (setup.py) ... [?25l[?25hdone
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
  Preparing metadata (setup.py) ... [?25l[?25hdone
  Preparing metadata (setup.py) ... [?25l[?25hdone
  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m458.9/458.9 kB[0m [31m5.9 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m46.0/46.0 kB[0m [31m5.8 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m11.3/11.3 MB[0m [31m34.4 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m79.8/79.8 kB[0m [31m10.3 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m108.3/108.3 kB[0m [31m12.7 MB/s[0m eta [36m0:00:00[0m
[?25h  Pr

### Why Do We Need the Dataset Adapter?

While Datasets come in various structures and formats, SG expects data in a specific format to be able to run.


> Example: Object Detection Format
> - Image format: (BS, H, W, C) i.e. channel last
> - Targets format: (BS, 6) where 6 represents (sample_id, class_id, label, cx, > cy, w, h).


This means that you should either use one of SuperGradient's built-in Dataset class if it supports your dataset structure, or, if your dataset is too custom for it, inherit from SG datasets and bring all the required changes.

While this is all right in most cases, it can be cumbersome when you just want to quickly experiment with a new dataset.

To reduce this overhead, SuperGradients introduced the concept of `DataloaderAdapter`. Instead of requiring you to write all the transformations required to use SG, the `DataloaderAdapter` will infer anything possible directly from your data. Whenever something cannot be inferred with 100% confidence, you will be asked a question with all the required context for you to properly answer.

Let's see this in practice with an example. Let's start with `SBDataset` dataset

# Exemple 1 - Segmentation Adapter on `SBDataset` Dataset

In this section, we'll walk through the process of preparing the `SBDataset` dataset for use in SuperGradients. We'll highlight the challenges and demonstrate how the Adapter can simplify the process.


1. Preparing the Dataset without Adapter

In [3]:
from torchvision.datasets import SBDataset

try:
  # There is a bug with `torchvision.datasets.SBDataset` that raises RuntimeError after downloading, so we just ignore it
  SBDataset(root="data", mode='segmentation', download=True)
except RuntimeError:
  pass

Downloading https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/semantic_contours/benchmark.tgz to data/benchmark.tgz


100%|██████████| 1419539633/1419539633 [00:17<00:00, 79796528.12it/s] 


Extracting data/benchmark.tgz to data
Downloading https://www.cs.cornell.edu/~bharathh/ to data/train_noval.txt


21334it [00:00, 299478.84it/s]


In [4]:
from torchvision.transforms import Compose, ToTensor, Resize, InterpolationMode


transforms = Compose([ToTensor(), Resize((512, 512), InterpolationMode.NEAREST)])
def sample_transform(image, mask):
  return transforms(image), transforms(mask)

train_set = SBDataset(root="data", mode='segmentation', download=False, transforms=sample_transform)

Now let's see what we get when instantiating a `Dataloader`

In [5]:
from torch.utils.data import DataLoader

train_loader = DataLoader(train_set, batch_size=20, shuffle=True)
_images, labels = next(iter(train_loader))

labels.unique()

tensor([0.0000, 0.0157, 0.0275, 0.0314, 0.0353, 0.0431, 0.0471, 0.0510, 0.0549,
        0.0588, 0.0627, 0.0667, 0.0745])

As you can see, the labels are normalized (0-1). This is all right, but it is not the format expected by SuperGradients.

Let's now see how the Adapter helps.

2. Introducing Adapter

The Adapter helps us skip manual data preparations and dives right into creating a dataloader that SuperGradients expects.

In [11]:
from super_gradients.training.dataloaders.adapters import SegmentationDataloaderAdapterFactory

train_loader = SegmentationDataloaderAdapterFactory.from_dataset(dataset=train_set, batch_size=20, shuffle=True, config_path='cache_file.json')

_images, labels = next(iter(train_loader))
labels.unique()

[2023-11-13 13:56:24] INFO - data_config.py - Cache deactivated for `SegmentationDataConfig`.



--------------------------------------------------------------------------------
[33;1mHow many classes does your dataset include?[0m
--------------------------------------------------------------------------------

Enter your response >>> 21
Great! [33;1mYou chose: `21`[0m

--------------------------------------------------------------------------------
[33;1mDoes your dataset provide a batch or a single sample?[0m
--------------------------------------------------------------------------------
Image shape: torch.Size([3, 512, 512])
Mask shape: torch.Size([1, 512, 512])
[34;1mOptions[0m:
[[34;1m0[0m] | Batch of Samples (e.g. torch Dataloader)
[[34;1m1[0m] | Single Sample (e.g. torch Dataset)

Your selection (Enter the [34;1mcorresponding number[0m) >>> 1
Great! [33;1mYou chose: `Single Sample (e.g. torch Dataset)`[0m

--------------------------------------------------------------------------------
[33;1mIn which format are your images loaded ?[0m
-------------------

tensor([  0,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11,  12,  13,  14,
         15,  16,  17,  18,  19,  20,  21,  22,  23,  24,  25,  26,  28,  29,
         30,  31,  32,  33,  35,  36,  37,  38,  39,  40,  41,  42,  43,  44,
         45,  46,  47,  48,  49,  50,  51,  52,  53,  54,  55,  56,  57,  58,
         59,  60,  61,  62,  63,  64,  65,  66,  67,  68,  69,  70,  71,  72,
         73,  74,  75,  76,  77,  78,  79,  80,  81,  82,  83,  84,  85,  86,
         87,  88,  89,  90,  91,  92,  93,  94,  95,  96,  97,  98,  99, 100,
        101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114,
        115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128,
        129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142,
        143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156,
        157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170,
        171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 1

You can see that the mask is now encoded as `int`, which is the representation used in SuperGradients.

It's important to note that the dataset adapter also support different dataset format such as one hot, ensuring that the output (`labels` here) is in the right format to use within SuperGradients.

## Example II - Detection Adapter on a Dictionary based Dataset

Some datasets return a more complex data structure than the previous example.

For instance, the `COCO` dataset implementation from `pytorch` returns a list of dictionaries representing the labels.

Let's have a look:


In [12]:
# Download the zip file
!wget https://deci-pretrained-models.s3.amazonaws.com/coco2017_small.zip

# Unzip the downloaded file
!unzip coco2017_small.zip > /dev/null

--2023-11-13 13:56:46--  https://deci-pretrained-models.s3.amazonaws.com/coco2017_small.zip
Resolving deci-pretrained-models.s3.amazonaws.com (deci-pretrained-models.s3.amazonaws.com)... 52.216.211.169, 52.216.246.28, 3.5.2.158, ...
Connecting to deci-pretrained-models.s3.amazonaws.com (deci-pretrained-models.s3.amazonaws.com)|52.216.211.169|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 246116231 (235M) [application/zip]
Saving to: ‘coco2017_small.zip’


2023-11-13 13:56:53 (37.0 MB/s) - ‘coco2017_small.zip’ saved [246116231/246116231]



In [13]:
from torchvision.datasets import CocoDetection
from torchvision.transforms import Compose, ToTensor, Resize, InterpolationMode
from torchvision.datasets import SBDataset


image_transform = Compose([ToTensor(), Resize((512, 512))])

train_set = CocoDetection(root='coco2017_small/images/train2017', annFile='coco2017_small/annotations/instances_train2017.json', transform=image_transform)
val_set = CocoDetection(root='coco2017_small/images/val2017', annFile='coco2017_small/annotations/instances_val2017.json', transform=image_transform)
image, targets = next(iter(train_set))

loading annotations into memory...
Done (t=0.13s)
creating index...
index created!
loading annotations into memory...
Done (t=0.06s)
creating index...
index created!


In [14]:
print(f"Number of targets: {len(targets)}, First target structure: {targets[0]}")

Observe the dataset output's nested dictionary structure? This complicates things for the Dataset Adapter as it's unsure which fields detail the bounding box.

To solve this, we utilize an extractor function.

#### The Extractor's Role

Simply put, the extractor translates your dataset's output into a format the Adapter understands. For our dataset, it will take the image and annotations, then return the bounding box data, including the label and coordinates.

Worried about bounding box format like `xyxy_label` or `label_xywh`? Don't be. The Adapter is designed to recognize them.

> For further guidance on extractor functions, see the [official documentation](https://github.com/Deci-AI/data-gradients/blob/master/documentation/dataset_extractors.md).

In [15]:
import torch

def coco_labels_extractor(sample) -> torch.Tensor:
    _, annotations = sample # annotations = [{"bbox": [1.08, 187.69, 611.59, 285.84], "category_id": 51}, ...]
    labels = []
    for annotation in annotations:
        class_id = annotation["category_id"]
        bbox = annotation["bbox"]
        labels.append((class_id, *bbox))
    return torch.Tensor(labels) # torch.Tensor([[51, 1.08, 187.69, 611.59, 285.84], ...])

coco_labels_extractor(sample=next(iter(train_set)))

tensor([[ 51.0000,   1.0800, 187.6900, 611.5900, 285.8400],
        [ 51.0000, 311.7300,   4.3100, 319.2800, 228.6800],
        [ 56.0000, 249.6000, 229.2700, 316.2400, 245.0800],
        [ 51.0000,   0.0000,  13.5100, 434.4800, 375.1200],
        [ 55.0000, 376.2000,  40.3600,  75.5500,  46.5300],
        [ 55.0000, 465.7800,  38.9700,  58.0700,  46.6700],
        [ 55.0000, 385.7000,  73.6600,  84.0200,  70.5100],
        [ 55.0000, 364.0500,   2.4900,  94.7600,  71.0700]])

This output is all you need to get started. Now we can use the Dataloader Adapters!

In [18]:
from super_gradients.training.dataloaders.adapters import DetectionDataloaderAdapterFactory
from data_gradients.dataset_adapters.config.data_config import DetectionDataConfig


adapter_config = DetectionDataConfig(labels_extractor=coco_labels_extractor, cache_path="coco_adapter_cache.json")
train_loader = DetectionDataloaderAdapterFactory.from_dataset(
    dataset=train_set,
    config=adapter_config,
    batch_size=5,
    drop_last=True,
)
val_loader = DetectionDataloaderAdapterFactory.from_dataset(
    dataset=train_set,
    config=adapter_config,
    batch_size=5,
    drop_last=True,
)

[2023-11-13 13:57:52] INFO - data_config.py - Cache deactivated for `DetectionDataConfig`.



--------------------------------------------------------------------------------
[33;1mHow many classes does your dataset include?[0m
--------------------------------------------------------------------------------

Enter your response >>> 80


[2023-11-13 13:57:55] INFO - detection_adapter_collate_fn.py - You are using Detection Adapter. Please note that it was designed specifically for YOLONAS, YOLOX and PPYOLOE.


Great! [33;1mYou chose: `80`[0m

--------------------------------------------------------------------------------
[33;1mIn which format are your images loaded ?[0m
--------------------------------------------------------------------------------

[34;1mOptions[0m:
[[34;1m0[0m] | RGB
[[34;1m1[0m] | BGR
[[34;1m2[0m] | LAB
[[34;1m3[0m] | Other

Your selection (Enter the [34;1mcorresponding number[0m) >>> 0
Great! [33;1mYou chose: `RGB`[0m

--------------------------------------------------------------------------------
[33;1m[33;1mWhich comes first[0m in your annotations, the class id or the bounding box?[0m
--------------------------------------------------------------------------------
Here's a sample of how your labels look like:
Each line corresponds to a bounding box.
tensor([[ 51.0000,   1.0800, 187.6900, 611.5900, 285.8400],
        [ 51.0000, 311.7300,   4.3100, 319.2800, 228.6800],
        [ 56.0000, 249.6000, 229.2700, 316.2400, 245.0800],
        [ 51.0000,

[2023-11-13 13:58:01] INFO - detection_adapter_collate_fn.py - You are using Detection Adapter. Please note that it was designed specifically for YOLONAS, YOLOX and PPYOLOE.


Great! [33;1mYou chose: `xywh: x-left, y-top, width, height			(COCO format)`[0m


In [19]:
_image, targets = next(iter(train_loader))
targets.shape # [N, 6] format with 6 representing (sample_id, class_id, cx, cy, w, h)

torch.Size([14, 6])

# III. Use your Adapted Dataloader to train a model

Now that we have an adapter for a detection dataset, let's use it to launch a training of YoloNAS on it!

This is of course for the sake of the example, since YoloNAS was originally trained using the SuperGradients implementation of COCO Dataset. You can replace the `COCO` dataset with any of your dataset.

In [19]:
from omegaconf import OmegaConf
from hydra.utils import instantiate

from super_gradients import Trainer
from super_gradients.training import models
from super_gradients.common.object_names import Models
from super_gradients.training import training_hyperparams
from super_gradients.common.environment.cfg_utils import load_recipe


trainer = Trainer(experiment_name="yolonas_training_with_adapter", ckpt_root_dir="./")
model = models.get(model_name=Models.YOLO_NAS_S, num_classes=adapter_config.n_classes, pretrained_weights="coco")

yolonas_recipe = load_recipe(config_name="coco2017_yolo_nas_s", overrides=[f"arch_params.num_classes={adapter_config.n_classes}", "training_hyperparams.max_epochs=1", "training_hyperparams.mixed_precision=False"])
yolonas_recipe = OmegaConf.to_container(instantiate(yolonas_recipe))
training_params = yolonas_recipe['training_hyperparams']

trainer.train(model=model, training_params=training_params, train_loader=train_loader, valid_loader=val_loader)

[2023-11-13 08:29:34] INFO - checkpoint_utils.py - License Notification: YOLO-NAS pre-trained weights are subjected to the specific license terms and conditions detailed in 
https://github.com/Deci-AI/super-gradients/blob/master/LICENSE.YOLONAS.md
By downloading the pre-trained weight files you agree to comply with these terms.
Downloading: "https://sghub.deci.ai/models/yolo_nas_s_coco.pth" to /root/.cache/torch/hub/checkpoints/yolo_nas_s_coco.pth
100%|██████████| 73.1M/73.1M [00:02<00:00, 27.5MB/s]
[2023-11-13 08:29:37] INFO - checkpoint_utils.py - Successfully loaded pretrained weights for architecture yolo_nas_s
[2023-11-13 08:29:38] INFO - sg_trainer.py - Starting a new run with `run_id=RUN_20231113_082938_239280`
[2023-11-13 08:29:38] INFO - sg_trainer.py - Checkpoints directory: ./yolonas_training_with_adapter/RUN_20231113_082938_239280
[2023-11-13 08:29:38] INFO - sg_trainer.py - Using EMA with params {'decay': 0.9997, 'decay_type': 'threshold', 'beta': 15}


The console stream is now moved to ./yolonas_training_with_adapter/RUN_20231113_082938_239280/console_Nov13_08_29_38.txt


[2023-11-13 08:29:38] INFO - sg_trainer_utils.py - TRAINING PARAMETERS:
    - Mode:                         Single GPU
    - Number of GPUs:               0          (0 available on the machine)
    - Full dataset size:            500        (len(train_set))
    - Batch size per GPU:           5          (batch_size)
    - Batch Accumulate:             1          (batch_accumulate)
    - Total batch size:             5          (num_gpus * batch_size)
    - Effective Batch size:         5          (num_gpus * batch_size * batch_accumulate)
    - Iterations per epoch:         100        (len(train_loader))
    - Gradient updates per epoch:   100        (len(train_loader) / batch_accumulate)

[2023-11-13 08:29:38] INFO - sg_trainer.py - Started training for 1 epochs (0/0)

Train epoch 0:   1%|          | 1/100 [00:16<26:48, 16.25s/it, PPYoloELoss/loss=5.34, PPYoloELoss/loss_cls=2.84, PPYoloELoss/loss_dfl=1.29, PPYoloELoss/loss_iou=1.21, gpu_mem=0]

# IV. Dig deeper into the Adapter

By default, any parameter that could not be confidently infered will trigger a question.

But you have the possibility to set these parameters in advance through the config object. In the previous example we had to set `labels_extractor` explicitly. Now let's set all the parameters

In [20]:
from super_gradients.training.dataloaders.adapters import DetectionDataloaderAdapterFactory
from data_gradients.dataset_adapters.config.data_config import DetectionDataConfig
from data_gradients.utils.data_classes.image_channels import ImageChannels
class_names = [category['name'] for category in train_set.coco.loadCats(train_set.coco.getCatIds())]

adapter_config = DetectionDataConfig(
    labels_extractor=coco_labels_extractor,
    is_label_first=True,
    class_names=class_names,
    image_channels=ImageChannels.from_str("RGB"),
    xyxy_converter='xywh',
    cache_path="coco_adapter_cache_with_default.json"
)

[2023-11-13 13:58:08] INFO - data_config.py - Cache deactivated for `DetectionDataConfig`.


This can now be used and you don't need to answer any question

In [21]:
train_loader = DetectionDataloaderAdapterFactory.from_dataset(
    dataset=train_set,
    config=adapter_config,
    batch_size=5,
    drop_last=True,
)
val_loader = DetectionDataloaderAdapterFactory.from_dataset(
    dataset=train_set,
    config=adapter_config,
    batch_size=5,
    drop_last=True,
)

_image, targets = next(iter(train_loader))
targets.shape # [N, 6] format with 6 representing (sample_id, class_id, cx, cy, w, h)

[2023-11-13 13:58:11] INFO - detection_adapter_collate_fn.py - You are using Detection Adapter. Please note that it was designed specifically for YOLONAS, YOLOX and PPYOLOE.
[2023-11-13 13:58:11] INFO - detection_adapter_collate_fn.py - You are using Detection Adapter. Please note that it was designed specifically for YOLONAS, YOLOX and PPYOLOE.


torch.Size([14, 6])

### Load from existing cache

You can use the cache of an adapter you already used in the past. This will allow you skip the questions that were already asked in the previous run.

In [22]:
# The new config will load the answer from questions asked in the previous run.
adapter_config = DetectionDataConfig(
    labels_extractor=coco_labels_extractor,
    cache_path="coco_adapter_cache_with_default.json" # Name of the previous cache
)

train_loader = DetectionDataloaderAdapterFactory.from_dataset(
    dataset=train_set,
    config=adapter_config,
    batch_size=5,
    drop_last=True,
)
val_loader = DetectionDataloaderAdapterFactory.from_dataset(
    dataset=train_set,
    config=adapter_config,
    batch_size=5,
    drop_last=True,
)

_image, targets = next(iter(train_loader))

[2023-11-13 13:58:14] INFO - detection_adapter_collate_fn.py - You are using Detection Adapter. Please note that it was designed specifically for YOLONAS, YOLOX and PPYOLOE.
[2023-11-13 13:58:14] INFO - detection_adapter_collate_fn.py - You are using Detection Adapter. Please note that it was designed specifically for YOLONAS, YOLOX and PPYOLOE.


In [23]:
targets.shape # [N, 6] format with 6 representing (sample_id, class_id, cx, cy, w, h)

torch.Size([14, 6])

As you can see, no question was asked and we still get the targets adapted into the SuperGradients format.