# Dataset Adapter

With diverse dataset structures available, ensuring compatibility with SuperGradients (SG) can be challenging. This is where the DataloaderAdapter plays a pivotal role. This tutorial takes you through the importance, implementation, and advantages of using the DataloaderAdapter with SG.

### Why Dataset Adapter?

Datasets come in a myriad of structures. However, SG requires data in a specific format.

For instance, consider the Object Detection Format:

Image format should be: (BS, H, W, C) i.e., channel last.
Targets should be in the format: (BS, 6), where 6 represents (sample_id, class_id, label, cx, cy, w, h).
The overhead of adjusting each dataset manually can be cumbersome. Enter DataloaderAdapter – designed to automatically understand your dataset structure and mold it for SG compatibility.

In [8]:
!pip install -q super-gradients==3.7.1

### Why Do We Need the Dataset Adapter?

While Datasets come in various structures and formats, SG expects data in a specific format to be able to run.


> Example: Object Detection Format
> - Image format: (BS, H, W, C) i.e. channel last
> - Targets format: (BS, 6) where 6 represents (sample_id, class_id, label, cx, > cy, w, h).


This means that you should either use one of SuperGradient's built-in Dataset class if it supports your dataset structure, or, if your dataset is too custom for it, inherit from SG datasets and bring all the required changes.

While this is all right in most cases, it can be cumbersome when you just want to quickly experiment with a new dataset.

To reduce this overhead, SuperGradients introduced the concept of `DataloaderAdapter`. Instead of requiring you to write all the transformations required to use SG, the `DataloaderAdapter` will infer anything possible directly from your data. Whenever something cannot be inferred with 100% confidence, you will be asked a question with all the required context for you to properly answer.

Let's see this in practice with an example. Let's start with `SBDataset` dataset

# Exemple 1 - Segmentation Adapter on `SBDataset` Dataset

In this section, we'll walk through the process of preparing the `SBDataset` dataset for use in SuperGradients. We'll highlight the challenges and demonstrate how the Adapter can simplify the process.


1. Preparing the Dataset without Adapter

In [3]:
from torchvision.datasets import SBDataset

try:
  # There is a bug with `torchvision.datasets.SBDataset` that raises RuntimeError after downloading, so we just ignore it
  SBDataset(root="data", mode='segmentation', download=True)
except RuntimeError:
  pass

In [4]:
from torchvision.transforms import Compose, ToTensor, Resize, InterpolationMode


transforms = Compose([ToTensor(), Resize((512, 512), InterpolationMode.NEAREST)])
def sample_transform(image, mask):
  return transforms(image), transforms(mask)

train_set = SBDataset(root="data", mode='segmentation', download=False, transforms=sample_transform)

Now let's see what we get when instantiating a `Dataloader`

In [5]:
from torch.utils.data import DataLoader

train_loader = DataLoader(train_set, batch_size=20, shuffle=True)
_images, labels = next(iter(train_loader))

labels.unique()

As you can see, the labels are normalized (0-1). This is all right, but it is not the format expected by SuperGradients.

Let's now see how the Adapter helps.

2. Introducing Adapter

The Adapter helps us skip manual data preparations and dives right into creating a dataloader that SuperGradients expects.

In [11]:
from super_gradients.training.dataloaders.adapters import SegmentationDataloaderAdapterFactory

train_loader = SegmentationDataloaderAdapterFactory.from_dataset(dataset=train_set, batch_size=20, shuffle=True, config_path='cache_file.json')

_images, labels = next(iter(train_loader))
labels.unique()

You can see that the mask is now encoded as `int`, which is the representation used in SuperGradients.

It's important to note that the dataset adapter also support different dataset format such as one hot, ensuring that the output (`labels` here) is in the right format to use within SuperGradients.

## Example II - Detection Adapter on a Dictionary based Dataset

Some datasets return a more complex data structure than the previous example.

For instance, the `COCO` dataset implementation from `pytorch` returns a list of dictionaries representing the labels.

Let's have a look:


In [12]:
# Download the zip file
!wget https://deci-pretrained-models.s3.amazonaws.com/coco2017_small.zip

# Unzip the downloaded file
!unzip coco2017_small.zip > /dev/null

In [13]:
from torchvision.datasets import CocoDetection
from torchvision.transforms import Compose, ToTensor, Resize, InterpolationMode
from torchvision.datasets import SBDataset


image_transform = Compose([ToTensor(), Resize((512, 512))])

train_set = CocoDetection(root='coco2017_small/images/train2017', annFile='coco2017_small/annotations/instances_train2017.json', transform=image_transform)
val_set = CocoDetection(root='coco2017_small/images/val2017', annFile='coco2017_small/annotations/instances_val2017.json', transform=image_transform)
image, targets = next(iter(train_set))

In [14]:
print(f"Number of targets: {len(targets)}, First target structure: {targets[0]}")

Observe the dataset output's nested dictionary structure? This complicates things for the Dataset Adapter as it's unsure which fields detail the bounding box.

To solve this, we utilize an extractor function.

#### The Extractor's Role

Simply put, the extractor translates your dataset's output into a format the Adapter understands. For our dataset, it will take the image and annotations, then return the bounding box data, including the label and coordinates.

Worried about bounding box format like `xyxy_label` or `label_xywh`? Don't be. The Adapter is designed to recognize them.

> For further guidance on extractor functions, see the [official documentation](https://github.com/Deci-AI/data-gradients/blob/master/documentation/dataset_extractors.md).

In [15]:
import torch

def coco_labels_extractor(sample) -> torch.Tensor:
    _, annotations = sample # annotations = [{"bbox": [1.08, 187.69, 611.59, 285.84], "category_id": 51}, ...]
    labels = []
    for annotation in annotations:
        class_id = annotation["category_id"]
        bbox = annotation["bbox"]
        labels.append((class_id, *bbox))
    return torch.Tensor(labels) # torch.Tensor([[51, 1.08, 187.69, 611.59, 285.84], ...])

coco_labels_extractor(sample=next(iter(train_set)))

This output is all you need to get started. Now we can use the Dataloader Adapters!

In [18]:
from super_gradients.training.dataloaders.adapters import DetectionDataloaderAdapterFactory
from data_gradients.dataset_adapters.config.data_config import DetectionDataConfig


adapter_config = DetectionDataConfig(labels_extractor=coco_labels_extractor, cache_path="coco_adapter_cache.json")
train_loader = DetectionDataloaderAdapterFactory.from_dataset(
    dataset=train_set,
    config=adapter_config,
    batch_size=5,
    drop_last=True,
)
val_loader = DetectionDataloaderAdapterFactory.from_dataset(
    dataset=train_set,
    config=adapter_config,
    batch_size=5,
    drop_last=True,
)

In [19]:
_image, targets = next(iter(train_loader))
targets.shape # [N, 6] format with 6 representing (sample_id, class_id, cx, cy, w, h)

# III. Use your Adapted Dataloader to train a model

Now that we have an adapter for a detection dataset, let's use it to launch a training of YoloNAS on it!

This is of course for the sake of the example, since YoloNAS was originally trained using the SuperGradients implementation of COCO Dataset. You can replace the `COCO` dataset with any of your dataset.

In [19]:
from omegaconf import OmegaConf
from hydra.utils import instantiate

from super_gradients import Trainer
from super_gradients.training import models
from super_gradients.common.object_names import Models
from super_gradients.training import training_hyperparams
from super_gradients.common.environment.cfg_utils import load_recipe


trainer = Trainer(experiment_name="yolonas_training_with_adapter", ckpt_root_dir="./")
model = models.get(model_name=Models.YOLO_NAS_S, num_classes=adapter_config.n_classes, pretrained_weights="coco")

yolonas_recipe = load_recipe(config_name="coco2017_yolo_nas_s", overrides=[f"arch_params.num_classes={adapter_config.n_classes}", "training_hyperparams.max_epochs=1", "training_hyperparams.mixed_precision=False"])
yolonas_recipe = OmegaConf.to_container(instantiate(yolonas_recipe))
training_params = yolonas_recipe['training_hyperparams']

trainer.train(model=model, training_params=training_params, train_loader=train_loader, valid_loader=val_loader)

# IV. Dig deeper into the Adapter

By default, any parameter that could not be confidently infered will trigger a question.

But you have the possibility to set these parameters in advance through the config object. In the previous example we had to set `labels_extractor` explicitly. Now let's set all the parameters

In [20]:
from super_gradients.training.dataloaders.adapters import DetectionDataloaderAdapterFactory
from data_gradients.dataset_adapters.config.data_config import DetectionDataConfig
from data_gradients.utils.data_classes.image_channels import ImageChannels
class_names = [category['name'] for category in train_set.coco.loadCats(train_set.coco.getCatIds())]

adapter_config = DetectionDataConfig(
    labels_extractor=coco_labels_extractor,
    is_label_first=True,
    class_names=class_names,
    image_channels=ImageChannels.from_str("RGB"),
    xyxy_converter='xywh',
    cache_path="coco_adapter_cache_with_default.json"
)

This can now be used and you don't need to answer any question

In [21]:
train_loader = DetectionDataloaderAdapterFactory.from_dataset(
    dataset=train_set,
    config=adapter_config,
    batch_size=5,
    drop_last=True,
)
val_loader = DetectionDataloaderAdapterFactory.from_dataset(
    dataset=train_set,
    config=adapter_config,
    batch_size=5,
    drop_last=True,
)

_image, targets = next(iter(train_loader))
targets.shape # [N, 6] format with 6 representing (sample_id, class_id, cx, cy, w, h)

### Load from existing cache

You can use the cache of an adapter you already used in the past. This will allow you skip the questions that were already asked in the previous run.

In [22]:
# The new config will load the answer from questions asked in the previous run.
adapter_config = DetectionDataConfig(
    labels_extractor=coco_labels_extractor,
    cache_path="coco_adapter_cache_with_default.json" # Name of the previous cache
)

train_loader = DetectionDataloaderAdapterFactory.from_dataset(
    dataset=train_set,
    config=adapter_config,
    batch_size=5,
    drop_last=True,
)
val_loader = DetectionDataloaderAdapterFactory.from_dataset(
    dataset=train_set,
    config=adapter_config,
    batch_size=5,
    drop_last=True,
)

_image, targets = next(iter(train_loader))

In [23]:
targets.shape # [N, 6] format with 6 representing (sample_id, class_id, cx, cy, w, h)

As you can see, no question was asked and we still get the targets adapted into the SuperGradients format.