Skip to content

Conversation

@ChristofferEdlund
Copy link
Contributor

@ChristofferEdlund ChristofferEdlund commented Sep 22, 2023

Problem

We want to introduce the possibility to use Albumentation transforms with darwin-py torch datasets.

Solution

Introducing an AlbumentationsTransform class in torch.transforms that can be be used in the following manner:

from darwin.torch.dataset import (
    ClassificationDataset,
    InstanceSegmentationDataset,
    ObjectDetectionDataset,
    SemanticSegmentationDataset,
)

from darwin.torch.transforms import AlbumentationsTransform

dataset_path = Path(r"/path/to/local/darwin/dataset")
transform = AlbumentationsTransform.from_path('/tmp/transform.json')

inst_dataset = InstanceSegmentationDataset(dataset_path=dataset_path, transform=transform)

One can initilize the AlbumentationsTransform in three ways:

  • AlbumentationsTransform(albumentation_transform) <- needs an albumentations transformation
  • AlbumentationsTransform.frompath(path) <- a path pointing to a .json or .yaml file defining the transformation
  • AlbumentationsTransform.fromdict(dict) <- a dictionary defining the transform

To read more about the dictionary and file formats supported, we refer to the albumentations documentation.

Further.

  • Instance segmentation dataset output is changed for bounding boxes to be coco format (X, Y, W, H) to be consistent with darwin-json annotations and the ObjectDetecion torch dataset. [BREAKING CHANGE]
  • Clamping is introduced to ObjectDetection bboxes that is outside of image for more robust data loading.

Changelog

  • Introduced albumentation transform support for darwin torch datasts
  • [BREAKING CHANGE] darwin.torch.dataset.InstanceSegmentationDataset has bbox coordinates changes from pascal_voc to coco format (X, Y, H, W)
  • darwin.torch.dataset.ObjectDetectionDataset clamps bbox coordinates out-of-bound.

@linear
Copy link

linear bot commented Sep 22, 2023

AI-1190 Implement and test augmentations

This task is about implementing and benchmarking augmentations for model training. We will first test it for object detection on HF models and benchmark against not using augmentations.

If this improve model performance, let's generate a more general solution where any integration can import and use the augmentations.

@owencjones owencjones changed the title Ai 1190 implement and test augmentations [AI-1190] Implement and test augmentations Sep 26, 2023
@owencjones
Copy link
Contributor

Updated title so that it's easier for me on deployment 😉

Copy link
Contributor

@almazan almazan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All changes look good to me.

Edit: noticed something in a second pass. See below.

Copy link
Contributor

@almazan almazan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments need to be addressed before approval

boxes = torch.as_tensor(boxes, dtype=torch.float32).reshape(-1, 4)
boxes[:, 2:] += boxes[:, :2]
boxes[:, 0::2].clamp_(min=0, max=w)
boxes[:, 1::2].clamp_(min=0, max=h)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not clamping bboxes anymore, since now indexes 2 and 3 are height and width instead of x2 and y2. Boxes now could go outside of the image.

@ChristofferEdlund ChristofferEdlund merged commit e89c93f into master Sep 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants