Feature/sg 729 abstract cocodataset #777

Louis-Dupont · 2023-03-13T12:18:32Z

Abstracting Coco dataset.
Changes bewteen the coco and its abstraction:

init parameters have different names (they also represent different things)

…work with different classes

dagshub · 2023-03-13T12:18:36Z

Join the discussion on DagsHub!

src/super_gradients/training/datasets/detection_datasets/coco_format_detection.py

BloodAxe

LGTM.

But I also wanted to leave my 5 cents here:

I think at some point we would need to make something DatasetReader class with logic to parse a dataset in whatever format into List of samples and use that to instantiate dataset classes. What I mean is this:

class COCODetectionDatasetReader:
  def __init__(images_dir, annotation_file, with_crowd, ...):
       ...
  
  def load_samples() -> List[DetectionSample]:
       ...

reader = COCODetectionDatasetReader(...)
samples = reader.load_samples()
dataset = DetectionDataset(samples, transforms)

So the idea is DetectionDataset has zero knowledge about where the data is coming from. E.g instead of sub classing we use composition, which is more flexible:

reader = COCODetectionDatasetReader(...)
reader = PascalVOCDetectionDatasetReader(...)
reader = DetectionDatasetReader(...)

And DetectionSample to provide all necessary information for a single image sample:

@dataclass
class DetectionSample
   image_path: str
   bboxes: np.ndarray # [N, 4] in XYXY format
   labels: np.ndarray # [N]
   is_crowd: np.ndarray # [N]

Note, let's say for some customer we have to have additional label attached to bbox. No problem:

@dataclass
class CustomerSpecificDetectionSample(DetectionSample):
   additional_property: np.ndarray # [N]


class CustomerSpecificDetectionDatasetReader:
  def load_samples() -> List[CustomerSpecificDetectionSample]:
       ...

I realize this is much of work and def. goes out of the scope of this PR.
So this is more to raise an awareness and suggest this for future sprints.

Louis-Dupont · 2023-03-16T13:30:03Z

It sounds like a great idea, I guess that we could also make it so that load_samples would be a generator to provide an option not to cache labels.

Louis-Dupont added 19 commits March 9, 2023 11:44

first draft - wip

0454c2e

rollback

a1cf5b0

wip

e2e3f82

adding robflow and robflow100 - robflow100 still need to be fixed to …

8bfc065

…work with different classes

add comment

feb903c

update

17bd194

Merge branch 'master' into feature/SG-729-add-roboflow100

50f34fb

wip

782156c

wip

43da2fc

add category

9f739b2

Merge branch 'master' into feature/SG-729-add-roboflow100

31fdb18

remove sampler

18728f1

rollback minor change

5a35765

imprve doc

32db3dd

wip

3a617e6

wip

82db585

add change in all_classes

afb043d

add verbose

2b166bd

abstract coco

4f56386

Louis-Dupont added 2 commits March 13, 2023 14:25

wip

5e6d741

formating

ee54603

Louis-Dupont marked this pull request as ready for review March 13, 2023 12:47

Louis-Dupont requested review from shaydeci, ofrimasad and BloodAxe as code owners March 13, 2023 12:47

Louis-Dupont and others added 4 commits March 13, 2023 14:48

remove unused line

531d996

rename

addb225

remove offset

bc5193f

Merge branch 'master' into feature/SG-729-abstract_cocodataset

03f0c96

Merge branch 'master' into feature/SG-729-abstract_cocodataset

4b36f48

BloodAxe reviewed Mar 16, 2023

View reviewed changes

src/super_gradients/training/datasets/detection_datasets/coco_format_detection.py Outdated Show resolved Hide resolved

BloodAxe previously approved these changes Mar 16, 2023

View reviewed changes

BloodAxe and others added 3 commits March 16, 2023 12:36

Merge branch 'master' into feature/SG-729-abstract_cocodataset

cce7812

Merge branch 'master' into feature/SG-729-abstract_cocodataset

dab9b0f

Merge branch 'master' into feature/SG-729-abstract_cocodataset

0c9ec14

Louis-Dupont dismissed BloodAxe’s stale review via 0c9ec14 March 16, 2023 13:24

fix doc

484f1e1

BloodAxe approved these changes Mar 16, 2023

View reviewed changes

Merge branch 'master' into feature/SG-729-abstract_cocodataset

e6883f8

Louis-Dupont merged commit c986887 into master Mar 16, 2023

Louis-Dupont deleted the feature/SG-729-abstract_cocodataset branch March 16, 2023 14:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/sg 729 abstract cocodataset #777

Feature/sg 729 abstract cocodataset #777

Louis-Dupont commented Mar 13, 2023 •

edited

Loading

dagshub bot commented Mar 13, 2023

BloodAxe left a comment

Louis-Dupont commented Mar 16, 2023

Feature/sg 729 abstract cocodataset #777

Feature/sg 729 abstract cocodataset #777

Conversation

Louis-Dupont commented Mar 13, 2023 • edited Loading

dagshub bot commented Mar 13, 2023

BloodAxe left a comment

Choose a reason for hiding this comment

Louis-Dupont commented Mar 16, 2023

Louis-Dupont commented Mar 13, 2023 •

edited

Loading