Implement AutoAugment for Detection #6609

ain-soph · 2022-09-19T19:02:32Z

Implement #6224.

Implement Learning Data Augmentation Strategies for Object Detection
Refers to: #3817

Motivation, pitch

Good to have augmentation in Torchvision

cc @vfdev-5 @datumbox

ain-soph · 2022-09-19T19:08:23Z

@vfdev-5

I'm using bboxes.data as torch.Tensor to avoid type checking issues. I'm not sure if it's a good convention.

          elif transform_id == "TranslateX":
              bboxes.data = F.affine_bounding_box(
                  bboxes.data,
                  bboxes.format,
                  bboxes.image_size,
                  angle=0.0,
                  translate=[int(magnitude), 0],
                  scale=1.0,
                  shear=[0.0, 0.0],
              )

There are several ops that I couldn't find in our current library:
- SolarizeAdd
- Cutout, Cutout_Only_BBoxes
- BBox_Cutout
- Flip_Only_BBoxes (This is actually just hflip, but we need to put that in _AutoAugmentBase._apply_image_transform)

vfdev-5 · 2022-09-22T09:40:08Z

@ain-soph thanks for the PR, let's remove useless files:

torchvision/image.pyd
torchvision/_C.pyd

ain-soph · 2022-09-22T09:41:12Z

@ain-soph thanks for the PR, let's remove useless files:

torchvision/image.pyd

torchvision/_C.pyd

Oops! Sorry for missing that... I'll change it soon.

vfdev-5 · 2022-09-22T09:43:01Z

@ain-soph do you have any working testing code to run it and visually check the output ? If yes, please share it here, it would be helpful. Thanks !

ain-soph · 2022-09-28T20:59:00Z

@vfdev-5 Sorry for the latency. Was busy at ICLR for the past week. I'll add some test codes tmr.

update

ain-soph · 2022-10-24T08:49:09Z

The recent auto-augment api modification forces me to re-design the api for detection. I'm still working on this now. Need to know how the new framework works.
Kinda confused if there is any case for such "video detection" compatibility?

datumbox · 2022-11-04T20:25:44Z

@ain-soph The API should be more stable now. Which specific AA modification broke your code. Perhaps we can help you?

ain-soph · 2022-11-04T20:32:24Z

@datumbox Sorry for dragging it so long. The main confusing part is how I can extract the bounding box from inputs, and (after augmentation) how I can insert the modified bounding box back to inputs.

Current image classification codes don't have this part and I can't find something for reference.
I'm trying to overload existing _flatten_and_extract_image_or_video and add the bounding box support, but I don't know whether it's recommended. I'll implement an ugly-but-work demo first for review. Hopefully it shall be finished today.

datumbox · 2022-11-04T20:48:09Z

@ain-soph Apologies for the rough edges. You are basically brave for digging into the developer part of the API so early. The public part should be fairly stable. We will continue polishing the internal part to make the experience smoother.

The method you probably want for extracting the bounding boxes is _utils.query_bounding_box(). I think overriding temporarily the _flatten_and_extract_image_or_video is fine. An alternative would be to have a similar separate method for bboxes. At this point I would recommend to "just do what you need to do" without worrying about the recommended way to implement the transform. We can help you refactor it once you have it in a working condition.

@vfdev-5 / @pmeier any recommendation on your side at this point?

ain-soph · 2022-11-05T05:57:07Z

I've uploaded a version that shall work and make a very simple unittest. Please take a review and I'm expecting some guidance to improve, thanks!

There are still a few to-do items on the plate, especially some augment operations not implemented in current transforms library yet: SolarizeAdd, Cutout, BBox_Cutout, Flip_Only_BBoxes, Cutout_Only_BBoxes

ain-soph · 2023-03-10T04:43:53Z

@vfdev-5 @datumbox It shall be ready to merge after reviewing all the comments I posted above.

torchvision/transforms/v2/_auto_augment.py

ain-soph · 2023-04-06T08:54:58Z

@vfdev-5 This PR shall be ready for merge and please take a final review. I have annotated some detailed comments in codes.

Summary:

Implement solarize_add and cutout transforms.
Add 4 new transform_id in apply_transform: "Flip", "SolarizeAdd", "Cutout", "BBox_Cutout"
In _transform_image_or_video_in_bboxes, convert bbox format to XYXY
Strictly follow tensorflow repo setting to:
- use 11 levels from 0 to 10. In previous AA codes, levels are from 0 to 9.
  To implement this, num_bins is replaced by num_bins+1 to keep correct intervals.
- follow their magnitude settings:
  - "Solarize": slightly different. It's now from 256/255 to 0. In previous AA codes, it's from 1.0 to 0.
    (Shall we consider still using 1.0 to 0 for consistency?)
  - "Brightness", "Color", "Contrast", "Sharpness": the meaning of level quite differs from previous AA codes.
    In previous AA codes, it's [0, 0.9] in 10 pieces with random negate (so level 0 refers to 0 and level 9 refers to either 0.9 or -0.9).
    In current AAdet codes, it's [-0.9, 0.9] in 11 pieces (so level 0 refers to -0.9 and level 10 refers to 0.9).
  - "TransformX", "TransformY": magnitude ranges are different from AA settings.
Add a second idx in flat_inputs_with_spec to save index of bboxes tensor.
Use _apply_image_or_video_and_bboxes_transform to process transforms.
- For "*_Only_BBoxes" transforms and "BBox_Cutout", call _transform_image_or_video_in_bboxes to apply transform to the sub-images/videos within the bboxes.
- For normal transforms, apply transform to both image/video and bboxes.

Several small concerns remaining:

test file

Current test (for both AA and AAdetection) will only use the first bbox and image/video in test dict, which makes test cases limited:

vision/test/test_transforms_v2.py

Lines 69 to 82 in dcca679

    
           def auto_augment_adapter(transform, input, device): 
        
               adapted_input = {} 
        
               image_or_video_found = False 
        
               for key, value in input.items(): 
        
                   if isinstance(value, (datapoints.BoundingBox, datapoints.Mask)): 
        
                       # AA transforms don't support bounding boxes or masks 
        
                       continue 
        
                   elif check_type(value, (datapoints.Image, datapoints.Video, is_simple_tensor, PIL.Image.Image)): 
        
                       if image_or_video_found: 
        
                           # AA transforms only support a single image or video 
        
                           continue 
        
                       image_or_video_found = True 
        
                   adapted_input[key] = value 
        
               return adapted_input

I'm currently testing all 4 policies, which might be unnecessary.

AutoAugmentDetection
- There are unnecessary type convert in solarize_add and _apply_image_or_video_and_bboxes_transform.
- _apply_image_or_video_transform simply calls _apply_transform. Could we merge them?
- In _apply_transform, I have a isinstance(inpt, datapoints.BoundingBox) condition in the middle, which might makes codes inconsistent with other conditions and difficult to understand (see comments for details).
- Need to check the usage of wrap_like is correct:
  image.wrap_like(image, result)
  chosen_bbox = bboxes.wrap_like(bboxes, bboxes[random_index].unsqueeze(0))
- flat_inputs_with_spec records 2 idxs in AAdet, which is different from AA codes.

…com/ain-soph/vision into Implement-AutoAugment-for-Detection

ain-soph · 2023-05-08T06:37:04Z

@vfdev-5 @datumbox Any feedback from maintainers? I think all code work has been finished and this PR is already ready for review. Please refer to #6609 (comment) for a work summary.

vfdev-5 · 2023-05-09T11:24:22Z

@ain-soph thanks for pinging and sorry for the delay, I'll check the code this week and see how we can make it done finally

torchvision/transforms/v2/_auto_augment.py

vfdev-5 · 2023-05-16T20:54:28Z

_apply_image_or_video_transform simply calls _apply_transform. Could we merge them?

OK, let's remove _apply_image_or_video_transform and keep only _apply_transform. _apply_image_or_video_transform is sort of alias saying that it works only on images/video.

In _apply_transform, I have a isinstance(inpt, datapoints.BoundingBox) condition in the middle, which might makes codes inconsistent with other conditions and difficult to understand (see comments for details).

Commented in the code. Let's remove all AADet related stuff from AABase.

flat_inputs_with_spec records 2 idxs in AAdet, which is different from AA codes.

This is probably ok.

Need to check the usage of wrap_like is correct

Seems like ok.

@ain-soph thanks for the work on this transformation! It is rather untrivial and especially that we need finally to train a model with this transform to see real benefit of using it.

Let's fix reported bugs and merge it and continue with follow-up PRs.

…results Added another smoke test

vfdev-5

LGTM, thanks a lot for the PR @ain-soph and appologies for such long review.

I pushed main class AutoAugmentDetection as _AutoAugmentDetection temporarily the time we validate everything and write proper non-regression tests.

I'll comment in this PR on the next steps.

NicolasHug · 2023-05-17T09:00:19Z

torchvision/transforms/v2/_auto_augment.py

@@ -621,3 +678,309 @@ def forward(self, *inputs: Any) -> Any:
            mix = F.to_image_pil(mix)

        return self._unflatten_and_insert_image_or_video(flat_inputs_with_spec, mix)
+
+
+class _AutoAugmentDetectionBase(_AutoAugmentBase):


What's the reason for having _AutoAugmentDetectionBase? It seems to only be used by one child class

Later we could simply add other AA techniques for detection.

do we plan to?

There is no clear plans at the moment. Do you want to simplify that and readd later ?

ain-soph · 2023-05-19T19:41:56Z

@vfdev-5 @NicolasHug Feel free to ping me when there is a plan from maintainers for following work. I could continue working on this.

init commit

0bd673b

facebook-github-bot added the cla signed label Sep 19, 2022

a small update

e29e444

ain-soph added 2 commits September 19, 2022 15:21

update

08f5d12

fix type checking issues

779c1a6

datumbox mentioned this pull request Sep 20, 2022

[RFC] Batteries Included - Phase 3 #6323

Open

16 tasks

datumbox linked an issue Sep 20, 2022 that may be closed by this pull request

Implement AutoAugment for Detection #6224

Open

ain-soph added 5 commits October 5, 2022 15:16

Merge branch 'pytorch:main' into Implement-AutoAugment-for-Detection

dcd3f18

(In Progress) temp commit

0e8df49

finish the majority

16f6823

Merge branch 'Implement-AutoAugment-for-Detection' into main

75740f2

Merge pull request #1 from ain-soph/main

a01572b

update

Merge branch 'pytorch:main' into Implement-AutoAugment-for-Detection

0dfc95b

ain-soph added 4 commits November 5, 2022 00:44

update

e712bc5

remove pyd files

936c704

fix type linting errors

20fc9a5

add test

b7a8adf

format test file

faa1187

datumbox requested review from pmeier and vfdev-5 November 7, 2022 09:14

ain-soph changed the title ~~[DRAFT] Implement AutoAugment for Detection~~ Implement AutoAugment for Detection Mar 9, 2023

ain-soph added 2 commits April 6, 2023 00:40

Merge branch 'main' into Implement-AutoAugment-for-Detection

2444128

update

ebc4341

ain-soph commented Apr 6, 2023

View reviewed changes

torchvision/transforms/v2/_auto_augment.py Show resolved Hide resolved

ain-soph commented Apr 6, 2023

View reviewed changes

torchvision/transforms/v2/_auto_augment.py Outdated Show resolved Hide resolved

ain-soph commented Apr 6, 2023

View reviewed changes

torchvision/transforms/v2/_auto_augment.py Outdated Show resolved Hide resolved

remove test policy

338d4ed

ain-soph added 4 commits April 18, 2023 23:50

Merge branch 'main' into Implement-AutoAugment-for-Detection

8a8a569

Merge branch 'main' into Implement-AutoAugment-for-Detection

bb1013a

remove todos

29c9db5

Merge branch 'Implement-AutoAugment-for-Detection' of https://github.…

417b744

…com/ain-soph/vision into Implement-AutoAugment-for-Detection

Merge branch 'pytorch:main' into Implement-AutoAugment-for-Detection

4e978b1

vfdev-5 reviewed May 16, 2023

View reviewed changes

torchvision/transforms/v2/_auto_augment.py Outdated Show resolved Hide resolved

vfdev-5 reviewed May 16, 2023

View reviewed changes

torchvision/transforms/v2/_auto_augment.py Outdated Show resolved Hide resolved

vfdev-5 reviewed May 16, 2023

View reviewed changes

torchvision/transforms/v2/_auto_augment.py Outdated Show resolved Hide resolved

vfdev-5 reviewed May 16, 2023

View reviewed changes

torchvision/transforms/v2/_auto_augment.py Outdated Show resolved Hide resolved

ain-soph and others added 2 commits May 16, 2023 17:44

update

791d620

AutoAugmentDetection -> _AutoAugmentDetection before we validate the …

3afeb30

…results Added another smoke test

vfdev-5 approved these changes May 17, 2023

View reviewed changes

NicolasHug reviewed May 17, 2023

View reviewed changes

ain-soph added 2 commits June 14, 2023 13:58

Merge branch 'main' into Implement-AutoAugment-for-Detection

89f22ce

Merge branch 'main' into Implement-AutoAugment-for-Detection

5425d95

This comment was marked as off-topic.

Sign in to view

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement AutoAugment for Detection #6609

Implement AutoAugment for Detection #6609

ain-soph commented Sep 19, 2022

ain-soph commented Sep 19, 2022 •

edited

Loading

vfdev-5 commented Sep 22, 2022

ain-soph commented Sep 22, 2022

vfdev-5 commented Sep 22, 2022

ain-soph commented Sep 28, 2022

ain-soph commented Oct 24, 2022

datumbox commented Nov 4, 2022

ain-soph commented Nov 4, 2022

datumbox commented Nov 4, 2022

ain-soph commented Nov 5, 2022 •

edited

Loading

ain-soph commented Mar 10, 2023

ain-soph commented Apr 6, 2023 •

edited

Loading

ain-soph commented May 8, 2023 •

edited

Loading

vfdev-5 commented May 9, 2023

vfdev-5 commented May 16, 2023

vfdev-5 left a comment

NicolasHug May 17, 2023

vfdev-5 May 17, 2023

NicolasHug May 17, 2023

vfdev-5 May 17, 2023

ain-soph commented May 19, 2023

This comment was marked as off-topic.

This comment was marked as off-topic.

Implement AutoAugment for Detection #6609

Are you sure you want to change the base?

Implement AutoAugment for Detection #6609

Conversation

ain-soph commented Sep 19, 2022

Motivation, pitch

ain-soph commented Sep 19, 2022 • edited Loading

vfdev-5 commented Sep 22, 2022

ain-soph commented Sep 22, 2022

vfdev-5 commented Sep 22, 2022

ain-soph commented Sep 28, 2022

ain-soph commented Oct 24, 2022

datumbox commented Nov 4, 2022

ain-soph commented Nov 4, 2022

datumbox commented Nov 4, 2022

ain-soph commented Nov 5, 2022 • edited Loading

ain-soph commented Mar 10, 2023

ain-soph commented Apr 6, 2023 • edited Loading

Summary:

Several small concerns remaining:

ain-soph commented May 8, 2023 • edited Loading

vfdev-5 commented May 9, 2023

vfdev-5 commented May 16, 2023

vfdev-5 left a comment

Choose a reason for hiding this comment

NicolasHug May 17, 2023

Choose a reason for hiding this comment

vfdev-5 May 17, 2023

Choose a reason for hiding this comment

NicolasHug May 17, 2023

Choose a reason for hiding this comment

vfdev-5 May 17, 2023

Choose a reason for hiding this comment

ain-soph commented May 19, 2023

This comment was marked as off-topic.

This comment was marked as off-topic.

ain-soph commented Sep 19, 2022 •

edited

Loading

ain-soph commented Nov 5, 2022 •

edited

Loading

ain-soph commented Apr 6, 2023 •

edited

Loading

ain-soph commented May 8, 2023 •

edited

Loading