Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added MixUp augmentation #1409

Closed
wants to merge 9 commits into from

Conversation

mikel-brostrom
Copy link

@mikel-brostrom mikel-brostrom commented Feb 23, 2023

In this PR, I implemented MixUp (https://arxiv.org/pdf/1710.09412v2.pdf)
I appreciate any comment and suggetsion.

Usage:

import cv2
import numpy as np
import albumentations as A
from matplotlib import pyplot as plt


imgsz=640

# helper func
def draw_bboxes_on_img(img, bboxes):
    for bbox in bboxes:
        # top left
        x1 = bbox[0]
        y1 = bbox[1]
        # bottom right
        x2 = bbox[0] + bbox[2]
        y2 = bbox[1] + bbox[3]
        c1 = (int(x1), int(y1))
        c2 = (int(x2), int(y2))
        cv2.rectangle(img, c1, c2, (0, 0, 255), 1)

# images have to be of the same size, hence the resizing
image0 = cv2.imread('./images/train2017/000000000139.jpg')
image1 = cv2.imread('./images/train2017/000000000285.jpg')

# define some bogus bboxes
bboxes0 = [[0, 40, 80, 80, '0'], [0, 80, 160, 160, '1']]
bboxes1 = [[0, 160 , 320, 320, '2']]

# PIPELINE FOR GENERATING EQUALLY SIZED IMAGES
transform1 = A.Compose(
    [
        # https://albumentations.ai/docs/api_reference/augmentations/geometric/resize/#albumentations.augmentations.geometric.resize.LongestMaxSize
        A.geometric.resize.LongestMaxSize(imgsz),
        # https://albumentations.ai/docs/api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.PadIfNeeded
        A.geometric.transforms.PadIfNeeded(imgsz, imgsz, border_mode=0, value=(114, 114, 114)),
    ],
    bbox_params=A.BboxParams(format='coco', min_area=20),
)

# MIXUP ONLY AUGMENTAION
transform2 = A.Compose(
    [
        MixUp(
            alpha=32,
            beta=32
        )
    ],
    bbox_params=A.BboxParams(format='coco', min_area=20),
)

# Get equaly sized images without breaking the aspect ratio
transformed = transform1(
    image=image0,
    bboxes=bboxes0,
)

image0_transformed = transformed['image']
bboxe0_transformed = transformed['bboxes']

transformed = transform1(
    image=image1,
    bboxes=bboxes1,
)

image1_transformed = transformed['image']
bboxe1_transformed = transformed['bboxes']

draw_bboxes_on_img(image0_transformed, bboxe0_transformed)
draw_bboxes_on_img(image1_transformed, bboxe1_transformed)
cv2.imwrite('image0_transformed.jpg', image0_transformed)
cv2.imwrite('image1_transformed.jpg', image1_transformed)

# Input the results for the two images into MixUp
transformed = transform2(
    image=image0_transformed,
    image1=image1_transformed,
    bboxes=bboxe0_transformed,
    bboxes1=bboxe1_transformed,
)

image2_transformed = transformed['image']
bboxe2_transformed = transformed['bboxes']

draw_bboxes_on_img(image2_transformed, bboxe2_transformed)
cv2.imwrite('image_transformed.jpg', image2_transformed)

Input images:

Results:

Notes

Notice the images have to be of the same size for this augmentation to work. That is the reason for having the helper augmentation pipeline with LongestMaxSize and PadIfNeeded. MixUp works fine together with mosaic The image size is asserted within the MixUp augmentation and raise a TypeError exception if the images aren't of the same size

@mikel-brostrom mikel-brostrom changed the title added MixUp transform added MixUp augmentation Feb 23, 2023
@mikel-brostrom
Copy link
Author

mikel-brostrom commented Feb 27, 2023

Mixup gives consistent boosts on image classification when using the loss presented in the paper. It also helps on COCO for object detection in the case of large models that tend to overfit. For smaller models this augmentation tends to be detrimental and should be avoided.

@i-aki-y
Copy link
Contributor

i-aki-y commented Mar 2, 2023

@mikel-brostrom I advise making the apply_* functions deterministic. Existing transforms put stochastic operations like np.random.* into get_params or get_params_dependent_on_targets. This practice makes the result reproducible and debugging and testing easy.
And why not use an argument for the alpha (the parameter of the beta function) instead of the hardcoding 32?

@mikel-brostrom
Copy link
Author

mikel-brostrom commented Mar 2, 2023

Thank for the feedback @i-aki-y! Branch updated based on your comments:

  • stochastic operations moved to get_params
  • apply is now deterministic
  • alpha and beta defining the distribution are now input arguments

Any suggestions on how to avoid calling apply twice when having multiple targets @i-aki-y?

@mikel-brostrom mikel-brostrom changed the title added MixUp augmentation WIP: added MixUp augmentation Mar 2, 2023
@i-aki-y
Copy link
Contributor

i-aki-y commented Mar 2, 2023

Albumentation has a mapping list from the argument key to the associated functions in the targets variable:

targets = {
    "image": apply_image,
    "bboxes": apply_bboxes,
    ...
}

The functions specified in the targets will be executed one by one when you apply trasform(image=image, bboxes=bboxes).

You have added new entries into the targets variable by using additional_targets

targets = {
    "image": apply_image,
    "bboxes": apply_bboxes,
    ...
    "image1": apply_image,
    ...
}

This means apply_image will be called twice; the first is for the "image", and the second is for the "image1".
I think this is the expected behavior when the additional_targets is used.

What will happen if you remove the "additional_targets"?

@mikel-brostrom
Copy link
Author

mikel-brostrom commented Mar 2, 2023

What will happen if you remove the "additional_targets"?

Thanks @i-aki-y! That simple change solved it! Updated the usage example

@mikel-brostrom
Copy link
Author

mikel-brostrom commented Mar 2, 2023

I consider this ready for review.

Don't want to steal the spotlight here @i-aki-y but should I put a PR up for Mosaic as well? I implemented it using the same approach as in this. Will you update yours? 😄

@mikel-brostrom mikel-brostrom changed the title WIP: added MixUp augmentation Added MixUp augmentation Mar 2, 2023
@i-aki-y
Copy link
Contributor

i-aki-y commented Mar 3, 2023

I consider this ready for review.

I think you need to consider the case when the input is grayscale len(image.shape) == 2.

Don't want to steal the spotlight here @i-aki-y but should I put a PR up for Mosaic as well? I implemented it using the same approach as in this. Will you update yours? 😄

Sure, you can make your PR.
But now I do not think introducing auxiliary targets like 'image_cache' and 'image1' is a good approach. Making a new Compose class that handles multi and single-image targets seems more flexible, and we can make the API more straightforward.

@mikel-brostrom
Copy link
Author

Making a new Compose class that handles multi and single-image targets seems more flexible

Yup, I agree here

@mikel-brostrom
Copy link
Author

But now I do not think introducing auxiliary targets like 'image_cache' and 'image1' is a good approach

Yes, I read you comment in your MR that is why I though I could upload mine. But is a multi image compose really needed? You can simply have a single target image and several other as input to complete the mosaic right? What could we gain by a multi-image Compose @i-aki-y ?

@i-aki-y
Copy link
Contributor

i-aki-y commented Mar 5, 2023

@mikel-brostrom My mosaic augmentation's PR have some difficulties, and these are two of them:

  1. We can not define the whole transform as a single Compose.
    As you did in the above example, we need to define multiple transforms; one is for preprocessing, and the second is for the mosaic.
    This weakens an advantage defined by the declaration. Ideally, I want to define the following ways:
Compose([
    A.Normalize(),
    A.Resize(),
    A.Mosaic(),
    A.RandomCrop(),
    A.MixUp(),
    ...
])

But it is difficult because the Compose could not know how to handle the additional targets required from Mosaic (and MixUp).

  1. The situation will be more complicated if the transform introduces an additional bboxes target as the Mosaic did.
    The Compose internally applies some pre- and post-processings, but the auxiliary targets introduced by individual transform bypass these operations.
    So the author of such a transform needs to re-implement the same pre- and post-processes for the additional targets. But this is a bad practice because such a code duplication reduces maintainability.
    Even worse、I have no idea how to implement some features implemented in the Compose such as label_fields feature because it is difficult to access the parameter information from the individual transform. The same situation exists for KeypointsParams.

There still exist other minor problems. Anyway, I think I need to extend the Compose to fix issues like those described above.
I have been thinking of this issue for the past few days and have just started writing a PoC. I will let you know when it is ready.

@mikel-brostrom
Copy link
Author

I see, yes, makes sense to have Compose for multi-input / single-output images.

I have been thinking of this issue for the past few days and have just started writing a PoC. I will let you know when it is ready.

I can try it out when it is done 😄

@thiagoribeirodamotta
Copy link

I see, yes, makes sense to have Compose for multi-input / single-output images.

I have been thinking of this issue for the past few days and have just started writing a PoC. I will let you know when it is ready.

I can try it out when it is done 😄

Did you manage to work this PR out with https://github.com/albumentations-team/albumentations/pull/1420 implementation?

Any plans to merge this?

@mikel-brostrom
Copy link
Author

I will pick this up again if the multi-input / single-output PR gets merged. Not worth investing the time of adapting this, if it is not happening

@octavflorescu
Copy link

Cool PR, thank you! i am using it with albumentations==0.5.2

@ternaus
Copy link
Collaborator

ternaus commented Mar 5, 2024

Added in #1549

@ternaus ternaus closed this Mar 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants