Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add JPEG augmentation #8316

Merged
merged 17 commits into from
Mar 18, 2024
Merged

Add JPEG augmentation #8316

merged 17 commits into from
Mar 18, 2024

Conversation

gau-nernst
Copy link
Contributor

@gau-nernst gau-nernst commented Mar 13, 2024

Fixes #8290

cc @vfdev-5

Copy link

pytorch-bot bot commented Mar 13, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/8316

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit ae595be with merge base 2ba586d (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@gau-nernst
Copy link
Contributor Author

@NicolasHug For PIL backend, it's actually very straight-forward to also support other image formats by simply passing format="WEBP" for example. Do you want to make this more generic? Something like

# `format` can be JPEG, WEBP, or whatever PIL support
def compression(image, format, quality):
    ...

# `format` can be a list of string, though the same `quality` value does not mean the same quality across formats.
class Compression(nn.Module)
    def __init__(self, format, quality):
        ...

For PyTorch backend, only JPEG can be supported natively for now, since torchvision only supports JPEG. We can say this is the limitation for PyTorch backend in the docs? (of course we can also try going through PIL for non-JPEG formats)

@gau-nernst
Copy link
Contributor Author

Some questions:

  • Should jpeg() checks if the tensor is on CPU and is uint8? If we don't check, encode_jpeg() will raise error, so that is also fine I think.
  • Do I need to handle torchscript somehow? The tests on my local machine don't fail, so it seems like encode_jpeg() and decode_jpeg() is torchscript-compatible?

@gau-nernst gau-nernst marked this pull request as ready for review March 15, 2024 13:09
Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for the excellent PR with such short notice @gau-nernst ! We really appreciate the high quality contribution.

I have a few comments below but it looks great already. If you're not able to address those before the release branch cut on Monday then no worries, just let me know and I'll do it myself.

To answer a few of your questions above:

For PIL backend, it's actually very straight-forward to also support other image formats by simply passing format="WEBP" for example. Do you want to make this more generic?

We try to provide the same offering for PIL images and tensors, so I think it's preferable to just stick to JPEG here (note that we could also have png, since we have png decoding for tensors). That would make the transition from PIL to tensor backend smoother, and for users who are relying on PIL only, it's not too hard to write a custom transform to achieve that anyway.

Should jpeg() checks if the tensor is on CPU and is uint8? If we don't check, encode_jpeg() will raise error, so that is also fine I think.

Great point - it's fine to rely on encode_jpeg() for the error, but we should clarify these expectations in the docstring

Do I need to handle torchscript somehow? The tests on my local machine don't fail, so it seems like encode_jpeg() and decode_jpeg() is torchscript-compatible?

Yes we need the functional to be torchscript-compatbile. encode_jpeg and decode_jpeg already support torchscript and the tests are passing, so there's nothing more you need to do here :)

torchvision/transforms/v2/_augment.py Outdated Show resolved Hide resolved
@@ -317,3 +317,42 @@ def _transform(self, inpt: Any, params: Dict[str, Any]) -> Any:
return output
else:
return inpt


def _setup_quality(quality: Union[int, Sequence[int]]):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is fairly short and only used in a single place, so let's just inline it in the class'__init__?

torchvision/transforms/v2/_augment.py Outdated Show resolved Hide resolved
torchvision/transforms/v2/_augment.py Outdated Show resolved Hide resolved
torchvision/transforms/v2/_augment.py Outdated Show resolved Hide resolved
test/test_transforms_v2.py Outdated Show resolved Hide resolved

@pytest.mark.parametrize("quality", [5, 75])
@pytest.mark.parametrize("color_space", ["RGB", "GRAY"])
def test_functional_image_correctness(self, quality, color_space):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't hurt to keep this, but it seems mostly a subset of the correctness test in test_transform_image_correctness so maybe we can get rid of it? I'll let you decide on that!

test/test_transforms_v2.py Outdated Show resolved Hide resolved

@_register_kernel_internal(jpeg, torch.Tensor)
@_register_kernel_internal(jpeg, tv_tensors.Image)
@_register_kernel_internal(jpeg, tv_tensors.Video)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency with the other kernels we should expose a jpeg_video kernel that calls into jpeg_image().

torchvision/transforms/v2/functional/_augment.py Outdated Show resolved Hide resolved
gau-nernst and others added 2 commits March 16, 2024 00:23
@gau-nernst
Copy link
Contributor Author

@NicolasHug I fixed the comments you left. If there are still issues, I can work on this tomorrow (midnight in my country now). If you want to merge this PR by today, you can finish it.

We try to provide the same offering for PIL images and tensors, so I think it's preferable to just stick to JPEG here (note that we could also have png, since we have png decoding for tensors).

Understand, I totally agree. From what I understand, PNG encoding is lossless (unless you use limited color palette), so there is no point doing PNG compression as an augmentation. Video-based image encoding, like WebP, might be useful to simulate low bitrate video frames (low bitrate JPEG does not quite look the same), but it will be for future PRs :).

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I'll merge once the CI is green. Thank you so much for the great PR @gau-nernst !

@NicolasHug NicolasHug merged commit 924b162 into pytorch:main Mar 18, 2024
81 checks passed
@gau-nernst gau-nernst deleted the jpeg_augmentation branch March 18, 2024 12:35
facebook-github-bot pushed a commit that referenced this pull request Mar 21, 2024
Reviewed By: vmoens

Differential Revision: D55062770

fbshipit-source-id: 926a1eea4f55cb0b3c1a4f379088c1505ec70479

Co-authored-by: Nicolas Hug <contact@nicolas-hug.com>
Co-authored-by: Nicolas Hug <nh.nicolas.hug@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: done
Development

Successfully merging this pull request may close these issues.

Add Jpeg Compression augmentation
3 participants