Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use libjpeg-turbo in CI instead of libjpeg #5941

Closed
wants to merge 9 commits into from

Conversation

NicolasHug
Copy link
Member

@NicolasHug NicolasHug commented May 4, 2022

This PR makes our CI rely on libjpeg-turbo insteaf of libjpeg. The main benefit of libjpeg-turbo is decoding speed, but this will also allow us to clean up our tests and make them more robust.

Note: This PR only concerns the CI tests, not the packaging for conda or PyPI. I'll look into that in a separate PR.

We had numerous issues and headaches in the past because of differences between PIL and torchvision and their underlying implementation (libjpeg vs libjpeg-turbo), e.g. #3913, #5910 or #5162.

PIL has been shipped on libjpeg-turbo for a while on Windows python-pillow/Pillow#3833 (comment), and since PIL 9, the linux and MacOS wheels are now also linked against turbo.

Shipping with libjpeg-turbo instead of libjpeg will allow us to be fully aligned with PIL, and greatly simplify / cleanup our tests, which had become a bit messy (see the amount of removed code in this PR). Note that internally, we already rely on turbo.

Closes #5184
Closes #5162
Closes #3913
Fixes #5910

@NicolasHug NicolasHug marked this pull request as draft May 4, 2022 13:56
@pmeier
Copy link
Collaborator

pmeier commented May 4, 2022

If this works, we can probably close #5184, correct?

@NicolasHug
Copy link
Member Author

yup

@NicolasHug
Copy link
Member Author

NicolasHug commented May 4, 2022

CI looks happy so far (looks red but the relevant tests passed). @andfoy @fmassa do you remember by any chance why we didn't rely on libjpeg-turbo from the start? Perhaps it's because at the time, PIL was not linked against libjpeg-turbo?

@andfoy
Copy link
Contributor

andfoy commented May 4, 2022

Yes indeed, that was the reason

Copy link
Collaborator

@pmeier pmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since CI seems to be happy, I think this is good to go. Thanks Nicolas!

test/test_image.py Outdated Show resolved Hide resolved
@NicolasHug NicolasHug changed the title [WIP] Use libjpeg-turbo instead of libjpeg Use libjpeg-turbo instead of libjpeg May 5, 2022
@NicolasHug NicolasHug changed the title Use libjpeg-turbo instead of libjpeg Use libjpeg-turbo in CI instead of libjpeg May 5, 2022
@NicolasHug NicolasHug marked this pull request as ready for review May 5, 2022 08:59
datumbox
datumbox previously approved these changes May 9, 2022
Copy link
Contributor

@datumbox datumbox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@@ -1,13 +1,14 @@
channels:
- pytorch
- defaults
- conda-forge
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @malfet as previously you've tried to phase out conda-forge from TorchVision.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two issues I'm slightly concerned about:

  • Would it mean that we are testing something different then what we ship?
  • This might unintentionally pull in different dependencies, that we think exist in conda, but they are not.

If this is only needed for testing, we can build and host libjpeg-turbo for all the platforms we care about in pytorch or pytorch-nightly channels

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your feedback Nikita

Would it mean that we are testing something different then what we ship?

Yes, but only until #5951 is merged. #5951 's goal is to ship torchvision with libjpeg turbo.

It's also not as bad as it sounds: it's fair at this point to assume that libjpeg-turbo's libjpeg.h is compatible with that of libjpeg. In terms of decoding results, they're both jpeg-compliant as well.

If this is only needed for testing, we can build and host libjpeg-turbo for all the platforms we care about in pytorch or pytorch-nightly channels

Ultimately we don't want this just for testing: we also want to ship torchvision with libjpeg in #5951 . But if we're happy to host libjpeg-turbo in the pytorch channel, then this might make all this much easier. WDYT @malfet ?

Comment on lines +100 to +102
if mode == ImageReadMode.GRAY:
abs_mean_diff = (img_ljpeg.type(torch.float32) - img_pil).abs().mean().item()
assert abs_mean_diff < 1
Copy link
Member Author

@NicolasHug NicolasHug May 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not too sure why this happens TBH. Maybe there are some slight differences in our decoding C++ code? But this is not a regression, in fact we're now making this check much stricter than it previously was.

@datumbox
Copy link
Contributor

datumbox commented May 9, 2022

If we merge this, we should test that it has no big effects on the model accuracies. We don't need to test all the models, a sample of 2-3 cases will do.

@NicolasHug
Copy link
Member Author

NicolasHug commented May 9, 2022

@datumbox I think you meant to comment on #5951? This PR should have zero impact on user-facing features

@datumbox
Copy link
Contributor

datumbox commented May 9, 2022

@NicolasHug Yes, I meant to comment on the other PR. :( Sorry for the confusion. Should I post it again on the other one or we are good?

@datumbox datumbox dismissed their stale review September 26, 2022 10:41

Minor changes detected in accuracy, needs a more discussion.

@datumbox
Copy link
Contributor

@NicolasHug I ran a few benchmark checks to see what would be the effect of this change. I do factor in the argument that PIL has changed the backend on 9.x, but I think we should measure carefully the effect on existing pre-trained models, the speed improvements that we get by switching and potential alternative approaches. This is one of these changes that are trivial to make on the code side but might have effects that need to be investigated.

Accuracy Benchmarks

First some good news. The effect on the models is detectable but very small. Here is how the model accuracies change with JPEG and JPEG-turbo:

ResNet50_Weights.IMAGENET1K_V1:
JPEG: Acc@1 76.130 Acc@5 92.864
JPEG-turbo: Acc@1 76.148 Acc@5 92.876

ResNet50_Weights.IMAGENET1K_V2:
JPEG: Acc@1 80.854 Acc@5 95.434
JPEG-turbo: Acc@1 80.844 Acc@5 95.436

MobileNet_V3_Large_Weights.IMAGENET1K_V1:
JPEG: Acc@1 74.044 Acc@5 91.322
JPEG-turbo: Acc@1 74.056 Acc@5 91.318

The above were executed on a single gpu and batch=1 to minimize variations. From the tests that your PR is removing, we know that there are differences between the two implementations (noted also on PIL release notes you reference) but in practice it's mostly noise. It would be worth to check other model families such as Object Detection, Semantic Segmentation and Optical Flow to see if any of them is more sensitive to the change. If the differences continue being so small and the speed gains are significant, I think there is a compelling case to introduce this minor BC-breaking change. Perhaps another argument for doing so is the stability we will get across platforms on our unit-tests and the ability to compare easier the results of PIL vs TorchVision on the reading of images.

Alternative approaches

There are alternative routes we can take, some of which are temporary. None of them are "free" and we should consider them only if there is massive effect on the accuracy of the existing models:

  1. We can temporarily patch the pil_loader() method on our Datasets (until Datasets V2 is stable), to remove PIL's jpeg-decoding from the equation and use TorchVision's read_image() which relies on LibJPEG. Here is a quick and dirty patch for this that I tested and works as expected, giving identical accuracy.
index 40d5e26d..982dd2ba 100644
--- a/torchvision/datasets/folder.py
+++ b/torchvision/datasets/folder.py
@@ -243,9 +243,15 @@ IMG_EXTENSIONS = (".jpg", ".jpeg", ".png", ".ppm", ".bmp", ".pgm", ".tif", ".tif
 
 def pil_loader(path: str) -> Image.Image:
     # open path as file to avoid ResourceWarning (https://github.com/python-pillow/Pillow/issues/835)
+    """
     with open(path, "rb") as f:
         img = Image.open(f)
         return img.convert("RGB")
+    """
+    from torchvision.io.image import read_image
+    from torchvision.transforms.functional import to_pil_image
+    img = read_image(path)
+    return to_pil_image(img).convert("RGB")

It's worth noting that this method is not used by most of the datasets right now (they opt for doing Image.open(f).convert("RGB") directly) but in this scenario we could easily change that without BC-breaking problems. This only partially mitigates the issue as users who read images directly with PIL (outside of our datasets) would still be affected. At least this gives us an option to ensure that TorchVision doesn't break BC on the components it controls.

  1. Pin PIL temporarily to a 8.x version and reach out to them and discuss options (potentially offer a way to switch backends). We can investigate options to address dependency issues that will be caused by this.

  2. Do nothing to mitigate the issue on PIL, flag the problem to our users and push them to use the Tensor Transforms backend while ensuring that the accuracy will be identical if they switch to it. This might mean that we would have to aggressively align Tensor transforms with PIL in any difference (for example antialiasing=True) which is also an issue of its own and needs to be discussed.

I hope that none of the above will be necessary and that the difference on accuracy will remain minor, allowing us to get away with this minor BC breakage. I would also be OK to merge a PR like this immediately after the upcoming release to give a lot of time to users to detect and flag potential issues. It's highly likely we would need to rerun inference jobs for all models to correct the meta-data and documentation, so this needs to be factored in on the amount of work that this switch entails. It might be worth starting a project doc or an RFC on Github to record all these (instead of having these discussions spread across PRs and issues) to keep track of what we need to do. Happy to chat more on this.

@NicolasHug
Copy link
Member Author

Thanks a lot for the benchmarks @datumbox . Just sharing initial thoughts below:

Here is how the model accuracies change with JPEG and JPEG-turbo

Could you share the exact setup you used to compare libjpeg and libjpeg-turbo? Did you rely on decode_jpeg(), or did you instead compare PIL8 vs PIL9? I'm wondering if PIL8 vs PIL9 may flag some differences that come e.g. from a difference in transforms results, on top of the different decoders. I guess the transforms should always give the same results across PIL versions, but who knows?

We can temporarily patch the pil_loader() method on our Datasets (until Datasets V2 is stable), to remove PIL's jpeg-decoding from the equation and use TorchVision's read_image() which relies on LibJPEG

Unfortunately, our read_image() (and particularly decode_jpeg()) isn't as complete as PIL's decoders. For example we currently don't support decoding CMYK -> RGB jpegs, which makes it impossible to decode some of the ImageNet samples #6538 with read_image()

@datumbox
Copy link
Contributor

@NicolasHug

Could you share the exact setup you used to compare libjpeg and libjpeg-turbo?

I tried to isolate the effects of PIL versions as follows:

  1. I used the patch to completely circumvent PIL and linked against different JPEG backends. So differences on PIL versions shouldn't factor in here.
  2. I compared models (MobileNetV3) for which I know the full training history and was done prior the PIL 9 changes. The accuracy reported on our release notes and meta-data matches what I get. So this way I know that the old PIL backend aligns extremely close with our IO. Worth noting that we had flaky tests to check this, so historically prior the backend change, we ensured we were aligned.

From the above, I'm fairly confident that our IO read_image() closely aligns with PIL v8 and that my benchmarks isolate only the effect of JPEG vs JPEG-turbo. If you spot anything weird in my logic let me know.

don't support decoding CMYK -> RGB jpegs

That's a good call out. There are probably no CMYK images on the validation split and that's why I didn't get an error during inference. I think, CMYK is something we should look into doing eventually; I had put a TODO when I worked on the read_image() API with some references and it didn't look like too hard at the time but we didn't have the time to do it. This might mean that if we get a substantial accuracy difference in any of the models and we have to go with the 1st alternative option proposed above, we would have to do more work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

fix JPEG reference tests Fix jpeg encoding tests on windows
6 participants