-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Image Optimization #5548
Image Optimization #5548
Conversation
*Total -- 5,465.43kb -> 4,908.52kb (10.19%) /skimage/data/palette_gray.png -- 0.91kb -> 0.26kb (71.55%) /skimage/data/horse.png -- 16.24kb -> 6.93kb (57.33%) /skimage/data/astronaut.png -- 773.00kb -> 412.79kb (46.6%) /skimage/data/palette_color.png -- 1.00kb -> 0.60kb (39.74%) /skimage/data/bw_text.png -- 8.15kb -> 5.20kb (36.22%) /skimage/data/chessboard_RGB.png -- 1.10kb -> 0.73kb (33.9%) /doc/source/gitwash/pull_button.png -- 12.59kb -> 8.70kb (30.92%) /doc/source/gitwash/forking_button.png -- 12.79kb -> 8.91kb (30.29%) /skimage/data/phantom.png -- 3.31kb -> 2.34kb (29.33%) /doc/source/skips/_static/skip-flowchart.png -- 12.62kb -> 9.00kb (28.7%) /skimage/data/clock_motion.png -- 57.41kb -> 43.30kb (24.57%) /skimage/data/moon.png -- 49.00kb -> 42.77kb (12.71%) /skimage/data/logo.png -- 175.51kb -> 156.38kb (10.9%) /skimage/data/page.png -- 46.56kb -> 41.62kb (10.62%) /skimage/data/microaneurysms.png -- 4.83kb -> 4.38kb (9.47%) /skimage/data/chelsea.png -- 234.88kb -> 214.01kb (8.88%) /skimage/data/no_time_for_that_tiny.gif -- 4.33kb -> 3.97kb (8.43%) /skimage/data/green_palette.png -- 1.67kb -> 1.55kb (6.74%) /doc/source/gitwash/branch_dropdown.png -- 15.93kb -> 14.99kb (5.9%) /skimage/data/hubble_deep_field.jpg -- 515.57kb -> 487.74kb (5.4%) /skimage/data/coffee.png -- 455.77kb -> 431.74kb (5.27%) /doc/source/user_guide/data/elevation_map.jpg -- 73.92kb -> 70.04kb (5.25%) /skimage/data/retina.jpg -- 263.25kb -> 251.98kb (4.28%) /skimage/data/rocket.jpg -- 109.89kb -> 106.36kb (3.21%) /skimage/data/ihc.png -- 466.71kb -> 456.50kb (2.19%) /doc/source/user_guide/data/denoise_viewer_window.png -- 89.45kb -> 88.20kb (1.4%) /skimage/registration/tests/data/OriginalX75Y75.png -- 60.83kb -> 60.03kb (1.32%) /skimage/registration/tests/data/OriginalX-130Y130.png -- 40.00kb -> 39.48kb (1.32%) /skimage/data/grass.png -- 212.79kb -> 210.01kb (1.3%) /doc/source/themes/scikit-image/static/img/logo.png -- 44.33kb -> 43.76kb (1.28%) /skimage/registration/tests/data/OriginalX130Y130.png -- 40.81kb -> 40.30kb (1.23%) /skimage/registration/tests/data/TransformedX75Y75.png -- 58.84kb -> 58.13kb (1.2%) /skimage/data/motorcycle_left.png -- 629.59kb -> 622.69kb (1.1%) /skimage/data/motorcycle_right.png -- 625.36kb -> 618.77kb (1.05%) /skimage/registration/tests/data/TransformedX-130Y130.png -- 43.19kb -> 42.74kb (1.04%) /skimage/data/coins.png -- 74.05kb -> 73.31kb (1%) /skimage/registration/tests/data/TransformedX130Y130.png -- 39.58kb -> 39.32kb (0.64%) /skimage/data/gravel.png -- 189.69kb -> 188.99kb (0.37%) Signed-off-by: ImgBotApp <ImgBotHelp@gmail.com>
[ImgBot] Optimize images
Thanks @vardaan-raj. We should not modify any images under the The majority of the image size in this project in terms of the website are the images that get generated for the gallery examples at runtime during the documentation builds. We do have image compression (via optipng) running there (see #4800). |
I was surprised this passes tests, but it is because the checks here: scikit-image/skimage/data/__init__.py Line 210 in 4b1986e
and here: scikit-image/skimage/data/__init__.py Line 220 in 4b1986e
Do not raise an error if the file is found, but the SHA256 does not match. It will instead just move onto the next source, eventually pulling the file from the |
I agree with @grlee77 for JPEG which is inherently lossy so any re-compression or transcoding will produce artifacts or subtle changes. There are a few .jpg files in here which should be reverted. However, the majority of these are PNG which is an inherently lossless format which uses lossless compression under the hood (like 7zip or FLAC, not like MP3). For PNGs it is safe to experiment with compression. Though it would require changing the checksums, swapping in a more optimally compressed PNG file yields the exact same data - and this should be simple to verify by loading them in pairs. For just the PNG files here, could you sum the pre/post size @vardaan-raj to understand the potential space saved? A brief description of the PNG compression technique used would also be appreciated. |
If you can just go to the above link you can see that the image file size has been reduced by 10% and to check for individual size reduction of each image just click on details and you can see a chart showing the initial sizes along with the new compressed sizes. |
The total image size was initially 5,465.43kb and after compression, it came down to 4,908.52kb, a 10.19% reduction |
In one case a PNG shrank by > 70% and a number of others were > 30% which is not negligable. @JDWarner's suggestion sounds good to me: Let me know if the above is not clear or you need help with either of these suggestions |
Only PNG files have been optimised by img-bot |
How do I update the checksums? |
How do I update the checksums? On linux I went to the
and for the files in
You would need to change the existing dictionary entries in this variable to match the new values: scikit-image/skimage/data/_registry.py Line 43 in 4b1986e
If there were hundreds of values to update it would probably be worth writing a custom script to help with this, but it is probably easiest in this case to just paste the values from the terminal into the file individually. |
Hello @vardaan-raj! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:
Comment last updated at 2021-08-31 05:21:42 UTC |
I have just updated the registry to include the new hash values for the modified png images, can you help as to why the checks are failing?? |
I have updated each individual modified png file in the registry with the new checksum value |
Can someone please update me on what to do next?? |
many lines containing SHA256 values exceed 80 characters. It will be less readable if we break up the lines.
see: https://pillow.readthedocs.io/en/stable/handbook/concepts.html#modes skimage/io/test_pil.py's test_palette_is_gray requires this file have mode P, not L
There are a mixture of real failures here and ones that are due to the fetcher comparing hashes in this PR to data that was fetched from main. Those SHA256 errors will resolve themselves if we merge. There is one file, However, there are also a couple of more subtle errors here!Specifically the "mode" of several of the PNG files (see Pillow docs on mode) has changed. Two of these changes cause test suite failures: Those are the only two mode changes that cause a test suite error, but there are others that got changed from RGB to type L or 1 as well: measured mode changes
I am not sure we should be making any of those changes even if they are not causing a test suite error. Modes were determined via navigating to the data folder and running the following script: from glob import glob
from PIL import Image
img_files = sorted(glob('*.png') + glob('*.gif'))
for f in img_files:
print(f, Image.open(f).mode) |
Fix test suite failures after image compression
I have updated the sha256 for no_time_for_that_tiny.gif in the registry |
*Total -- 176.42kb -> 156.64kb (11.21%) /skimage/data/palette_gray.png -- 0.91kb -> 0.26kb (71.55%) /skimage/data/logo.png -- 175.51kb -> 156.38kb (10.9%) Signed-off-by: ImgBotApp <ImgBotHelp@gmail.com>
[ImgBot] Optimize images
What's the next step? |
Revert "[ImgBot] Optimize images"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is okay now. My only question is if we should be even more conservative and not change any of the files under skimage/data.
This one will not pass the CI tests given how the data gets pulled from main
so the SHA256 is going to mismatch until we merge this.
I have approved it now. Let's see if we can get a review from another maintainer before merging. |
I don't think we can make this change, unless we keep the old copies in the repository as well (otherwise older versions of scikit-image won't be able to fetch their data). So, as much as I appreciate the idea behind this PR, I think we'll have to close. @grlee77 Happy to discuss if you think this is too conservative. |
Hmm, fetch points to the version-specific branch. Those release branches would not be modified by this, so it should still work after the change. I guess I did not consider what the behavior was prior to the introduction of pooch, though? Is that what you are referring to? Overall, agree that this may not be worth the risk of breaking things. |
@grlee77 If you think the change is worth it, that it won't cause problems (I was thinking of the registry itself, but sounds like that's not a real concern), and you'd like to shepherd it through, please go for it. |
I have simply optimized the images by compressing their size without compromising their quality to make them more space efficient