Panic on some real-world images at mcu.rs:354:48 #11
Comments
Thanks for the quick fix! |
Welcome. Also while at it, there are some other quality issues I have seen crop up like While running tests, could you be reading the files and writing them back and maybe opening some to see if there are any defects. Currently I'm using such a function to see defects. fn write(in_file:&str,out_file:&str) {
std::panic::catch_unwind(|| {
let mut d = Decoder::new();
d.set_num_threads(1).unwrap();
d.set_output_colorspace(ColorSpace::RGBX);
let x = d.decode_file(in_file).unwrap();
let mut comp =mozjpeg::Compress::new(mozjpeg::ColorSpace::JCS_EXT_RGBX);
comp.set_size(d.width() as usize, d.height() as usize);
comp.set_mem_dest();
comp.start_compress();
//replace with your image data
let pixels = x;
assert!(comp.write_scanlines(&pixels));
comp.finish_compress();
let jpeg_bytes = comp.data_to_vec().unwrap();
let mut v = OpenOptions::new()
.write(true)
.create(true)
.open(out_file)
.unwrap();
v.write_all(&jpeg_bytes).unwrap();
// write to file, etc.
})
.unwrap();
} which uses Since you have a larger corpora, you might be able to identify such defects more quickly than I do. But if it's not possible, it's fine, still appreciate the bug reports |
Ah, looking for decoding differences is a great idea! It's something I wanted to dabble in with I think the simplest way to do that would be decoding the file with If you provide me with a snippet that decodes the input JPEG file and writes it to lightly compressed PNG or to BMP, I'll handle the rest. |
Also I don't think this is actually fixed, I am still seeing a lot of panics, just at |
One thing to note is that libjpeg-turbo needs to be bit-identical to libjpeg , hence whatever optimizations it does should uphold that. jpeg was probably not written with multithreading in mind, even if I have it working here it's mainly magic and sacrifices. Specifically
This affects the output by +2/-2 of libjpeg-turbo with no way of reducing that without the library becoming single threaded |
Indeed, I am not looking for them to match up perfectly. This is never going to happen with lossy encoding formats anyway. The way I've done this with |
Looking into it. Seems to be an issue with images with odd numbered dimensions.(usually require padding bytes) |
I have rigged a comparison of use std::error::Error;
fn main() -> Result<(), Box<dyn Error>> {
use image::io::Reader as ImageReader;
let input = std::env::args().nth(1).unwrap();
let output = std::env::args().nth(2).unwrap();
let img = ImageReader::open(input)?.decode()?;
img.save(output)?;
Ok(())
} I invoke it from a Linux shell script to convert from whatever the input format is to PNG (BMP would be faster but would lose transparency) and then run the imagmagick #!/bin/sh
set -e
input="$1"
output="$(mktemp --tmpdir result_XXXXXXXXXXXXX.png)"
trap "rm -f "$output"" EXIT
target/release/image-convert "$input" "$output" || echo "Failed to decode $input" 1>&2
similarity=$(compare -quiet -metric RMSE "$input[0]" "$output" /dev/null 2>&1) || true
echo "$similarity $input" Then run it in parallel to speed things up and capture the output: fd '\.jpe?g' | nice ionice parallel ./image-compare.sh > ~/similarities.txt 2>~/errors.txt A simple I have only run it on a subset of my corpus so far, but it seems to be holding up surprisingly well. I have tested it for decoding errors quite extensively before, but not for incorrect decoding, and I'm happy to see that it's not happening. |
While looking at jpeg-decoder code
It sticks a lot to libjpeg-turbo, even the simd implementations were done
by someone who has probably looked into libjpeg-turbo.(which is some good
work according to me).
So I'd expect this
If it's decoding it would be decoding correctly
Another thing you can try is different image configs
Like using cjpeg tool to test for similarities
I.e start with a BMP file, change sampling factors in the command line
using `cjpeg -sample (value)`, (valid values are 1x1,2x2 and 4x2 , not so
sure, on top of my head) and then decode those as you are and see the
similarity score, should test up-sampling algorithms.
…On Sun, 3 Jul 2022, 22:31 Shnatsel, ***@***.***> wrote:
I have rigged a comparison of image backed by jpeg-decoder against
imagemagick using the following code. First, I have a converter from JPEG
to PNG using image:
use std::error::Error;
fn main() -> Result<(), Box<dyn Error>> {
use image::io::Reader as ImageReader;
let input = std::env::args().nth(1).unwrap();
let output = std::env::args().nth(2).unwrap();
let img = ImageReader::open(input)?.decode()?;
img.save(output)?;
Ok(())
}
I invoke it from a Linux shell script to convert from whatever the input
format is to PNG (BMP would be faster but would lose transparency) and then
run the imagmagick compare command against the original:
#!/bin/sh
set -e
input="$1"
output="$(mktemp --tmpdir result_XXXXXXXXXXXXX.png)"trap "rm -f "$output"" EXIT
target/release/image-convert "$input" "$output" || echo "Failed to decode $input" 1>&2
similarity=$(compare -quiet -metric RMSE "$input[0]" "$output" /dev/null 2>&1) || trueecho "$similarity $input"
Then run it in parallel to speed things up and capture the output:
fd '\.jpe?g' | nice ionice parallel ./image-compare.sh > ~/similarities.txt 2>~/errors.txt
A simple sort -n will then show the most diverging images.
I have only run it on a subset of my corpus so far, but it seems to be
holding up surprisingly well. I have tested it for decoding errors quite
extensively before, but not for incorrect decoding, and I'm happy to see
that it's not happening.
—
Reply to this email directly, view it on GitHub
<#11 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AFZRVE3ZPFFHUPDSPZ44FRDVSHS7JANCNFSM52PPCPBQ>
.
You are receiving this because you modified the open/close state.Message
ID: ***@***.***>
|
This is a good idea.
I should probably be doing the same and try to make it a CI script.
The only issue is that I'm really tied up at the moment hence fixes and
improvements happen over the weekend
…On Sun, 3 Jul 2022, 22:39 Caleb Etemesi, ***@***.***> wrote:
While looking at jpeg-decoder code
It sticks a lot to libjpeg-turbo, even the simd implementations were done
by someone who has probably looked into libjpeg-turbo.(which is some good
work according to me).
So I'd expect this
If it's decoding it would be decoding correctly
Another thing you can try is different image configs
Like using cjpeg tool to test for similarities
I.e start with a BMP file, change sampling factors in the command line
using `cjpeg -sample (value)`, (valid values are 1x1,2x2 and 4x2 , not so
sure, on top of my head) and then decode those as you are and see the
similarity score, should test up-sampling algorithms.
On Sun, 3 Jul 2022, 22:31 Shnatsel, ***@***.***> wrote:
> I have rigged a comparison of image backed by jpeg-decoder against
> imagemagick using the following code. First, I have a converter from JPEG
> to PNG using image:
>
> use std::error::Error;
> fn main() -> Result<(), Box<dyn Error>> {
> use image::io::Reader as ImageReader;
> let input = std::env::args().nth(1).unwrap();
> let output = std::env::args().nth(2).unwrap();
> let img = ImageReader::open(input)?.decode()?;
> img.save(output)?;
> Ok(())
> }
>
> I invoke it from a Linux shell script to convert from whatever the input
> format is to PNG (BMP would be faster but would lose transparency) and then
> run the imagmagick compare command against the original:
>
> #!/bin/sh
> set -e
>
> input="$1"
> output="$(mktemp --tmpdir result_XXXXXXXXXXXXX.png)"trap "rm -f "$output"" EXIT
>
> target/release/image-convert "$input" "$output" || echo "Failed to decode $input" 1>&2
> similarity=$(compare -quiet -metric RMSE "$input[0]" "$output" /dev/null 2>&1) || trueecho "$similarity $input"
>
> Then run it in parallel to speed things up and capture the output:
>
> fd '\.jpe?g' | nice ionice parallel ./image-compare.sh > ~/similarities.txt 2>~/errors.txt
>
> A simple sort -n will then show the most diverging images.
>
> I have only run it on a subset of my corpus so far, but it seems to be
> holding up surprisingly well. I have tested it for decoding errors quite
> extensively before, but not for incorrect decoding, and I'm happy to see
> that it's not happening.
>
> —
> Reply to this email directly, view it on GitHub
> <#11 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AFZRVE3ZPFFHUPDSPZ44FRDVSHS7JANCNFSM52PPCPBQ>
> .
> You are receiving this because you modified the open/close state.Message
> ID: ***@***.***>
>
|
An update. 1.We multi-thread on image decoding, meaning we internally chunk the image(primitively) The issue is that the chunk can be wrong, sometimes, especially when the image is small and has dimensions not divisible by 8(or 32 depending on the sampling factors). The solution to this I have is currently
The issue with this is that is becomes an overhead for everyone else, because allocating more means that we have a large memory footprint, which bothers everyone (and me especially). So I'm still currently trying to think of a good solution to this |
How big the memory overhead will be in practice? Over-allocating up to an extra 32 pixels in each dimension doesn't sound too bad. |
It depends. We want to over-allocate a whole row with a height of 8-32 pixels and a width equal to it's image. |
So in bytes that would be, at worst, 32 pixels * 32 bits per pixel * image width. Let's say the image is small, just 64x64 pixels, and we over-allocate it by 50%. That's ~42kB of the base image and an extra ~21kB we've over-allocated. I don't think anyone's going to notice an extra 21kb in memory usage, and that's at resolutions where the over-allocation overhead is very large - 50%. For larger images it's going to be more like 5% - and even that occurs only on a handful of files! Besides, if that extra memory is never written to, then the OS will not even provision the memory pages for it. So I'm really not convinced that this kind of overhead is a problem at all. |
I'm not sure if that's also the defined behavior in Windows / Mac OS. Let's hope it is. |
This panic can still happen even after the fix, here's an image that triggers it: waterfront.jpg Tested on commit e34e5bd (latest as of this writing) |
Out of those images sent as unwrap-panics.gz, howare there any failing in
that archive?
…On Sat, 30 Jul 2022, 21:21 Shnatsel, ***@***.***> wrote:
This panic can still happen even after the fix, here's an image that
triggers it: waterfront.jpg
<https://user-images.githubusercontent.com/291257/181936561-504c768e-75a5-45d7-b5fa-75bec3ddb9ae.jpg>
Tested on commit e34e5bd
<e34e5bd>
(latest as of this writing)
—
Reply to this email directly, view it on GitHub
<#11 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AFZRVE5QHR3NCEOLK3R33XLVWVXDRANCNFSM52PPCPBQ>
.
You are receiving this because you modified the open/close state.Message
ID: ***@***.***>
|
As of commit cff242b, none of the files in |
The attached image triggers the following panic in zune-jpeg:
It happens on 88 images out of ~5500 images I have tested. It seems to be the only panic to occur on this dataset.
1130214.jpg
The code to reproduce it is the same as in #10
Tested on commit 1f92bac
The text was updated successfully, but these errors were encountered: