Skip to content
This repository has been archived by the owner on Mar 3, 2023. It is now read-only.

Panic on some real-world images at mcu.rs:354:48 #11

Closed
Shnatsel opened this issue Jul 2, 2022 · 18 comments
Closed

Panic on some real-world images at mcu.rs:354:48 #11

Shnatsel opened this issue Jul 2, 2022 · 18 comments

Comments

@Shnatsel
Copy link
Contributor

Shnatsel commented Jul 2, 2022

The attached image triggers the following panic in zune-jpeg:

thread 'main' panicked at 'called `Option::unwrap()` on a `None` value', zune-jpeg/src/mcu.rs:354:48

It happens on 88 images out of ~5500 images I have tested. It seems to be the only panic to occur on this dataset.

1130214.jpg

The code to reproduce it is the same as in #10

Tested on commit 1f92bac

@Shnatsel
Copy link
Contributor Author

Shnatsel commented Jul 2, 2022

Thanks for the quick fix!

@etemesi254
Copy link
Owner

Welcome.

Also while at it, there are some other quality issues I have seen crop up like
#12

While running tests, could you be reading the files and writing them back and maybe opening some to see if there are any defects.

Currently I'm using such a function to see defects.

fn write(in_file:&str,out_file:&str) {
    std::panic::catch_unwind(|| {
        
        let mut d = Decoder::new();
       d.set_num_threads(1).unwrap();
        d.set_output_colorspace(ColorSpace::RGBX);
        
        let x = d.decode_file(in_file).unwrap();
        
         let mut comp =mozjpeg::Compress::new(mozjpeg::ColorSpace::JCS_EXT_RGBX);
    
        comp.set_size(d.width() as usize, d.height() as usize);
        comp.set_mem_dest();
        comp.start_compress();
        
        //replace with your image data
        let pixels = x;
        assert!(comp.write_scanlines(&pixels));

        comp.finish_compress();
        let jpeg_bytes = comp.data_to_vec().unwrap();
        let mut v = OpenOptions::new()
            .write(true)
            .create(true)
            .open(out_file)
            .unwrap();
        v.write_all(&jpeg_bytes).unwrap();

        // write to file, etc.
    })
    .unwrap();
}

which uses mozjpeg crate backed by libjpeg-turbo to write files.

Since you have a larger corpora, you might be able to identify such defects more quickly than I do.

But if it's not possible, it's fine, still appreciate the bug reports

@Shnatsel
Copy link
Contributor Author

Shnatsel commented Jul 2, 2022

Ah, looking for decoding differences is a great idea! It's something I wanted to dabble in with jpeg-decoder as well, but never got around to it.

I think the simplest way to do that would be decoding the file with zune-jpeg and writing it back to disk in some lossless format, like PNG or BMP, and then running imagemagick difference analysis on the result compared to the input jpeg. Imagemagick would decode the original with libjpeg-turbo internally. I've used a similar setup to test resvg once and it worked well.

If you provide me with a snippet that decodes the input JPEG file and writes it to lightly compressed PNG or to BMP, I'll handle the rest.

@Shnatsel
Copy link
Contributor Author

Shnatsel commented Jul 2, 2022

Also I don't think this is actually fixed, I am still seeing a lot of panics, just at mcu.rs:356:48 this time. Here's a larger sample of images triggering this so you can test it on more than one example: unwrap-panics.tar.gz

@etemesi254
Copy link
Owner

One thing to note is that libjpeg-turbo needs to be bit-identical to libjpeg , hence whatever optimizations it does should uphold that.

jpeg was probably not written with multithreading in mind, even if I have it working here it's mainly magic and sacrifices.

Specifically

  1. The Horizontal Vertical(HV) upsampler in this library and libjpeg-turbo will never produce the same data, HV type images are what are commonly found out there.
    This is because libjpeg-turbo uses the whole downsampled image to do up-sampling while I use image chunks.

  2. The color converter is different, with a bias of +1/-1 as opposed to libjpeg turbo.

This affects the output by +2/-2 of libjpeg-turbo with no way of reducing that without the library becoming single threaded

@etemesi254 etemesi254 reopened this Jul 2, 2022
@Shnatsel
Copy link
Contributor Author

Shnatsel commented Jul 2, 2022

Indeed, I am not looking for them to match up perfectly. This is never going to happen with lossy encoding formats anyway.

The way I've done this with resvg was by calculating the similarity score for all the images, then sorting them by similarity, and looking at the most diverging images. That worked great - actual error cases had very low similarity scores compared to the rest, it was easy to tell when to stop looking from the similarity scores alone.

@etemesi254
Copy link
Owner

etemesi254 commented Jul 2, 2022

Also I don't think this is actually fixed, I am still seeing a lot of panics, just at mcu.rs:356:48 this time. Here's a larger sample of images triggering this so you can test it on more than one example: unwrap-panics.tar.gz

Looking into it.

Seems to be an issue with images with odd numbered dimensions.(usually require padding bytes)

@Shnatsel
Copy link
Contributor Author

Shnatsel commented Jul 3, 2022

I have rigged a comparison of image backed by jpeg-decoder against imagemagick using the following code. First, I have a converter from JPEG to PNG using image:

use std::error::Error;

fn main() -> Result<(), Box<dyn Error>> {
    use image::io::Reader as ImageReader;
    let input = std::env::args().nth(1).unwrap();
    let output = std::env::args().nth(2).unwrap();
    let img = ImageReader::open(input)?.decode()?;
    img.save(output)?;
    Ok(())
}

I invoke it from a Linux shell script to convert from whatever the input format is to PNG (BMP would be faster but would lose transparency) and then run the imagmagick compare command against the original:

#!/bin/sh

set -e

input="$1"
output="$(mktemp --tmpdir result_XXXXXXXXXXXXX.png)"
trap "rm -f "$output"" EXIT

target/release/image-convert "$input" "$output" || echo "Failed to decode $input" 1>&2
similarity=$(compare -quiet -metric RMSE "$input[0]" "$output" /dev/null 2>&1) || true
echo "$similarity $input"

Then run it in parallel to speed things up and capture the output:

fd '\.jpe?g' | nice ionice parallel ./image-compare.sh > ~/similarities.txt 2>~/errors.txt

A simple sort -n will then show the most diverging images.

I have only run it on a subset of my corpus so far, but it seems to be holding up surprisingly well. I have tested it for decoding errors quite extensively before, but not for incorrect decoding, and I'm happy to see that it's not happening.

@etemesi254
Copy link
Owner

etemesi254 commented Jul 3, 2022 via email

@etemesi254
Copy link
Owner

etemesi254 commented Jul 3, 2022 via email

@etemesi254
Copy link
Owner

etemesi254 commented Jul 30, 2022

An update.

1.We multi-thread on image decoding, meaning we internally chunk the image(primitively)

The issue is that the chunk can be wrong, sometimes, especially when the image is small and has dimensions not divisible by 8(or 32 depending on the sampling factors).

The solution to this I have is currently

  1. Just allocate bigger space.

The issue with this is that is becomes an overhead for everyone else, because allocating more means that we have a large memory footprint, which bothers everyone (and me especially).

So I'm still currently trying to think of a good solution to this

@Shnatsel
Copy link
Contributor Author

How big the memory overhead will be in practice? Over-allocating up to an extra 32 pixels in each dimension doesn't sound too bad.

@etemesi254
Copy link
Owner

It depends.

We want to over-allocate a whole row with a height of 8-32 pixels and a width equal to it's image.

@Shnatsel
Copy link
Contributor Author

So in bytes that would be, at worst, 32 pixels * 32 bits per pixel * image width. Let's say the image is small, just 64x64 pixels, and we over-allocate it by 50%. That's ~42kB of the base image and an extra ~21kB we've over-allocated.

I don't think anyone's going to notice an extra 21kb in memory usage, and that's at resolutions where the over-allocation overhead is very large - 50%. For larger images it's going to be more like 5% - and even that occurs only on a handful of files!

Besides, if that extra memory is never written to, then the OS will not even provision the memory pages for it. So I'm really not convinced that this kind of overhead is a problem at all.

@etemesi254
Copy link
Owner

I'm not sure if that's also the defined behavior in Windows / Mac OS.

Let's hope it is.

@Shnatsel
Copy link
Contributor Author

This panic can still happen even after the fix, here's an image that triggers it: waterfront.jpg

Tested on commit e34e5bd (latest as of this writing)

etemesi254 added a commit that referenced this issue Jul 31, 2022
@etemesi254
Copy link
Owner

etemesi254 commented Oct 11, 2022 via email

@Shnatsel
Copy link
Contributor Author

As of commit cff242b, none of the files in unwrap-panics.gz cause a panic. Neither does the waterfront.jpg image attached later.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants