Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

to_luma methods can have floating point issues #1214

Closed
CrackedP0t opened this issue May 10, 2020 · 7 comments · Fixed by #1651
Closed

to_luma methods can have floating point issues #1214

CrackedP0t opened this issue May 10, 2020 · 7 comments · Fixed by #1651

Comments

@CrackedP0t
Copy link

CrackedP0t commented May 10, 2020

In the rgb_to_luma and bgr_to_luma methods, the floating point arithmetic can result in incorrect results. For example:

use image::Pixel;

fn main() {
    let pixel = image::Rgb::from([13, 13, 13]);
    
    println!("{:?}", pixel.to_luma()); // Prints `Luma([12])`; should print `Luma([13])`
}

This is because in rgb_to_luma, l is equal to 12.99999904632568359375 before being converted back to a u8.

@CrackedP0t
Copy link
Author

CrackedP0t commented May 10, 2020

I've come up with two ways to fix this: one that keeps using floating point math, and one that uses integers.

The first just rounds the luma value up before converting it to its original type:

const SRGB_LUMA: [f32; 3] = [0.2126, 0.7152, 0.0722];

#[inline]
fn rgb_to_luma<T: Primitive>(rgb: &[T]) -> T {
    let l = SRGB_LUMA[0] * rgb[0].to_f32().unwrap()
        + SRGB_LUMA[1] * rgb[1].to_f32().unwrap()
        + SRGB_LUMA[2] * rgb[2].to_f32().unwrap();

    NumCast::from(l.round()).unwrap()
}

The second instead uses integer math, avoiding all the jank that comes with floats as well as running faster (I benchmarked it as running about twice as fast). The only primitives that are used in pixels are u8s and u16s, and both will fit inside the u32 when multiplied in this way.

const SRGB_LUMA: [u32; 3] = [2126, 7152, 722];
const SRGB_LUMA_DIV: u32 = 10000;

#[inline]
fn rgb_to_luma<T: Primitive>(rgb: &[T]) -> T {
    let l = SRGB_LUMA[0] * rgb[0].to_u32().unwrap()
        + SRGB_LUMA[1] * rgb[1].to_u32().unwrap()
        + SRGB_LUMA[2] * rgb[2].to_u32().unwrap();
    NumCast::from(l / SRGB_LUMA_DIV).unwrap()
}

I'd love feedback on these ideas!

@HeroicKatora
Copy link
Member

I don't like the use of floating point multiplication here as well. A fraction based-approach would definitely make the finer details more obvious and avoid these subtle inconsistencies. (I am reminded of how unexpectedly difficult it is to find the mid-point of an interval). It would certainly be possible to do (more) precise arithmetic with floating point here as well. However, considering a) the extra observation that it already is slower than an integer method and b) that there is no color space information and this transformation c) that it was not intended to be 100% precise, there specialized libraries such as palette, this tradeoff does not seem worth it.

@HeroicKatora
Copy link
Member

HeroicKatora commented May 10, 2020

One problem I can see with the integer idea: The function is templated over arbitrary T: Primitive and indeed this parameter is exposed when it is used in the implementation of FromColor<Rgb<T>> for Luma<T>. In particular, the instantiation T = f32 is currently valid. However the integer implementation would panic on some inputs (NaN and Inf) and produce results with much less accuracy.

In the long term this may be solved with specialization, or by implementing the trait only for a small internal set of type parameters, but in the short term both of these are infeasible. The former requires nightly, the latter is not possible in a point release in any case. It's something to keep in mind for the next major release where we plan to address color spaces in some capacity.

Which leaves us with the rounding solution or another form of floating point precision correction.

@aschampion
Copy link
Contributor

I'm also unhappy with the float conversion here, but settled for that for the reasons @HeroicKatora mentions to do with needing to support different primitive types and depths in the absence of specialization. If we can find a trick to avoid it for integral types, that'd be great.

A problem with the rounding solution for float types: conventionally in images, floats are in [0, 1]. Rounding would binarize the output.

@kaj
Copy link
Member

kaj commented May 10, 2020

Back in #720 i provided an Enlargeable type for doing integer math on pixel values, so that <u8 as Enlargeable>::Larger would resolve to u32 and <u32 as Enlargeable>::Larger would be u64. Maybe that would be useful here as well? Enlargeable could get a f32 implementation, where Larger could be f64.

Addendum: I'm not sure Enlargeable is a good name for that type. What it really is is a means to accumulate a few hundred instances of the base type.

kaj added a commit to kaj/image that referenced this issue May 11, 2020
This is an attempt to fix image-rs#1214, using the `Enlargeable` trait to
select a suitable type for the calculation in `rgb_to_luma` and
`bgr_to_luma`.

Integer pixel types will use suitable integer pixel types for the
calculation, while floating point pixel types will use floating point
types.

This PR also provides the `Enlargeable` trait for all `Primitive`
types (luckily, `Primitive` is not implemented for 128 bit numeric
types).
@kaj
Copy link
Member

kaj commented May 11, 2020

@CrackedP0t , is the benchmark you used in the repository? I would be interested to see if #1215 is as fast as the code you tested for integer pixel types (and no slower than the current code for floating-point pixel types).

@CrackedP0t
Copy link
Author

Here's a Gist with the code I used

With this benchmark, the old float method takes ~70 ms, and both the int and enlarge methods take ~20 ms, with the enlarge method usually running a few ms faster.

@HeroicKatora HeroicKatora added this to To do in Version 0.24 Feb 13, 2021
HeroicKatora pushed a commit to HeroicKatora/image that referenced this issue Jan 12, 2022
This is an attempt to fix image-rs#1214, using the `Enlargeable` trait to
select a suitable type for the calculation in `rgb_to_luma` and
`bgr_to_luma`.

Integer pixel types will use suitable integer pixel types for the
calculation, while floating point pixel types will use floating point
types.

This PR also provides the `Enlargeable` trait for all `Primitive`
types (luckily, `Primitive` is not implemented for 128 bit numeric
types).

The `Color` implementation of all library-defined structs is then
restricted to a subset of Primitive channel types. This is necessary to
allow its conversion operations to utilize the private `Enlargeable`.
Notably however we _relax_ the type itself to not utilize any trait
bound. It's a simple wrapper around an array now.
Version 0.24 automation moved this from To do to Done Jan 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Version 0.24
  
Done
4 participants