New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix cmyk_to_rgb causing off by one rounding errors. #1136
Conversation
Running cmyk_to_rgb(&[0, 0, 0, 254]) produces the RGB color [0, 0, 0] when it should in fact produce RGB [1, 1, 1]: R = 255 * (1 - C/255) * (1-K/255) R = 255 * (1-0/255) * (1-254/255) R = 255 * 1 * 1/ 255 = 1. This bug occurs with a range of other values as well - 159 / 255 of the values [0, 0, 0, 0..255] are affected. The fix involves reducing the number of multiplications and divisions, and also yields a ~25% speedup (results are from randomly generated images). ``` 250 px Orig took 0.0000020s Fast took 0.0000015s 2500 px Orig took 0.0000154s Fast took 0.0000121s 25000 px Orig took 0.0002961s Fast took 0.0002334s 250000 px Orig took 0.0017928s Fast took 0.0014527s 2500000 px Orig took 0.0255831s Fast took 0.0213677s ```
If it's a non-breaking change, it should be against |
Sorry - that's a pretty obvious feature to have, and I should've checked before closing and opening a new PR. I do think clarifying the purpose of master/next in the Contributors section would be helpful for future contributors. I'm not sure if you would consider this change as breaking, since it does change output for certain values (from incorrect results to correct results). |
Ah, 'breaking' as in a SemVer sense of changes to the interface. I don't think there are many value changes that would be considered breaking unless there was an explicit promise of additional guarantees in the documentation. (Maybe in exceptional circumstances such as format detection). Fixing the correctness of color conversion is not. |
It's also possible to further optimize the code like so, but the difference is marginal (~ 1-5% on my computer) and requires specifying "target-cpu=native", otherwise performance regressions occur.
|
I wanted to remark about how we might want to do the fix now, and performance later but then it occurred to me: Can we not avoid going through floating point entirely and stick to precise integer arithmetic? Your reformulation of the conversion should make it much more clear how to avoid the floating point steps by treating all intermediate computations as fractional arithmetic over the fixed denominator (e.g. |
Regardless: Can you push some tests and the benchmark as a separate commit so one can verify before and after the change? |
This test checks all possible colours produced by CYMK (in one channel), and checks that the RGB conversion is correct.
This is equivalent, but in general, division is slower than multiplication for all types, which is what makes my solution a lot faster (uses 1 div per step instead of 4), while the multiplies are pretty cheap. |
Please benchmark this before claiming it. Those are fixed integer divisions of 16-bit length which will get optimized to a multiplication with the inverse (or a shift). The floating point solution has three separate u8->fp->u8 conversions. Furthermore, an integer implementation has a much better chance of being compiled to SIMD instructions. |
The following function
Benchmarked, best of 10, built with
As we can see, the current implementation using f32 and multiplies is faster than both the original solution and proposed u16 division solution. |
Can we try a seriously optimized variant? Note the elided bounds checks if you check the assembly and the resulting use of SIMD (the main point to prefer short integer arithmetic). (Prediction: 4× to 16× speedup). pub fn cmyk_to_rgb_int(input: &[u8]) -> Vec<u8> {
let count = input.len()/4;
let mut output = vec![0; 3*count];
let in_pixels = input[..4*count].chunks_exact(4);
let out_pixels = output[..3*count].chunks_exact_mut(3);
for (pixel, outp) in in_pixels.zip(out_pixels) {
let c = u16::from(pixel[0]);
let m = u16::from(pixel[1]);
let y = u16::from(pixel[2]);
let k = 255u16 - u16::from(pixel[3]);
// CMY -> RGB
let r = (((255 - c)*k)/255) as u8;
let g = (((255 - m)*k)/255) as u8;
let b = (((255 - y)*k)/255) as u8;
outp[0] = r;
outp[1] = g;
outp[2] = b;
}
output
} |
But you have previously written:
And now we only have a tenth of the pixels in more time:
What are we even benchmarking here? |
Sorry, didn't include the release flag - it seems that the performance is in favour of your implementation, so I pushed that - correct benchmarks below:
|
What hardware are you running? On two machines I could test it on, the floating point turned out faster by ~5%. Those are an |
I'm running an i5-8350U - the integer solution is 5% faster when I compile with |
Then it may just flipflop depending on memory timings and exact CPU. I personally find integer code easier to reason about (don't have to remember rounding rules etc.) but if performance is so close, then it is a matter of taste. Let's just get it in and maybe we can follow up on a more exact perf later. |
I've added tests to cover the colors that the previous version failed to get correct. From what I've tried, I'd have to make cmyk_to_rgb public in order to add code in bench/ to benchmark it, which I don't know if I should do, or add a |
You can add a benchmark within the usual tests as well, just need to put it behind a mod test {
#[cfg(feature = "benchmarks")]
#[bench]
fn cmyk(b: &mut test::Bencher) {
…
}
} |
I've added benchmarks - everything should be ready to merge now. |
Thank you, and special appreciation for compairing all the various implementations. |
Running cmyk_to_rgb(&[0, 0, 0, 254]) produces the RGB color [0, 0, 0] when it should in fact produce RGB [1, 1, 1]:
R = 255 * (1 - C/255) * (1-K/255)
R = 255 * (1-0/255) * (1-254/255)
R = 255 * 1 * 1/ 255 = 1.
This bug occurs with a range of other values as well - 159 / 255 of the values [0, 0, 0, 0..255] are affected.
The fix involves reducing the number of multiplications and divisions, and also yields a ~15-25% speedup (results are from randomly generated images, using
cargo build --release
).I license past and future contributions under the dual MIT/Apache-2.0 license,
allowing licensees to chose either at their option.
(Resubmitted PR against next branch, not master - it's not specified which branch to specify pull requests against.)