Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YCoCg Heuristics #258

Open
DarkZeros opened this issue Apr 13, 2016 · 8 comments
Open

YCoCg Heuristics #258

DarkZeros opened this issue Apr 13, 2016 · 8 comments

Comments

@DarkZeros
Copy link

I noticed some images are compressed further by disabling the YCoCg transform. A clear example https://upload.wikimedia.org/wikipedia/commons/4/47/PNG_transparency_demonstration_1.png

My analysis lead me to the correlation between colours as the culprit. YCoCg is useful because it achieves decorrelation for most real life images. Leading to a efficient MANIAC compresion later on. However artificial images tend to have pure colours in RGB space. Therefore RGB space is more "decorrelated" than YCoCg.

IE:
Dice histogram:
http://i.snag.gy/bChdw.jpg

Kodim1 histogram:
http://i.snag.gy/k0ZnR.jpg

I am in the process of writing some heuristics based on image color histograms/correlations. Just made this issue to notify you about it, and any comment or suggestions are also welcome.

@jonsneyers
Copy link
Member

Sounds interesting!
One other reason why we're using YCoCg instead of RGB is to get better progressive decoding (Y/luma gets priority over the CoCg/chroma channels, to get better progressive previews faster). So when staying in RGB space, we should probably reorder to GRB (green first) since that is closer to luma first.

I think lossless WebP optionally does a SubtractGreen transform, which transforms R,G,B to R-G,G,B-G. Maybe that also makes sense to do.

@DarkZeros
Copy link
Author

I analyzed the 3 transforms on dice:
YCoCg : 151 kB
RGB : 140 kB (-7%)
R-G,G,B-G : 136 kB (-10%)

R-G,G,B-G seems good because it eliminates the white dots on the dice from the R and B components.

It seems that there is quite a room for improvement here for some specific images, even more than what I originally expected!
Maybe we can make a transform that generates a custom fitted color space for a given image, so that we ensure maximum compresion. Although the transform will be slightly slower since more math will be involved. (plus some space in the file to specify the one being used)

EDIT: I made a mistake in the G,R-G,B-G test, leading to a invalid file. Corrected now!

@jonsneyers
Copy link
Member

I'm implementing a PlanePermute transformation with optional "subtract one channel from the others". For now, the encoder will not use it by default (just use YCoCg), unless you manually disable YCoCg, then it transforms R,G,B to G,R-G,B-G. Subtracting green usually seems to be a good idea (compared to just keeping RGB in whatever order), and encoding green first makes sense since it contributes most to luma.

If on some particular image, something like B, G-B, R-B turns out to be better, the bitstream will support that (and the decoder can decode it), but the encoder for now just never does that.

(This transform could also be useful to reorder the planes in RGBG images)

jonsneyers added a commit that referenced this issue Apr 14, 2016
plane permute/subtract (cf #258)
some decoder fuzzing
@psykauze
Copy link
Contributor

Yeah, the plane ordering might be useful for RGGB purpose. Also, the subtract green light might be useful too but I'm not sure because average value between RGB planes are not the same.

@jonsneyers
Copy link
Member

So, the format supports everything we want: RGB, YCoCg, channel permutation and subtracting one channel from the others. From the decoder / format spec perspective, this is "done".

From the encoder perspective, there's the matter of deciding which color transform(s) to apply; currently it just always does YCoCg except if you specify -Y, then it does G (R-G) (B-G). It is computationally too expensive to try all possible combinations and fully encode the image, then select the best one. Maybe some simpler heuristic is possible...

@jonsneyers
Copy link
Member

@DarkZeros Do you have an idea of a simple encoder heuristic to decide which color transform to use? Suppose it's OK to actually do a few different transforms (e.g. YCoCg, G(R-G)(B-G), GRB), but it's not OK to do any real encoding. Is there some kind of cheap way to estimate the amount of decorrelation? Perhaps just compare the total of the absolute values of the pixels in the 'chroma' channels, the idea being that values close to zero mean better compression / better decorrelation?

@DarkZeros
Copy link
Author

@jonsneyers Hi, I didn't have enough time to invest in this proyect, sry for that...
My original idea was that I could use the channel correlation as a quick metric. I could even downscale the image for the correlation.
But after some coding and testing I found out that the correlation between channels does not give any proper insight about wheather the MANIAC encoder will produce a better or worse result. This is due to MANIAC encoding channel separately.
My current ideas are to measure the channels self correlation (how predictable they are) and select the transformation that produces higher selfcorrelation on average. But I am unsure on how to proceed... Since it will depend on wheather the encoder is interlaced or not, and on the pixel predictor used.
Also the heuristic has to be fast, and that is the real issue.

Maybe the most simple way forward is performing a downscaled R0 encoding as a quick test of the best transform.

@DarkZeros
Copy link
Author

DarkZeros commented Jan 31, 2017

@jonsneyers Ok found out a very good YCoCg heuristic. The predictor heuristic.
I modified the predictor heuristic (using plane 1) to give out RAW data out, then compared it RGB vs YCoCg.
I also encoded files in both ways (-Y and normal), then plot a graph with the predictor guess vs real file size.

The predictor seems to be accurate most of the time. And when is not is in those "close cases".
When there is a clear predictor value, the decision is always correct and also makes the file much smaller.

image

I have a dataset clearly biased towards YCoCg (kodak dataset + icons), even though:
Total size YCoCg: 269.706 MB
Total size RGB: 272.365 MB
Total size Predictor: 269.661 MB

Images best YCoCg: 1365/2181
Images best RGB: 966/2181
Images best Predictor: 1870/2181

I will keep on investigating on the best predictor, and supply a Pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants