# YCoCg Heuristics #258

Open
opened this Issue Apr 13, 2016 · 8 comments

Projects
None yet
3 participants

### DarkZeros commented Apr 13, 2016

 I noticed some images are compressed further by disabling the YCoCg transform. A clear example https://upload.wikimedia.org/wikipedia/commons/4/47/PNG_transparency_demonstration_1.png My analysis lead me to the correlation between colours as the culprit. YCoCg is useful because it achieves decorrelation for most real life images. Leading to a efficient MANIAC compresion later on. However artificial images tend to have pure colours in RGB space. Therefore RGB space is more "decorrelated" than YCoCg. IE: Dice histogram: http://i.snag.gy/bChdw.jpg Kodim1 histogram: http://i.snag.gy/k0ZnR.jpg I am in the process of writing some heuristics based on image color histograms/correlations. Just made this issue to notify you about it, and any comment or suggestions are also welcome.
Member

### jonsneyers commented Apr 13, 2016

 Sounds interesting! One other reason why we're using YCoCg instead of RGB is to get better progressive decoding (Y/luma gets priority over the CoCg/chroma channels, to get better progressive previews faster). So when staying in RGB space, we should probably reorder to GRB (green first) since that is closer to luma first. I think lossless WebP optionally does a SubtractGreen transform, which transforms R,G,B to R-G,G,B-G. Maybe that also makes sense to do.

### DarkZeros commented Apr 14, 2016

 I analyzed the 3 transforms on dice: YCoCg : 151 kB RGB : 140 kB (-7%) R-G,G,B-G : 136 kB (-10%) R-G,G,B-G seems good because it eliminates the white dots on the dice from the R and B components. It seems that there is quite a room for improvement here for some specific images, even more than what I originally expected! Maybe we can make a transform that generates a custom fitted color space for a given image, so that we ensure maximum compresion. Although the transform will be slightly slower since more math will be involved. (plus some space in the file to specify the one being used) EDIT: I made a mistake in the G,R-G,B-G test, leading to a invalid file. Corrected now!
Member

### jonsneyers commented Apr 14, 2016

 I'm implementing a PlanePermute transformation with optional "subtract one channel from the others". For now, the encoder will not use it by default (just use YCoCg), unless you manually disable YCoCg, then it transforms R,G,B to G,R-G,B-G. Subtracting green usually seems to be a good idea (compared to just keeping RGB in whatever order), and encoding green first makes sense since it contributes most to luma. If on some particular image, something like B, G-B, R-B turns out to be better, the bitstream will support that (and the decoder can decode it), but the encoder for now just never does that. (This transform could also be useful to reorder the planes in RGBG images)

### jonsneyers added a commit that referenced this issue Apr 14, 2016

``` added FLIF16 version bit ```
```plane permute/subtract (cf #258)
some decoder fuzzing```
``` b09b6da ```
Contributor

### psykauze commented Apr 15, 2016

 Yeah, the plane ordering might be useful for RGGB purpose. Also, the subtract green light might be useful too but I'm not sure because average value between RGB planes are not the same.

Member

### jonsneyers commented Aug 28, 2016

 So, the format supports everything we want: RGB, YCoCg, channel permutation and subtracting one channel from the others. From the decoder / format spec perspective, this is "done". From the encoder perspective, there's the matter of deciding which color transform(s) to apply; currently it just always does YCoCg except if you specify `-Y`, then it does G (R-G) (B-G). It is computationally too expensive to try all possible combinations and fully encode the image, then select the best one. Maybe some simpler heuristic is possible...

Closed

Member

### jonsneyers commented Nov 17, 2016

 @DarkZeros Do you have an idea of a simple encoder heuristic to decide which color transform to use? Suppose it's OK to actually do a few different transforms (e.g. YCoCg, G(R-G)(B-G), GRB), but it's not OK to do any real encoding. Is there some kind of cheap way to estimate the amount of decorrelation? Perhaps just compare the total of the absolute values of the pixels in the 'chroma' channels, the idea being that values close to zero mean better compression / better decorrelation?

### DarkZeros commented Nov 18, 2016

 @jonsneyers Hi, I didn't have enough time to invest in this proyect, sry for that... My original idea was that I could use the channel correlation as a quick metric. I could even downscale the image for the correlation. But after some coding and testing I found out that the correlation between channels does not give any proper insight about wheather the MANIAC encoder will produce a better or worse result. This is due to MANIAC encoding channel separately. My current ideas are to measure the channels self correlation (how predictable they are) and select the transformation that produces higher selfcorrelation on average. But I am unsure on how to proceed... Since it will depend on wheather the encoder is interlaced or not, and on the pixel predictor used. Also the heuristic has to be fast, and that is the real issue. Maybe the most simple way forward is performing a downscaled R0 encoding as a quick test of the best transform.

### DarkZeros commented Jan 31, 2017 • edited

 @jonsneyers Ok found out a very good YCoCg heuristic. The predictor heuristic. I modified the predictor heuristic (using plane 1) to give out RAW data out, then compared it RGB vs YCoCg. I also encoded files in both ways (-Y and normal), then plot a graph with the predictor guess vs real file size. The predictor seems to be accurate most of the time. And when is not is in those "close cases". When there is a clear predictor value, the decision is always correct and also makes the file much smaller. I have a dataset clearly biased towards YCoCg (kodak dataset + icons), even though: Total size YCoCg: 269.706 MB Total size RGB: 272.365 MB Total size Predictor: 269.661 MB Images best YCoCg: 1365/2181 Images best RGB: 966/2181 Images best Predictor: 1870/2181 I will keep on investigating on the best predictor, and supply a Pull request.