Skip to content
This repository has been archived by the owner on Mar 16, 2023. It is now read-only.

Limited palette modes #1

Closed
danya02 opened this issue Feb 20, 2023 · 3 comments
Closed

Limited palette modes #1

danya02 opened this issue Feb 20, 2023 · 3 comments
Labels
enhancement New feature or request

Comments

@danya02
Copy link
Contributor

danya02 commented Feb 20, 2023

For better data storage efficiency, it is better to use more than the binary color mode, but video compression means that full-color RGB mode is impossible to round-trip via YouTube. However, we could store fewer bits per pixel, while still being more than 1.

For example, one option could be to store one bit each per the red, green and blue channel, yielding a 1-bit RGB palette with 8 different options. Then, when reading, you can round the pixel values to the nearest of (0, 255) to get the 3 bits in a pixel. Based on my own experiments with Twitter, this scheme survives their JPEG compression, and I expect that it will probably also work for YouTube.

More generally, we could take a list that's 2^N entries in size, and each entry is an RGB color. The sender encodes each N bits by the corresponding color from the list, and the receiver rounds each acquired pixel to the closest color from that list (with the closeness being determined by a perceptual metric close to how the encoder does it).

The color list also needs to be communicated, and for this, one option would be to include some frames of black-and-white pixels that serve as a header.

@azukaar
Copy link

azukaar commented Feb 20, 2023

I came here for this. The efficiency of storage could be greatly improved. Also, building in some redundancy would allow to play with higher risk palette for optimizing the final storage %

Another two things to consider:

  • The same bytes could be represented by different colours, and the colour would be chosen based on neighboors in order to reduce potential compression artefact (ex: not have 2 shades of red next to each others)

  • The sound channel could be also used for additional bandwidth

@danya02
Copy link
Contributor Author

danya02 commented Feb 21, 2023

For avoiding compression artifacts, we need someone who has experience with the codecs involved, or else some experimentation, to determine the spatial correlation of the different distortions. I'm actually not sure that deliberately breaking apart the areas of contiguous color is a good idea -- then it's harder for the encoder to encode the shape of that area, so it might end up having to compromise on the color instead. Also, another thing to consider is to try minimizing the inter-frame difference, so that there's less data to compress in the first place.

For redundancy, we're looking for an error-correction scheme with a special property: that it can deal with many small errors, rather than one big error. This is a rather unusual property; neither Hamming nor Reed-Solomon does this. Input from computer scientists is needed.

When I tried to encode data into videos, I tried to use QR codes instead of raw bytes. I think it worked quite well to deal with the compression issues, since QR codes have ECC built in, but I had problems with extracting them in the right order and actually putting them back together into a file: https://youtube.com/watch?v=HzU69Dm-BEs

The sound channel isn't trivially composable with the main data, and in any case it can store a fraction of the data that the main video can. But what that can be used for is to store a checksum of the data, so that errors in decoding can be detected. One strategy for correcting them could be to try again with a higher-resolution version of the video.

@xiaotuzididi
Copy link

good

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants