-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pad the LZW code table if less than minimum size #14
Conversation
Build failure unrelated? |
Thank you for looking into this! I need to reacquaint myself with the spec and the code a bit, but will try to reply soon, maybe later today. |
Your explanation makes perfect sense; I must have missed that part of the spec. I'm not sure if this would matter in practice, but it seems like to be 100% conformant, we should pad up to |
Right now numColors is the padded size from the color table, so it's guaranteed to be a power of two but its minimum value is 2 rather than 4. The 4 is kind of a magical number, maybe adding a doc comment to the LzwEncoder constructor specifying that numColors must be a power of two? Or I could make it a constant with a comment on its derivation? |
I have an alternate fix in #15 |
Oh right -- forgot that I think I prefer your other fix for now, and afterward I can refactor a bit to explicitly derive |
We were deriving the minimum code size from the number of colors in the LZW code table, which is sort of backwards. The spec defines the "clear" code's value (and implicitly, the number of colors in the LZW code table) based on the minimum code size. I also made some tweaks to emphasize that the number of colors in the LZW code table may not match the color table size, as @rendaw pointed out in square#14. The former quantity is now just a local variable in `defaultCodeTable`, so it's more contained. The latter quantity I renamed to `colorTableSize`.
We were deriving the minimum code size from the number of colors in the LZW code table, which is sort of backwards. The spec defines the "clear" code's value (and implicitly, the number of colors in the LZW code table) based on the minimum code size. I also made some tweaks to emphasize that the number of colors in the LZW code table may not match the color table size, as @rendaw pointed out in #14. The former quantity is now just a local variable in `defaultCodeTable`, so it's more contained. The latter quantity I renamed to `colorTableSize`.
I think this fixes #11.
My understanding is that while the color table has a minumum size of 2, the LZW minimum code size of 2 indicates that the code table has 4 colors, so the code table needs to be padded with two empty colors before the cc and eoi codes. I believe decoders were failing because they expected cc and eoi codes to have different code indexes and were thus misinterpreting the whole stream (first code in the stream is always cc).
I think this is indicated on page 32 here https://www.w3.org/Graphics/GIF/spec-gif89a.txt although I'm bad with specs without lots of pictures and pretty diagrams... it says
so for a code size of 2, the cc index would be 4 (0-1 would be colors, 2-3 ??).
gifencoder currently has 2 as the current minimum for the LZW minimum code size value, and I believe this is correct (i.e. the issue isn't that the minimum code size is being set too high). For reference, the definition says a bit above that that for 2 color images the minimum code size is 2:
I couldn't find information on padding. My understanding is that since the code table is used by matching color table indexes, reusing color index 0 for the padding elements is fine since a lookup for color 0 will never get that far.
In any case I tried this and I could read the output file with a couple programs (not entirely sure they aren't all using the same library).