Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compress: unify Huffman logic in flate and bzip2 #20485

Open
dsnet opened this issue May 24, 2017 · 3 comments

Comments

Projects
None yet
2 participants
@dsnet
Copy link
Member

commented May 24, 2017

The Huffman encoding in both flate and bzip2 is identical except for some minor differences:

  • bzip2 treats the leading bits in a bitstream as the MSB of a byte, while flate treats the leading bits as the LSB of a byte.
  • bzip2 allows Huffman codes to be up to 20-bits long, while flate allows codes up to 15-bits long.

Currently, the implementation of Huffman encoding in flate is superior (and faster) to the one in bzip2. The differences are very minor and it should be able to unify the two without any performance hit. We should extract the common logic as compress/internal/huffman.

@dsnet dsnet added this to the Go1.10 milestone May 24, 2017

@dsnet dsnet self-assigned this May 24, 2017

@dsnet dsnet added the Performance label Aug 30, 2017

@dsnet dsnet modified the milestones: Go1.10, Unplanned Sep 26, 2017

@adamdrake

This comment has been minimized.

Copy link

commented Nov 8, 2017

Looks like an interesting performance optimization and stdlib simplification. Any plans to work on this personally or is it open for the community in general?

@dsnet

This comment has been minimized.

Copy link
Member Author

commented Nov 8, 2017

I re-wrote the flate decoder from the ground up in https://github.com/dsnet/compress/tree/d2570c4d5b0229583afd32c7c4767e51f314c608. It uses the unified Huffman logic idea I wrote about here, but I haven't gotten around to merging it into standard library.

The performance is significantly faster than the stdlib implementation (~1.4x).

@adamdrake

This comment has been minimized.

Copy link

commented Nov 13, 2017

That sounds great! I hope it makes it into a future release as those speed improvements would be welcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.