Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Binlog reading - grace checking of GZip signature #9567

Open
JanKrivanek opened this issue Dec 21, 2023 · 0 comments
Open

Binlog reading - grace checking of GZip signature #9567

JanKrivanek opened this issue Dec 21, 2023 · 0 comments
Labels
help wanted Issues that the core team doesn't plan to work on, but would accept a PR for. Comment to claim. Priority:3 Work that is nice to have triaged

Comments

@JanKrivanek
Copy link
Member

Context

Grace reporting if non-Gzip stream encountered

Background

Idea by @KirillOsenkov:

One other thought I had that's not necessarily for this PR, but somewhat related. Right now all binlog files start with 1F 8B 08 which is the GZip stream signature bytes (08 indicating Deflate compression). If we change the compression method for the outer stream at some point in the future, the envelope format will likely have a different signature. Long term I'd love to investigate Zstd or Lzma compression, I remember I saw more than 2x improvement in binlog sizes when I uncompressed Gzip/Deflate and recompressed with 7-Zip.

Wondering if we could prepare for that future today by gracefully showing a message if the binlog file doesn't start with 1F 8B 08.

Currently we handle it relatively well: There was an exception while reading the log file: Found invalid data while decoding. Wondering if there's anything else we can do here, or probably not?

Further Thoughts

This might not be that helpful now as we support only a GZip streams. But would we (or contributor :-)) decide to add support for other compression mechanisms - it would become a must. Then we'd probably use the signature to flip between the deflation streams implementations (or throw more descriptively if unknown).

Possible support of uncompressed stream might be nice as well - I found myself quite a few times uncompressing the binlog to troubleshoot various issues with it, but then being unable to use it as is.

@JanKrivanek JanKrivanek added help wanted Issues that the core team doesn't plan to work on, but would accept a PR for. Comment to claim. needs-triage Have yet to determine what bucket this goes in. labels Dec 21, 2023
@AR-May AR-May added triaged Priority:3 Work that is nice to have and removed needs-triage Have yet to determine what bucket this goes in. labels Mar 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Issues that the core team doesn't plan to work on, but would accept a PR for. Comment to claim. Priority:3 Work that is nice to have triaged
Projects
None yet
Development

No branches or pull requests

2 participants