Skip to content

Fix LZ long-match encoding overflow (#753)#753

Closed
terrelln wants to merge 2 commits into
facebook:devfrom
terrelln:export-D104838040
Closed

Fix LZ long-match encoding overflow (#753)#753
terrelln wants to merge 2 commits into
facebook:devfrom
terrelln:export-D104838040

Conversation

@terrelln
Copy link
Copy Markdown
Contributor

@terrelln terrelln commented May 12, 2026

Summary:

The LZ encoder capped matchLength() at UINT16_MAX because sequence match
lengths are stored as uint16_t. If the match had been walked backward at least
UINT16_MAX bytes, then the match finding process would resume at a position
which had already been inserted into the hash table. This would result in match
with distance <= 0 and corruption would ensue.

Differential Revision: D104838040

@meta-cla meta-cla Bot added the cla signed label May 12, 2026
@meta-codesync
Copy link
Copy Markdown

meta-codesync Bot commented May 12, 2026

@terrelln has exported this pull request. If you are a Meta employee, you can view the originating Diff in D104838040.

terrelln added a commit to terrelln/openzl that referenced this pull request May 12, 2026
Summary:

The LZ encoder capped `matchLength()` at `UINT16_MAX` because sequence match
lengths are stored as `uint16_t`. If the match had been walked backward at least
`UINT16_MAX` bytes, then the match finding process would resume at a position
which had already been inserted into the hash table. This would result in match
with `distance <= 0` and corruption would ensue.

Differential Revision: D104838040
@meta-codesync meta-codesync Bot changed the title Fix LZ long-match encoding overflow Fix LZ long-match encoding overflow (#753) May 12, 2026
@terrelln terrelln force-pushed the export-D104838040 branch from f526842 to bfe6f2f Compare May 12, 2026 18:34
terrelln added 2 commits May 15, 2026 09:02
Summary:

Allow the fuzzer to find the bug in D104838040 with short inputs.

I haven't gotten the fuzzer to reproduce it yet, but in theory it should be able to.

Reviewed By: kevinjzhang

Differential Revision: D104843371
Summary:

The LZ encoder capped `matchLength()` at `UINT16_MAX` because sequence match
lengths are stored as `uint16_t`. If the match had been walked backward at least
`UINT16_MAX` bytes, then the match finding process would resume at a position
which had already been inserted into the hash table. This would result in match
with `distance <= 0` and corruption would ensue.

Differential Revision: D104838040
@terrelln terrelln force-pushed the export-D104838040 branch from bfe6f2f to 36914a0 Compare May 15, 2026 16:03
terrelln added a commit to terrelln/openzl that referenced this pull request May 15, 2026
Summary:

The LZ encoder capped `matchLength()` at `UINT16_MAX` because sequence match
lengths are stored as `uint16_t`. If the match had been walked backward at least
`UINT16_MAX` bytes, then the match finding process would resume at a position
which had already been inserted into the hash table. This would result in match
with `distance <= 0` and corruption would ensue.

Reviewed By: Cyan4973

Differential Revision: D104838040
@meta-codesync meta-codesync Bot closed this in 59cc558 May 19, 2026
@meta-codesync
Copy link
Copy Markdown

meta-codesync Bot commented May 19, 2026

This pull request has been merged in 59cc558.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant