You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The row hash strategy doesn't use a chainTable, similarly to ZSTD_fast. Therefore, existing checks that strategy != ZSTD_fast whose actual intent it was to check for a chaintable, are no longer accurate, and all existing sites must be migrated to ZSTD_allocateChainTable(). I didn't see any more instances of this in the codebase.
In this case, ZSTD_reduceIndex() with row hash could actually end up trying to access memory outside the cctx in cases where the chainLog was large enough (even though that's not a param row hash actually uses).
ZSTD_reduceIndex() actually has just the right params for us to use, so the fix is fairly straightforward. The params passed are appliedParams, so the matchfinder mode won't be ZSTD_urm_auto (and we assert that in ZSTD_allocateChainTable()).
Would that make sense to add a test that cover this use case ?
We already have a category "large data test".
Could that fit in ? Or would such a test need an unreasonably long time ?
Alternatively, could we retrofit one of these tests to cover this case ?
I'm surprised the issue was not detected by these tests, but I presume none of the tests large enough to trigger ZSTD_reduceIndex() was using rowHash strategy due to speed concerns.
Would that make sense to add a test that cover this use case ?
We already have a category "large data test".
Could that fit in ? Or would such a test need an unreasonably long time ?
Alternatively, could we retrofit one of these tests to cover this case ?
I'm surprised the issue was not detected by these tests, but I presume none of the tests large enough to trigger ZSTD_reduceIndex() was using rowHash strategy due to speed concerns.
I think @terrelln mentioned that he might add some more fuzzer tests. #2601 adds a simple test that would catch this.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The row hash strategy doesn't use a chainTable, similarly to
ZSTD_fast. Therefore, existing checks thatstrategy != ZSTD_fastwhose actual intent it was to check for a chaintable, are no longer accurate, and all existing sites must be migrated toZSTD_allocateChainTable(). I didn't see any more instances of this in the codebase.In this case,
ZSTD_reduceIndex()with row hash could actually end up trying to access memory outside thecctxin cases where thechainLogwas large enough (even though that's not a param row hash actually uses).ZSTD_reduceIndex()actually has just the right params for us to use, so the fix is fairly straightforward. Theparamspassed areappliedParams, so the matchfinder mode won't beZSTD_urm_auto(and we assert that inZSTD_allocateChainTable()).Repro of the bug: