Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix undefined behavior in ZSTD_decompressStream() #3258

Merged

Conversation

daniellerozenblit
Copy link
Contributor

TLDR

This PR fixes two instances of undefined behavior (0 offset to Null pointer) in ZSTD_decompressStream(). Also it adds some additional clang ubsan CI jobs, in addition to existing gcc ubsan jobs.

Reproduce Bug

Run CC=clang make uasan-test-zstream with clang11 or higher. Make sure to add the test case that was added in this PR (in tests/zstreamtest.c) since it exposes one of these errors.

Benchmarking

This bug fix does not seem to negatively impact decompression performance.

I benchmarked on an Intel(R) Xeon(R) Gold 6138 CPU @ 2.00GHz machine with core isolation and turbo disabled. I measured decompression time for silesia.tar (203M), compiling with GCC11 and clang15. I ran each scenario 5 times interleaved to eliminate bias from background processes.

Clang Results

original: 0m0.472s = 430M/s
this PR: 0m0.469s = 433M/s

This PR increased compression speed by around 0.698%, probably due to random noise.

GCC Results

original: 0m0.466s = 436M/s
this PR: 0m0.466s = 436M/s

This PR did not change compression speed (0.00% decrease).

@@ -2059,7 +2059,7 @@ size_t ZSTD_decompressStream(ZSTD_DStream* zds, ZSTD_outBuffer* output, ZSTD_inB
if (ZSTD_isError(decompressedSize)) return decompressedSize;
DEBUGLOG(4, "shortcut to single-pass ZSTD_decompress_usingDDict()")
ip = istart + cSize;
op += decompressedSize;
op = op ? op + decompressedSize : op;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Detailing the thought process :

That such a scenario can even be triggered in ubsan tests feels strange initially:
for execution to reach this code position, it needs to pass the test :
(oend-op) >= zds->fParams.frameContentSize.
So, if there is any content, then it's required that op != NULL.

However, thinking about edge cases,
it's indeed possible that frameContentSize==0.
This is an empty frame (yet still a valid frame).
It's obviously not an expected outcome in general, the point is that it can happen, it's allowed.
In which case, it becomes technically possible to "decompress" such an empty frame into NULL.

Now, there are more philosophical questions associated to this scenario :
Is it normal to ask to decompress a frame into a NULL buffer ?
Should it be an instant fail ? or should it succeed in the specific case where frameContentSize==0 ?

Note that this code block is just an "optimization" : it triggers single-pass mode when space conditions are met, which requires less workload than full-featured streaming mode.
Which means that skipping this block is not an "end" sentence : it just requires to employ the more complex streaming mode.

Therefore, one possible idea here could be to add a op != NULL condition to enter this code block,
so that op==NULL is no longer a possible scenario to support inside the code block
(it must employ the more general streaming code path).
However, such a solution does not appear "obviously" better than the ternary operator you selected.

I would say, the only piece of information missing is why the scenario (op==NULL) can technically happen here, justifying the existence of this ternary operator to any future contributor reviewing the code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that a very brief comment would be useful here.

@daniellerozenblit daniellerozenblit marked this pull request as ready for review September 8, 2022 21:46
Copy link
Contributor

@embg embg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR looks great! Just a few very minor nits, and we should be good to land.

lib/decompress/zstd_decompress.c Outdated Show resolved Hide resolved
@@ -2059,7 +2059,7 @@ size_t ZSTD_decompressStream(ZSTD_DStream* zds, ZSTD_outBuffer* output, ZSTD_inB
if (ZSTD_isError(decompressedSize)) return decompressedSize;
DEBUGLOG(4, "shortcut to single-pass ZSTD_decompress_usingDDict()")
ip = istart + cSize;
op += decompressedSize;
op = op ? op + decompressedSize : op;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that a very brief comment would be useful here.

.github/workflows/dev-long-tests.yml Outdated Show resolved Hide resolved
.github/workflows/dev-long-tests.yml Outdated Show resolved Hide resolved
lib/decompress/zstd_decompress.c Show resolved Hide resolved
tests/Makefile Show resolved Hide resolved
tests/zstreamtest.c Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants