Fix ZstandardStream truncating multi-frame zstd responses to the first frame by christosk92 · Pull Request #129047 · dotnet/runtime

christosk92 · 2026-06-05T15:21:58Z

Problem

A zstd stream may be a sequence of frames concatenated back-to-back (RFC 8878 §3), and many encoders/CDNs emit one frame per buffer — so large Content-Encoding: zstd HTTP responses commonly arrive as multiple ~64 KB frames. ZstandardStream decoded only the first frame and silently dropped the rest, surfacing downstream as truncated payloads (in the linked issue, System.Text.Json failing at exactly BytePositionInLine: 65536).

Root cause is in ZstandardDecoder.Decompress:

if (result == 0) { _finished = true; return OperationStatus.Done; }

ZSTD_decompressStream returning 0 means end-of-frame, not end-of-stream (lib/zstd.h: "0 when a frame is completely decoded and fully flushed"). _finished is a one-way latch and Decompress starts with if (_finished) return OperationStatus.Done;, so once the first frame completes every subsequent frame is discarded.

Fix

This mirrors the existing multi-member handling in GZipStream/DeflateStream (Inflater.ResetStreamForLeftoverInput + DeflateStream.InflatorIsFinished), adapted to zstd:

ZstandardStream.TryDecompress now loops across frames. When the decoder reports Done and more input is available that begins with a zstd frame magic number, it continues decoding the next frame on the same native context. No reset is needed between frames — ZSTD_decompressStream automatically begins the next frame on the following call (so window-size / dictionary settings carry over), which keeps this consistent with the native streaming contract.
A new ZstandardDecoder.PrepareForNextFrame() (internal) clears only the managed end-of-frame latch.
StartsWithZstdFrame peeks the 4-byte frame magic (standard 0xFD2FB528, or skippable 0x184D2A50–0x184D2A5F) to distinguish a following frame from trailing non-zstd data after the final frame, which is left untouched — matching how DeflateStream treats data after the last gzip member.
A new _atFrameBoundary flag makes end-of-input a clean end (rather than truncated data) only when the last frame finished cleanly, so the existing strict-validation truncation detection still fires for genuinely truncated frames.

Only the streaming path was affected. The static one-shot helpers are already multi-frame-correct (ZstandardDecoder.TryDecompress → ZSTD_decompress, which concatenates natively; TryGetMaxDecompressedLength → ZSTD_decompressBound, which is multi-frame-aware), so they are unchanged.

Tests

Added to CompressionStreamUnitTests.Zstandard.cs (sync + async each):

ZstandardStream_ConcatenatedFrames_DecompressesAllFrames — two concatenated frames whose payloads span the 64 KB internal buffer; asserts the full concatenated output (this fails without the fix).
ZstandardStream_ConcatenatedFrames_AcrossReads_DecompressesAllFrames — a non-seekable stream returning one byte per read, so each frame magic is discovered across multiple underlying reads (exercises the HttpClient-style no-rewind path and the magic-split-across-reads case).
ZstandardStream_FrameFollowedByTrailingData_StopsAtEndOfFrame — a single frame followed by non-zstd trailing bytes; asserts the payload decodes and the trailing bytes remain available on the (seekable) base stream.

Existing tests, including StreamTruncation_IsDetected, should continue to pass.

Notes

I haven't built the full runtime locally, so I'm relying on CI to validate. Happy to adjust the approach — in particular, whether the frame-to-frame continuation should live in ZstandardStream (as here, paralleling DeflateStream) or inside ZstandardDecoder, and whether trailing-data tolerance should match GZipStream exactly. A deterministic, dependency-free repro is linked from the issue: https://github.com/christosk92/zstd-net11-repro

ZSTD_decompressStream returns 0 at the end of each frame, not the end of the stream, but ZstandardDecoder.Decompress treated that as end-of-stream via a permanent _finished latch. A zstd stream made of multiple concatenated frames (valid per RFC 8878) -- e.g. large HTTP Content-Encoding: zstd responses -- was therefore silently truncated to its first frame, surfacing downstream as truncated payloads (e.g. JSON parse errors at exactly 65536 bytes). ZstandardStream now continues decoding subsequent frames on the same native context (mirroring GZipStream multi-member handling), distinguishes a following frame from trailing data via the frame magic number, and only treats end-of-input as a clean end when at a frame boundary. Adds regression tests for concatenated frames (buffered, split across reads, and with trailing data). Fixes dotnet#129038.

dotnet-policy-service · 2026-06-05T15:23:41Z

Tagging subscribers to this area: @karelz, @dotnet/area-system-io-compression
See info in area-owners.md if you want to be subscribed.

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Updates Zstandard decompression to correctly handle multiple concatenated zstd frames (common in HTTP Content-Encoding: zstd) without truncation, and adds regression coverage for multi-frame and trailing-data scenarios.

Changes:

Extend ZstandardStream decompression to continue decoding across concatenated frames and distinguish frames vs. trailing non-zstd data.
Add tests to verify full output for concatenated frames (including across fragmented reads) and correct handling of trailing data after the final frame.
Add decoder support method to clear managed end-of-frame state between frames.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File	Description
src/libraries/System.IO.Compression/tests/Zstandard/CompressionStreamUnitTests.Zstandard.cs	Adds regression tests for concatenated frames, across-read boundaries, and trailing-data handling.
src/libraries/System.IO.Compression/src/System/IO/Compression/Zstandard/ZstandardStream.Decompress.cs	Implements multi-frame decoding logic and adjusts truncation detection at frame boundaries.
src/libraries/System.IO.Compression/src/System/IO/Compression/Zstandard/ZstandardDecoder.cs	Adds `PrepareForNextFrame` to clear managed end-of-frame state for concatenated streams.

christosk92 · 2026-06-05T15:32:36Z

+                if (destination.IsEmpty)
+                {
+                    // The caller provided a zero-byte buffer.  This is typically done in order to avoid allocating/renting
+                    // a buffer until data is known to be available.  We don't have perfect knowledge here, as _decoder.Decompress
+                    // will return DestinationTooSmall whether or not more data is required.  As such, we assume that if there's
+                    // any data in our input buffer, it would have been decompressible into at least one byte of output, and
+                    // otherwise we need to do a read on the underlying stream.  This isn't perfect, because having input data
+                    // doesn't necessarily mean it'll decompress into at least one byte of output, but it's a reasonable approximation
+                    // for the 99% case.  If it's wrong, it just means that a caller using zero-byte reads as a way to delay
+                    // getting a buffer to use for a subsequent call may end up getting one earlier than otherwise preferred.
+                    Debug.Assert(status == OperationStatus.DestinationTooSmall);
+                    if (_buffer.ActiveLength != 0)
+                    {
+                        Debug.Assert(bytesWritten == 0);
+                        return true;
+                    }
+                }


False positive. ZstandardDecoder.Decompress returns DestinationTooSmall (never NeedMoreData) whenever destination is empty, via an unconditional guard before any native call. So at this point the status is always DestinationTooSmall, including for 0-byte reads. This matches the pre-existing assert, so leaving it as-is.

christosk92 · 2026-06-05T15:32:37Z

+            public override Task<int> ReadAsync(byte[] buffer, int offset, int count, CancellationToken cancellationToken) =>
+                Task.FromResult(Read(buffer.AsSpan(offset, count)));
+
+            public override ValueTask<int> ReadAsync(Memory<byte> buffer, CancellationToken cancellationToken = default) =>
+                new ValueTask<int>(Read(buffer.Span));


Fixed. Both ReadAsync overrides now return a canceled Task/ValueTask when the token is already canceled.

christosk92 · 2026-06-05T15:32:39Z

+                    // Fewer than ZstdFrameMagicLength bytes remain: not enough to tell whether another
+                    // frame follows (its magic number may be split across reads) or this was the last
+                    // frame. Hand back any output now and resolve on the next call / underlying read.
+                    // Because we're at a frame boundary, end-of-input is treated as a clean end rather
+                    // than truncation (see _atFrameBoundary checks in Read/ReadAsync).
+                    lastResult = OperationStatus.NeedMoreData;
+                    return bytesWritten != 0;


Good catch. Fixed the inconsistency too: at a frame boundary, end-of-input now reports Done so 1 to 3 trailing bytes are rewound on a seekable stream, matching the >= 4 byte case. Added a test for a frame followed by 1 to 3 trailing bytes.

- Report Done at end-of-input when at a frame boundary so 1-3 trailing bytes after the final frame are rewound on a seekable base stream, consistent with the >= 4 byte trailing-data case. - SingleByteReadStream.ReadAsync test helper now honors the CancellationToken. - Add regression test for a frame followed by 1-3 trailing bytes (seekable).

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

+                if (destination.IsEmpty)
+                {
+                    // The caller provided a zero-byte buffer.  This is typically done in order to avoid allocating/renting
+                    // a buffer until data is known to be available.  We don't have perfect knowledge here, as _decoder.Decompress
+                    // will return DestinationTooSmall whether or not more data is required.  As such, we assume that if there's
+                    // any data in our input buffer, it would have been decompressible into at least one byte of output, and
+                    // otherwise we need to do a read on the underlying stream.  This isn't perfect, because having input data
+                    // doesn't necessarily mean it'll decompress into at least one byte of output, but it's a reasonable approximation
+                    // for the 99% case.  If it's wrong, it just means that a caller using zero-byte reads as a way to delay
+                    // getting a buffer to use for a subsequent call may end up getting one earlier than otherwise preferred.
+                    Debug.Assert(status == OperationStatus.DestinationTooSmall);


+            Assert.Equal(expected.Length, output.Length);
+            Assert.Equal(expected, output.ToArray());


christosk92 · 2026-06-05T15:38:49Z

@dotnet-policy-service agree

The while(true) loop assigns lastResult before any use, so the pre-loop initializer was a dead store (IDE0059, treated as a build error). Also drop the redundant re-assignment of Done in the trailing-data branch.

Copilot AI review requested due to automatic review settings June 5, 2026 15:21

github-actions Bot added the area-System.IO.Compression label Jun 5, 2026

dotnet-policy-service Bot added the community-contribution Indicates that the PR has been added by a community member label Jun 5, 2026

Copilot AI reviewed Jun 5, 2026

View reviewed changes

christosk92 mentioned this pull request Jun 5, 2026

HttpClient / ZstandardStream silently truncates multi-frame zstd responses to the first frame (.NET 11 preview 4) #129038

Open

christosk92 marked this pull request as ready for review June 5, 2026 15:33

Copilot AI review requested due to automatic review settings June 5, 2026 15:33

Copilot AI reviewed Jun 5, 2026

View reviewed changes

Fix IDE0059 build error: remove redundant lastResult assignments

e0861db

The while(true) loop assigns lastResult before any use, so the pre-loop initializer was a dead store (IDE0059, treated as a build error). Also drop the redundant re-assignment of Done in the trailing-data branch.

This was referenced Jun 6, 2026

"We stopped hearing from agent Azure Pipelines 32. Verify the agent machine is running and has a healthy network connection" dotnet/dnceng#1886

Open

Android WebSocket failure #121518

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix ZstandardStream truncating multi-frame zstd responses to the first frame#129047

Fix ZstandardStream truncating multi-frame zstd responses to the first frame#129047
christosk92 wants to merge 3 commits into
dotnet:mainfrom
christosk92:fix-zstd-multiframe-129038

christosk92 commented Jun 5, 2026

Uh oh!

dotnet-policy-service Bot commented Jun 5, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

christosk92 Jun 5, 2026

Uh oh!

christosk92 Jun 5, 2026

Uh oh!

christosk92 Jun 5, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

christosk92 commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		Assert.Equal(expected.Length, output.Length);
		Assert.Equal(expected, output.ToArray());

Conversation

christosk92 commented Jun 5, 2026

Problem

Fix

Tests

Notes

Uh oh!

dotnet-policy-service Bot commented Jun 5, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

christosk92 Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

christosk92 Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

christosk92 Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

christosk92 commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants