Skip to content

Fix string buffer size calculation in Utf8JsonWriter.WriteStringValue#127047

Merged
eiriktsarpalis merged 6 commits intodotnet:mainfrom
prozolic:StringConverter
Apr 27, 2026
Merged

Fix string buffer size calculation in Utf8JsonWriter.WriteStringValue#127047
eiriktsarpalis merged 6 commits intodotnet:mainfrom
prozolic:StringConverter

Conversation

@prozolic
Copy link
Copy Markdown
Contributor

@prozolic prozolic commented Apr 17, 2026

#103155

Use WriteStringValueSegment for large strings
In StringConverter.Write, this PR use chunk-based WriteStringValueSegment instead of WriteStringValue when the input string length exceeds the safe threshold. This prevents IndexOutOfRangeException that could occur when writing extremely large strings.

This PR precompute maxRequiredBytes before dispatching to Utf8JsonWriter.WriteStringMinimized and Utf8JsonWriter.WriteStringIndented, using the original input length.

  • With escaping: value.Length * MaxExpansionFactorWhileEscaping (6)
  • Without escaping: value.Length * MaxExpansionFactorWhileTranscoding (3)

The new calculation is safe for inputs up to int.MaxValue / 6 (~357M chars). Utf8JsonWriter.WriteStringValue enforces a maximum input size of 166_666_666 chars (JsonConstants.MaxUnescapedTokenSize), so no overflow occurs in practice.

In StringConverter.Write, use chunk-based WriteStringValueSegment
instead of WriteStringValue when the input string length exceeds
the safe threshold. This prevents IndexOutOfRangeException that
could occur when writing extremely large strings.
Copilot AI review requested due to automatic review settings April 17, 2026 06:02
@dotnet-policy-service dotnet-policy-service Bot added the community-contribution Indicates that the PR has been added by a community member label Apr 17, 2026
@prozolic prozolic marked this pull request as ready for review April 17, 2026 06:09
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses #103155 by preventing IndexOutOfRangeException when serializing extremely large strings. It updates StringConverter.Write to switch to chunked Utf8JsonWriter.WriteStringValueSegment for strings beyond a computed safe length threshold.

Changes:

  • Add chunked string writing in StringConverter.Write using WriteStringValueSegment for very large inputs.
  • Introduce new OuterLoop tests intended to cover extremely large string serialization (including indented scenarios).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
src/libraries/System.Text.Json/src/System/Text/Json/Serialization/Converters/Value/StringConverter.cs Uses segmented string writing past a computed safe threshold to avoid overflow/crash paths.
src/libraries/System.Text.Json/tests/System.Text.Json.Tests/Serialization/Value.WriteTests.cs Adds new OuterLoop coverage for extremely large string serialization (root + indented + array element).

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 17, 2026 06:17
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses an IndexOutOfRangeException when serializing extremely large strings by switching StringConverter.Write to use chunked Utf8JsonWriter.WriteStringValueSegment once the input string exceeds a computed “safe length” threshold, avoiding overflow-prone size computations in the non-segmented writer path.

Changes:

  • Add chunked segmented string writing in StringConverter.Write for very large strings.
  • Introduce helper logic to compute a max “safe” string length based on worst-case escaping/transcoding expansion.
  • Add OuterLoop tests covering extremely large string serialization (including indented scenarios).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File Description
src/libraries/System.Text.Json/src/System/Text/Json/Serialization/Converters/Value/StringConverter.cs Routes very large strings through chunked WriteStringValueSegment to avoid overflow/indexing failures in large escaping/transcoding cases.
src/libraries/System.Text.Json/tests/System.Text.Json.Tests/Serialization/Value.WriteTests.cs Adds OuterLoop coverage for very large string serialization and indentation scenarios to reproduce/guard against the reported crash.

add ConditionalTheory and ConditionalFact in test case.
and WriteStringIndented, using the original input length.

 - With escaping: value.Length * MaxExpansionFactorWhileEscaping
 - Without escaping: value.Length * MaxExpansionFactorWhileTranscoding

The new calculation is safe for inputs up to int.MaxValue / 6
 (~357M chars). WriteStringValue enforces a maximum input size of
166_666_666 chars (MaxUnescapedTokenSize), so no overflow occurs in
practice
Copilot AI review requested due to automatic review settings April 23, 2026 14:51
@prozolic prozolic changed the title Use WriteStringValueSegment for large strings Fix string buffer size calculation in Utf8JsonWriter.WriteStringValue Apr 23, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Addresses IndexOutOfRangeException when serializing very large strings by fixing Utf8JsonWriter’s buffer sizing logic for escaped UTF-16 string values, and adds regression tests covering near-limit escaped string writes.

Changes:

  • Update Utf8JsonWriter string value write path to use a precomputed max-required-byte count when sizing buffers for escaped strings.
  • Add Utf8JsonWriter outerloop tests that write extremely large escaped string values (minimized + indented).
  • Add an end-to-end JsonSerializer.Serialize outerloop test for extremely large escaped strings.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
src/libraries/System.Text.Json/src/System/Text/Json/Writer/Utf8JsonWriter.WriteValues.String.cs Fixes max-buffer-size computation for escaped UTF-16 string values to avoid overflow leading to out-of-range writes.
src/libraries/System.Text.Json/tests/System.Text.Json.Tests/Utf8JsonWriterTests.cs Adds large escaped-string writer regression tests (minimized + indented).
src/libraries/System.Text.Json/tests/System.Text.Json.Tests/Serialization/Value.WriteTests.cs Adds end-to-end serializer regression test for extremely large escaped strings.

@eiriktsarpalis eiriktsarpalis enabled auto-merge (squash) April 27, 2026 13:57
@eiriktsarpalis eiriktsarpalis merged commit aba46e3 into dotnet:main Apr 27, 2026
91 checks passed
@prozolic prozolic deleted the StringConverter branch April 27, 2026 14:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-System.Text.Json community-contribution Indicates that the PR has been added by a community member

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants