Fix string buffer size calculation in Utf8JsonWriter.WriteStringValue#127047
Fix string buffer size calculation in Utf8JsonWriter.WriteStringValue#127047eiriktsarpalis merged 6 commits intodotnet:mainfrom
Conversation
In StringConverter.Write, use chunk-based WriteStringValueSegment instead of WriteStringValue when the input string length exceeds the safe threshold. This prevents IndexOutOfRangeException that could occur when writing extremely large strings.
There was a problem hiding this comment.
Pull request overview
This PR addresses #103155 by preventing IndexOutOfRangeException when serializing extremely large strings. It updates StringConverter.Write to switch to chunked Utf8JsonWriter.WriteStringValueSegment for strings beyond a computed safe length threshold.
Changes:
- Add chunked string writing in
StringConverter.WriteusingWriteStringValueSegmentfor very large inputs. - Introduce new OuterLoop tests intended to cover extremely large string serialization (including indented scenarios).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| src/libraries/System.Text.Json/src/System/Text/Json/Serialization/Converters/Value/StringConverter.cs | Uses segmented string writing past a computed safe threshold to avoid overflow/crash paths. |
| src/libraries/System.Text.Json/tests/System.Text.Json.Tests/Serialization/Value.WriteTests.cs | Adds new OuterLoop coverage for extremely large string serialization (root + indented + array element). |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR addresses an IndexOutOfRangeException when serializing extremely large strings by switching StringConverter.Write to use chunked Utf8JsonWriter.WriteStringValueSegment once the input string exceeds a computed “safe length” threshold, avoiding overflow-prone size computations in the non-segmented writer path.
Changes:
- Add chunked segmented string writing in
StringConverter.Writefor very large strings. - Introduce helper logic to compute a max “safe” string length based on worst-case escaping/transcoding expansion.
- Add OuterLoop tests covering extremely large string serialization (including indented scenarios).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| src/libraries/System.Text.Json/src/System/Text/Json/Serialization/Converters/Value/StringConverter.cs | Routes very large strings through chunked WriteStringValueSegment to avoid overflow/indexing failures in large escaping/transcoding cases. |
| src/libraries/System.Text.Json/tests/System.Text.Json.Tests/Serialization/Value.WriteTests.cs | Adds OuterLoop coverage for very large string serialization and indentation scenarios to reproduce/guard against the reported crash. |
add ConditionalTheory and ConditionalFact in test case.
and WriteStringIndented, using the original input length. - With escaping: value.Length * MaxExpansionFactorWhileEscaping - Without escaping: value.Length * MaxExpansionFactorWhileTranscoding The new calculation is safe for inputs up to int.MaxValue / 6 (~357M chars). WriteStringValue enforces a maximum input size of 166_666_666 chars (MaxUnescapedTokenSize), so no overflow occurs in practice
There was a problem hiding this comment.
Pull request overview
Addresses IndexOutOfRangeException when serializing very large strings by fixing Utf8JsonWriter’s buffer sizing logic for escaped UTF-16 string values, and adds regression tests covering near-limit escaped string writes.
Changes:
- Update
Utf8JsonWriterstring value write path to use a precomputed max-required-byte count when sizing buffers for escaped strings. - Add
Utf8JsonWriterouterloop tests that write extremely large escaped string values (minimized + indented). - Add an end-to-end
JsonSerializer.Serializeouterloop test for extremely large escaped strings.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| src/libraries/System.Text.Json/src/System/Text/Json/Writer/Utf8JsonWriter.WriteValues.String.cs | Fixes max-buffer-size computation for escaped UTF-16 string values to avoid overflow leading to out-of-range writes. |
| src/libraries/System.Text.Json/tests/System.Text.Json.Tests/Utf8JsonWriterTests.cs | Adds large escaped-string writer regression tests (minimized + indented). |
| src/libraries/System.Text.Json/tests/System.Text.Json.Tests/Serialization/Value.WriteTests.cs | Adds end-to-end serializer regression test for extremely large escaped strings. |
#103155
Use WriteStringValueSegment for large stringsInStringConverter.Write, this PR use chunk-basedWriteStringValueSegmentinstead ofWriteStringValuewhen the input string length exceeds the safe threshold. This preventsIndexOutOfRangeExceptionthat could occur when writing extremely large strings.This PR precompute maxRequiredBytes before dispatching to
Utf8JsonWriter.WriteStringMinimizedandUtf8JsonWriter.WriteStringIndented, using the original input length.MaxExpansionFactorWhileEscaping (6)MaxExpansionFactorWhileTranscoding (3)The new calculation is safe for inputs up to
int.MaxValue / 6(~357M chars).Utf8JsonWriter.WriteStringValueenforces a maximum input size of166_666_666chars (JsonConstants.MaxUnescapedTokenSize), so no overflow occurs in practice.