Improve deserialization of JSON primitives into JsonElement #116419

PranavSenthilnathan · 2025-06-08T21:44:15Z

When creating JsonElement there is an extra overhead of creating and storing the MetadataDb in addition to the required UTF-8 payload. We can reduce this overhead by caching readonly databases for primitives of small length. This PR only affects deserialization of JsonElement when it is part of a larger deserialization, like extension data and dictionaries (if the value is object, JsonElement, or JsonNode). This should cover most places where a JsonElement of a primitive is created, but there's nothing preventing us from extending it to top level JsonElement deserialization as well.

Caching is based on the length in bytes of the UTF-8 JSON payload. The threshold was arbitrarily chosen - numbers have threshold of 8 bytes and strings 16 bytes.

The perf results show up to ~20% improvement in some cases.

Benchmarks


BenchmarkDotNet v0.14.1-nightly.20250107.205, Windows 11 (10.0.26100.4061)
AMD Ryzen 9 9950X 4.30GHz, 1 CPU, 32 logical and 16 physical cores
.NET SDK 10.0.100-preview.3.25201.16
  [Host]     : .NET 10.0.0 (10.0.25.17105), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-TYDCSN : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-ODTNID : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI

PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms  MaxIterationCount=20  
MinIterationCount=15  WarmupCount=1

Method	Toolchain	Payload	Mean	Ratio	Allocated	Alloc Ratio
Baseline	main	{ "Foo": "foo", "Bar": "barValue" }	163.0 ns	1.00	264 B	1.00
Baseline	PR	{ "Foo": "foo", "Bar": "barValue" }	163.2 ns	1.00	264 B	1.00

DeserializeObjectDictionary	main	{ "Foo": "foo", "Bar": "barValue" }	197.3 ns	1.00	656 B	1.00
DeserializeObjectDictionary	PR	{ "Foo": "foo", "Bar": "barValue" }	175.7 ns	0.89	560 B	0.85

DeserializeJsonNodeDictionary	main	{ "Foo": "foo", "Bar": "barValue" }	152.1 ns	1.00	464 B	1.00
DeserializeJsonNodeDictionary	PR	{ "Foo": "foo", "Bar": "barValue" }	149.7 ns	0.98	464 B	1.00

DeserializeJsonElementDictionary	main	{ "Foo": "foo", "Bar": "barValue" }	189.2 ns	1.00	616 B	1.00
DeserializeJsonElementDictionary	PR	{ "Foo": "foo", "Bar": "barValue" }	156.0 ns	0.82	520 B	0.84

DeserializeExtensionObjectDictionary	main	{ "Foo": "foo", "Bar": "barValue" }	211.8 ns	1.00	680 B	1.00
DeserializeExtensionObjectDictionary	PR	{ "Foo": "foo", "Bar": "barValue" }	175.7 ns	0.83	584 B	0.86

DeserializeExtensionJsonElementDictionary	main	{ "Foo": "foo", "Bar": "barValue" }	206.0 ns	1.00	640 B	1.00
DeserializeExtensionJsonElementDictionary	PR	{ "Foo": "foo", "Bar": "barValue" }	178.1 ns	0.86	544 B	0.85

DeserializeExtensionJsonObject	main	{ "Foo": "foo", "Bar": "barValue" }	173.3 ns	1.00	544 B	1.00
DeserializeExtensionJsonObject	PR	{ "Foo": "foo", "Bar": "barValue" }	176.9 ns	1.02	544 B	1.00

Baseline	main	{ "Foo": 42, "Bar": 3.14 }	176.1 ns	1.00	256 B	1.00
Baseline	PR	{ "Foo": 42, "Bar": 3.14 }	172.6 ns	0.98	256 B	1.00

DeserializeObjectDictionary	main	{ "Foo": 42, "Bar": 3.14 }	208.5 ns	1.00	632 B	1.00
DeserializeObjectDictionary	PR	{ "Foo": 42, "Bar": 3.14 }	165.3 ns	0.79	552 B	0.87

DeserializeJsonNodeDictionary	main	{ "Foo": 42, "Bar": 3.14 }	212.2 ns	1.00	664 B	1.00
DeserializeJsonNodeDictionary	PR	{ "Foo": 42, "Bar": 3.14 }	171.7 ns	0.81	584 B	0.88

DeserializeJsonElementDictionary	main	{ "Foo": 42, "Bar": 3.14 }	207.8 ns	1.00	592 B	1.00
DeserializeJsonElementDictionary	PR	{ "Foo": 42, "Bar": 3.14 }	162.1 ns	0.78	512 B	0.86

DeserializeExtensionObjectDictionary	main	{ "Foo": 42, "Bar": 3.14 }	215.9 ns	1.00	656 B	1.00
DeserializeExtensionObjectDictionary	PR	{ "Foo": 42, "Bar": 3.14 }	189.3 ns	0.88	576 B	0.88

DeserializeExtensionJsonElementDictionary	main	{ "Foo": 42, "Bar": 3.14 }	214.4 ns	1.00	616 B	1.00
DeserializeExtensionJsonElementDictionary	PR	{ "Foo": 42, "Bar": 3.14 }	175.4 ns	0.82	536 B	0.87

DeserializeExtensionJsonObject	main	{ "Foo": 42, "Bar": 3.14 }	235.0 ns	1.00	744 B	1.00
DeserializeExtensionJsonObject	PR	{ "Foo": 42, "Bar": 3.14 }	200.0 ns	0.85	664 B	0.89

Benchmarking code

[HideColumns("Job", "Min", "Max", "Median", "Error", "StdDev", "RatioSD", "Gen0")]
public class ExtensionJson
{
    private const string JsonStringValues = "{ \"Foo\": \"foo\", \"Bar\": \"barValue\" }";
    private const string JsonNumberValues = "{ \"Foo\": 42, \"Bar\": 3.14 }";

    [Params(JsonStringValues, JsonNumberValues)]
    public string Payload { get; set; }

    [Benchmark]
    [BenchmarkCategory(Categories.Libraries, Categories.JSON)]
    public object Baseline() =>
        JsonSerializer.Deserialize<JsonElement>(Payload)!;

    [Benchmark]
    [BenchmarkCategory(Categories.Libraries, Categories.JSON)]
    public object DeserializeObjectDictionary() =>
        JsonSerializer.Deserialize<Dictionary<string, object>>(Payload)!;

    [Benchmark]
    [BenchmarkCategory(Categories.Libraries, Categories.JSON)]
    public object DeserializeJsonNodeDictionary() =>
        JsonSerializer.Deserialize<Dictionary<string, JsonNode>>(Payload)!;

    [Benchmark]
    [BenchmarkCategory(Categories.Libraries, Categories.JSON)]
    public object DeserializeJsonElementDictionary() =>
        JsonSerializer.Deserialize<Dictionary<string, JsonElement>>(Payload)!;

    [Benchmark]
    [BenchmarkCategory(Categories.Libraries, Categories.JSON)]
    public object DeserializeExtensionObjectDictionary() =>
        JsonSerializer.Deserialize<ExtensionObjectDictionary>(Payload)!;

    [Benchmark]
    [BenchmarkCategory(Categories.Libraries, Categories.JSON)]
    public object DeserializeExtensionJsonElementDictionary() =>
        JsonSerializer.Deserialize<ExtensionJsonElementDictionary>(Payload)!;

    [Benchmark]
    [BenchmarkCategory(Categories.Libraries, Categories.JSON)]
    public object DeserializeExtensionJsonObject() =>
        JsonSerializer.Deserialize<ExtensionJsonObject>(Payload)!;

    public class ExtensionObjectDictionary
    {
        [JsonExtensionData]
        public Dictionary<string, object> Properties { get; set; }
    }

    public class ExtensionJsonElementDictionary
    {
        [JsonExtensionData]
        public Dictionary<string, JsonElement> Properties { get; set; }
    }

    public class ExtensionJsonObject
    {
        [JsonExtensionData]
        public JsonObject Properties { get; set; }
    }
}

Copilot

Pull Request Overview

This PR improves deserialization of JSON primitives into JsonElement by caching immutable MetadataDb instances for common literal values, strings, and numbers. Key changes include updating Parse methods to use a ref Utf8JsonReader and introducing token-specific caching in MetadataDb, along with a minor adjustment in the metadata buffer sizing logic.

Updated Parse logic to pass reader by reference and select caching based on token type.
Introduced new MetadataDb creation methods (for literal, string, and number values) and a locked cache for small primitives.
Adjusted the condition for enlarging the MetadataDb buffer.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File	Description
JsonDocument.Parse.cs	Adjusted parsing logic to use ref Utf8JsonReader and to call new CreateLockedFor* methods based on token type.
JsonDocument.MetadataDb.cs	Introduced new caching methods for literals, strings, and numbers, and modified the buffer enlargement check.

Comments suppressed due to low confidence (2)

src/libraries/System.Text.Json/src/System/Text/Json/Document/JsonDocument.MetadataDb.cs:266

Changing the condition from '>=' to '>' alters when the buffer is enlarged. Please verify that the new check correctly prevents buffer overflows when appending new rows.

if (Length > _data.Length - DbRow.Size)

src/libraries/System.Text.Json/src/System/Text/Json/Document/JsonDocument.Parse.cs:771

Subtracting 2 from the payload length assumes that the JSON string always includes both starting and ending quotes. Please confirm that this logic safely handles all edge cases.

MetadataDb database = MetadataDb.CreateLockedForString(utf8Json.Length - 2, reader.ValueIsEscaped);

dotnet-policy-service · 2025-06-08T21:44:46Z

Tagging subscribers to this area: @dotnet/area-system-text-json, @gregsdennis
See info in area-owners.md if you want to be subscribed.

src/libraries/System.Text.Json/src/System/Text/Json/Document/JsonDocument.MetadataDb.cs

eiriktsarpalis · 2025-06-24T11:40:12Z

src/libraries/System.Text.Json/src/System/Text/Json/Document/JsonDocument.MetadataDb.cs

+            private static readonly MetadataDb LockedNull =
+                CreateLockedForNonStringPrimitiveImpl(JsonTokenType.Null, JsonConstants.NullValue.Length);
+
+            // Index i is a singleton database for all numbers of length i


What does "all numbers of length i" mean exactly? Is it that we're caching all possible numbers that are up to 8 characters long? That would be ~ $10^8$ different numbers, not accounting for decimal points or exponentials.

No, the metadata database doesn't store the actual content, only the index into the actual content. The document will still need to store the UTF-8. But for primitive JSON values (like string and number), the metadata is just "what is the start offset of the value" and "how long is the value" (and "is the value escaped" for strings, but I chose not to cache escaped strings). The start offset is always 0 for number and 1 for string (to skip the quote) so we just need to store "how long is the value". So there would only be 8 cached MetadataDbs which represent 0 <= n <10^8 actual values.

I see now. I take it these are only used if the strings or numbers don't have preceding or trailing whitespace or comments?

Yep, this is used only in the ParseValue path so the ReadOnlyMemory<byte> that the JsonDocument holds will only contain the single value without leading/trailing junk.

Can you check if we have relevant test coverage just in case?

src/libraries/System.Text.Json/src/System/Text/Json/Document/JsonDocument.MetadataDb.cs

…ment-primitive

cache metadatadb

e4f5a12

PranavSenthilnathan requested a review from eiriktsarpalis June 8, 2025 21:44

PranavSenthilnathan self-assigned this Jun 8, 2025

Copilot AI review requested due to automatic review settings June 8, 2025 21:44

PranavSenthilnathan added the area-System.Text.Json label Jun 8, 2025

Copilot AI reviewed Jun 8, 2025

View reviewed changes

build-analysis bot mentioned this pull request Jun 9, 2025

System.Net.Quic.Tests.MsQuicTests.WriteTests failed with "System.Net.Quic.QuicException : The connection timed out from inactivity." #105177

Open

eiriktsarpalis reviewed Jun 24, 2025

View reviewed changes

src/libraries/System.Text.Json/src/System/Text/Json/Document/JsonDocument.MetadataDb.cs Show resolved Hide resolved

eiriktsarpalis reviewed Jun 24, 2025

View reviewed changes

src/libraries/System.Text.Json/src/System/Text/Json/Document/JsonDocument.MetadataDb.cs Show resolved Hide resolved

eiriktsarpalis reviewed Jun 24, 2025

View reviewed changes

src/libraries/System.Text.Json/src/System/Text/Json/Document/JsonDocument.MetadataDb.cs Show resolved Hide resolved

PranavSenthilnathan added 2 commits June 27, 2025 22:17

Add tests for trailing trivia

0d82e80

Merge branch 'main' of https://github.com/dotnet/runtime into jsonele…

41bcd20

…ment-primitive

This was referenced Jun 28, 2025

Occasional failure in "browser-wasm windows Release LibraryTests: Build Product" #116671

Open

browser-wasm Windows build error #116746

Open

[iOS/tvOS] System.Runtime.Tests crash with signal 4 #116815

Open

wasm build failure in CI #117017

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve deserialization of JSON primitives into JsonElement #116419

Improve deserialization of JSON primitives into JsonElement #116419

PranavSenthilnathan commented Jun 8, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

dotnet-policy-service bot commented Jun 8, 2025

Uh oh!

Uh oh!

Uh oh!

eiriktsarpalis Jun 24, 2025

Uh oh!

PranavSenthilnathan Jun 25, 2025

Uh oh!

eiriktsarpalis Jun 26, 2025

Uh oh!

PranavSenthilnathan Jun 27, 2025

Uh oh!

eiriktsarpalis Jun 27, 2025

Uh oh!

Uh oh!

Uh oh!

Improve deserialization of JSON primitives into JsonElement #116419

Are you sure you want to change the base?

Improve deserialization of JSON primitives into JsonElement #116419

Conversation

PranavSenthilnathan commented Jun 8, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

dotnet-policy-service bot commented Jun 8, 2025

Uh oh!

Uh oh!

Uh oh!

eiriktsarpalis Jun 24, 2025

Choose a reason for hiding this comment

Uh oh!

PranavSenthilnathan Jun 25, 2025

Choose a reason for hiding this comment

Uh oh!

eiriktsarpalis Jun 26, 2025

Choose a reason for hiding this comment

Uh oh!

PranavSenthilnathan Jun 27, 2025

Choose a reason for hiding this comment

Uh oh!

eiriktsarpalis Jun 27, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!