Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random-access document model for JSON (JsonDocument) #28132

Closed
bartonjs opened this issue Dec 11, 2018 · 10 comments
Closed

Random-access document model for JSON (JsonDocument) #28132

bartonjs opened this issue Dec 11, 2018 · 10 comments
Labels
api-approved API was approved in API review, it can be implemented area-System.Text.Json
Milestone

Comments

@bartonjs
Copy link
Member

JsonDocument

This class holds the parsed model of the JSON payload. It rents storage from an array pool, and this is IDisposable to return the resources back.

Like XmlDocument, this type only really has a creation routine (Parse) and access to the root element.

An instance takes a ReadOnlyMemory during construction, and holds that ReadOnlyMemory until Dispose (caller modification of the data between Parse and Dispose leads to non-deterministic, unsupported behavior).

JsonElement

This type is effectively a cursor in the document. It is tied to the document that created it, so when that document gets disposed the ObjectDisposedExceptions leak out from this types members. To keep memory utilization at a minimum it is a struct which is effectively a union of what would otherwise be JsonArray, JsonObject, JsonComment, JsonString, JsonNumber, JsonProperty, and whatever we'd do for true/false/null.

Update

Based on the previous API review:

  • JsonDocument.Parse now also accepts string, ReadOnlySpan<char>, and ReadOnlySequence<byte>, and manages things appropriately.
  • (Try)GetRawBytes is gone
  • TryGetValue overloads are now Try-prefixed versions of the non-Try methods (e.g. TryGetInt32())
  • [Try]GetDecimal was added
  • [Try]GetSingle and [Try]GetUInt32 were added
  • The string (and ReadOnlySpan<char> and ReadOnlySpan<byte>) indexer(s) are now a GetProperty method group to convey that they're more than O(1) expensive.
  • Split EnumerateChildren to EnumerateArray and EnumerateObject.
  • Custom enumerable/enumerators now implement the interfaces
  • EnumerateObject gets JsonProperty values instead of JsonElement (name is delay allocated as a System.String)
  • Added JsonValueType, which is JsonTokenType except
    • StartObject => Object
    • StartArray => Array
    • None => Undefined (to match the ECMAScript undefined value)
    • -EndObject
    • -EndArray
    • -Comment
  • JsonReaderOptions inputs to JsonDocument.Parse are defaulted to default

Changes NOT made:

  • No first-class support for nullable at this time
  • The property indexer (now deleted) does not return a nullable JsonElement, because neither the indexer nor methods lifted
namespace System.Text.Json
{
    public sealed partial class JsonDocument : IDisposable
    {
        public JsonElement RootElement { get; }
        public void Dispose();
        public static JsonDocument Parse(ReadOnlySequence<byte> utf8Json, JsonReaderOptions readerOptions = default);
        public static JsonDocument Parse(System.ReadOnlyMemory<byte> utf8Json, JsonReaderOptions readerOptions = default);
        public static JsonDocument Parse(System.ReadOnlyMemory<char> json, JsonReaderOptions readerOptions = default);
        public static JsonDocument Parse(string json, JsonReaderOptions readerOptions = default);
    }
    public readonly partial struct JsonElement
    {
        // InvalidOperationException if Type is not Array
        // IndexOutOfRangeException when appropriate
        public JsonElement this[int index] { get; }

        public JsonValueType Type { get; }

        // InvalidOperationException if Type is not Array
        public JsonElement.ArrayEnumerator EnumerateArray();
        public int GetArrayLength();

        // InvalidOperationException if Type is not Object
        public JsonElement.ObjectEnumerator EnumerateObject();
        public bool TryGetProperty(ReadOnlySpan<byte> utf8PropertyName, out JsonElement value);
        public bool TryGetProperty(ReadOnlySpan<char> propertyName, out JsonElement value);
        public bool TryGetProperty(string propertyName, out JsonElement value);
        // KeyNotFoundException if the property is not found
        public JsonElement GetProperty(ReadOnlySpan<byte> utf8PropertyName);
        public JsonElement GetProperty(ReadOnlySpan<char> propertyName);
        public JsonElement GetProperty(string propertyName);

        // InvalidOperationException if Type is not True or False
        public bool GetBoolean();

        // InvalidOperationException if Type is not Number
        // FormatException if value does not fit
        public decimal GetDecimal();
        public double GetDouble();
        public int GetInt32();
        public long GetInt64();
        public float GetSingle();
        public string GetString();
        [CLSCompliantAttribute(false)]
        public uint GetUInt32();
        [CLSCompliantAttribute(false)]
        public ulong GetUInt64();

        // InvalidOperationException if Type is not Number
        // false if value does not fit.
        public bool TryGetDecimal(out decimal value);
        public bool TryGetDouble(out double value);
        public bool TryGetInt32(out int value);
        public bool TryGetInt64(out long value);
        public bool TryGetSingle(out float value);
        [CLSCompliantAttribute(false)]
        public bool TryGetUInt32(out uint value);
        [CLSCompliantAttribute(false)]
        public bool TryGetUInt64(out ulong value);

        public override string ToString();

        public partial struct ArrayEnumerator : IEnumerable<JsonElement>, IEnumerator<JsonElement>, IEnumerable, IEnumerator, IDisposable
        {
            public JsonElement Current { get; }
            public void Dispose();
            public JsonElement.ArrayEnumerator GetEnumerator();
            public bool MoveNext();
            public void Reset();
        }
        public partial struct ObjectEnumerator : IEnumerable<JsonProperty>, IEnumerator<JsonProperty>, IEnumerable, IEnumerator, IDisposable
        {
            public JsonProperty Current { get; }
            public void Dispose();
            public JsonElement.ObjectEnumerator GetEnumerator();
            public bool MoveNext();
            public void Reset();
        }
    }
    public partial readonly struct JsonProperty
    {
        public string Name { get; }
        public JsonElement Value { get; }
    }
    public enum JsonValueType : byte
    {
        Array = (byte)2,
        False = (byte)6,
        Null = (byte)7,
        Number = (byte)4,
        Object = (byte)1,
        String = (byte)3,
        True = (byte)5,
        Undefined = (byte)0,
    }
    public partial struct Utf8JsonWriter
    {
        public void WriteElement(string propertyName, JsonElement value);
        public void WriteElement(ReadOnlySpan<char> propertyName, JsonElement value);
        public void WriteElement(ReadOnlySpan<byte> propertyName, JsonElement value);
        public void WriteElementValue(JsonElement element);
    }
}

In order to keep the overhead low this type does not support data manipulation (insert/append, delete, or update).

@bartonjs
Copy link
Member Author

Examples:

private static string GetTargetFrameworkMoniker(string runtimeConfigJsonPath)
{
    byte[] assetsJson = File.ReadAllBytes(runtimeConfigJsonPath);

    using (JsonDocument doc = JsonDocument.Parse(assetsJson, default))
    {
        return doc.RootElement["runtimeOptions"]["tfm"].GetString();
    }
}

public static string[] GetProbingPaths(string runtimeConfigPath)
{
    byte[] assetsJson = File.ReadAllBytes(runtimeConfigJsonPath);

    using (JsonDocument doc = JsonDocument.Parse(assetsJson, default))
    {
        JsonElement runtimeOptions = doc.RootElement["runtimeOptions"];

        if (runtimeOptions.TryGetProperty("additionalProbingPaths", out JsonElement probingPaths))
        {
            int len = probingPaths.GetArrayLength();

            if (len > 0)
            {
                string[] ret = new string[len];
                int idx = 0;

                foreach (JsonElement element in probingPaths.EnumerateChildren())
                {
                    ret[idx++] = element.GetString();
                }

                return ret;
            }
        }
    }

    return Array.Empty<string>();
}

Transliteration of https://github.com/aspnet/AspNetCore/blob/1fd3fb764af4439bbd76d1f4a9601a14c92111ab/src/Security/src/Microsoft.AspNetCore.Authentication.OAuth/Claims/JsonKeyClaimAction.cs#L33-L55:

public override void Run(JsonElement userData, ClaimsIdentity identity, string issuer)
{
    JsonElement value = userData[JsonKey];

    if (value.Type == JsonTokenType.StartArray)
    {
        foreach (JsonElement child in value.EnumerateChildren())
        {
            AddClaim(child.ToString(), identity, issuer);
        }
    }
    else if (value.Type != JsonTokenType.StartObject)
    {
        AddClaim(value.ToString(), identity, issuer);
    }
}

private void AddClaim(string value, ClaimsIdentity identity, string issuer)
{
    if (!string.IsNullOrEmpty(value))
   {
        identity.AddClaim(new Claim(ClaimType, value, ValueType, issuer));
    }
}

@Joe4evr
Copy link
Contributor

Joe4evr commented Dec 12, 2018

So in the API Design Review video, there was a bit of discussion around handling of string literals with the upcoming Utf8String in mind. Target-typing should naturally be a thing:

Utf8String str = "Hello UTF-8 world!";

What wasn't mentioned is that part of the feature's language design, I think, includes (or at least has room for) a literal prefix to force interpretation to a Utf8String:

var str = u"Also a UTF-8 literal.";

Not sure off-hand if that's final or not, but that at least would resolve the concern Immo brought up.

@JesperTreetop
Copy link
Contributor

Will there also be decimal GetDecimal() and bool TryGetValue(out decimal value) on JsonElement? Parsing it as a double is destructive.

@TylerBrinkley
Copy link
Contributor

TylerBrinkley commented Dec 21, 2018

I suppose ObjectEnumerator cannot implement both IReadOnlyDictionary<string, JsonElement> and IEnumerable<JsonProperty>? What's the purpose of having an explicit JsonProperty type as opposed to using KeyValuePair<string, JsonElement>? Lazy conversion of the Name to a string?

@bartonjs
Copy link
Member Author

bartonjs commented Jan 3, 2019

I suppose ObjectEnumerator cannot implement both IReadOnlyDictionary<string, JsonElement> and IEnumerable<JsonProperty>?

It's just the iterator (and enumerable); the dictionary-like lookups are already on JsonElement. The latest iteration also eliminated the name-based indexers, since they're "more expensive than they look".

What's the purpose of having an explicit JsonProperty type as opposed to using KeyValuePair<string, JsonElement>? Lazy conversion of the Name to a string?

That, and future expansion (e.g. a way to expose the property name as a Utf8String once that feature comes in).

@terrajobst
Copy link
Member

Modulo a few feedback items we discussed today I think we can mark the API as approved.

@fredrikhr
Copy link
Contributor

Concerning the discussion about reading streams. How about if you would add support for parsing a PipeReader?

@bartonjs
Copy link
Member Author

dotnet/corefx#34485 did not provide the new methods on Utf8JsonWriter; those are still TBD.

@ahsonkhan
Copy link
Member

@bartonjs, can this issue now be closed given that we have added interop with reader/writer and document/element?

@bartonjs
Copy link
Member Author

Yep, good call.

@msftgits msftgits transferred this issue from dotnet/corefx Feb 1, 2020
@msftgits msftgits added this to the 3.0 milestone Feb 1, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 14, 2020
@bartonjs bartonjs removed their assignment Jul 26, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-approved API was approved in API review, it can be implemented area-System.Text.Json
Projects
None yet
Development

No branches or pull requests

8 participants