Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Utf8JsonReader.Skip() with support for incomplete data #33295

Open
ahsonkhan opened this Issue Nov 7, 2018 · 5 comments

Comments

Projects
None yet
5 participants
@ahsonkhan
Copy link
Member

commented Nov 7, 2018

Edit: Updating API proposal since a void returning Skip() (that we initially discussed) does not work for partial data.

API additions:

public ref partial struct Utf8JsonReader
{
    // Necessary, given customer usage
    // Matches what Json.NET does.
    // If at StartArray/StartObject, skips to end of current object/array returning to a lower depth
    // If at a property name, skip it, and then skip the "value" IF it was a array/object, ending with EndArray/EndObject. If the "value" is a primitive type (not array/object), only skips the property name and nothing else.
    // If at any other token type, does nothing (for example primitive value type), but returns true
    // Returns false IF there wasn't enough data to successfully skip (i.e. Read() returned false).
    // If returning false, walks back the Utf8JsonReader state to before Skip was called. To successfully skip, get more data and create a new instance of the reader.
    // It will never return false if isFinalBlock = true
    // Otherwise, either returns true (for valid JSON) or throws a JsonReaderException for invalid JSON.
    public bool TrySkip() { throw null; }
   // Alternative names: TryFinish{Value}? If we don't add Skip that throws, call this one Skip?

    // In the worst case, TrySkip is 50-70% slower, and generally ~10% slower
    // since it has to recover and reset the reader to the previous state.
    public void Skip() { throw null; } // Throws if isFinalBlock = false

    // Optional helpers - might not be necessary
    public bool TrySkipToDepth(int targetDepth) { throw null; }
    // There are concerns with user passing in reader.CurrentDepth and expecting different semantics
    // No real scenario has come up to support require this yet.
}

Used in SignalR:
https://github.com/aspnet/AspNetCore/blob/9de42d516e6ee9c8eb17700e134f3ac8a7a682ff/src/SignalR/common/Shared/SystemTextJsonExtensions.cs#L57-L72
https://github.com/aspnet/AspNetCore/blob/9de42d516e6ee9c8eb17700e134f3ac8a7a682ff/src/SignalR/common/SignalR.Common/src/Protocol/HandshakeProtocol.cs#L179-L206
https://github.com/aspnet/AspNetCore/blob/9de42d516e6ee9c8eb17700e134f3ac8a7a682ff/src/SignalR/common/Http.Connections.Common/src/NegotiateProtocol.cs#L210-L264

Used in DependencyModel:
https://github.com/dotnet/core-setup/blob/45f9401bf62faf0d3446cfd8681d35cc3487367a/src/managed/Microsoft.Extensions.DependencyModel/UnifiedJsonReader.Utf8JsonReader.cs#L27-L51
https://github.com/dotnet/core-setup/blob/45f9401bf62faf0d3446cfd8681d35cc3487367a/src/managed/Microsoft.Extensions.DependencyModel/DependencyContextJsonReader.cs#L360-L376

Sample usage when TrySkip() returns false:

string jsonString = "{\"property\": {}}";
byte[] dataUtf8 = Encoding.UTF8.GetBytes(jsonString);
JsonReaderState state = default;

//Partial input: "{\"property\": {
var json = new Utf8JsonReader(dataUtf8.AsSpan(0, 14), isFinalBlock: false, state);
json.Read();
Assert.Equal(JsonTokenType.StartObject, json.TokenType);
int consumed = (int)json.BytesConsumed;
int counter = 0;
while (!json.TrySkip())
{
    // False the first call, True the second time
    counter++;
    // get more data somehow (e.g. from underlying stream/pipe), and appropriately set isFinalBlock
    json = new Utf8JsonReader(dataUtf8.AsSpan(consumed), isFinalBlock: true, json.CurrentState);
}
Assert.Equal(1, counter);
Assert.Equal(JsonTokenType.EndObject, json.TokenType);
Assert.Equal(dataUtf8.Length - consumed, json.BytesConsumed);
Assert.False(json.Read());

The naiive implementation (that doesn't support incomplete data, and returns void) is as follows:

        public void Skip()
        {
            if (TokenType == JsonTokenType.PropertyName)
            {
                Read();
            }

            if (TokenType == JsonTokenType.StartObject || TokenType == JsonTokenType.StartArray)
            {
                int depth = CurrentDepth;
                // Note the less than OR equal, given the semantics of CurrentDepth
                while (Read() && depth <= CurrentDepth)
                {
                }
            }
        }

Prototype implementation:
ahsonkhan@7fa12d1#diff-dbc577928a29b3154f5b195f56722391R218

@ahsonkhan

This comment has been minimized.

Copy link
Member Author

commented Mar 13, 2019

@ghuntley

This comment has been minimized.

Copy link
Member

commented Mar 19, 2019

@terrajobst

This comment has been minimized.

Copy link
Member

commented Mar 19, 2019

TrySkip() is unfortunate because it's an unbounded buffering read; compared to calling Read() which would throw data away as you go. That looks like a perf trap. But it's convenient so let's keep it.

We shouldn't do Skip() because callers typically can't know whether it's the final block.

The TrySkipToDepth(int targetDepth) API seems fine. Skipping is a weird term, but hey, that's the term everyone else is using so let's keep that.

@ghuntley

This comment has been minimized.

Copy link
Member

commented Mar 19, 2019

Skipping is a weird term, but hey, that's the term everyone else is using so let's keep that.

Related https://docs.microsoft.com/en-us/dotnet/api/system.xml.xmlreader.skip?view=netstandard-2.0

@ahsonkhan

This comment has been minimized.

Copy link
Member Author

commented Apr 1, 2019

We shouldn't do Skip() because callers typically can't know whether it's the final block.

I want to circle back to this decision. I think we should still provide this API. Just because callers might not know if it's the final block doesn't mean they won't sometimes know. And in that case, asking them to call the unnecessarily slow "TrySkip" and ignore the result (since it will always be true or we throw for invalid JSON data) doesn't make sense.

IsFinalBlock = true: https://github.com/dotnet/core-esetup/blob/18780678da576d8c629066f50af30159a8859b2f/src/managed/Microsoft.Extensions.DependencyModel/DependencyContextJsonReader.Utf8JsonReader.cs#L27
User-provided Skip: https://github.com/dotnet/core-setup/blob/18780678da576d8c629066f50af30159a8859b2f/src/managed/Microsoft.Extensions.DependencyModel/UnifiedJsonReader.Utf8JsonReader.cs#L99-L113

IsFinalBlock = true: https://github.com/aspnet/AspNetCore/blob/9de42d516e6ee9c8eb17700e134f3ac8a7a682ff/src/SignalR/common/SignalR.Common/src/Protocol/HandshakeProtocol.cs#L110
User-provided Skip: https://github.com/aspnet/AspNetCore/blob/9de42d516e6ee9c8eb17700e134f3ac8a7a682ff/src/SignalR/common/Shared/SystemTextJsonExtensions.cs#L57-L72

Therefore, can we add back:

public void Skip() { throw null; } // Throws if isFinalBlock = false

@maryamariyan maryamariyan removed their assignment Apr 3, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.