Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API Proposal: SequenceReader<T>.TryRead overloads to read a specified number of elements #30778

Closed
davidfowl opened this issue Sep 6, 2019 · 37 comments · Fixed by #57921
Closed
Labels
api-approved API was approved in API review, it can be implemented area-System.Buffers help wanted [up-for-grabs] Good issue for external contributors
Milestone

Comments

@davidfowl
Copy link
Member

The use case is to return a contiguous ReadOnlySpan<byte> or a ReadOnlySequence<byte> the current position.

public partial ref struct SequenceReader<T>
{
    bool TryRead(int count, out ReadOnlySpan<T> value);
    bool TryRead(int count, out ReadOnlySequence<T> value);
}

One use case for this type of method is length prefixed network protocols e.g.

void Parse(ref ReadOnlySequence<byte> buffer)
{
    var reader = new SequenceReader<byte>(buffer);
    // Read the length prefix, then read the payload
    if (!reader.TryReadBigEndian(out int length) || !reader.TryRead(length, out ReadOnlySequence<byte> payload))
    {
        return;
    }
    
    ProcessPayload(payload);
}
@davidfowl
Copy link
Member Author

@JeremyKuhne
Copy link
Member

We should consider adding skip/start to align with dotnet/corefx#40871. It isn't as necessary in this case as we're moving the reader forward (unlike peeking, where there is no way to skip without rewinding), but it would improve perf a little.

@softworkz
Copy link

The problem I'm seeing with this approach over my other suggestion here (https://github.com/dotnet/corefx/issues/40885) is the code flow.

  1. You might not always want to actually consume and advance
  2. You might want to read from another position
  3. You don't want to check the return value each time:
    When parsing certain protocols, you'd rather want to check the remaining content for parsing one time, and then read out a number of values subsequently. Checking the return value each time makes the parsing code ugly.
    I think it should rather throw an Exception instead when there's not enough data in the sequence.

Also this does not allow elegant parsing code like in the "Example 2" here: https://github.com/dotnet/corefx/issues/40885

@JeremyKuhne
Copy link
Member

You might not always want to actually consume and advance

The TryPeek APIs dotnet/corefx#40871 should address that part of it.

You might want to read from another position

I don't think we should be turning the reader into a random access reader. The APIs are specifically designed around performance, skipping around to random SequencePositions is expensive and will increase the cognitive overhead on how to use the reader efficiently.

I think it should rather throw an Exception instead when there's not enough data in the sequence.

Only our positioning APIs throw, and they never should if your code is written correctly. I don't think we'll want to change the design so that some read methods throw and some don't. It also makes writing high performance consuming code more difficult as try..catch makes JIT optimizations much more difficult.

Also this does not allow elegant parsing code like in the "Example 2" here: dotnet/corefx#40885

While you can't write an inline call with the out, it allows the no-throw design for high performance. The current design is pretty consistent, I don't think we want to mix another usage pattern in.

@softworkz
Copy link

softworkz commented Sep 6, 2019

I have written high performance parsing code and the bottom line was that I was faster by copying to a byte array and parsing from there than by using the insufficient crippled API of the SequenceReader in its current form.

The TryPeek APIs dotnet/corefx#40871 should address that part of it.

You're exactly hitting the problem here: "part of it"

Now tell me, which code should a consumer of this API write, when he doesn't need to peek for a byte, but for an unsigned 32bit integer?

@JeremyKuhne
Copy link
Member

You're exactly hitting the problem here: "part of it"

No API is going to address every use case. I'm not sure what you're suggesting here.

Now tell me, which code should a consumer of this API write, when he doesn't need to peek for a byte, but for an unsigned 32bit integer?

We have binary reader extensions. We didn't put peek versions of those as there was no request. Please feel free to create a proposal if you think it's useful. You can build the same sort of helpers in the meantime.

@softworkz
Copy link

What proposing here is the functionality that any of those helper methods would need to implement. So why should we propose a hundred helper method when a single method would be sufficient?

Those 6 extension methods that you are referencing is just a minimal subset of methods available from BinaryPrimitives for example.

Also, .NET Core recently added hundreds of various TryParse(..) overloads working on spans (mostly ReadOnlySpan<char). The single method that I'm proposing here allows using all of them.

Are you seriously suggesting that I should create proposals for duplicating all those TryParse and binary extension methods?

No API is going to address every use case. I'm not sure what you're suggesting here.

It's not about a specific single use case. It's about the ability to make this usable in a hundred of cases.

That single method that I'm proposing would allow that.

@JeremyKuhne
Copy link
Member

Are you seriously suggesting that I should create proposals for duplicating all those TryParse and binary extension methods?

No. I'm suggesting you create TryPeek equivalent set of the things you want, if you want them. If you don't, that's ok.

It's about the ability to make this usable in a hundred of cases. That single method that I'm proposing would allow that.

I don't see how understand how your proposal provides more functionality then this proposal or the "peek" option that doesn't advance the reader. Yes, you can do it inline, but I've already gone into the reasons why we are unlikely to do that. If you have examples to demonstrate where your proposal provides functionality this does not please share them.

@JeremyKuhne
Copy link
Member

For the proposal example, here is what you currently would have to write:

void Parse(ref ReadOnlySequence<byte> buffer)
{
    var reader = new SequenceReader<byte>(buffer);

    // Read the length prefix, validate we have enough content left
    if (!reader.TryReadBigEndian(out int length) || reader.Remaining < length)
    {
        return;
    }

    ProcessPayload(reader.Sequence.Slice(reader.Position, length));
}

With this proposal it becomes:

void Parse(ref ReadOnlySequence<byte> buffer)
{
    var reader = new SequenceReader<byte>(buffer);
    // Read the length prefix, then read the payload
    if (!reader.TryReadBigEndian(out int length) || !reader.TryRead(length, out ReadOnlySequence<byte> payload))
    {
        return;
    }
    
    ProcessPayload(payload);
}

It's more awkward for a span in the current code as you'd have to use TryCopyTo and manage your own span buffer. Presumption is the span API here would allocate an array like the other APIs do when it has to.

@softworkz
Copy link

softworkz commented Sep 6, 2019

With my proposal (https://github.com/dotnet/corefx/issues/40885) it's a one-liner:

var myValue = BinaryPrimitives.ReadUInt32BigEndian(reader.GetSpan(offset, 4));

PS: I need UInt32, not Int32 - what would I do currently (except casting)?

@davidfowl
Copy link
Member Author

With your proposal what gets returned when length exceeds the bounds?

@JeremyKuhne
Copy link
Member

JeremyKuhne commented Sep 6, 2019

PS: I need UInt32, not Int32 - what would I do currently (except casting)?

Just cast, there is another discussion bout adding the unsigned overloads. There is no cost to casting, which is why we didn't add them originally- just trying to be deliberate in adding API so we didn't end up with a really huge surface area. #30580

@softworkz
Copy link

softworkz commented Sep 6, 2019

With your proposal what gets returned when length exceeds the bounds?

An IndexOutOfRangeException (or similar). I have written parsing code that is processing large volumes of data and there's just a single try catch at the very top of the call stack.
Jeremy pointed out that high performance code should not use Exception handling and I agree in a way that it should not be used for controlling code flow.
The caller can check if there's enough data for calling, so IMO it's legit to throw as that would be the caller's fault.

@JeremyKuhne
Copy link
Member

The caller can check if there's enough data for calling, so IMO it's legit to throw as that would be the caller's fault.

There is also some cost to adding the throw in the reader as well. I spent over two months getting the fast path as fast as possible- adding a single line or method call can measure in some of the cases.

Perf not withstanding, we didn't go that route with the design here so it would not align with the rest of the API.

@softworkz
Copy link

The caller can check if there's enough data for calling, so IMO it's legit to throw as that would be the caller's fault.

There is also some cost to adding the throw in the reader as well. I spent over two months getting the fast path as fast as possible- adding a single line or method call can measure in some of the cases.

The check for the condition ("buffer long enough") is required in either case.
How would some conditionally called code affect performance by its sole existence as long as it's never called?
Or is the compiler adding some MSIL stuff due to the fact that the method could throw?

@JeremyKuhne
Copy link
Member

Or is the compiler adding some MSIL stuff due to the fact that the method could throw?

When we do throw we use throw "helpers" in these critical paths which put the throw in another method as a throw prevents the method from getting inlined. Just having another method call can also impact the ability of the JIT to inline / optimize in other ways. For the fast paths the overhead is very, very small so seemingly trivial things can have a measurable impact.

@softworkz
Copy link

I understand, but I'm not sure if the impact can be really significant considering that even the most elementary members are throwing exceptions - just think of an array's item accessor.

@JeremyKuhne
Copy link
Member

I understand, but I'm not sure if the impact can be really significant considering that even the most elementary members are throwing exceptions - just think of an array's item accessor.

Sure, bounds checking period has a cost and it is enough to measure in the fast paths here where you're effectively just doing Span operations. There are some incredibly subtle JIT interactions we're considering in the implementation here- matching arguments to sub methods, having a single final return, etc. As I mentioned, we spent months flogging this code, running perf measurements and looking at the output IL and final JIT'ted code. It was quite the slog trying to squeeze every last drop of perf we could out of it (without resorting to unsafe code if possible).

@softworkz
Copy link

A few months ago - when I was struggling trying to make some reasonable use of ReadOnlySequence - I've been reading through a lot of discussions here on that subject involving @stephentoub and @davidfowl and a few others from which I gained the impression that - under acknowledgement of the limited accessibility of the ReadOnlySequence API - there should be introduced something allowing some real-world use of this - namely the SequenceReader.

I had assumed that the low-level kind of use would be using ReadOnlySequence directly and the purpose of the SequenceReader would be to allow using it in a more convenient way.
That's why I've been criticizing that the SequenceReader doesn't make it much easier to consume a ReadOnlySequence and instead of doing better, it shares many of the oddities of the ReadOnlySequence API. It might be highly highly optimized now, but it doesn't meet the intended purpose.

Or have I misunderstood the intended purpose of the SequenceReader?

@davidfowl
Copy link
Member Author

I don’t think it’s fair to say it doesn’t make using the ReadOnlySequence easier. That may not be the case for your specific scenario but it certainly does it make it easier to use for a majority of other scenarios and also makes several nasty edge cases completely disappear.

@davidfowl
Copy link
Member Author

It might also be good to have an API like this on ReadOnlySequence directly

@JeremyKuhne
Copy link
Member

Or have I misunderstood the intended purpose of the SequenceReader?

The intent was to make it easier to handle cross segment reading, with a high priority on performance given the desire to use this in performance critical situations (e.g. handling web requests). While we didn't purposefully want to make it hard to use, it was specifically designed to be high performance no matter how you use it. That meant that it had to be somewhat aligned with the reality of the data structure it was working with- which doesn't fall directly in line with models people are used to such as Stream or StreamReader/Writer.

We're trying to be very deliberate with adding/expanding API to try and make sure that we're adding APIs that will stand up to the test of time. These APIs will be around for decades after all. :)

@softworkz
Copy link

I think the problem is that you haven't recognized yet, that there's a much wider range of potential for this technique. You have focused on and optimized for just a small range of specific use cases. But there's more - even cases where performance doesn't play a primary role. That's why I think that API proposals should not only be judged by how it would perform in those specific scenarios that you are currently focusing on.

Of course, each public method should do its job with the best possible performance.
There are always operations that are more costly than others, but when the result is required and there's no other way to achieve the same in a faster way, then that shouldn't be a reason not to implement that method.
Probably it would still be faster than the user doing it with custom code.

@FiniteReality
Copy link

I personally don't like the first overload. It's deceptively simple, and the only sensible approach when it comes up to a boundary would be to fail. For example:

// Sequence:
// [\x08 1 2 3 4] [5 6 7 8]

bool TryReadMessage(ref ReadOnlySequence<byte> buffer, out string message)
{
    var reader = new SequenceReader<byte>(buffer);
    if (!reader.TryRead(out byte length) || !reader.TryRead(length, out ReadOnlySpan<byte> data))
    {
        message = null;
        return false;
    }

    message = Encoding.UTF8.GetString(data);
    return true;
}

I would expect this method to return false, as the read could not read the specified number of bytes due to the boundary in the sequence.

This, in my opinion, means that more code will be written to only use the second ReadOnlySequence overload, ignoring the first overload entirely.

@davidfowl
Copy link
Member Author

It wouldn't fail, it would allocate an array.

@FiniteReality
Copy link

It wouldn't fail, it would allocate an array.

That seems a bit odd to me considering most of these APIs seem to be designed to avoid that.

@davidfowl
Copy link
Member Author

That seems a bit odd to me considering most of these APIs seem to be designed to avoid that.

There's already prior art with TryReadTo. The ReadOnlySpan<T> overload gives you a span if it can in a non allocating slice and will fallback to allocating if it can't. https://github.com/dotnet/corefx/blob/59c6b1ab96aa2e99ed6ecb33a8bcf9283ce1ea27/src/System.Memory/src/System/Buffers/SequenceReader.Search.cs#L80

@FiniteReality
Copy link

I see. Thanks for the clarification.

@JeremyKuhne
Copy link
Member

JeremyKuhne commented Sep 10, 2019

Video

Approved: https://github.com/dotnet/corefx/issues/40962#issuecomment-530060857

        public bool TryRead(int length, out ReadOnlySpan<T> value);
        public bool TryRead(long length, out ReadOnlySequence<T> value);

@softworkz
Copy link

It wouldn't fail, it would allocate an array.

That seems a bit odd to me considering most of these APIs seem to be designed to avoid that.

@FiniteReality - You might be interested in my comment https://github.com/dotnet/corefx/issues/40871#issuecomment-530088885 regarding allocation probability.

@ahsonkhan
Copy link
Member

See https://github.com/dotnet/corefx/issues/40871#issuecomment-542928968

Based on having recently ratified guidance for working with spans, the API as reviewed needs discussion again; because we feel that when returning spans it should always be caller-provided data.

So bool TryPeek(int skip, int count, Span<T> destinationIfAcrossSegments, out ReadOnlySpan<T> value) is a better fit, the returned (via out) span is either came from the ctor (a slice of CurrentSpan) or a slice of value. But nothing is ever "made up" then returned via the span.

@msftgits msftgits transferred this issue from dotnet/corefx Feb 1, 2020
@msftgits msftgits added this to the 5.0 milestone Feb 1, 2020
@terrajobst terrajobst added api-approved API was approved in API review, it can be implemented and removed api-ready-for-review labels Feb 20, 2020
@terrajobst
Copy link
Member

terrajobst commented Feb 20, 2020

Video

  • The first API will have to allocate when count causes spanning blocks in the sequence. The bigger the count, the more likely it is to cross such a boundary. This could be changed to allow the caller to pass in a pre-allocated span or we could change the semantics to only succeed if the data is available contiguously. This API will need more design.
  • The second overload is fine but we should change the name to TryReadExact to make it clear that returning less or more is both not possible.
public partial ref struct SequenceReader<T>
{
    bool TryReadExact(int count, out ReadOnlySequence<T> value);
}

@reflectronic
Copy link
Contributor

We talked about this API (and some others) more in #30807 (comment). Specifically, we discussed putting the out parameter first (as it is on every other SequenceReader API), and also renaming value to sequence. For the sake of having a congruent API surface, that should probably be done for this API.

On the topic of the Span overload, I think that keeping the overload is also important for having a congruent API surface. There are existing APIs like TryReadTo and TryReadToAny that take both Span and Sequence overloads; it would be strange/annoying to have a new API, which is similar in purpose to the existing ones, but with no Span overload "because reasons."

Of course, having some type of alternative which doesn't necessarily allocate is useful, especially since the buffer size can be much more predictable. But I think that the matching APIs should be added for this method, and likewise anything new we come up with should be added to TryReadTo/TryReadToAny/TryPeek/etc.

@tannergooding
Copy link
Member

Moved this to future and marked as up-for-grabs. @davidfowl if this is needed for 6.0.0, please let us know.

@davidfowl
Copy link
Member Author

This is fine for .NET 7

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Aug 23, 2021
@lateapexearlyspeed
Copy link
Contributor

Hi, I would try and learn this. I have raised draft PR and will change its status when it is ready for review, thanks !

@lateapexearlyspeed
Copy link
Contributor

PR is ready for review now, please review, thanks.

@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Oct 13, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Nov 12, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-approved API was approved in API review, it can be implemented area-System.Buffers help wanted [up-for-grabs] Good issue for external contributors
Projects
None yet
Development

Successfully merging a pull request may close this issue.