New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: SequenceReader<T>.ReadToEnd / AdvanceToEnd #29360
Comments
cc @JeremyKuhne to triage/give first pass. |
This is presuming that you want a I can see the argument where you want to pass the reader on to something else that will read more if there is anything left to read. In that case we can provide an optimized I'm a little more hesitant on var remaining = sequenceReader.Sequence.Slice(sequenceReader.Position);
sequenceReader.AdvanceToEnd(); It's also is a bit off an odd fit with the rest of the API. We don't have anything else quite the same. |
Yeah, that's mostly the idea. The way I understand it, I might be wrong there, but I expect it to happen quite often that one wishes to consume the whole remaining sequence if no delimiter were found.
My proposal was actually more about In fact, a quick search through the code in AspNetCore lead me to this: More contextI'm basically trying to implement a parser atop I can split text in chunks, but entities need to be decoded as a whole. So, the logic would be as follows:
The fact that |
That was my understanding as well. In designing the reader it was driven primarily by performance. I'll take this to review, the others may allay my concerns or come up with good terminology for the
I'm happy to see additional proposals come through. :) |
@JeremyKuhne it seems to me this is not required to ship 3.0. Moving to future. If I am wrong please let me know. |
In building a small sample for using SequenceReader I stumbled into a similar requirement. See https://github.com/stevejgordon/SequenceReaderSample In my case, I'm parsing a sequence of comma-separated values. Once I've found all items using the delimiter, the remainder of the stream is assumed to be the final item (no delimiter after it). My sample grabs the final data using TryCopyTo which works but is something I'd see having to do often enough that a method to ReadToEnd would be handy. I'd like to get the final data as a Span directly. @JeremyKuhne - I'd welcome any feedback you have on the sample in general as I plan to post a blog about what it does when using SequenceReader. |
@stevejgordon Thanks for the extra info. In your example, the final item doesn't need a reader- it can simply be I'd also consider changing |
Thanks very much, @JeremyKuhne. I've updated the sample per your feedback. I appreciate you getting back to me so quickly. Once the post is done I'll ping the link over to you. |
Published an introduction to using SequenceReader based on my sample to https://www.stevejgordon.co.uk/an-introduction-to-sequencereader |
@stevejgordon nice! One note- the |
@JeremyKuhne I started working on the doc, I plan to start a open a PR soon. |
@JeremyKuhne To be honest, I did scan through that API documentation and should probably include a link to that somewhere in the post! I was referring to more user-facing, here's how you use it documentation which it seems @davidfowl has covered! :-) |
Based on offline discussion with @JeremyKuhne, since we already have the property ReadOnlySequence<byte> unreadSequence = reader.Sequence.Slice(reader.Position); public ref partial struct SequenceReader<T> where T : unmanaged, IEquatable<T>
{
// …
public void AdvanceToEnd() { /* … */ }
public ReadOnlySequence<T> UnreadSequence { get; }
// …
} |
This was approved in dotnet/corefx#40962 as: namespace System.Buffers
{
public ref struct SequenceReader<T>
{
// Optimized API to position the reader at the end of the sequence (much faster than what users can write)
public void AdvanceToEnd();
// Pairs with existing Span<T> UnreadSpan;
public readonly ReadOnlySequence<T> UnreadSequence { get; }
}
} Here are the rough implementations: /// <summary>
/// The unread portion of the <see cref="Sequence"/>.
/// </summary>
public readonly ReadOnlySequence<T> UnreadSequence
{
get => Sequence.Slice(Position);
} public void AdvanceToEnd()
{
if (_moreData)
{
Consumed = Length;
CurrentSpan = default;
CurrentSpanIndex = 0;
_currentPosition = Sequence.End;
_moreData = false;
}
} Marking as up-for-grabs to fully implement/test. |
Rationale
Today, one can read a sequence up to any delimiter and advance to/after the delimiter by using
SequenceReader<T>.TryReadTo
orSequenceReader<T>.TryReadToAny
.This is quite convenient when working with bounded and well-delimited data, but can be cumbersome when processing sequences of an unbounded data stream. (e.g. data coming from a
PipeReader
in an uncontrolled matter)The problem this proposal seeks to address is: Read data as far as possible until any delimiter is reached.
It applies in these cases
It is possible today to have something like that:
Or something like that:
However, both these options come with their own problems:
SequenceReader<T>.Advance
, which will enumerate every remaining segment once. This is pointless and wasteful if the desired result is simply to jump to the end, and also redundant because it has already been done by the call toSequenceReader<T>.TryReadToAny
.ReadOnlySequence<byte>
, while that may not have been required in the first place.In this case, an API such as
SequenceReader<T>.ReadToAnyOrEnd
would work, but it would still feel a bit clumsy to use, compared to other methods.Plus, half of the desired feature can already be achieved from another method, so it would be better to provide only the other half if possible.
Hence this proposal about adding the ability to skip to the end of a sequence.
Proposed API
Method
AdvanceToEnd
would simply update the internal state of the reader to the end of the sequence. Hopefully, no complex operation will be required here.Method
ReadToEnd
would be pretty similar, but return the remaining part of the sequence before advancing to the end.Example
Read the maximum amount of data until a delimiter:
Skip to delimiter(s) or skip the sequence entirely:
Questions
Should it be
bool TryReadToEnd(out ReadOnlySequence<T> sequence)
instead ofReadOnlySequence<T> ReadToEnd()
? (Returns false if the end has already been reached)The text was updated successfully, but these errors were encountered: