Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review System.Buffers.SequenceReader<T> proposals #30807

Closed
JeremyKuhne opened this issue Sep 10, 2019 · 12 comments
Closed

Review System.Buffers.SequenceReader<T> proposals #30807

JeremyKuhne opened this issue Sep 10, 2019 · 12 comments
Assignees
Milestone

Comments

@JeremyKuhne
Copy link
Member

This is a meta issue for #30779, #30778, #30771, #30770, and #29360 which add new APIs to SequenceReader<T>. This captures the overall change for the review. Detailed discussions are in the linked issues.

Proposed new surface area:

namespace System.Buffers
{
    public ref struct SequenceReader<T>
    {
        // Optimized API to position the reader at the end of the sequence (much faster than what users can write)
        public void AdvanceToEnd();

        // Pairs with existing Span<T> UnreadSpan;
        public ReadOnlySequence<T> UnreadSequence { get; }

        // Peeking out T, while skipping. This is more performant than users can write (avoids rewinding).
        public bool TryPeek(int skip, out T value);

        // Overloads for TryRead that allow reading out a given count rather than to some delimiter (as with existing
        // API span out will slice if it can or allocate and copy if it has to).
        bool TryRead(int count, out ReadOnlySpan<T> value);
        bool TryRead(int count, out ReadOnlySequence<T> value);

        // Equivalent "Peek" versions. They need a skip as peeking doesn't advance the reader and rewinding is super expensive.
        public bool TryPeek(int count, out ReadOnlySpan<T> value);
        public bool TryPeek(int skip, int count, out ReadOnlySpan<T> value);
        public bool TryPeek(int count, out ReadOnlySequence<T> value);
        public bool TryPeek(int skip, int count, out ReadOnlySequence<T> value);
 
        // Pairs with existing TryCopyTo(Span<T> destination), which does not advance the reader (neither does this)
        public bool TryCopyTo(int skip, Span<T> destination);

        // Also proposed, but having this can lead to writing slow code.
        public bool SetPosition(SequencePosition position);
   }
}
@JeremyKuhne JeremyKuhne self-assigned this Sep 10, 2019
@ahsonkhan
Copy link
Member

cc'ing interested parties based on the issues listed:
@davidfowl, @GoldenCrystal, @stevejgordon, @softworkz, @FiniteReality,

@softworkz
Copy link

IMO, the SequenceReader should not get any API added involving a SequencePosition parameter. Right now, it doesn't have any other - except the Position property.

I think there should be a clear separation between ReadOnlySequence and SequenceReader APIs, where ReadOnlySequence is dealing with SequencePosition objects and SequenceReader using integer offsets.

@jkotas jkotas changed the title Review System.Buffers proposals Review System.Buffers.SequenceReader<T> proposals Sep 10, 2019
@terrajobst
Copy link
Member

terrajobst commented Sep 10, 2019

Video

  • We should rename the skip parameters to offset
    • We don't use skip anywhere else and it kind of implies a side effect
    • offset implies relative to something and for a reader it makes to be relative to current position
  • We should the count parameter to length
  • We should remove the overloads of TryPeek that take two arguments and whose first argument is count -- because it conflicts TryPeek(int skip, out T value)
  • Arguments referring to position and lengths should be typed as long
  • We should add readonly annotations
  • It's sad that TryPeek() that returns a span might have to allocate if the span crosses buffers but there is no way around that.
  • FDG: we need to consider what we do for out parameters where the out type is the same as the input type as that allows you modify the this while it's running (ask Levi for details).
  • public void SetPosition(SequencePosition position); needs more work as there are a lot of concerns (like O(n) complexity).
namespace System.Buffers
{
    public ref struct SequenceReader<T>
    {
        // Optimized API to position the reader at the end of the sequence (much faster than what users can write)
        public void AdvanceToEnd();

        // Pairs with existing Span<T> UnreadSpan;
        public readonly ReadOnlySequence<T> UnreadSequence { get; }

        // Overloads for TryRead that allow reading out a given count rather than to some delimiter (as with existing
        // API span out will slice if it can or allocate and copy if it has to).
        public bool TryRead(int length, out ReadOnlySpan<T> value);
        public bool TryRead(long length, out ReadOnlySequence<T> value);

        // Peeking out T, while skipping. This is more performant than users can write (avoids rewinding).
        public readonly bool TryPeek(long offset, out T value);

        // Equivalent "Peek" versions. They need a offset as peeking doesn't advance the reader and rewinding is super expensive.
        public readonly bool TryPeek(long offset, int length, out ReadOnlySpan<T> value);
        public readonly bool TryPeek(long offset, long length, out ReadOnlySequence<T> value);
 
        // Pairs with existing TryCopyTo(Span<T> destination), which does not advance the reader (neither does this)
        public readonly bool TryCopyTo(long offset, Span<T> destination);
   }
}

@reflectronic
Copy link
Contributor

reflectronic commented Nov 4, 2019

Shouldn't the parameter names be span and sequence for methods with either overload? Existing methods that provide both that follow that pattern, and it is useful for writing code like reader.TryReadTo(span: out var span, '\n').

Also, the existing methods have the out parameter first & I think that these should stay congruent with those methods.

@ahsonkhan
Copy link
Member

Shouldn't the parameter names be span and sequence for methods with either overload? Existing methods that provide both that follow that pattern, and it is useful for writing code like reader.TryReadTo(span: out var span, '\n').

Being consistent with existing APIs make sense to me (rename value to span, sequence).

Also, the existing methods have the out parameter first & I think that these should stay congruent with those methods.

Maybe. I can see changing the TryRead APIs with the length parameter to have the out paramter first. I am not sure about the TryPeek ones that have offset/length. But again, like you said, they would be more consistent.

I would be fine with both changes, but curious to hear what others have to say (@terrajobst, @JeremyKuhne, @bartonjs). The suggestion is:

    public ref struct SequenceReader<T>
    {
-        public bool TryRead(int length, out ReadOnlySpan<T> value);
+        public bool TryRead(out ReadOnlySpan<T> span, int length);
-        public bool TryRead(long length, out ReadOnlySequence<T> value);
+        public bool TryRead(out ReadOnlySequence<T> sequence, long length);

-        public readonly bool TryPeek(long offset, out T value);
+        public readonly bool TryPeek(out T value, long offset);

-        public readonly bool TryPeek(long offset, int length, out ReadOnlySpan<T> value);
+        public readonly bool TryPeek(out ReadOnlySpan<T> span, long offset, int length);
-        public readonly bool TryPeek(long offset, long length, out ReadOnlySequence<T> value);
+        public readonly bool TryPeek(out ReadOnlySequence<T> sequence, long offset, long length);
   }

Here are the relevant existing APIs that shipped in 3.0 (for reference):

        public readonly bool TryPeek(out T value) { throw null; }
        public bool TryRead(out T value) { throw null; }
        public bool TryReadTo(out System.Buffers.ReadOnlySequence<T> sequence, System.ReadOnlySpan<T> delimiter, bool advancePastDelimiter = true) { throw null; }
        public bool TryReadTo(out System.Buffers.ReadOnlySequence<T> sequence, T delimiter, bool advancePastDelimiter = true) { throw null; }
        public bool TryReadTo(out System.Buffers.ReadOnlySequence<T> sequence, T delimiter, T delimiterEscape, bool advancePastDelimiter = true) { throw null; }
        public bool TryReadTo(out System.ReadOnlySpan<T> span, T delimiter, bool advancePastDelimiter = true) { throw null; }
        public bool TryReadTo(out System.ReadOnlySpan<T> span, T delimiter, T delimiterEscape, bool advancePastDelimiter = true) { throw null; }
        public bool TryReadToAny(out System.Buffers.ReadOnlySequence<T> sequence, System.ReadOnlySpan<T> delimiters, bool advancePastDelimiter = true) { throw null; }
        public bool TryReadToAny(out System.ReadOnlySpan<T> span, System.ReadOnlySpan<T> delimiters, bool advancePastDelimiter = true) { throw null; }

@bartonjs
Copy link
Member

bartonjs commented Nov 13, 2019

out values always go after all non-out values (except defaulted parameters)

@ahsonkhan
Copy link
Member

ahsonkhan commented Nov 13, 2019

out values always go after all non-out values (except defaulted parameters)

There are some existing SequenceReader APIs that don't follow that though (out parameter is the first one for existing TryReadTo APIs). So, is it better to be consistent within the type here?

@MarcoRossignoli
Copy link
Member

@ahsonkhan this meta issue has got approved label but seem that a lot of related issue are ready to review...can you confirm?I'd like to work on some of those but I don't know if make sense.

@JeremyKuhne
Copy link
Member Author

@MarcoRossignoli- we need to get the other issues resolved again, so we can't do those yet.

We need to revisit: dotnet/corefx#40780 and #30779.

@ahsonkhan
Copy link
Member

We need to revisit: dotnet/corefx#40780 and #30779.

FYI, that should be #30778 (not dotnet/corefx#40780).

@MarcoRossignoli
Copy link
Member

Ok thanks!

@JeremyKuhne
Copy link
Member Author

Closing the meta issue and tracking via the linked issues.

@dotnet dotnet locked as resolved and limited conversation to collaborators Dec 12, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants