-
Notifications
You must be signed in to change notification settings - Fork 4.9k
Improve default handling in ReadOnlySequence.Slice #38431
Conversation
@@ -455,7 +455,7 @@ public ReadOnlySequence<T> Slice(SequencePosition start, SequencePosition end) | |||
public ReadOnlySequence<T> Slice(SequencePosition start) | |||
{ | |||
BoundsCheck(start); | |||
return SliceImpl(start); | |||
return SliceImpl(start.Equals(default) ? Start : start); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I recall, this code is quite perf-sensitive. Are we OK with introducing branches with checks like these?
Maybe start.GetObject() == null
instead?
It's probably not worth it, but consider even calling a SliceImpl helper that takes object/int (and passing in _startObject
, _startInteger
rather than creating a SequencePosition to pass in which gets deconstructed anyway).
cc @davidfowl, @pakrym
@@ -355,7 +355,7 @@ private void BoundsCheck(in SequencePosition position) | |||
// Storing this in a local since it is used twice within InRange() | |||
ulong startRange = (ulong)(((ReadOnlySequenceSegment<T>)startObject!).RunningIndex + startIndex); | |||
if (!InRange( | |||
(ulong)(((ReadOnlySequenceSegment<T>)position.GetObject()!).RunningIndex + sliceStartIndex), | |||
(ulong)((position.GetObject() != null ? ((ReadOnlySequenceSegment<T>)position.GetObject()!).RunningIndex : 0) + sliceStartIndex), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems counter to the annotated bang (!) on position.GetObject()
. @safern - what should be the correct annotation here? Should we remove the !
?
nit: We may want to extract this out in a local to help with readability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The compiler will not do deep data flow analysis here to track that you already checked for null on the call of GetObject()
and that you're calling it again when it is not null. So unfortunately the !
is still needed here. I just validated that: https://sharplab.io/#v2:EYLgtghgzgLgpgJwD4AEBMBGAsAKFwYgDsBXAG1ImFLgAI5DLrcUBmG9GgYRoG9caB7NgEtCMGgFkAFABEaABwD2UYTGGLCASl79BASBQB2BctXrCAOgDicGAHlgAKzgBjGFO0BCALw0S5GgB+GikpAC1NJRU1DWtbB2c3D00LAElCABM4AA8aEBoABgBuXQEAX1wKvBxWdjQaOT4cPVKaRSdXGGCAfXbE8V9/UhKcQRpW2r7O4Jt7DqTtbwA+Gl75mBGq5jYOMJ1m2tFxdKzcnhoAc1simihrmjKaXwxNoA
If perf is not affected and adding an extra local doesn't hurt, that would be the only way to remove the !
, but if that hurts the produced IL we should leave as is.
cc: @cston I guess this behavior is by design and not something the compiler can infer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@safern, yes, this is by design - the compiler tracks the nullability of locals, parameters, and fields and properties of those only.
The test location seems fine to me. For perf testing, I would start by looking at the existing perf tests which are measuring just the API call specifically (do a before/after comparison). If there are some tests missing, please add them: For more general, end-to-end testing and impact, you would have to take a look at where such APIs are used most heavily, which in this case is in Kestrel Web Server (this is probably not necessary for such a change). For instance: cc @jkotalik, who might also be able to help with that (if needed) |
Edit: Patch is complete now. Can be reviewed Just adding a note here: the patch is not complete yet. There’s bugs in the other Slice APIs that I’m fixing. I’ll have a complete patch up later tonight or tomorrow morning |
ba77d2d
to
0a9071b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise, LGTM (pending some perf numbers, even from micro-benchmarks)
src/System.Memory/tests/ReadOnlyBuffer/ReadOnlySequenceTests.Default.cs
Outdated
Show resolved
Hide resolved
src/System.Memory/tests/ReadOnlyBuffer/ReadOnlySequenceTests.Default.cs
Outdated
Show resolved
Hide resolved
@@ -455,7 +455,7 @@ public ReadOnlySequence<T> Slice(SequencePosition start, SequencePosition end) | |||
public ReadOnlySequence<T> Slice(SequencePosition start) | |||
{ | |||
BoundsCheck(start); | |||
return SliceImpl(start); | |||
return SliceImpl(start.GetObject() == null ? Start : start); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we are already checking whether start.GetObject() is null in BoundsCheck
:
long runningIndex = ((ReadOnlySequenceSegment<T>?)position.GetObject())?.RunningIndex ?? 0;
But there isn't a great way to avoid that (and only occurs in one of the branches).
Maybe, store the result of start.Object == null
in a bool and pass that in to BoundsCheck
. Then runningIndex
can be 0 unless that bool is true (in which case we can get rid of the null checks there). Something to consider, if it can reduce the instruction size of BoundsCheck
and Slice
.
long runningIndex = 0;
if (positionIsNotNull)
{
Debug.Assert(position.GetObject() != null);
runningIndex = ((ReadOnlySequenceSegment<Byte>)position.GetObject()).RunningIndex;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Store start.GetObject() != null
in a local and re-use in the SliceImpl
call
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. In the works :)
@@ -122,6 +122,39 @@ public void Default_SliceNegativeLength() | |||
Assert.Throws<ArgumentOutOfRangeException>("length", () => buffer.Slice(buffer.Start, -1L)); | |||
} | |||
|
|||
[Fact] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you share the code coverage numbers. Let's make sure all the new branches are covered (and if not, add more tests).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, uploading the zip containing the code coverage works.
Looks like we have other branches too. Like in Slice(long start, SequencePosition end)
and also the new if (positionIsNotNull)
branch in BoundsCheck
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Welp, it looks like github doesn't like zips
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"\scratch2\scratch\prgovi\ROS_Slice_CodeCoverage.zip"
@@ -451,34 +454,25 @@ private SequenceType GetSequenceType() | |||
private static int GetIndex(int Integer) => Integer & ReadOnlySequence.IndexBitMask; | |||
|
|||
[MethodImpl(MethodImplOptions.AggressiveInlining)] | |||
private ReadOnlySequence<T> SliceImpl(in SequencePosition start, in SequencePosition end) | |||
private ReadOnlySequence<T> SliceImpl(in object? startObject, in int startIndex, in object? endObject, in int endIndex) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a quick view of the disassembly (in a similar sample), this pattern seems to help reduce instruction count. So looks good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a quick view of the disassembly (in a similar sample), this pattern seems to help reduce instruction count.
On Linux, too?
Perf test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pgovind was going to do some perf validation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I thought this would've performed better too, but it does not and I don't have an explanation for why yet. It seems that calling SliceImpl(new SequencePosition(someObject, someIndex), someLength)
is somehow faster than calling SliceImpl(someObject, someIndex, endObject, endLength)
🤷♂️.
Edit: I modified the patch to include just the bug fix and no refactoring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On Linux, too?
@stephentoub, why do you think the behavior would be different on Linux?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do you think the behavior would be different on Linux?
Different calling convention?
summary: better: 1, geomean: 1.203 worse: 1, geomean: 1.121 total diff: 2 | Slower | diff/base | Base Median (ns) | Diff Median (ns) | Modality| | ----------------------------------------------------------------------------- | ---------:| ----------------:| ----------------:| --------:| | MicroBenchmarks.corefx.System.Memory.ReadOnlySequenceBenchmarks.StartPosition | 1.12 | 9.69 | 10.87 | | | Faster | base/diff | Base Median (ns) | Diff Median (ns) | Modality| | -------------------------------------------------------------------------------- | ---------:| ----------------:| ----------------:| --------:| | MicroBenchmarks.corefx.System.Memory.ReadOnlySequenceBenchmarks.StartPosition_An | 1.20 | 5.74 | 4.77 | |
Updated with the latest numbers: I setup some benchmarks here: The results are as follows:
|
src/System.Memory/src/System/Buffers/ReadOnlySequence.Helpers.cs
Outdated
Show resolved
Hide resolved
@@ -329,7 +329,7 @@ public ReadOnlySequence<T> Slice(long start, SequencePosition end) | |||
FoundInFirstSegment: | |||
// startIndex + start <= int.MaxValue | |||
Debug.Assert(start <= int.MaxValue - startIndex); | |||
return SliceImpl(new SequencePosition(startObject, (int)startIndex + (int)start), end); | |||
return SliceImpl(new SequencePosition(startObject, (int)startIndex + (int)start), new SequencePosition(sliceEndObject, (int)sliceEndIndex)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aren't we potentially mismatching the object and the index here? sliceEndObject
could be pointing to _startObject
, but the sliceEndIndex
is still the index from end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope. When sliceEndObject
is null(happens only when default is passed in), sliceEndIndex
will be 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed offline, this issue needs to be resolved.
src/System.Memory/src/System/Buffers/ReadOnlySequence.Helpers.cs
Outdated
Show resolved
Hide resolved
Assuming the CI finishes before snapping, can I merge this patch? Are we ok with the perf numbers? |
Hello @ahsonkhan! Because this pull request has the p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (
|
Assert.Equal(0, slicedSequence.Length); | ||
|
||
// Slice(x, default) returns empty if x = 0. Otherwise throws | ||
sequence = sequence.Slice(2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should cover both the Slice(int, SequencePosition) and the Slice(SequencePosition, int) cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How so? I would feel more confident if we had separate test cases for both overloads, explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed this offline, and it makes sense to me now.
src/System.Memory/src/System/Buffers/ReadOnlySequence.Helpers.cs
Outdated
Show resolved
Hide resolved
We'll look out for regressions. |
* Improve default handling in ReadOnlySequence.Slice Commit migrated from dotnet/corefx@5f6cba8
Fixes https://github.com/dotnet/corefx/issues/35254
The null dereference was because we now allow nullable objects in SequencePosition. Once I fixed that, there's still the issue of what
default(SequencePosition)._object
should be. If the expectation is that the default start position is 0, it makes sense thatdefault(SequencePosition)._object = Start
?Also, this is my first patch in corefx, so I'd appreciate comments on the new test and its location. Also, this seems like a small change, but how do folks generally measure the performance impacts of new changes?