Memory and ReadOnlyMemory validation errors not matching #23670

Drawaes · 2017-09-28T03:41:42Z

Updated as per the suggestion of a create method
Updated with design from @stephentoub which allows all cases to be covered
Updated with design change from @benaadams to allow type inference for T
Put in namespace and removed the Create to leave only the Dangerous Create
Added question around moving current Dangerous Create Method

Rationale

A major use case of [ReadOnly]Span/Memory is to replace handing around array buffers and their offsets and count.

One of the major benifits of the design as I see it as it moves bounds checks out to where the buffer is created which is excellent. However when upgrading legacy code there seems to be a blocker in that stream defines the triplet to be

public int SomeMethod(byte[] buffer, int offset, int count);

This is normally then teamed up with checks on

"is buffer null" == null argument exception
"is offset negative or past the end of the buffer?" == argument out of range exception with offset as field referenced
"is count > than buffer.length, or < 0 or count +offset > buffer.length" == argument out of range exception

The issue with the way it currently is, that for anything that takes that triplet you have to manually do validation on the inputs before creating a Memory or risk having exceptions with names that don't match.

This causes double validation to take place, once in the legacy code and once in the Memory creation. As Memory is often used on the "hot paths" of code it is a penalty for using memory.

Proposal

Add "unsafe" methods to create memory and readonly memory that avoid the extra checks. Then the checks can be maintained in the code with the same error messages and the double check penalty doesn't have to be paid.

namespace System.Runtime.InteropServices
{
    public static class Span
    {
        public static Span<T> DangerousCreate(T[] array, int start, int length);
    ...
    }
    // ... same for ReadOnlySpan<T>

    public static class Memory
    {
        public static Memory<T> DangerousCreate(T[] array, int start, int length);
    }

    // ... same for ReadOnlyMemory<T>
}

Usage

byte[] buffer;

var span = Span.DangerousCreate(buffer, offset, count);
// vs
var span = Span<byte>.DangerousCreate(buffer, offset, count);

Outstanding Questions

Should the existing method

[MethodImpl(MethodImplOptions.AggressiveInlining)]
[EditorBrowsable(EditorBrowsableState.Never)]
public static Span<T> DangerousCreate(object obj, ref T objectData, int length)

Be moved to the new types and should the IDE hiding be removed?

References

dotnet/corefx#24295 The Code this came up in (It means an extra validation path without it)

Below no longer matters as the legacy code can maintain it's own named validation.



|Class|ArrayName|Start|Length|
|---|---|---|---|
|FileStream|buffer|offset|count|
|NetworkStream|buffer|offset|count|
|BufferedStream|buffer|offset|count|
|MemoryStream|buffer|offset|count|
|StreamReader|buffer|index|count|
|StreamWriter|buffer|index|count|
|Stream|buffer|offset|count|
|SslStream|buffer|offset|count|

The text was updated successfully, but these errors were encountered:

benaadams · 2017-09-28T16:38:27Z

Currently its

 ReadOnlyMemory(T[] array, int start, int length)

Change to

 ReadOnlyMemory(T[] buffer, int offset, int count)

Seems good

cc: @ahsonkhan, @KrzysztofCwalina

ahsonkhan · 2017-09-28T17:42:56Z

Would the same apply to Span/ReadOnlySpan?
https://github.com/dotnet/coreclr/blob/master/src/mscorlib/shared/System/Span.cs#L71-L78

This should go through API review to make sure we want this change and apply it in other places too (to remain consistent).

benaadams · 2017-09-28T17:47:48Z

Would the same apply to Span/ReadOnlySpan?

/cc @stephentoub

Drawaes · 2017-09-28T17:53:13Z

Yeah it looks from a quick tour around that it is the same for the non async methods that take span.

@ahsonkhan If you want I can update the top comment to be in proper API review form?

ahsonkhan · 2017-09-28T17:55:22Z

If you want I can update the top comment to be in proper API review form?

Yes please. Tyvm.
We have other APIs in Span that use start and length too, like Slice:

public Span<T> Slice(int start, int length)

Drawaes · 2017-09-28T18:09:57Z

Updated

KrzysztofCwalina · 2017-09-28T18:31:29Z

We cannot rename Span.Length to Span.Count. Span is like an array and arrays have Length property.

I also don't want the ctor parameter to be called "count" but the property it initialized being called "Length"

To solve the problem outlined in this issue, could we add a helper method that would create Span out of T[], offset, count while validating the three parameters and throwing the right exception?

Drawaes · 2017-09-28T18:42:43Z

That seems like a reasonable idea to me.

stephentoub · 2017-09-28T18:45:03Z

Is the suggestion to have two different APIs for constructing a {ReadOnly}Span/Memory<T> from a T[]/int/int triplet, just with different parameter names and thus different exceptions that get thrown?

benaadams · 2017-09-28T18:47:20Z

To solve the problem outlined in this issue, could we add a helper method that would create Span out of T[], offset, count while validating the three parameters and throwing the right exception?

Precedent is the (Value)Tuple Create methods, so you don't always have to specify type

var t = ValueTuple.Create(array);

rather than

var t = new ValueTuple<int[]>(array);

So

public static class Span
{
    public static Span<T> Create<T>(T[] buffer, int offset, int count) 
    {
        // Validate
        return Span<T>.DangerousCreate(buffer, offset, count);
    }
}

For

var span = Span.Create(array, offset, count);

Rather than

var span = new Span<int>(array, offset, count);

Obv Tuple is more annoying as it has many type params rather than just one

Drawaes · 2017-09-28T19:11:58Z

Removed my comment as it was out of order due to my phone ;) Added @benaadams / @KrzysztofCwalina 's idea as the main API request above.

benaadams · 2017-09-28T19:15:19Z

Would need to be in a new non-generic static class; else you'd have to specify the type anyway and it would loose some advantages; so would be a bit of a weird addition other than the change in names of params

e.g.

Span<int>.Create(array, offset, count);

rather than

Span.Create(array, offset, count);

Drawaes · 2017-09-28T19:53:23Z

Would it work to add an unsafe unchecked internal method/constructor in coreclr, then add the same to the typeforwarded types in corefx. Finally then a helper method could be added to CoreFX.

This would only need to be internal as most streams are in corefx very few external custom ones I imagine. That way there are no external API additions or changes?

benaadams · 2017-09-28T20:04:33Z

DangerousCreate is there for Span anyway, though its checking length (but not other params) https://github.com/dotnet/corefx/blob/master/src/System.Memory/src/System/Span.cs#L108-L113

Drawaes · 2017-09-28T20:10:52Z

So would just need something like that for memory. Would pay for a double length check but it would be less than a full check on all the params.

stephentoub · 2017-10-09T14:26:41Z

could we add a helper method that would create Span out of T[], offset, count while validating the three parameters and throwing the right exception?

How about just adding a Dangerous/Unsafe factory for {ReadOnly}Memory<T> that doesn't do any argument validation at all, leaving it up to the call site? Then I can do my own argument validation if needed, and places where I know the arguments are valid I can skip the checks entirely. There are a bunch of places I'm seeing where, as I'm switching over from using array-based to memory-based calls, I'm having to construct {ReadOnly}Memory<T> instances from T[]/offset/count triplets that I know are all valid.

Drawaes · 2017-10-09T14:33:47Z

Seems like that is going to be the most flexible option.

Drawaes · 2017-10-09T20:30:12Z

@stephentoub do you have a shape in mind for this factory (don't call it factory please :P ) I will happily update the top comment with the idea.

stephentoub · 2017-10-09T21:43:44Z

do you have a shape in mind for this factory

public ref struct Span<T>
{
    public static Span<T> DangerousCreate(T[] array, int start, int length);
    ...
}
// ... same for ReadOnlySpan<T>

public struct Memory<T>
{
    public static Memory<T> DangerousCreate(T[] array, int start, int length);
    ...
}
// ... same for ReadOnlyMemory<T>

Drawaes · 2017-10-09T21:53:54Z

Updated top issue.

benaadams · 2017-10-09T23:13:54Z

Push it out to a static class and can get type inference and throw in the Create with renamed parameter validation?

public static class Span
{
    public static Span<T> Create(T[] buffer, int offset, int count)
    {
         // validate params
        return DangerousCreate(buffer, offset, count);
    }

    public static Span<T> DangerousCreate(T[] buffer, int offset, int count);
    ...
}
// ... same for ReadOnlySpan<T>

public static class Memory
{
    public static Memory<T> Create(T[] buffer, int offset, int count)
    {
         // validate params
        return DangerousCreate(buffer, offset, count);
    }

    public static Memory<T> DangerousCreate(T[] buffer, int offset, int count);
    ...
}
// ... same for ReadOnlyMemory<T>

byte[] array;

var span = Span.DangerousCreate(array, start, length);
// vs
var span = Span<byte>.DangerousCreate(array, start, length);

// drop in validation Exception matching
var memory = Memory.Create(array, start, length);

Also then the Throws can be in non-generic class

Drawaes · 2017-10-09T23:20:10Z

I prefer that, but @stephentoub your the one that gets to be in the API reviews, whats your thinking? (Apart from the fact we could never use the var in corefx ;) )

stephentoub · 2017-10-10T02:56:45Z

Seems reasonable. @KrzysztofCwalina, @ahsonkhan, was there strong reasoning for the ctor-based span/memory creation vs factory-based span/memory creation?

Drawaes · 2017-10-10T16:01:45Z

@benaadams @stephentoub updated top issue to match Bens design

KrzysztofCwalina · 2017-10-10T16:07:02Z

I am not sure it's worth adding two types to System root namespace just so we can have the factory methods.
Also, I don't think we should have Dangerous APIs on such types. We decided to move all dangerous APIs to Marshal-like type in System.Runtime.InteropServices.

benaadams · 2017-10-10T16:11:09Z

Move the entire classes into System.Runtime.InteropServices the argument rename is for interop also :)

namespace System.Runtime.InteropServices
{
    public static class Span
    {
        public static Span<T> Create(T[] buffer, int offset, int count);
        public static Span<T> DangerousCreate(T[] buffer, int offset, int count);

Drawaes · 2017-10-10T16:31:04Z

I am easy on location, and on naming, the Type Inference is more "friendly" if you want it moved @KrzysztofCwalina what would you call the class if it went to CompilierServices?

public static class Unsafe
{
    public static Span<T> DangerousCreateSpan<T>(T[] buffer, int offset, int count);
    public static ReadonlySpan<T> DangeriousCreateReadonlySpan(T[] buffer, int offset, int count);
    public static Memory<T> DangerousCreateMemory<T>(T[] buffer, int offset, int count);
    public static ReadOnlyMemory<T> DangerousCreateReadonlyMemory<T>(T[] buffer, int offset, int count);
}

Means no new type, and they all show up in the IDE side by side. You also have made it super clear what the story is here?

ahsonkhan · 2017-10-28T01:35:49Z

Move the entire classes into System.Runtime.InteropServices the argument rename is for interop also :)

@Drawaes, can you please update the original post with the namespace specified?

the Type Inference is more "friendly"

Don't we still get the type inference regardless of whether it lives in System or System.Runtime.InteropServices?

Do we need the Create methods? For the original concern about error matching, isn't DangerousCreate sufficient?

Drawaes · 2017-10-28T02:15:25Z

Does that look better?

We have the namespace moved to System.Runtime.InteropServices
We get type Inference
Dropped the Create methods, not sure why there were still there as you are correct the DangerousCreate seems to cover all of issues I had.

ahsonkhan · 2017-10-28T02:18:32Z

Do the argument names in DangerousCreate have to be different than what we use in the Span constructor or can we continue to use index/length?

Drawaes · 2017-10-28T02:22:53Z

you are correct Sir they should match the Span names, as it doesn't matter what they are called for my purpose now, and .. consistency

I fixed it ( I left the example code as Array, offset to make the point about what they are likely to be ;) )

ahsonkhan · 2017-10-28T02:28:03Z

Sorry for the mistake. I meant start/length.

index -> start?
buffer -> array?

https://github.com/dotnet/corefx/blob/master/src/System.Memory/ref/System.Memory.cs#L46

public Span(T[] array, int start, int length)

Edit: Also, should we also add an overload that only takes an array (which skips the null and co-variance checks)?

Drawaes · 2017-10-28T02:31:01Z

No problem, I was being lazy and didn't check as well 🤣 (fixed)

benaadams · 2017-10-28T08:33:01Z

Question would be, should the existing Span overload

[MethodImpl(MethodImplOptions.AggressiveInlining)]
[EditorBrowsable(EditorBrowsableState.Never)]
public static Span<T> DangerousCreate(object obj, ref T objectData, int length)

Also move to System.Runtime.InteropServices? T is specified by ref T objectData

Drawaes · 2017-10-28T09:35:32Z

Seems reasonable, the only query would be around shipping status and back compat.

benaadams · 2017-10-28T09:37:18Z

Seems reasonable, the only query would be around shipping status and back compat.

Question for the api reviewers :)

Drawaes · 2017-10-28T09:52:12Z

Above our pay grades... :lol:

terrajobst · 2017-10-31T18:44:41Z

Video

We went back and forth, but it seems we're in agreement that these can be useful. We shouldn't expose static types Span and Memory in InteropServices. Instead, we should add them to the new MemoryMarshal type.

Drawaes · 2017-11-02T22:52:07Z

Started a PR in coreclr dotnet/coreclr#14833

jkotas · 2017-11-02T23:17:17Z

I do not think these APIs are good addition.

Memory is "slow" type. We are losing security and gaining nothing by having a constructor that omits bounds checks.
I am ok with having the low level constructor for Span, but it should be just DangerousCreate moved from the Span type to MemoryMarshal. I do not think we should have convenience constructors that take arrays to make skipping the bounds check easy.

Drawaes · 2017-11-02T23:24:28Z

The reason mainly is

Most of the stream types run the validation checks already. I couldn't remove these checks when updating to take memory from an array (or span for that matter) because it would change the exception message for the field that fails, making it a breaking change.

That means there are plenty of times that instead of the 5 checks (null, index < length & index > 0 using overflow for 1 check, index + count < length, length > 0) you end up with 10 checks everytime you call.

This might seem like a "slow" operation but often these async methods can return sync and with ValueTask there is no allocations. So say SslStream reads 16k frames, and then the client "sips" 2k from it, then the next call for 2k will be a sync very quick return and the 5 extra checks start to show.

ahsonkhan · 2017-11-02T23:29:37Z

I do not think these APIs are good addition.

The reason mainly is

Would having these methods as internal only solve both concerns?

jkotas · 2017-11-02T23:42:31Z

the 5 extra checks start to show

Memory(T[], int, int) has these precondinditions:

            if (array == null)
                ThrowHelper.ThrowArgumentNullException(ExceptionArgument.array);
            // This one should be completely optimized out for `Memory<byte>`.
            if (default(T) == null && array.GetType() != typeof(T[]))
                ThrowHelper.ThrowArrayTypeMismatchException();
            if ((uint)start > (uint)array.Length || (uint)length > (uint)(array.Length - start))
                ThrowHelper.ThrowArgumentOutOfRangeException();

So I see 3 extra checks - they will be like 7 well predicted instructions (and JIT may optimize some of it out). They should cost nothing compared to other overhead associated with Memory<T>.

Could you please share an example of realworld code that shows the problem?

omariom · 2017-11-03T00:37:46Z

This is how it looks on .NET Framework x86

jkotas · 2017-11-03T00:52:32Z

I meant to e.g. have some measurement that shows how much faster a real world async Stream Read method will get if it has this special constructor. (Also, this is not what the code will look like because of the actual code is written using non-inlineable ThrowHelper.)

For better or worse, the framework APIs are designed with explicit argument checks. Yes, these argument checks add extra code everywhere. But the execution cost of these extra argument checks is relatively small. We have done experiments in the past with "fast and crash" modes where we removed things like bounds checking everywhere - the performance benefits of such modes were surprisingly small because of these checks tend to execute pretty fast. I am wondering whether we know what we are really getting out of these duplicated APIs.

I guess I should wait for the link to the API discussion to be posted to see what arguments were used to justify these APIs.

ahsonkhan · 2017-11-03T00:57:48Z

I guess I should wait for the link to the API discussion to be posted to see what arguments were used to justify these APIs.

https://youtu.be/bHwCPVNQLwo?t=1h16m53s

omariom · 2017-11-03T01:10:06Z

@jkotas
I meant to illustrate that the comparison, as you said, is a few instructions with a couple of well predicted branches.

    mov esi, [ebp+0xc]
    test edx, edx
    jz L001e
    mov eax, [edx+0x4]
    cmp eax, esi
    jb L0049
    sub eax, esi
    cmp eax, [ebp+0x8]
    jb L0049

jkotas · 2017-11-03T13:32:05Z

I have listened to the discussion.

My take is that if we are worried about redunant checks, we should tech the JIT to eliminate them. The JIT does elimate some of the duplicate checks already, e.g. null checks. For example, the code for this method:

static int FetchElementUsingSpan(int[] x)
{
    if (x == null)
        throw new ArgumentNullException();
    ReadOnlySpan<int> s = new ReadOnlySpan<int>(x);
    return s[0];
}

Is this:

test!My.FetchElementUsingSpan(Int32[]):
00007ff9`ea1108d0 56              push    rsi
00007ff9`ea1108d1 4883ec20        sub     rsp,20h
00007ff9`ea1108d5 4885c9          test    rcx,rcx
00007ff9`ea1108d8 7414            je      test!My.FetchElementUsingSpan(Int32[])+0x1e (00007ff9`ea1108ee)
00007ff9`ea1108da 488d4110        lea     rax,[rcx+10h]
00007ff9`ea1108de 8b4908          mov     ecx,dword ptr [rcx+8]
00007ff9`ea1108e1 83f900          cmp     ecx,0
00007ff9`ea1108e4 762b            jbe     test!My.FetchElementUsingSpan(Int32[])+0x41 (00007ff9`ea110911)
00007ff9`ea1108e6 8b00            mov     eax,dword ptr [rax]
00007ff9`ea1108e8 4883c420        add     rsp,20h
00007ff9`ea1108ec 5e              pop     rsi
00007ff9`ea1108ed c3              ret
...

Notice that there are two null checks: one in my method and other one in the ReadOnlySpan implementation itself. However, the JIT figured out that the second check is redundant and optimized it out. The JIT is not as good in eliminating the redundant checks for bounds checks today. It is something to fix in the JIT and not optimize API design around.

I do not think we should be adding duplicate APIs that differ just in having vs. not having argument validation. Having two ways to do a thing and making people to think about it is negative value for the framework. As always, I can be convinced that this make sense by data.

Drawaes · 2017-11-03T13:42:01Z

If the nulls are already removed and there is scope to look at either removal via the JIT then I would be good with removing the API review. It's been through the review process though so others would need to make the final choice.

How about this, I will do some measurements in an end to end scenario, if the measurements are in the margin of error we could remove It? If they aren't then there is something more to discuss.

jkotas · 2017-11-03T14:00:33Z

@Drawaes Sounds good. Thanks!

karelz · 2017-11-21T17:54:09Z

FYI: The API review discussion was recorded - see https://youtu.be/bHwCPVNQLwo?t=4584 (23 min duration)

tarekgh · 2018-01-10T21:05:39Z

I am closing this review per the latest discussion on this issue. here are some comments:

We already have Span.DangerousCreate and we have another overlapping issue dotnet/corefx#26139 which proposing moving the Span.DangerousCreate to MemoryMarshal.DangerousCreate
@Drawaes can still provide his E2E scenario measurements and we can consider it if it is worth doing something more (e.g. adding MemoryMarshal.DangerousCreate which support creating Memory objects too). Or suggest any change in the dotnet/corefx#26139. @ahsonkhan is already aware of that and he can update dotnet/corefx#26139 as needed. @Drawaes, feel free to post your measurement numbers here or in dotnet/corefx#26139

KrzysztofCwalina assigned ahsonkhan Oct 9, 2017

ahsonkhan removed their assignment Nov 2, 2017

joshfree assigned tarekgh Jan 9, 2018

tarekgh closed this as completed Jan 10, 2018

msftgits transferred this issue from dotnet/corefx Jan 31, 2020

msftgits added this to the 2.1.0 milestone Jan 31, 2020

dotnet locked as resolved and limited conversation to collaborators Dec 20, 2020

Memory and ReadOnlyMemory validation errors not matching #23670

Memory and ReadOnlyMemory validation errors not matching #23670

Comments

Drawaes commented Sep 28, 2017

Rationale

Proposal

Usage

Outstanding Questions

References

benaadams commented Sep 28, 2017

ahsonkhan commented Sep 28, 2017 • edited

benaadams commented Sep 28, 2017

Drawaes commented Sep 28, 2017

ahsonkhan commented Sep 28, 2017 • edited

Drawaes commented Sep 28, 2017

KrzysztofCwalina commented Sep 28, 2017

Drawaes commented Sep 28, 2017

stephentoub commented Sep 28, 2017

benaadams commented Sep 28, 2017 • edited

Drawaes commented Sep 28, 2017

benaadams commented Sep 28, 2017

Drawaes commented Sep 28, 2017 • edited

benaadams commented Sep 28, 2017 • edited

Drawaes commented Sep 28, 2017 • edited

stephentoub commented Oct 9, 2017 • edited

Drawaes commented Oct 9, 2017

Drawaes commented Oct 9, 2017

stephentoub commented Oct 9, 2017

Drawaes commented Oct 9, 2017

benaadams commented Oct 9, 2017 • edited

Drawaes commented Oct 9, 2017 • edited

stephentoub commented Oct 10, 2017

Drawaes commented Oct 10, 2017

KrzysztofCwalina commented Oct 10, 2017

benaadams commented Oct 10, 2017 • edited

Drawaes commented Oct 10, 2017 • edited

ahsonkhan commented Oct 28, 2017

Drawaes commented Oct 28, 2017

ahsonkhan commented Oct 28, 2017 • edited

Drawaes commented Oct 28, 2017 • edited

ahsonkhan commented Oct 28, 2017 • edited

Drawaes commented Oct 28, 2017 • edited

benaadams commented Oct 28, 2017 • edited

Drawaes commented Oct 28, 2017

benaadams commented Oct 28, 2017

Drawaes commented Oct 28, 2017

terrajobst commented Oct 31, 2017 • edited

Drawaes commented Nov 2, 2017 • edited

jkotas commented Nov 2, 2017 • edited

Drawaes commented Nov 2, 2017 • edited

ahsonkhan commented Nov 2, 2017

jkotas commented Nov 2, 2017 • edited

omariom commented Nov 3, 2017

jkotas commented Nov 3, 2017 • edited

ahsonkhan commented Nov 3, 2017

omariom commented Nov 3, 2017 • edited

jkotas commented Nov 3, 2017 • edited

Drawaes commented Nov 3, 2017

jkotas commented Nov 3, 2017

karelz commented Nov 21, 2017

tarekgh commented Jan 10, 2018

ahsonkhan commented Sep 28, 2017 •

edited

ahsonkhan commented Sep 28, 2017 •

edited

benaadams commented Sep 28, 2017 •

edited

Drawaes commented Sep 28, 2017 •

edited

benaadams commented Sep 28, 2017 •

edited

Drawaes commented Sep 28, 2017 •

edited

stephentoub commented Oct 9, 2017 •

edited

benaadams commented Oct 9, 2017 •

edited

Drawaes commented Oct 9, 2017 •

edited

benaadams commented Oct 10, 2017 •

edited

Drawaes commented Oct 10, 2017 •

edited

ahsonkhan commented Oct 28, 2017 •

edited

Drawaes commented Oct 28, 2017 •

edited

ahsonkhan commented Oct 28, 2017 •

edited

Drawaes commented Oct 28, 2017 •

edited

benaadams commented Oct 28, 2017 •

edited

terrajobst commented Oct 31, 2017 •

edited

Drawaes commented Nov 2, 2017 •

edited

jkotas commented Nov 2, 2017 •

edited

Drawaes commented Nov 2, 2017 •

edited

jkotas commented Nov 2, 2017 •

edited

jkotas commented Nov 3, 2017 •

edited

omariom commented Nov 3, 2017 •

edited

jkotas commented Nov 3, 2017 •

edited