New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Safety rules for ref-like types like `Span<T>` #264

Closed
wants to merge 5 commits into
base: master
from

Conversation

@VSadov
Member

VSadov commented Mar 13, 2017

Please do not merge yet.

Show outdated Hide outdated proposals/span-safety.md
Show outdated Hide outdated proposals/span-safety.md

@jkotas jkotas referenced this pull request Mar 16, 2017

Merged

Update SpanBench test #10216

## `ref-like` types must be stack-only. ##
C# compiler already has a concept of a restricted types that covers special platform types such as `TypedReference`. In some cases those types are referred as ref-like types as well. We should probably use some different term to refer to `TypedReference` and similar types - like `restricted types` to avoid confusion.

This comment has been minimized.

@ahsonkhan

ahsonkhan Mar 16, 2017

Member

a restricted types

Should be a restricted type

@ahsonkhan

ahsonkhan Mar 16, 2017

Member

a restricted types

Should be a restricted type

```
Note: returning by a ordinary writeable reference appears to be ok by itself, but with values on heap being impossible, values in local frame being not returnable and variables passed from the caller being readonly, it may not matter.

This comment has been minimized.

@ahsonkhan

ahsonkhan Mar 16, 2017

Member

a ordinary

an ordinary

@ahsonkhan

ahsonkhan Mar 16, 2017

Member

a ordinary

an ordinary

2. Some implementations of `Span<T>` literally contain a managed pointer in one of its fields. Managed pointers are not supported as fields of heap objects and code that manages to put a managed pointer on the GC heap typically crashes at JIT time.
3. It is permitted for a `Span<T>` to refer to data in the local stack frame - individual local variables or `stackalloc`-ed arrays. A scenario when an instance of a `Span<T>` outlives the referred data would lead to undefined behavior, including type-safety violations and heap corruptions.
All the above problems would be alleviated if instances of `Span<T>` would be allowed as stack data types only. In fact for the `#3` we need a stronger guarantee. Details on that to follow below.

This comment has been minimized.

@ahsonkhan

ahsonkhan Mar 16, 2017

Member

Details on that to follow below.

Can you add a link/reference to the "details"?

@ahsonkhan

ahsonkhan Mar 16, 2017

Member

Details on that to follow below.

Can you add a link/reference to the "details"?

@jkotas

This comment has been minimized.

Show comment
Hide comment
@jkotas

jkotas Mar 17, 2017

Member

BTW: Any thoughts on how the ref-like types are going to be identified?

Member

jkotas commented Mar 17, 2017

BTW: Any thoughts on how the ref-like types are going to be identified?

## `ref-like` types must be stack-only. ##
C# compiler already has a concept of a restricted types that covers special platform types such as `TypedReference`. In some cases those types are referred as ref-like types as well. We should probably use some different term to refer to `TypedReference` and similar types - like `restricted types` to avoid confusion.

This comment has been minimized.

@jkotas

jkotas Mar 17, 2017

Member

There is also byref-like type defined in ECMA spec. I think that one is fine - we are just making it to cover more types and relaxing the rules.

@jkotas

jkotas Mar 17, 2017

Member

There is also byref-like type defined in ECMA spec. I think that one is fine - we are just making it to cover more types and relaxing the rules.

This comment has been minimized.

@VSadov

VSadov Mar 29, 2017

Member

@jkotas - any suggestions?
Ideally it should be something that makes Span unconsumable by old code - like a modreq, but it is not possible to put modreq on a type.

@VSadov

VSadov Mar 29, 2017

Member

@jkotas - any suggestions?
Ideally it should be something that makes Span unconsumable by old code - like a modreq, but it is not possible to put modreq on a type.

This comment has been minimized.

@VSadov

VSadov Mar 29, 2017

Member

We can use an ordinary attribute, but it is easily ignorable.

@VSadov

VSadov Mar 29, 2017

Member

We can use an ordinary attribute, but it is easily ignorable.

This comment has been minimized.

@jkotas

jkotas Mar 29, 2017

Member

IIRC, the solution we have discussed was to mark Span with ObsoleteAttribute with special message pattern, and make newer compilers to not emit warnings for ObsoleteAttribute with this special pattern. Check notes https://github.com/dotnet/csharplang/blob/master/meetings/2016/LDM-2016-11-01.md

@jkotas

jkotas Mar 29, 2017

Member

IIRC, the solution we have discussed was to mark Span with ObsoleteAttribute with special message pattern, and make newer compilers to not emit warnings for ObsoleteAttribute with this special pattern. Check notes https://github.com/dotnet/csharplang/blob/master/meetings/2016/LDM-2016-11-01.md

- ref-like type cannot be a type of an array element
- ref-like type cannot be used as a generic type argument
- ref-like variable cannot be boxed
- ref-like type cannot be a field of ordinary not ref-like type

This comment has been minimized.

@MichalStrehovsky

MichalStrehovsky Mar 20, 2017

Member

Should "ref-like type cannot be a static field of any type" be added to this list?

@MichalStrehovsky

MichalStrehovsky Mar 20, 2017

Member

Should "ref-like type cannot be a static field of any type" be added to this list?

This comment has been minimized.

@VSadov

VSadov Mar 29, 2017

Member

Yes, ref-like type cannot be a type of a static field. It can only be an instance field of another ref-like type.

@VSadov

VSadov Mar 29, 2017

Member

Yes, ref-like type cannot be a type of a static field. It can only be an instance field of another ref-like type.

## ref-like variables can be passed as `in` parameters ##
- it is ok to pass a ref-like variable by reference to a call as long as the reference is **_readonly_** .
The reason is that exposing ref-like variables via writeable references make it nearly impossible to reason about their safety.

This comment has been minimized.

@MichalStrehovsky

MichalStrehovsky Mar 20, 2017

Member

Do we need any special rules for unmanaged pointers? When I type unsafe, I'll still be able to e.g. stackalloc an array of spans (useful), or shoot myself in the foot (not so useful), like I can with TypedReference, right?

using System;

unsafe class Program
{
    static void Leak(TypedReference* tr)
    {
        int i = 456;

        // Whoops
        *tr = __makeref(i);
    }

    static void Main()
    {
        TypedReference tr;
        Leak(&tr);
        Console.WriteLine(__refvalue(tr, int));
    }
}

(I look at this as the intended behavior of unsafe, but a blurb in the sense of "we thought about it and it's by design" might be useful.)

@MichalStrehovsky

MichalStrehovsky Mar 20, 2017

Member

Do we need any special rules for unmanaged pointers? When I type unsafe, I'll still be able to e.g. stackalloc an array of spans (useful), or shoot myself in the foot (not so useful), like I can with TypedReference, right?

using System;

unsafe class Program
{
    static void Leak(TypedReference* tr)
    {
        int i = 456;

        // Whoops
        *tr = __makeref(i);
    }

    static void Main()
    {
        TypedReference tr;
        Leak(&tr);
        Console.WriteLine(__refvalue(tr, int));
    }
}

(I look at this as the intended behavior of unsafe, but a blurb in the sense of "we thought about it and it's by design" might be useful.)

This comment has been minimized.

@VSadov

VSadov Mar 29, 2017

Member

Unsafe code is a tough subject. Some unsafe code has well-defined behavior even if not safe. Your example may actually run just fine depending on implementation.
C# as a language generally does not have a lot of guarantees about unsafe code.

I'd say the sample is legal, but its behavior is implementation specific and thus undefined.

@VSadov

VSadov Mar 29, 2017

Member

Unsafe code is a tough subject. Some unsafe code has well-defined behavior even if not safe. Your example may actually run just fine depending on implementation.
C# as a language generally does not have a lot of guarantees about unsafe code.

I'd say the sample is legal, but its behavior is implementation specific and thus undefined.

@KrzysztofCwalina

what does "receiver" in "ref-like fields are safe to return as long as the receiver is safe to return" mean?

The most permissive approach would be alias/value tracking via fixed point data-flow analysis. It is very expensive and complex approach to be practical, but will be called out here for completeness.
Simpler solutions are:
- make ref-like variables "single-assignment" similarly to the solution we have for ordinary ref variables.

This comment has been minimized.

@KrzysztofCwalina

KrzysztofCwalina Mar 22, 2017

Member

I don't think we can make them single assignment. It's super common to do span = span.Slice(count);

@KrzysztofCwalina

KrzysztofCwalina Mar 22, 2017

Member

I don't think we can make them single assignment. It's super common to do span = span.Slice(count);

This comment has been minimized.

@VSadov

VSadov Mar 29, 2017

Member

@KrzysztofCwalina - would it be ok to allow this only for spans that are free from referring to local data?

With local data you would have to do 'var anotherSpan = span.Slice(count);`

The reason is that inside a method could be many nested scopes and it would be very hard to enforce that variables from outer scopes do not end up referring indirectly to variables from inner scopes if they can be reassigned.
On the other hand, everything outside of the method lives as long as you have a reference to it, so it is safe to assign to anything.

@VSadov

VSadov Mar 29, 2017

Member

@KrzysztofCwalina - would it be ok to allow this only for spans that are free from referring to local data?

With local data you would have to do 'var anotherSpan = span.Slice(count);`

The reason is that inside a method could be many nested scopes and it would be very hard to enforce that variables from outer scopes do not end up referring indirectly to variables from inner scopes if they can be reassigned.
On the other hand, everything outside of the method lives as long as you have a reference to it, so it is safe to assign to anything.

## `ref-like` types must be readonly structs. ##
- ref-like types must be readonly.

This comment has been minimized.

@ektrah

ektrah Mar 22, 2017

Is it safe to have a ref-like type that is not readonly if all ref-like fields are readonly?

Use case: I'm currently passing around mutable structs by reference for incrementally reading from a ReadOnlySpan (struct, usage) / incrementally writing to a Span (struct, usage).

@ektrah

ektrah Mar 22, 2017

Is it safe to have a ref-like type that is not readonly if all ref-like fields are readonly?

Use case: I'm currently passing around mutable structs by reference for incrementally reading from a ReadOnlySpan (struct, usage) / incrementally writing to a Span (struct, usage).

This comment has been minimized.

@VSadov

VSadov Mar 22, 2017

Member

@ektrah - I need to think about it, but at first glance it seems unsafe. Any method on a non-readonly struct would get "this" by a read-write reference and as such can completely replace it with anything:

SomeMethod()
{
    // we can check if r1 is safe to return based on `arg1` and `arg2`
    Asn1Reader r1 =  GetReader(arg1, arg2);

    r1.Foo();

    // can we return or even use r1 after the call above?
    return r1;
}

struct Asn1Reader
{
      public void Foo()
      {
            // this will invalidate the assumptions that caller could make about `s1`
            this = new Asn1Reader(new ReadonlySpan<byte>(stackalloc byte[16]));
      }
}
@VSadov

VSadov Mar 22, 2017

Member

@ektrah - I need to think about it, but at first glance it seems unsafe. Any method on a non-readonly struct would get "this" by a read-write reference and as such can completely replace it with anything:

SomeMethod()
{
    // we can check if r1 is safe to return based on `arg1` and `arg2`
    Asn1Reader r1 =  GetReader(arg1, arg2);

    r1.Foo();

    // can we return or even use r1 after the call above?
    return r1;
}

struct Asn1Reader
{
      public void Foo()
      {
            // this will invalidate the assumptions that caller could make about `s1`
            this = new Asn1Reader(new ReadonlySpan<byte>(stackalloc byte[16]));
      }
}

This comment has been minimized.

@ektrah

ektrah Mar 22, 2017

@VSadov - stackalloc requires Foo to be unsafe, which probably means that Foo can do unsafe things with this and invalidate assumptions even if the type is readonly. This might be a better example:

struct MyStruct
{
    private ref int _x;

    public MyStruct(ref int x)
    {
        ref _x = ref x;
    }

    public void Bar()
    {
        int y = 123;
        this = new MyStruct(ref y); // error
    }

    public MyStruct Baz()
    {
        int y = 123;
        return new MyStruct(ref y); // error
    }

    public void Qux(out MyStruct result)
    {
        int y = 123;
        result = new MyStruct(ref y); // error
    }
}

To me it seems that assigning to a ref-like this should follow the same rules as returning a ref-like type, i.e., the assignment in Bar should be rejected for the same reason as the return statement in Baz.

@ektrah

ektrah Mar 22, 2017

@VSadov - stackalloc requires Foo to be unsafe, which probably means that Foo can do unsafe things with this and invalidate assumptions even if the type is readonly. This might be a better example:

struct MyStruct
{
    private ref int _x;

    public MyStruct(ref int x)
    {
        ref _x = ref x;
    }

    public void Bar()
    {
        int y = 123;
        this = new MyStruct(ref y); // error
    }

    public MyStruct Baz()
    {
        int y = 123;
        return new MyStruct(ref y); // error
    }

    public void Qux(out MyStruct result)
    {
        int y = 123;
        result = new MyStruct(ref y); // error
    }
}

To me it seems that assigning to a ref-like this should follow the same rules as returning a ref-like type, i.e., the assignment in Bar should be rejected for the same reason as the return statement in Baz.

This comment has been minimized.

@VSadov

VSadov Mar 29, 2017

Member

@ektrah - your examples are correct.

The idea with stacalloc is to have some syntax that would allow wrapping stack allocated data into a span (without leaking the ptr of course). Such construct could be used without "unsafe", since it would be type-safe, but we would need to make the resulting span not-returnable.

new ReadonlySpan<byte>(stackalloc byte[16]) is one possible example of such syntax. It is TBD though.

@VSadov

VSadov Mar 29, 2017

Member

@ektrah - your examples are correct.

The idea with stacalloc is to have some syntax that would allow wrapping stack allocated data into a span (without leaking the ptr of course). Such construct could be used without "unsafe", since it would be type-safe, but we would need to make the resulting span not-returnable.

new ReadonlySpan<byte>(stackalloc byte[16]) is one possible example of such syntax. It is TBD though.

This comment has been minimized.

@ektrah

ektrah Mar 29, 2017

@VSadov - A safe stackalloc would be nice. I'm already using unsafe stackalloc with Span<T> quite a few times: 1, 2, 3, 4, 5.

And it would really, really nice to be able to encapsulate a ref-like type together with mutable state, as I currently do in the structs linked above.

@ektrah

ektrah Mar 29, 2017

@VSadov - A safe stackalloc would be nice. I'm already using unsafe stackalloc with Span<T> quite a few times: 1, 2, 3, 4, 5.

And it would really, really nice to be able to encapsulate a ref-like type together with mutable state, as I currently do in the structs linked above.

- ref-like type cannot be used as a generic type argument
- ref-like variable cannot be boxed
- ref-like type cannot be a field of ordinary not ref-like type
- indirect restrictions, such as disallowed use of ref-like types in async methods, which are really a result of disallowing ref-like typed fields.

This comment has been minimized.

@svick

svick Mar 26, 2017

Contributor

What about edge cases, where ref-like types could be allowed? For example:

  1. Values in async methods that are immediately passed to another method. E.g. even inside an async method, F(GetRefLike()) or F(await task, GetRefLike()) could be allowed (though F(GetRefLike(), await task) couldn't).
  2. Locals in async methods with no awaits.
  3. Locals that are part of a closure of a "simple" local function (i.e. one whose closure is a struct, not a class).

I'm not sure what would be the value in allowing such cases, especially since the rules could become byzantine and closely tied to implementation details of the compiler. But I think it's worth considering.

@svick

svick Mar 26, 2017

Contributor

What about edge cases, where ref-like types could be allowed? For example:

  1. Values in async methods that are immediately passed to another method. E.g. even inside an async method, F(GetRefLike()) or F(await task, GetRefLike()) could be allowed (though F(GetRefLike(), await task) couldn't).
  2. Locals in async methods with no awaits.
  3. Locals that are part of a closure of a "simple" local function (i.e. one whose closure is a struct, not a class).

I'm not sure what would be the value in allowing such cases, especially since the rules could become byzantine and closely tied to implementation details of the compiler. But I think it's worth considering.

This comment has been minimized.

@VSadov

VSadov Mar 29, 2017

Member

I think we can look at preexisting behavior of types like TypedReference here. To make rules simple we may outright disallow ref-like types in async methods. Disallowing just the cases when ref-like is captured could lead to cases where something works in optimized code and does not in debug.

@VSadov

VSadov Mar 29, 2017

Member

I think we can look at preexisting behavior of types like TypedReference here. To make rules simple we may outright disallow ref-like types in async methods. Disallowing just the cases when ref-like is captured could lead to cases where something works in optimized code and does not in debug.

@VSadov

This comment has been minimized.

Show comment
Hide comment
@VSadov

VSadov Apr 21, 2017

Member

Based on feedback, the rules around passing span like types byref have been refined.
Because of that some of the feedback here has become inapplicable. The rest have been applied to the updated draft.

Since the discussion here is fairly long and concerns are captured in the updated document, I am closing this to start new discussion here:
#472

Member

VSadov commented Apr 21, 2017

Based on feedback, the rules around passing span like types byref have been refined.
Because of that some of the feedback here has become inapplicable. The rest have been applied to the updated draft.

Since the discussion here is fairly long and concerns are captured in the updated document, I am closing this to start new discussion here:
#472

@VSadov VSadov closed this Apr 21, 2017

@dgrunwald dgrunwald referenced this pull request Nov 15, 2017

Open

C# language support status #829

46 of 75 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment