Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal]: Practical existential types for interfaces #5556

Open
1 of 5 tasks
agocke opened this issue Dec 17, 2021 · 85 comments
Open
1 of 5 tasks

[Proposal]: Practical existential types for interfaces #5556

agocke opened this issue Dec 17, 2021 · 85 comments

Comments

@agocke
Copy link
Member

agocke commented Dec 17, 2021

Practical existential types for interfaces

  • Proposed
  • Prototype: Not Started
  • Implementation: Not Started
  • Specification: Not Started

Terminology note: What's called "existential types" in this proposal is more correctly called "associated types" in other languages. The previous proposal, referenced below, did not have the same restrictions and thus implemented something much closer to a pure existential type without type erasure. This proposal is more restrictive and doesn't support using interfaces with associated types in all locations. You can think of every mention of "existential type" in this proposal as meaning "associated type."

Intro

Previously I proposed adding some form of existential types to C#. In that proposal, I describe existential types at a high level and describe how some features could be added to C#. However, I didn't propose any particular implementation strategy and left many questions open, including the mechanism for enforcing type safety.

In this proposal I will describe a full implementation and type safety mechanism, as well as a metadata representation.

To start, the syntactic presentation. I propose that existential types be represented as abstract type members of interfaces. For example,

interface Iface
{
    // Existential type
    type Ext;

    public Ext P { get; }
}

The existential type is substituted when the interface is implemented, e.g.

class C : Iface
{
    type Iface.Ext = int;

    public int P => 0;
}

This syntax is similar to other languages with existential types and provides a clear separation from existing type parameters on types.

This syntax also emphasizes the differences described in the earlier proposal: unlike regular type parameters, which are provided by the creator of a type, existential types are provided by the implementer of the interface and are invisible at type creation.

Note: Some alternate syntaxes include

  1. interface IFace<protected Ext> (this was the syntax I used in my first proposal
  2. interface IFace<abstract Ext>
  3. interface IFace<implicit Ext>
  4. interface IFace|<Ext>

This does raise the question left open in the previous proposal: how to define type equality. Since each interface implementation may have a unique subsitution for the existential type, type equality depends on the exact type of the implementation. Notably, the interface itself is not an implementation, so the following would not type check:

void M(Iface left, Iface right)
{
    var x = left.P;
    x = right.P; // error, left.P and right.P may not be the same type
}

Worse, the type of x is difficult to express in the language, as-is. It is in some sense a type parameter, but there isn't a named type parameter in scope to use to refer to it. Inside the interface we call it Iface.Ext, but this is not actually a type, it is a type parameter. The actual type is whatever was substituted by the implementation. In the case of our example C above, the type is int.

However, we can improve the power of the feature using a different feature: existing C# generics. If we avoid using the type parameter as a type, and instead use it as a constraint, things get simpler:

void M<T>(T left, T right)
    where T : Iface
{
    var x = left.P; // `var` could be type `T.Ext`
    x = right.P; // type checks
}

With this usage, we can be confident that the implementations will produce "compatible" types. This leads to the following proposal: interfaces with type members should only be usable as generic constraints. With this restriction, we can treat type members as relatively standard C# types, usable in the places where type parameters would be permitted. That this is type safe may not be obvious, but the proposed reduction to existing .NET metadata will verify that the resulting code is type safe.

Motivating example

The examples above demonstrate simple usages, but don't give an example of practical advantages. One opportunity is improved optimizations. Consider LINQ. As Jared Parsons described in a blog post, two of the biggest weaknesses of IEnumerable<T> are the repeated interface dispatches, and the abstraction of the enumerator type. As he describes, we could improve the pattern using generics:

public interface IFastEnum<TElement, TEnumerator>
{
  TEnumerator Start { get; }
  bool TryGetNext(ref TEnumerator enumerator, out TElement value);
}

One big problem with this pattern is it makes the enumerator into either an additional type parameter which needs to be manually propagated, or public surface area. This is a job much better left to the compiler. This is how it could be written with existential types:

public interface IFastEnum<TElement>
{
    type TEnumerator;
    TEnumerator Start { get; }
    bool TryGetNext(ref TEnumerator enumerator, out TElement value);
}

The enumerator type is now appropriately elided for everyone except the implementor. A user might write

void M<TEnum>(TEnum e) where TEnum : IFastEnum<string>
{
    foreach (var elem in e)
    {
        Console.WriteLine(elem);
    }
}

This is more verbose than not using generics, but that is a more general concern about verbosity of generics.

And on the implementor side, it would look like this:

class List<T> : IFastEnum<TElement>
{
    type IFastEnum.TEnumerator = int;
    int Start => 0;
    public bool TryGetNext(ref int enumerator, out TElement value)
    {
        if (enumerator >= Count) {
            value = default(TElement);
            return false;
        }

        value = _array[enumerator++];
        return true;
    }
}

This is much the same code that Jared wrote, and should provide the same performance benefits.

Compilation

In the previous proposal, I described a lowering strategy based on logic theorems around existential and universal type equivalence. This technique is powerful and flexible, but much more complicated and difficult to implement. The above design has substantial limitations on how existential types can be used, therefore the implementation can be much simpler. If these limitations prove too onerous in the future, some restrictions may be loosened with a more complex compilation strategy.

The proposed compilation strategy is broadly quite simple: turn existential types into hidden generic parameters. This may seem extreme, but note that the language rule for type members requires the interface which contains them to only be used as constraints. This means that compilation may add generic parameters, but it will never make a method generic which wasn't before, and the type parameters will not spread past the introduction of the constraint.

Here's a simple example of the code before and after.

Before:

interface Iface
{
    type Ext : IDisposable;

    Ext P { get; }
}
class A : Iface
{
    type Ext = MemoryStream;

    MemoryStream P => new MemoryStream();
}
void Dispose<T>(T t)
    where T : Iface
{
    t.P.Dispose();
}
Dispose(new A());

After:

interface Iface<Ext>
    where Ext : IDisposable
{
    Ext P { get; }
}
class A : Iface<MemoryStream>
{
    MemoryStream P => new MemoryStream();
}
void Dispose<T, Ext>(T t)
    where T : Iface<Ext>
{
    t.P.Dispose();
}
Dispose<A, MemoryStream>(new A());

The important transformations are:

  1. Type members become additonal type parameters on the interface.
  2. Constraints on type members become constrains on the interface.
  3. Type member assignments in the implementation become type substitutions
    in the interface implementation.
  4. In all places where type parameters are declared with constraints to interfaces
    with type members, all type members must be added to the parameter list.
  5. At all callsites with synthesized type parameters, the correct type arguments must be
    inferred.

Most of the transformations are simple, but the synthesizing and inferring of type parameters may be worth some elaboration.

First, we need to establish what is necessary for inference. To do so, we need to determine which synthesized type parameters "belong" to which type parameter. This should be doable using synthesized attributes or modreqs to point to the "owning" parameter's index.

Once we know the owning parameters and the synthesized parameters, we can temporarily remove the synthesized parameters and perform type inference as currently specified in C#, or use the manual substitutions. Once the substitution is identified, we can identify the substitutions for the synthesized parameters. First, identify the needed interface by walking the constraint list in order. Next, determine the substitutions in the interface implementation. As currently proposed, the syntax only allows for a single implementation of a given interface for a given type. By searching types from most to least derived for the first implementation of the target interface, we can identify the substitutions on the argument. By matching the substitutions of the argument with the synthesized type parameters, synthesized type arguments can be generated.

The process above can be repeated for all type definitions and substitutions.

  • Open question

Consider also banning re-implementation of the same interface across inheritance. It's not a type safety violation (and therefore shouldn't need a runtime check), but it could lead to confusing behavior on which implementation is chosen, especially since it is always fully inferred.

Conclusion

Advantages

  • Relatively simple to implement and explain
  • Provides full performance benefits
  • Compact metadata representation

Drawbacks

  • More complexity in generics in C#
  • Generics are different in metadata vs. source
  • Other languages have to implement encoding/decoding separately

Design Meetings

https://github.com/dotnet/csharplang/blob/main/meetings/2022/LDM-2022-02-16.md#practical-existential-types-for-interfaces

@agocke agocke changed the title [Proposal]: [FEATURE_NAME] [Proposal]: Practical existential types for interfaces Dec 17, 2021
@HaloFour
Copy link
Contributor

I can see the utility but I really don't like how this causes the source declarations and the metadata to differ so much in what I think would be non-obvious ways. IIRC this was a serious sticking point in the earlier discussions on shapes, that the witness type had to be a generic type parameter, and that the team overall would have preferred that the runtime offer a solution that did not require something like that.

@BhaaLseN
Copy link

Could "optional type parameters" solve this? A bit borrowed from C++ templates, not sure how this would be represented in metadata though.

interface Iface<Ext = MemoryStream> where Ext : IDisposable
{
      Ext P { get; }
}
class A : Iface // implicitly: Iface<MemoryStream>
{
    MemoryStream P => new MemoryStream();
}
void Dispose<T>(T t)
    where T : Iface
{
    t.P.Dispose();
}
Dispose(new A());

Or for the fast enumerable case:

public interface IFastEnum<TElement, TEnumerator = int>
{
    TEnumerator Start { get; }
    bool TryGetNext(ref TEnumerator enumerator, out TElement value);
}
class List<TElement> : IFastEnum<TElement> // implicitly: IFastEnum<TElement, int>
{
    int Start => 0;
    public bool TryGetNext(ref int enumerator, out TElement value)
    {
        if (index >= Count) {
            value = default(T);
            return false;
        }

        value = _array[index++];
        return true;
    }
}

// implicit: any IFastEnum<string, int>
void M<TEnum>(TEnum e) where TEnum : IFastEnum<string>
{
    foreach (var elem in e)
    {
        Console.WriteLine(elem);
    }
}
// explicit? it does allow overriding the value, while "type Foo" would not
// would only be called when the implemetation is specialized and Fancy(TM)
// question about overload resolution though.
// and how TEnumerator could be hoisted up to be a type parameter of M.
void M<TEnum>(TEnum e) where TEnum : IFastEnum<string, FancyEnumerator<string>>
{
    foreach (var elem in e)
    {
        Console.WriteLine(elem);
    }
}

@iam3yal
Copy link
Contributor

iam3yal commented Dec 17, 2021

@BhaaLseN This seems wrong to me Iface<Ext = MemoryStream> where Ext : IDisposable because existential types needs to be declared first so if anything it should be something like this Iface<Ext = MemoryStream> where Ext : type, IDisposable otherwise it's just a regular generic type parameter and to be honest it looks weird to me.

I think that "optional type parameters" is a separated feature that doesn't really solve the problem here but enhances generics that the proposed solution take advantage of and might benefit from the enhancement as well should it ever added to the language.

@TahirAhmadov
Copy link

TahirAhmadov commented Dec 17, 2021

I'm confused by the FastEnum example. First of all, the index should be enumerator and TElement should be T. But those typos aside, how is that better than just creating a new:

interface IFastEnumerator<T>
{
  bool TryMoveNext(out T item);
}
interface IFastEnumerable<T>
{
  IFastEnumerator<T> GetFastEnumerator();
}
class List<T> : IFastEnumerable<T>
{
  class FastEnumerator : IFastEnumerator<T>
  {
    public FastEnumerator(List<T> list) => this._list = list;
    int _index;
    List<T> _list;
    // or
    T[] _array; int _size; // to get rid of the indirection
    public bool TryMoveNext(out T item)
    {
      // ...
    }
  }
  public IFastEnumerator<T> GetFastEnumerator() => new FastEnumerator(this);
}
foreach(var x in list) { ... }

C# is expanded to look for GetFastEnumerator and use it accordingly. The only downside of this approach is one initial allocation of a class, which has the list reference (8 bytes) and the counter (4 bytes) = 12 bytes. Are we sure it's worth it to introduce a whole new concept of "sort of generics" to save allocating 12 bytes once per loop? And doesn't the performance cost of extra generic arguments (in terms of runtime) negate the benefit of saving that allocation?
Also, with the example in the OP, it has to repetitively put the ref index on the stack - on top of invoking an instance method using v-table. With my example, the instance invocation is still there, but the ref index push isn't. Maybe I'm wrong but it seems my example is faster.

@0x0737
Copy link

0x0737 commented Dec 17, 2021

What about abstract classes?
And hope it eventually will be a runtime feature... Compiler "tricks" like that always remind me of java's type erasure...

@CyrusNajmabadi
Copy link
Member

CyrusNajmabadi commented Dec 17, 2021

At all callsites with synthesized type parameters, the correct type arguments must be inferred.

I wonder if there's space here to do somethign akin to how VB works with extension methods and type parameters. Where tehre can be the 'original def' with some number, and the 'reduced' version with less. Except that here it woudl be somewhat the reverse. I'm just speaking at the Language level how we might represent this, while the emitted level could def look different.

@TahirAhmadov
Copy link

Improved version of my code sample:

interface IFastEnumerator<T>
{
  bool TryMoveNext(out T item);
}
interface IFastEnumerable<T>
{
  Type GetFastEnumeratorType(); // return type which has a ctor accepting one parameter, the collection; and has TryMoveNext method
}
class List<T> : IFastEnumerable<T>
{
  struct FastEnumerator : IFastEnumerator<T>
  {
    public FastEnumerator(List<T> list) => this._list = list;
    int _index;
    List<T> _list;
    // or
    T[] _array; int _size; // to get rid of the indirection
    public bool TryMoveNext(out T item)
    {
      // ...
    }
  }
  public Type GetFastEnumeratorType() => typeof(FastEnumerator);
}
foreach(var x in list) { ... }
// becomes
var enumerator = new List<T>.FastEnumerator(list); // look for ctor which takes the list as a parameter
// the enumerator now lives on the stack
while(enumerator.TryGetNext(out readonly x)) // this call can be inlined, virtually removing any overhead
{
}

And the best part, all this is possible today, with minor changes to the C# compiler (recognize new types/method names/etc.)

@333fred
Copy link
Member

333fred commented Dec 17, 2021

Improved version of my code sample:

interface IFastEnumerator<T>
{
  bool TryMoveNext(out T item);
}
interface IFastEnumerable<T>
{
  Type GetFastEnumeratorType(); // return type which has a ctor accepting one parameter, the collection; and has TryMoveNext method
}
class List<T> : IFastEnumerable<T>
{
  struct FastEnumerator : IFastEnumerator<T>
  {
    public FastEnumerator(List<T> list) => this._list = list;
    int _index;
    List<T> _list;
    // or
    T[] _array; int _size; // to get rid of the indirection
    public bool TryMoveNext(out T item)
    {
      // ...
    }
  }
  public Type GetFastEnumeratorType() => typeof(FastEnumerator);
}
foreach(var x in list) { ... }
// becomes
var enumerator = new List<T>.FastEnumerator(list); // look for ctor which takes the list as a parameter
// the enumerator now lives on the stack
while(enumerator.TryGetNext(out readonly x)) // this call can be inlined, virtually removing any overhead
{
}

And the best part, all this is possible today, with minor changes to the C# compiler (recognize new types/method names/etc.)

That Enumerator type is not something that the compiler can determine at compile time, meaning it would have to be reflection. Associated types is how you achieve your interface proposal with actual compile time knowledge.

@TahirAhmadov
Copy link

How about:

interface IFastEnumerator<T>
{
  bool TryMoveNext(out T item);
}
interface IFastEnumerable<T>
{
  IFastEnumerator<T> GetFastEnumerator(); // this is for regular usage
}
class List<T> : IFastEnumerable<T>
{
  public struct FastEnumerator : IFastEnumerator<T> // this can be public - who cares
  {
    public FastEnumerator(List<T> list) => this._list = list;
    int _index;
    List<T> _list;
    // or
    T[] _array; int _size; // to get rid of the indirection
    public bool TryMoveNext(out T item)
    {
      // ...
    }
  }
  public IFastEnumerator<T> GetFastEnumerator() => new FastEnumerator(this); // this is for regular usage
  public FastEnumerator GetFastEnumeratorDirect() => new FastEnumerator(this); // this is for highest performance
}
foreach(var x in list) { ... }
// becomes
var enumerator = list.GetFastEnumeratorDirect(); // C# looks for a method named GetFastEnumeratorDirect
// the enumerator now lives on the stack
while(enumerator.TryGetNext(out readonly x)) // this call can be inlined, virtually removing any overhead
{
}

@333fred
Copy link
Member

333fred commented Dec 17, 2021

How about:

interface IFastEnumerator<T>
{
  bool TryMoveNext(out T item);
}
interface IFastEnumerable<T>
{
  IFastEnumerator<T> GetFastEnumerator(); // this is for regular usage
}
class List<T> : IFastEnumerable<T>
{
  public struct FastEnumerator : IFastEnumerator<T> // this can be public - who cares
  {
    public FastEnumerator(List<T> list) => this._list = list;
    int _index;
    List<T> _list;
    // or
    T[] _array; int _size; // to get rid of the indirection
    public bool TryMoveNext(out T item)
    {
      // ...
    }
  }
  public IFastEnumerator<T> GetFastEnumerator() => new FastEnumerator(this); // this is for regular usage
  public FastEnumerator GetFastEnumeratorDirect() => new FastEnumerator(this); // this is for highest performance
}
foreach(var x in list) { ... }
// becomes
var enumerator = list.GetFastEnumeratorDirect(); // C# looks for a method named GetFastEnumeratorDirect
// the enumerator now lives on the stack
while(enumerator.TryGetNext(out readonly x)) // this call can be inlined, virtually removing any overhead
{
}

That's pretty much what we have today, with the same performance issues in fully generic code.

@TahirAhmadov
Copy link

That's pretty much what we have today, with the same performance issues in fully generic code.

Oh yea, I didn't even realize it's there. Given that's already there, what are the performance issues? How can that be improved any further? (Other than getting rid of the MoveNext/Current and replacing it with TryGetNext) What's the drive for "existential types"?

@CyrusNajmabadi
Copy link
Member

the problem is when you are workign with IFastEnumerable. The compiler doesn't know about GetFastEnumeratorDirect to do this quickly. With existential types, it could.

@TahirAhmadov
Copy link

the problem is when you are workign with IFastEnumerable. The compiler doesn't know about GetFastEnumeratorDirect to do this quickly. With existential types, it could.

I'm completely confused now. The compiler already looks for GetEnumerator which returns the public struct enumerator, not the interface (per @333fred and List<T> definition).

@CyrusNajmabadi
Copy link
Member

I'm completely confused now. The compiler already looks for GetEnumerator which returns the public struct enumerator

It can do that when workign with the concrete type. But if you're workign with the interface it has no idea. That's what existential types aims to solve. You can both be working with the interface, but pass along enough information that the compielr/jit can statically know all the information necessary to generate more optimal code (namely code that doesn't need to call through virtual-dispatch or allocate on the heap).

@CyrusNajmabadi
Copy link
Member

Using Andy's example, you could pass around an IFastEnum, but get the same perf as if you passed around a concrete impl of that interface.

@TahirAhmadov
Copy link

I'm completely confused now. The compiler already looks for GetEnumerator which returns the public struct enumerator

It can do that when workign with the concrete type. But if you're workign with the interface it has no idea. That's what existential types aims to solve. You can both be working with the interface, but pass along enough information that the compielr/jit can statically know all the information necessary to generate more optimal code (namely code that doesn't need to call through virtual-dispatch or allocate on the heap).

Oh I see what you mean. Still, is it really that important to get rid of one allocation per entire loop? My first example - basically adding TryGetNext and leaving everything else as is - works just as fast per iteration. Or am I missing something else?

@CyrusNajmabadi
Copy link
Member

Still, is it really that important to get rid of one allocation per entire loop?

Yes.

@CyrusNajmabadi
Copy link
Member

CyrusNajmabadi commented Dec 17, 2021

My first example - basically adding TryGetNext and leaving everything else as is - works just as fast per iteration.

Again, this is what we have today :)

However, you only get the benefits if you know the concrete types. But that's highly problematic. In many domains you do not. And now there are tons of allocations happening as you have to work over abstract interfaces with no additional information present to help the runtime/jit/etc. avoid them.

@TahirAhmadov
Copy link

@CyrusNajmabadi @333fred
OK I just ran performance testing, enumerating 1,000,000 items, 100 times to get an average, and the results are: (.NET 5)

  • OP example with ref int state: 00:00:00.0057360
  • My example with interface and TryGetNext instead of MoveNext/Current: 00:00:00.0055274
  • Concrete type - struct returned, and TryGetNext: 00:00:00.0047260

.NET FW 4.8:

  • OP example with ref int state: 00:00:00.0139251
  • My example with interface and TryGetNext instead of MoveNext/Current: 00:00:00.0128952
  • Concrete type - struct returned, and TryGetNext: 00:00:00.0117484

In .NET 5, the first 2 approaches (ref int and interface) sometimes switch places; to be completely fair, I think they perform equally. In .NET FW 4.8, the 2nd (interface) approach is always faster.
Code: https://pastebin.com/UA1N1KCG

@CyrusNajmabadi
Copy link
Member

@TahirAhmadov please show a BenchmarkDotnet result, including memory allocations. FUrthermore, this only shows TryGetNext appraoch which is not extendible to all the cases where existential types may be used.

@HaloFour
Copy link
Contributor

This proposal is also about the ergonomics of generic types, not performance. Those scenarios where performance benefits can be achieved but currently necessitate painful generic signatures are just a bonus.

@CyrusNajmabadi
Copy link
Member

CyrusNajmabadi commented Dec 17, 2021

My BDN results:

|           Method |       Mean |   Error |  StdDev | Ratio | Allocated |
|----------------- |-----------:|--------:|--------:|------:|----------:|
|  IFastEnumerable | 1,647.0 ns | 4.52 ns | 4.23 ns |  1.00 |         - |
|                  |            |         |         |       |           |
| IFastEnumerable2 | 2,269.2 ns | 6.86 ns | 6.41 ns |  1.00 |      32 B |
|                  |            |         |         |       |           |
|  FastEnumerator2 |   347.1 ns | 0.71 ns | 0.66 ns |  1.00 |         - |

Gist: https://gist.github.com/CyrusNajmabadi/9a1acd7f5c2f2b63c194870533e6af62

So anywhere from 5-7x faster (not even counting things like allocations).

@TahirAhmadov
Copy link

@TahirAhmadov please show a BenchmarkDotnet result, including memory allocations. FUrthermore, this only shows TryGetNext appraoch which is not extendible to all the cases where existential types may be used.

The code is self explanatory, I don't have the time now to make a new project with that BenchmarkDotnet. I do agree that the allocation will add some memory consumption, but again, we're talking about something like 20 bytes; this only becomes a problem if the loop is executed many times - which is usually a code smell for a whole bunch of reasons, not only allocating the enumerator object. In short, I think there may be a miniscule performance benefit to squeeze out of this improvement; but the cost of creating a whole new syntax constructs plus necessary runtime changes just doesn't seem to be worth it right now.
Again - I'm going off of what's in the OP. They show the motivational example and that's what I can work with. I can't sit here and invent other ways to use a feature which I'm not the one proposing.

@CyrusNajmabadi
Copy link
Member

The code is self explanatory

The code is not acceptable as it does hand written measurements without any of the efforts or smarts that BDN puts into reliable, repeatable, and statistically significant measurements.

but again, we're talking about something like 20 bytes; this only becomes a problem if the loop is executed many times

No it's not. That's the common case. Components are producing enumerables and other parts of hte system are consuming it.

I think there may be a miniscule performance benefit to squeeze out of this improvement;

You are incorrect. Even in the basic examples you showed it's 5-7x, and no allocs.

@TahirAhmadov
Copy link

My BDN results:

|           Method |       Mean |   Error |  StdDev | Ratio |
|----------------- |-----------:|--------:|--------:|------:|-
|  IFastEnumerable | 1,647.0 ns | 4.52 ns | 4.23 ns |  1.00 |
|                  |            |         |         |       |
| IFastEnumerable2 | 2,269.2 ns | 6.86 ns | 6.41 ns |  1.00 |
|                  |            |         |         |       |
|  FastEnumerator2 |   347.1 ns | 0.71 ns | 0.66 ns |  1.00 |

Gist: https://gist.github.com/CyrusNajmabadi/9a1acd7f5c2f2b63c194870533e6af62

So anywhere from 5-7x faster (not even counting things like allocations).

The FastEnumerator2 is the concrete type struct approach - I just included it for comparison sake.
However your results are interesting. The existential type approach does seem to be about 27% faster. I'll look into it further later.

@CyrusNajmabadi
Copy link
Member

CyrusNajmabadi commented Dec 17, 2021

The existential type approach does seem to be about 27% faster. I'll look into it further later.

No. The existential type approach would allow for the speed of FastEnumerator2 as there will be no virtual dispatch. The jit will literally know that 'start' is an int, and that there is a not-virt call to:

    public bool TryGetNext(ref int enumerator, out TElement value)
    {
        if (index >= Count) {
            value = default(T);
            return false;
        }

        value = _array[index++];
        return true;
    }

which will likely get entire inlined and optimized.

@FaustVX
Copy link

FaustVX commented Mar 28, 2022

@jnm2 Ok for that, but my other points still remains.

@agocke
Copy link
Member Author

agocke commented Mar 29, 2022

@FaustVX I gave a real-world example here: #5556 (comment)

@agocke
Copy link
Member Author

agocke commented Mar 29, 2022

@fabianoliver I read through a bit of that proposal. It's basically existential types, but without the restrictions that make them efficiently implementable. Consider the example

void IterateIfPossible(object obj)
{
    if (obj is IEnumerable<?> items) 
    {
        foreach (var item in items)
            Console.WriteLine(item);
    }
}

This can't be implemented without boxing. It's easy to see if you try to type-check it at the CLR level. Despite the fact that ? is hidden in the syntax, ? must have an actual runtime type, except that there's no uniform type that could possibly represent all types ? could be. As HaloFour mentioned in that issue, int and string do not have a shared representation. You can provide a shared representation by boxing an int, but this is effectively representing everything as a heap-allocated pointer. This is very inefficient and expensive.

In general, if you want efficient value type representations, they must remain unboxed.

@agocke
Copy link
Member Author

agocke commented Mar 29, 2022

Well, addendum, you can do it efficiently, but the only way I know of is a transformation to Skolem normal form as I proposed in #1328, and this proposal was explicitly designed to be much easier to implement and add fewer new typing rules to the language (i.e., path-dependent types)

@fabianoliver
Copy link

fabianoliver commented Mar 29, 2022

@agocke

Agreed, I think with wildcard matching, you could do

void IterateOptimisedIfPossible(object obj)
{
    // The most annoying bit about wildcard runtime type matching: obj could implement IFastEnum multiple times
    if (obj is IFastEnum<string, ?> matches && matches.Length == 1)
      Iterate(items[0]);
}

// Compiler could implement this as Iterate<T>(IFastEnum<string,T> enum), i.e. any function that contains generic wildcards would always be implicitly generic
void Iterate(IFastEnum<string, ?> enum)
{
  foreach(var e in enum)
    DoStuff(e);
}

How would IterateOptimisedIfPossible look under #5556 - would it require explicit reflection to Invoke "Iterate" with the actual type of obj as a generic arg?

The fact you could do a more unoptimised iteration using wildcards:

void IterateUnoptimised(object obj)
{
    if (obj is IFastEnum<string, ?> matches && matches.Length == 1)
      foreach(var e in matches[0])
        DoStuff(e);
}

might be a blessing and a curse at the same time. I'd think the downside is that it'd be very easy to miss this isn't optimal.
The upside is likely that it gives you optionality; in cases where you don't care about the potential boxing overhead, or want to avoid the potential overhead of an extra generic method, you could opt to use the inline version.

You're probably right that a fully fledged wildcard matching system is probably a lot harder to implement/get right.
I wonder if a first implementation could ship without runtime type matching though - i.e. no "is" using wildcards etc. I think the implementation strategy would then quite closely mirror #5556 - everything could be done with hidden generics at compile (or reflection)-time - while potentially leaving the door open to add runtime type features with a consistent syntax further down the road?

@agocke
Copy link
Member Author

agocke commented Mar 29, 2022

@fabianoliver That doesn't work. At the call to Iterate, the compiler needs to provide a generic substitution for T. The only type that could be there is object because you've already lost all static type information.

hoffmann-stefan added a commit to schiller-de/ros2_dotnet that referenced this issue Jun 24, 2022
…per types"

With "action wrapper types" this means the ROS messages and services that add the `GoalId`, `TimeStamp` or other fields on top of the user defined types for the action.
The introduced methods in `IRosActionDefinition` are needed to work around the missing existential type support in C# (dotnet/csharplang#5556).
@Sergio0694
Copy link

Just sharing some random thoughts - if I'm reading this correctly, it seems like the proposed syntax with the associated type being an "interface member" would be preferrable as it'd make things a lot more ergonomic when using source generators? Let me make an example so I can also double check this actually make sense 😄

In ComputeSharp I have a couple of interfaces (eg. ID2D1PixelShader) that users can implement on a type to indicate that type is a shader type to run on the GPU. The source generator then kicks in and generates a whole bunch of supporting methods for that shader type (eg. to get the transpiled HLSL code, the compiled shader bytecode, other metadata info, marshal the constant buffer data, etc.). These methods are then used by ComputeSharp to actually manage the execution of shaders.

Conceptually, imagine something like this:

interface IFoo
{
    void SomeMethodUsersShouldImplement();
    
    void SourceGeneratedMethod1();
    void SourceGeneratedMethod2();
    // ...
}

What users do is the following:

partial struct MyType : IFoo
{
    public void SomeMethodUsersShouldImplement() { /* ... */ }
}

The generator then kicks in and generates all those additional methods. With associated types, this could be:

interface IFoo
{
    type TGenerated : IGenerated;

    void SomeMethodUsersShouldImplement();
    
    interface IGenerated
    {
        static abstract void SourceGeneratedMethod1();
        static abstract void SourceGeneratedMethod2();
        // ...
    }
}

And users would then write:

partial struct MyType : IFoo
{
    public void SomeMethodUsersShouldImplement() { /* ... */ }
}

And the generator would declare a partial declaration of this file, with some file-local type implementing the interface, and would declare that type as the associated type for this interface. Then APIs using this would just use that directly, eg.:

// Before
static void UseType<T>(T type)
    where T : IFoo
{
    type.SourceGeneratedMethod1();
}

// After
static void UseType<T>(T type)
    where T : IFoo
{
    type.TGenerated.SourceGeneratedMethod1();
}

Which seems pretty nice to me. Couple things:

  • If I'm reading the proposal correctly, this should work, right? 🤔
  • It seems having associated types be interface members would make this scenario much better, as the author of a type would not have to indicate the generic type arguments in the interface base type declaration. Instead, the generator would just declare that associated type as a type member in the generated partial type declaration. Otherwise, I don't see how could this be done at all, since users wouldn't have any type to put in the : IFoo<TGenerated> declaration.

hoffmann-stefan added a commit to schiller-de/ros2_dotnet that referenced this issue Apr 22, 2023
…per types"

With "action wrapper types" this means the ROS messages and services that add the `GoalId`, `TimeStamp` or other fields on top of the user defined types for the action.
The introduced methods in `IRosActionDefinition` are needed to work around the missing existential type support in C# (dotnet/csharplang#5556).
@Sergio0694
Copy link

Sergio0694 commented Apr 29, 2023

A few more thoughts on this (and comment above) after discussing the issue of "adding new members" from source generators with @CyrusNajmabadi and @jkoritzinsky the other day. Under this proposal, would existential types support being declared as partial in implementing types? Because if that was the case, the pattern mentioned in #5556 (comment) could be supported by having the user declare the partial type themselves, and the generator would just implement it, instead of also declaring it from the start (which I understand makes the generator less efficient, in theory). That is, would this be supported?

// Interface declaration shipped in a library
interface IFoo
{
    type TBar : IBar;

    void SomeMethodUsersShouldImplement();
    
    interface IBar
    {
        void SourceGeneratedMethod();
    }
}

// User code in a project referencing the library
partial class Foo : IFoo
{
    [GeneratedBar]
    partial type TBar;

    public void SomeMethodUsersShouldImplement() { }
}

// Generated code
partial class Foo
{
    partial type TBar = Bar;

    private class Bar
    {
        public void SourceGeneratedMethod() { }
    }
}

This seems like a very powerful and flexible pattern in general 👀

@333fred
Copy link
Member

333fred commented Apr 29, 2023

@Sergio0694 you almost certainly wouldn't be able to use a file type there.

@Sergio0694
Copy link

Sergio0694 commented Apr 29, 2023

@333fred updated to use a private type, just to keep things simpler and not risk getting sidetracked 😄
I didn't consider the fact file private types are very restricted in where they can be used, so yeah that'd be fair.

But in general, ignore the specific accessibility, could also just be public, doesn't matter.
The question is just - do you think we could have "partial existential type declarations"?

@333fred
Copy link
Member

333fred commented Apr 29, 2023

Probably

@abnercpp
Copy link

abnercpp commented Apr 30, 2023

I know this may be a late opinion, but I'm more inclined towards "use site" existential types, which would allow this functionality to be used for existing interfaces with no change to them whatsoever. Furthermore, the programmer can choose what he makes an explicit type parameter, an implicit type parameter, and an existential type parameter.

On top of that, we wouldn't need a new concept of interfaces that can't be used like regular ones (that is, can only be used as a generic constraint).

public static class LinqExtensionMethods
{
    // `TEnumerable` and `TReturn` are explicit.
    // `TItem` is implicit and inferred by the compiler from the `where` constraints.
    public static <TItem> TReturn Select<TEnumerable, TReturn>(this TEnumerable source, Func<TItem, TReturn> func) where TEnumerable : IEnumerable<TItem>
    {
            // Implementation doesn't matter.
    }
}

And if I don't need to know about TItem, to make it a truly existential type, I can just write something like:

    public static class PrinterExtensionMethods
    {
        // Here `T` from `IEnumerable<T>` is existential.
        public static void PrintAll<TEnumerable>(this TEnumerable source) where TEnumerable : IEnumerable<?>
        {
            // Implementation doesn't matter.
        }
    }

I believe this feature would not require any changes to the CLR, we can just make everything syntax sugar and during lowering the compiler could just make everything an explicit parameter. How these methods show up in reflection, as well as the order of the compiler-generated parameters, however, are another discussion.

@agocke
Copy link
Member Author

agocke commented May 1, 2023

@Sergio0694 The problem with partial associated types (they are associated types in this proposal, the original proposal did have something akin to existential types with full Skolemization, and this proposal was a response to that one, but this one doesn't really have existentials) is that ordering is important, so you couldn't have, for instance

partial interface I1
{
   type T1;
}
partial interface I1
{
   type T2;
}

As the ordering of those members is semantically important in the lowering. That's not to say that we couldn't choose the member syntax, but we would have to restrict members to appearing in only one declaration.

@abner-commits Your proposal doesn't really solve one of the main things that this one tries to, which is that writing signatures with tons of generic parameters really sucks.

@abnercpp
Copy link

abnercpp commented May 1, 2023

@Sergio0694 The problem with partial associated types (they are associated types in this proposal, the original proposal did have something akin to existential types with full Skolemization, and this proposal was a response to that one, but this one doesn't really have existentials) is that ordering is important, so you couldn't have, for instance

partial interface I1
{
   type T1;
}
partial interface I1
{
   type T2;
}

As the ordering of those members is semantically important in the lowering. That's not to say that we couldn't choose the member syntax, but we would have to restrict members to appearing in only one declaration.

@abner-commits Your proposal doesn't really solve one of the main things that this one tries to, which is that writing signatures with tons of generic parameters really sucks.

Except it does. I fail to see how it would "suck" to discard parameters I don't care about like this:

void DoSomething<T>(T target) where T : IInterface<?, ?, ?, ?>
{
}

Is that any less ergonomic than having to declare an interface just for usage within generics? Plus there's the advantage that this applies to already existing interfaces, and I can add constraints to the implicit types if I want, further enhancing what we can accomplish with genetics.

Here's an example of what would be possible to do with implicit type parameters + existential types:

public <TEnum> void DoSomething<TDict>(TDict target)
    where TDict : IReadOnlyDictionary<TEnum, ?>
    where TEnum : struct, Enum
{
    
}

@Sergio0694
Copy link

"As the ordering of those members is semantically important in the lowering. That's not to say that we couldn't choose the member syntax, but we would have to restrict members to appearing in only one declaration."

@agocke apologize for the potentially dumb question, but could you elaborate why the ordering would be important here? As in, why would the various associated types for the interface not be handled like other members (eg. fields, methods, etc.) that can be scattered around over multiple partial declarations, and then they're just handled accordingly by name? I'm not sure I'm following why would a type T2 being defined in the same type before/after type T1, or in another partial declaration of its parent interface, result in a different semantics for it 😅

@agocke
Copy link
Member Author

agocke commented May 1, 2023

@Sergio0694 The lowered encoding is as type parameters, so it matters whether T1 or T2 end up seen first, as the encoding would be Iface1<T1, T2> in lowered form. If a change in compilation order of the files resulted in IFace<T2, T1> that would be a binary breaking change.

@abner-commits Writing out discards is worse than not writing out discards. The other problem is that the purpose of the associated type are different.

When you have an associated type that can only be referred to by the implementor, it creates a contract where the type parameter is abstract -- an implementation detail. If you allow them to be manually substituted, then you remove the "hiding" functionality of making it an implementation detail. There's a fundamental question of where the abstraction sits.

@Sergio0694
Copy link

"The lowered encoding is as type parameters, so it matters whether T1 or T2 end up seen first, as the encoding would be Iface1<T1, T2> in lowered form. If a change in compilation order of the files resulted in IFace<T2, T1> that would be a binary breaking change."

Ooh, of course, yeah that makes perfect sense now. Thank you! 😄

@abnercpp
Copy link

abnercpp commented May 1, 2023

@Sergio0694 The lowered encoding is as type parameters, so it matters whether T1 or T2 end up seen first, as the encoding would be Iface1<T1, T2> in lowered form. If a change in compilation order of the files resulted in IFace<T2, T1> that would be a binary breaking change.

@abner-commits Writing out discards is worse than not writing out discards. The other problem is that the purpose of the associated type are different.

When you have an associated type that can only be referred to by the implementor, it creates a contract where the type parameter is abstract -- an implementation detail. If you allow them to be manually substituted, then you remove the "hiding" functionality of making it an implementation detail. There's a fundamental question of where the abstraction sits.

I believe having to write out discards is a minor inconvenience that in return allows you to have fine-grained control over generic parameters. I can have IEnumerable<?> but I can also have IReadOnlyDictionary<T, ?> (if I want to know T or add constraints to it). It feels like a more general-purpose solution than existential types in interfaces, since both can be used for the same thing, however the former can be used for even more things, and further enhances the type system.

Your point about the existential type being an implementation detail I would argue depends on the context. I may want to use the interface in a context where I do care about the existential type, but I still don't want to be tied to a specific implementation of that interface. Likewise, I may want to use that same interface and not care about its generic types. In my opinion, the level of abstraction should be up to the API consumer to decide. If I feel like the generic type doesn't matter for the context I'm in, I can discard it, otherwise I can use it as an implicit parameter, which will even make consuming my API simpler.

hoffmann-stefan added a commit to schiller-de/ros2_dotnet that referenced this issue May 3, 2023
…per types"

With "action wrapper types" this means the ROS messages and services that add the `GoalId`, `TimeStamp` or other fields on top of the user defined types for the action.
The introduced methods in `IRosActionDefinition` are needed to work around the missing existential type support in C# (dotnet/csharplang#5556).
hoffmann-stefan added a commit to schiller-de/ros2_dotnet that referenced this issue May 15, 2023
…per types"

With "action wrapper types" this means the ROS messages and services that add the `GoalId`, `TimeStamp` or other fields on top of the user defined types for the action.
The introduced methods in `IRosActionDefinition` are needed to work around the missing existential type support in C# (dotnet/csharplang#5556).
hoffmann-stefan added a commit to schiller-de/ros2_dotnet that referenced this issue May 19, 2023
…per types"

With "action wrapper types" this means the ROS messages and services that add the `GoalId`, `TimeStamp` or other fields on top of the user defined types for the action.
The introduced methods in `IRosActionDefinition` are needed to work around the missing existential type support in C# (dotnet/csharplang#5556).
hoffmann-stefan added a commit to hoffmann-stefan/ros2_dotnet that referenced this issue May 19, 2023
With "action wrapper types" this means the ROS messages and services that add the `GoalId`, `TimeStamp` or other fields on top of the user defined types for the action.
The introduced methods in `IRosActionDefinition` are needed to work around the missing existential type support in C# (dotnet/csharplang#5556).
hoffmann-stefan added a commit to hoffmann-stefan/ros2_dotnet that referenced this issue May 29, 2023
With "action wrapper types" this means the ROS messages and services that add the `GoalId`, `TimeStamp` or other fields on top of the user defined types for the action.
The introduced methods in `IRosActionDefinition` are needed to work around the missing existential type support in C# (dotnet/csharplang#5556).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests