Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] First Class Delegate Types #10303

Closed
axel-habermaier opened this issue Apr 3, 2016 · 15 comments
Closed

[Proposal] First Class Delegate Types #10303

axel-habermaier opened this issue Apr 3, 2016 · 15 comments

Comments

@axel-habermaier
Copy link
Contributor

With more and more functional language features coming to C#, I think it is time to make delegate types first class citizens in C# to improve readability and tooling. The following proposal does that: It discusses the issues that I think C# currently has with delegate types and proposes a new syntax to solve these issues in a way that is mostly fully transparent; i.e., it does not change the generated code in most cases compared to what you can do in C# 6.

Note that this proposal exclusively focuses on delegate types as opposed to creating and instantiating delegate instances. For the latter, we have automatic method group to delegate conversions, anonymous delegates, and lambda functions. All of these features are completely orthogonal to this proposal and would continue to work unmodified.

Current Approach

C# currently has no first class syntactic support for delegate types. Instead, it treats delegates mostly like classes, which they indeed are under the hood. The idea was to require all delegate types to be explicitly declared and named for semantic reasons in order to convey meaning to the user: For example, the System.Collections.Generic.List<T> class has a FindAll method taking a Predicate delegate, defined as follows:

public delegate bool Predicate<T>(T value);

In later versions of C# and .NET, it became somewhat less common to declare custom delegate types for "each" method. This is probably mostly due to the high number of types one has to introduce with little benefit; also, the idea was that the name conveys meaning, but since coming up with names is hard, they were mostly pretty generic anyway (for events, for instance, the names are often of the form "EventName + Handler suffix", which doesn't really tell you anything). In particular, you have to F12 into the delegate declaration to actually see the parameters the delegate expects.

Things both got worse and improved when the families of System.Action, System.Func, and System.EventHandler delegate types where added to the base class library. Most new types and methods throughout .NET make use of these delegate types today. In particular, LINQ's Where method does not use Predicate like List<T>.FindAll, instead it uses Func<T, bool>.

Problems with the Current Approach

Delegates are a successful feature of both C# and the CLR. However, the way C# currently handles delegate types has several problems that are outlined in the following.

Side Note: Interestingly, F# does not use delegates to represent its function values but uses FSharpFunc-derived types instead. The reason seems to be to efficiently support function currying, which I'm not proposing here. Hence, C# would continue to use the CLR's special support for delegates.

Unhelpful tooling

The tooling is problematic (read unhelpful) when using LINQ's Aggregate method, for instance, which is declared as follows:

public static TSource Aggregate<TSource>(this IEnumerable<TSource> source, Func<TSource, TSource, TSource> func);

In particular, the tooling experience is completely broken when I try to use the method as follows:

var l = new List<int> { 1, 2, 3 };
var r = l.Aggregate((a, b) => 2 * a + b);

Question: What am I doubling, the accumulated value or the elements in the list? I for one never know and have to look it up in the documentation in the remarks section (!!); the first argument (a in this case) is the aggregated value by the way. There is no way within Visual Studio (that I am aware of) to figure that out. Since Aggregate uses System.Func, it doesn't even help to F12 into the delegate declaration of System.Func<T1,T2>, as the delegate arguments just like their types have generic names arg1 and arg2, which is not helpful at all.

Limitations of Action and Func

Another annoyance is the need for both System.Action and System.Func. As void is not a real type, it isn't allowed as a generic argument, hence you can't simply write Func<int, void> for a void-returning function taking an integer. Hence, Action<int> was introduced, which always seemed somewhat of a hack to me.

Because of their use of generics, System.Action and System.Func have other limitations: You can't have ref and out parameters, params parameters, or make use of pointer types, for instance. In all of these cases, you have to fall back to declaring your own delegate type explicitly.

In short, Action and Func are, in my opinion, an imperfect library-based solution to work around a C# language limitation.

Proposed Solution

The proposed solution is simple: Make delegate types a first class syntactic construct, similar to how tuples apparently become first class in C# 7. There are several issues to discuss:

  • What's the syntax?
  • What code is generated?
  • Does the proposal solve all of the above problems?
  • How does the new feature interact with other .NET languages and earlier versions of the compiler?
  • Bonus: Is it possible to automatically convert old C# code to the new syntax using a code fix without changing the semantics of the program?

Spoiler: While there will always be use cases for explicitly declared custom delegate types even if it was decided to implement this proposal, you would most likely never explicitly use Func or Action again, and ref/out/params parameters or pointers would also no longer be a reason for explicitly declared delegate types.

Syntax

The proposal is to add a new type expression syntax similar to how tuple types work.

Other Languages
Let's first take a brief look at some other languages to get some ideas and inspiration of how delegates (or more generally: function pointers) could be syntactically represented. For instance, let's declare a function pointer/delegate type for a function returning a bool and taking an int:

In C:

bool(*)(int) // assuming a bool type is declared

C++ uses the same syntax as C for free-standing functions and a slightly different syntax for member functions. The std::function template introduced with C++11 unifies both syntaxes into something that I personally find quite nice:

bool(*)(int) // free-standing function
bool(C::*)(int) // member function of class/struct C
std::function<bool(int)>

Functional languages like F# of course have first class support for function types:

int -> bool

Proposed syntax
I would propose to use a C-like syntax for all delegate types in C#, as the F# syntax just doesn't feel very C#-ish. Disclaimer: I'm not really sure whether the syntax would introduce ambiguities; if so, we could certainly come up with another acceptable syntax that is unambiguous. Therefore, this syntax is just a suggestion; please don't focus on it too much. Some examples:

// A delegate type for void-returning methods without parameters:
void()

// A delegate type for void-returning methods with an int parameter:
void(int)

// A delegate type for bool-returning methods with an int parameter
bool(int)

// a delegate type returning a void pointer, taking a ref int, an out float, 
// and zero, one, or more objects
void*(ref int, out float, params object[]) 

// a delegate type for a (int,int) tuple-returning function taking two int parameters
(int, int)(int, int)

Ideally, we would also be able to specify parameter names to improve tooling just like we can do for delegate declarations today; names would be optional, however. For instance:

void(int someName)
bool(int importantInteger)

// note how named and unnamed parameters can be mixed
void*(ref int passByRef, out float, params object[] otherParameters) 

// a delegate type for a (int,int) tuple-returning function taking an int enumerable
// for instance, could be used with the Tally function of the tuples proposal
(int count, int sum)(IEnumerable<int> values)

Examples
Here are some small code examples making use of the new delegate type syntax:

// LINQ's aggregate function; the parameter names now make it
// clear which parameter is the aggregated value and which is the element 
public static TSource Aggregate<TSource>(this IEnumerable<TSource> source, TSource(TSource aggregatedValue, TSource value) func) 
{ ... }

// A function generating another function
public static bool(T) CreatePredicate<T>(T valueToCompareTo)
{
    return value => value == valueToCompareTo;
}

// Storing a delegate in a local variable
var list = new List<int> { 1, 2, 3, 4 };
bool(int) predicate = v => v > 3;
var result = list.FindAll(predicate);

// Note how the feature would also make working with
// Expression APIs a bit nicer:
Expression<bool(int)> e = ...
// would require nested generics in C# 6:
Expression<Func<int,bool>> e = ...

Code generation

The idea is that the new syntax is only syntactic sugar for Func and Action where possible. Only when these types are not available, for example because the compilation target is an old version of .NET, or because the delegate type uses pointers, ref/out parameters, etc., is a new delegate type introduced. Examples:

Using Action and Func

public static void M1(bool(int) f) { ... }
// results in
public static void M1(Func<int, bool> f) { ... }

public static void M2(void(int) f) { ... }
// results in
public static void M2(Action<int> f) { ... }

If parameters are named, I suggest to add a NamesAttribute (or something similar, probably unified with the attribute used to encode tuple names) to the base class library that can be used to encode parameter names which are subsequently picked up by IntelliSense. That would probably work in a very similar way to how names are encoded for tuples. Examples:

public static void M3(bool(int firstArg, float, double thirdArg) f) { ... }
// results in
public static void M3([Names("firstArg", null, "thirdArg")] Func<int, float, double, bool> f) { ... }

public static void(int arg) M4() { ... }
// results in
[return: Names("arg")]
public static Action<int> M4() { ... }

// field of delegate type:
public void(int arg) _f;
// results in
[Names("arg")]
public Action<int> _f;

Open question
How would we encode the names of the elements in the tuple returned by the delegate returned by a method as in the following case? That could become complicated, but a solution to this problem probably already exists for tuples; after all, they can also be nested.

public static (int x, int y)(int z) M5() { ... }
// generated code using parameter name nesting
[return: Names("z", new [] { "x", "y" })]
public static Func<int, (int, int)> M5() { ... }

What about ref/out/params/pointers?

That is more complicated; as Action and Func cannot be used in these cases, the compiler would have to declare a new delegate type. This, however, has the same type unification issues that have previously been discussed for tuples, though I don't think there is a generic solution for this problem that is both efficient and doesn't require CLR support. Anyway, let's consider doing the simplest thing that could possibly work: Just declare a delegate type in the containing namespace with appropriate accessibility. For instance:

unsafe class C 
{
     private void*() _f;
     public void M(double*() m)
     {
           int*(int) r = ...;
           ...
     }
}

// generated code:

// The delegate for the private field would be internal; 
// the names of the generated delegates would be unutterable as for backing fields, etc.
internal unsafe delegate void* <>_f_delegate();

// The delegate for the public method parameter would be public
public unsafe delegate double* <>M_delegate();

// The delegate for the method local would be internal
internal unsafe delegate int* <>M_local_delegate(int);

unsafe class C 
{
     private  <>_f_delegate _f;

     public void M(<>M_delegate m)
     {
          <>M_local_delegate r = ...;
           ...
     }
}

One could of course consider unifications of the generated delegate types within an assembly, for instance when two methods declare a parameter with the same delegate syntax. There is one problem though that is illustrated by the following code:

// assembly A
public unsafe static void A(void*() m) { ... }

// assembly B
public unsafe static void*() B() { ... }

// assembly C
A(B()); // compiler error: the delegate type returned by B does not match the type expected by A

That is unfortunate. For tuples, this problem can be avoided by adding the underlying tuple type to the base class library. This is not possible however for the delegate types we're trying to declare here; after all, there already are Func and Action in the base class libraries, but we cannot use them due to their generic nature (by the way, do tuples support pointers? Probably not.)

I see two potential solutions for this problem: Either fix the CLR to allow structurally equal delegate types to be efficiently converted into each other. Or wrap the delegate in another delegate instance, which of course is inefficient because that would result in two delegate instantiations and invocations instead of just one. That is:

A(B()); // as above
// would result in
var _temp_ = B();
A(() => _temp_());
// bad: two delegate instances are created as well as a display class to
// capture the local, and two delegate invocations are required

Given that Action and Func based delegates are the common case, the performance penalty of the above might be acceptable. Once the CLR provides an efficient means to convert delegates, the generated code could transparently make use of that after a recompile for newer .NET versions, such as:

A(B()); // as above
// with hypothetical CLR support, would result in
A(Delegate.ConvertTo<CompilerGeneratedDelegateTypeExpectedByA>(B()));

Summary

Would this proposal solve the problems mentioned above?

Yes: If LINQ's Aggregate function would be redefined as above, IntelliSense could show the parameter names, solving the usability issue. While we're at it, tooling could also show parameter names for custom declared delegate types. Note that Aggregate can be changed without breaking backwards compatibility: The compiled method will be unchanged (both the implementation as well as the execution-relevant metadata); only the [ParameterNames] attribute will be added, which only affects tooling, but doesn't affect binary compatibility.

Yes: We would not ever write Action or Func again; instead, the compiler would pick the correct type automatically.

Yes: We could use ref/out/params parameters and pointers with the same syntax.

Yes: Other languages and previous versions of the compiler would be unaffected by this change. They would simply see the Action and Func types as before, not getting any of the tooling improvements, however. Again, replacing old Action and Func declarations with the new syntax is binary compatible.

No: There is a type unification problem for non-Action or Func based delegates between different assemblies.

No: Other languages and older compilers would see compiler-generated delegate types in certain cases, though not in the common cases using Action and Func.

Could a code fix automatically convert code to the new syntax?

Yes: All references to Action or Func could be replaced with the new syntax without affecting the semantics of the program. In fact, the exact same code would be generated as before. Manual intervention would be required, however, to come up with meaningful parameter names, if desired. Custom declared delegate types such as Predicate<T> would of course remain completely unaffected by this proposal and where they make sense, continue to be fully supported.

@HaloFour
Copy link

HaloFour commented Apr 3, 2016

I don't understand this statement about C# not having syntactic support for delegates as a first-class citizen. C# absolutely does have syntactic support, which is why you're not required to define a delegate as a class complete with all of its required members, which is what you must do in IL. Considering that a CLR delegate is a proper named type, C#'s syntax for defining a delegate is about as succinct as you can get.

Autogenerated names as a part of a public contract is a very fragile solution. If the developer inadvertently does something which causes the generated name to change then all consumers of that delegate will break. The names also have to be CLS-compliant in order to be usable from other languages, so something like <>M_delegate would certainly not be suitable.

The only real "problem" that this proposal seems to try to address is that the argument names of the general-purpose Action and Func delegates are general-purpose and not descriptive. But the solution for trying to address this through some combination of autogenerated types and/or attributes seems more complex than simply just defining a new delegate type.

@svick
Copy link
Contributor

svick commented Apr 3, 2016

C function pointer syntax is notoriously hard to understand, especially in more complicated cases. Your proposal makes it better, since you're removing the middle (*) part, but I'm still not sure it's the right way to go. For example, the delegate type for the CreatePredicate method in the example would be bool(T)(T), that doesn't strike me as easy to understand, even when compared with Func<T, Func<T, bool>>

And I believe the syntax is ambiguous, at least assuming your proposed delegate types would syntactically behave like any other type: Action().Combine(); could mean either "invoke Action (which could be for example a property) and then call Combine() on the result" or "call Combine() on the delegate type Action()".

Also, currently, unutterable names are only used as implementation details, but you are proposing to expose them publicly. I think that is not acceptable, for several reasons: it's not CLS compatible, it means the unutterable name can never change, it makes the code hard to use for languages without var.

@axel-habermaier
Copy link
Contributor Author

@HaloFour: Regarding "first class": What I meant is that C# has no first class support for delegate type expressions. It does indeed have first class support for delegate declarations.

@HaloFour: The fact that no new delegate type was defined for Aggregate clearly shows that its not just about "simply" defining a new delegate type. Also, this proposal not only considers missing parameter names and IntelliSense deficiencies a problem, but also the non-unified declaration of Action and Func-based delegates, which are sometimes even impossible to use altogether.

@svick: bool(T)(T) actually reads quite nice in my opinion, because that's exactly what is does. Maybe it should be possible to write (bool(T))(T) as well? Not sure if that makes it clearer, though. Also, nested functions always mess up signatures, regardless of the syntax that you use.

@svick: Regarding the ambiguity of Action().Combine(): Good catch! The grammar would be ambiguous because the parser could not decide whether Action() represents an invocation expression or a type expression.

Regarding generated delegate types: This is indeed unfortunate. The problem could only be avoided if a) the CLR allowed generic pointer parameters and b) the CLR allowed ref/out generic parameters. The remaining information (out and params) could be encoded in attributes as well. I still think that this proposal has value, even if generated delegate types could only be used internally within an assembly.

@HaloFour
Copy link

HaloFour commented Apr 3, 2016

@axel-habermaier It doesn't solve that non-unification, it just uses a different syntax to describe it. int(int) and void(int) would still remain incompatible delegate types. Action and Func weren't added to the BCL to address any language limitations or to attempt to handle every possible delegate signature that could ever be used. Simply declare a new delegate type. Then you'd have a named type that you can reuse anywhere you want, which I think is infinitely nicer than trying to force function signatures into the middle of other function signatures.

@axel-habermaier
Copy link
Contributor Author

@HaloFour: The problem is that declaring delegate types doesn't scale especially with more and more functional programming coming into C#. Hence, no one does it, not even the BCL. In fact, the framework design guidelines event suggest to avoid custom delegate types.

The fact that int(int) and void(int) are not signature-compatible is due to their very nature; yet the syntactic declaration would be the same. The difference between Action and Func solely exists because of a C#/CLR limitation; F# doesn't have that problem (it uses unit for void, which is a real type), and C++ allows void as template arguments. What other reasons could there possibly be for separating Action and Func?

@DavidArno
Copy link

Being able to have parameter names added to Func and Action, much like the tuple proposals will add names to Tuple<...>, is a nice idea. However, this almost seems a secondary aspect of what you are proposing. The main proposal seems to be for lots of new syntax just to avoid creating custom delegates for edge-cases whenout, ref or pointers can't be avaoided.

@vladd
Copy link

vladd commented Apr 4, 2016

Please please please don't introduce the abomination known as "C function naming scheme" into such a nice, modern, readable and understandable language as C# is. The spiral rule is very unnatural and counter-intuitive thing.

A syntax that requires deobfuscation cannot possibly be a good syntax.

@svick
Copy link
Contributor

svick commented Apr 4, 2016

@vladd If I understand this proposal and the C syntax correctly (at least to some degree), then I think the spiral rule would not apply.

If you look at the example you linked to:

void (*signal(int, void (*fp)(int)))(int);

signal is a function passing an int and a pointer to a function passing an int returning nothing (void) returning a pointer to a function passing an int returning nothing (void)

Then I think the equivalent using this proposal would be:

void(int) signal(int i, void(int) fp)

signal is a method passing an int and a delegate to a method passing an int returning nothing (void) returning a delegate to a method passing an int returning nothing (void)

@orthoxerox
Copy link
Contributor

@svick what would be the type of signal, then? void(int)(int, void(int))? Well, it's better than the spiral rule, but unlike function invocation it's right-associative. I don't know how much this will complicate the parser. (int, int->void)->int->void is in my opinion easier to parse for both humans and machines.

@pawchen
Copy link
Contributor

pawchen commented Apr 5, 2016

I think (void(object))(Xyz) is ambiguous when Xyz is unknown? If Xyz is a type, then this means a delegate type that returns a delegate, otherwise it's a cast?

@svick
Copy link
Contributor

svick commented Apr 5, 2016

@orthoxerox The type of delegate to signal, yes.

I originally thought this was a terrible syntax (because of the C heritage). Now I'm not sure, but the ambiguity issues might be a deal breaker.

@axel-habermaier
Copy link
Contributor Author

How about explicit naming was allowed for the generated delegate types to solve the problem of unutterable, possibly changing public names? Such as

class X
{
     // using a special attribute
     public static void M([DelegateName("MyDelegate")] bool(int* i) x) { ... }

     // inline name declaration
     public static void M(bool MyDelegate(int* i) x) { ... }
}

The compiler could require all externally visible, auto-generated delegates to be explicitly named.

@GeirGrusom
Copy link

@axel-habermaier

F# doesn't have that problem (it uses unit for void, which is a real type)

void also is a real type though. It just isn't handle as well by either C# or the runtime.

Calling typeof(Func<>).MakeGenericType(typeof(void)) throws an ArgumentException.

@axel-habermaier
Copy link
Contributor Author

@GeirGrusom: True. Interestingly, you can even instantiate void like so: System.Runtime.Serialization.FormatterServices.GetUninitializedObject(typeof(void)) 😄

@gafter
Copy link
Member

gafter commented Apr 21, 2017

Issue moved to dotnet/csharplang #470 via ZenHub

@gafter gafter closed this as completed Apr 21, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests