New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: forward pipe operators #74

Open
orthoxerox opened this Issue Feb 13, 2017 · 77 comments

Comments

Projects
None yet
@orthoxerox
Copy link

orthoxerox commented Feb 13, 2017

Forward pipe operators

Summary

Operators for passing values to methods/delegates/constructors/arbitrary expressions.

See also dotnet/roslyn#5445, dotnet/roslyn#3171

Motivation

Promoting the use of static (preferably stateless) functions.

Detailed design

Grammar

primary-no-array-creation-expression
    : ... //existing grammar
    | pipe-expression
    | placeholder-expression
    ;

pipe-expression
    : primary-expression pipe-operator primary-no-array-creation-expression
    ;
    
pipe-operator
    : '|>'
    | '?>'
    ;

placeholder-expression
    : '@'
    ;
    
object-creation-expression
    : 'new' type arguments? object-or-collection-initializer?
    ;

arguments
    : '(' argument-list ')'
    ;

Functionality

Binding

  1. The lhs is bound as an assignment to a temporary local in the nearest scope to ensure it is evaluated first.
  2. The rhs is bound depending on the presence of a placeholder:
    2.1. The rhs is visited by a placeholder finder
    2.1.1. Pipe expressions do not check their rhs for placeholders
    2.1.2. Placeholders report they are a placeholder
    2.1.3. Other nodes recur
    2.2. If placeholders are present, the rhs is bound as is using PipeBinder
    2.2.1. Placeholders are bound as BoundLocals by PipeBinder (regular binder binds them as BoundBadExpressions)
    2.3. If placeholders are not present, the rhs is bound as an invocation:
    2.3.1. If the rhs is an ObjectCreationExpressionSyntax with arguments, it is bound as a BoundBadExpression (tbd)
    2.3.2. If the rhs is an ObjectCreationExpressionSyntax without arguments, it is bound as a BoundObjectCreationExpression with a single argument
    2.3.3. If the rhs is a ConditionalAccessExpressionSyntax, it is (TODO: explain the logic in the prototyp)
    2.3.4. If the rhs is an AwaitExpressionSyntax, (TODO: think about that)
    2.3.5. Otherwise the rhs is bound as an invocation expression with one argument
  3. The expression is bound with lhs, rhs and the type of operator (conditional or not).

Code Gen

  1. Assignment of the lhs to the temporary local is emitted and the value is discarded.
  2. Code gen depends of the conditional parameter of the BoundPipeExpression:
    2.1. If the BoundPipeExpression is conditional:
    2.1.1. If the result is null or a Nullable<T> with no value, a default value of the bound rhs type is emitted
    2.1.2. Otherwise the rhs is emitted
    2.2. If the BoundPipeExpression is not conditional, the rhs is emitted

Drawbacks

I don't really see any.

Alternatives

None.

Unresolved questions

What is the correct precedence of the pipe operators? In my (invalid) prototype I placed them between ternary and coalescing operators, but I can't remember why.

Are there any other syntax nodes that protect placeholders inside? Lambdas/delegates?

Should we have an operator that chains function calls using SelectMany?

Design meetings

None that I know of.

@alrz

This comment has been minimized.

Copy link
Contributor

alrz commented Feb 13, 2017

Currently new SomeType1 is not a valid expression, so you would require to change allowed syntax forms in that context. Also, that syntax won't work in case of currying e.g. e |> F(1) would be equivalent to F(1, e) (note that no actual currying is happening, just it would be allowed syntactically). I think we should require parentheses for any number of arguments (including zero) to make that consistent with existing production rules of the language. this is what has been proposed in dotnet/roslyn#5445.

@orthoxerox

This comment has been minimized.

Copy link

orthoxerox commented Feb 13, 2017

@alrz Requiring parentheses would violate the principle of least surprise, IMO:

SomeType e = default(SomeType);

Action<SomeType> F(int x) => blah;
void F(int x, SomeType st) => bleh;

e |> F(1); //calls the second overload
e |> F(1)(); //calls the first overload

Also, currying is usually done from the left, not from the right:

static Func<T2, TResult> Curry<T1, T2, TResult>(Func<T1, T2, TResult> func, T1 t1) 
    => t2 => func(t1, t2);

Inserting a parameter in front of the supplied parameters feels weird, I'd rather support something like dotnet/roslyn#3561 or dotnet/roslyn#8670 or dotnet/roslyn#3171

@alrz

This comment has been minimized.

Copy link
Contributor

alrz commented Feb 13, 2017

While I disagree that it would be any useful to forward to return value of the RHS in context of C#, I think it would help to provide language grammar you have in mind for the forward pipe operator.

@DavidArno

This comment has been minimized.

Copy link

DavidArno commented Feb 13, 2017

@orthoxerox,

I'm noticing a problem with proposals created by someone other that the team. In the latter case, there is a proposal spec that is linked to that shows the details, such as syntax. Would it be worth creating a gist for example, that could used for the spec for this proposal (same idea would apply to other community proposals too)?

@orthoxerox

This comment has been minimized.

Copy link

orthoxerox commented Feb 13, 2017

@DavidArno I guess I could fork the repo and push the proposals there.

@DavidArno

This comment has been minimized.

Copy link

DavidArno commented Feb 13, 2017

Ah good point. Hadn't thought of doing that. Excellent idea as a PR from that is easier too.

@orthoxerox

This comment has been minimized.

Copy link

orthoxerox commented Feb 13, 2017

@alrz Could you add some markdown examples of grammar specs to your proposed issue template? This would help a whole lot.

@alrz

This comment has been minimized.

Copy link
Contributor

alrz commented Feb 13, 2017

In that proposal you could use a general (or restricted, depending on the precedence) expression as RHS since there is no syntactic requirements. Then allowed forms will be checked semantically. e.g only permit expressions that a value can be forwarded to, namely, invocation-expression, object-creation-expression, an await-expression that contains any of applicable expressions (recursively), etc.

Note that expression '|>' expression is ambiguous as it does not specify any precedence for expressions in which the operator is applied, also it wouldn't allow new SomeType as it's not a valid expression.

@orthoxerox

This comment has been minimized.

Copy link

orthoxerox commented Feb 13, 2017

@alrz in my case it's not restricted at all, since the pipe expression replaces all invocation-likes: the rhs provides an invocable. I think only object creation expression is weird, since it doesn't actually provide an invocable and has to be syntactically valid without both argument list and initializers now.

I should check if my approach works with await, though.

@YaakovDavis

This comment has been minimized.

Copy link

YaakovDavis commented Feb 13, 2017

It would be nice if piping to arg-positions other than the last will be supported. This can be achieved by a new place-holder syntax, e.g.:

void M(int a, int b, int c) {}
2 |> M(1, .., 3);

Alternative place-holder suggestion:

2 |> M(1, @, 3);

This is especially useful given that methods in C# codebases typically aren't designed for piping purposes.

@orthoxerox

This comment has been minimized.

Copy link

orthoxerox commented Feb 13, 2017

@YaakovDavis yes, shorthand lambdas are an option as well, but they have very complicated boundary detection rules. Maybe if the placeholders were restricted to argument lists and not arbitrary expressions... but then they wouldn't be as useful as full-powered shorthand lambdas.

@YaakovDavis

This comment has been minimized.

Copy link

YaakovDavis commented Feb 13, 2017

@orthoxerox,

shorthand lambdas

Are you referring to the proposed lambda syntax from dotnet/roslyn#5445?
I see my proposal as orthogonal to that. Piped-value place-holders could work with, or without the lambda support. Please correct me if I miss something.

@orthoxerox

This comment has been minimized.

Copy link

orthoxerox commented Feb 13, 2017

@YaakovDavis I think you must've linked the wrong proposal. Anyway, yes, I am referring to that syntax. I wouldn't call you proposal orthogonal to the shorthand lambda, since piping to a shorthand lambda would look the same as piping to an invocation with a placeholder:

2 |> M(1, .., 3);
2 |> M(1, @, 3);

This makes placeholders a strict subset of the shorthand lambda functionality. Return values are commonly used as method receivers, not just regular arguments, so if we allow placeholders to be used there as well, they might be much more useful than I originally thought...

@YaakovDavis

This comment has been minimized.

Copy link

YaakovDavis commented Feb 13, 2017

@orthoxerox Seems like you're referring dotnet/roslyn#3561. Am I correct?
I wasn't aware of this proposal till now.

Let's continue the discussion after confirming that we're on the same page :)

@HaloFour

This comment has been minimized.

Copy link
Contributor

HaloFour commented Feb 13, 2017

I'm not sure why forward pipe operator makes a language more "functional". More "fluent", maybe.

In my opinion exposing any of these four operators opens a can of worms as you'll soon run into paradigmatic differences between C# and functional languages. Currying and overloading are two big ones. Those two features generally can't coexist in the same language (or at least in the same context, which is the line that F# walks). In F#, piping an expression into a function that accepts multiple arguments results in a currying operation. That clearly can't happen here. So you still end up with having arguments having to be defined and embedded at the call site.

Personally I prefer extension methods. They achieve roughly the same functionality but fit more into the C# syntax. Hopefully local extension methods will be supported at some point.

@alrz

This comment has been minimized.

Copy link
Contributor

alrz commented Feb 13, 2017

@HaloFour In my opinion, this should not be perceived as a move to make the language more "functional". That's just a construct to write more readable call chains with the same lexical order as they will be executed or without leaving the expression context.

@gafter

This comment has been minimized.

Copy link
Member

gafter commented Feb 13, 2017

@orthoxerox

If the rhs is a lambda declaration, it is optimized away by inlining the body.

That doesn't make any sense, as the lambda would be an error by virtue of not having a (delegate) target type. (You can't "optimize" an error)

Does anyone have a plausible concrete use case for this that demonstrates that it would be of value in frequently occurring code patterns?

@HaloFour

This comment has been minimized.

Copy link
Contributor

HaloFour commented Feb 13, 2017

@gafter

You can't "optimize" an error

I beg to differ. Isn't that called "fail fast"?

@orthoxerox

This comment has been minimized.

Copy link

orthoxerox commented Feb 13, 2017

@gafter I guess this snippet is a stub from an older version of the proposal where shorthand lambdas were included. The lambda expression would be rewritten before being bound, so it wouldn't be an error yet. With longhand lambdas this doesn't make much sense, I agree.

@orthoxerox

This comment has been minimized.

Copy link

orthoxerox commented Feb 13, 2017

@HaloFour the presence of |> nudges the developers towards writing pure static functions because they are easy to chain.

I'm kinda warming up to the idea of adding placeholders to the proposal, to offset the impossibility of currying.

@MgSam

This comment has been minimized.

Copy link

MgSam commented Feb 14, 2017

I think placeholders to indicate where to use the value being piped are necessary. Eliding or shortening the argument list of a method looks odd and would be no doubt be confusing and introduce potential ambiguities.

While its certainly possible to create similar behavior to this in today's C# using an extension method that accepts a Func<T1, T2>, it is, of course, not nearly as efficient as the proposed pipe operator could be as it requires closing over any local variables you want to use.

@gafter Any nested function calls seem a pretty great use case for this. Reading function calls from the inside out makes it unnecessarily hard to follow complex logic.

var foo = Substring(ParseJson(LoadData())[5].Value, 0, 3)

vs

var foo = LoadData() |> ParseJson(@)[5] |> Substring(@, 0, 5)

In fact, I wish the MS Excel guys would take a page from F# and add this operator already. Trying to decipher tons of nested function calls is a nightmare.

@orthoxerox

This comment has been minimized.

Copy link

orthoxerox commented Feb 14, 2017

Updated the OP with grammar and more detailed design, including placeholders.

@alrz

This comment has been minimized.

Copy link
Contributor

alrz commented Feb 14, 2017

@gafter AFAIK this was championed by dotnet/roslyn#5445's assignee as it's labeled "Ready", no? If so, is there a spec draft somewhere that we can base the discussion on it to ensure that we're on the same page?

@AlphaDriveLegal

This comment has been minimized.

Copy link

AlphaDriveLegal commented Feb 14, 2017

Hi All,
I was wondering why the |> operator was picked for this. My thought was that something like -> might be easier to understand. I didn't see anyone else mentioning that here so I just wanted to throw it out:

var results = 
  from x in items
  where x != null && x.Age > 21
  select x
  ->Skip(5)->Take(10)->ToList()
  ;

@orthoxerox

This comment has been minimized.

@gafter

This comment has been minimized.

Copy link
Member

gafter commented Feb 14, 2017

@alrz That proposal is different, as it has the input parameter added to the argument list. This proposal (in its first iteration, at least) requires there be no argument list.

@alrz

This comment has been minimized.

Copy link
Contributor

alrz commented Feb 14, 2017

@gafter Should I open another issue for that one or this suffices?

@gafter

This comment has been minimized.

Copy link
Member

gafter commented Feb 14, 2017

@alrz You can hold discussion wherever you want. If you have a different proposal to discuss then it makes more sense to do so on a different issue.

@orthoxerox

This comment has been minimized.

Copy link

orthoxerox commented Feb 14, 2017

@alrz Why do you think that argument lists are better than no argument lists?

@alrz

This comment has been minimized.

Copy link
Contributor

alrz commented Mar 6, 2017

Looking a little closer, the proposed @ syntax is exactly what #91 is trying to accomplish.

Note: I'll use Mathematica syntax for #91; using # for parameters and & to terminate.

For example,

.Select(# + 1 &)
// equivalent to
.Select(x => x + 1)

So, using the same syntax in the context of forwarding would look like this (#96 syntax for RHS):

value 
|> $"the value is {#}" &
|> Console.WriteLine(); 

Delegates and method references (#84) could also be used (which looks like what's proposed here for RHS),

value
|> Console.WriteLine; 

I think this closes the gap between #96 and #74.

/cc @orthoxerox @HaloFour

@gulshan

This comment has been minimized.

Copy link

gulshan commented Mar 6, 2017

I imagine |> being a postfix operator for function/method calling. Meaning "abc" |> Write is just Write("abc"). Thus I proposed to remove the parenthesis after the method-name. That will also help to reduce ambiguity with overloading. And my proposal of using await and return with |> comes from the notion of it being postfix. Using using within the chain was a bit too far of an idea.

@orthoxerox

This comment has been minimized.

Copy link

orthoxerox commented Mar 6, 2017

@gulshan the parens were the reason why @alrz opened his own proposal.

@alrz yes, originally two ideas were combined in my roslyn proposal. Without a delimiter shorthand lambdas were a pain to parse, so I dropped them. I'm not sure if a terminator is the right option there, though. I think a prefix would work better with the existing parser.

@orthoxerox

This comment has been minimized.

Copy link

orthoxerox commented May 13, 2017

I've implemented a prototype that combines pipes with placeholders and shorthand lambdas. No conditional pipe support yet, though. Will push it tomorrow when I get home.

@MgSam

This comment has been minimized.

Copy link

MgSam commented Aug 8, 2017

Interesting reading- ECMAScript is considering a similar proposal: https://github.com/tc39/proposal-pipeline-operator.

@mharasimowicz

This comment has been minimized.

Copy link

mharasimowicz commented Nov 19, 2017

I support this proposal, it can simplify code when flow is clear.

BTW: It's possible to write similar functionality today by writing own delegate classes and overloading | operator, but code is quite verbose, which in practice defeats it's purpose.

I've created gist with example code.

https://gist.github.com/mharasimowicz/a4ff68ab4b890d8b866dcdbd6795cc3b

@gulshan

This comment has been minimized.

Copy link

gulshan commented Nov 19, 2017

The ECMAScript proposal made async part of the pipeline. Feels great!

@Jehoel

This comment has been minimized.

Copy link

Jehoel commented Dec 15, 2017

One thing I'd love to use this for is for "postfix-casts". It feels unnatural to me that the C# cast (T) operator is prefix, not postfix, because it disregards the expected left-to-right evaluation order for non-parenthesised operations, which is how Linq's Cast<T>() method works.

Compare:

((Int32)enumerable.Select( foo => foo.Bar ).SingleOrDefault() ?? 123).ToString();

With:

enumerable.Select( foo => foo.Bar ).SingleOrDefault() ?? 123 |> (Int32)@ |> @.ToString();

I feel |> (Int32)@ |> syntax could be improved somewhat - any thoughts?

@dzmitry-lahoda

This comment has been minimized.

Copy link

dzmitry-lahoda commented May 18, 2018

I do fun programming again with .Do & ?.Do.

        public static T Do<T> (this T target, Action<T> action)
        {
            action(target);
            return target;
        }

        public static TResult Do<T, TResult> (this T target, Func<T, TResult> func) => 
            func(target);
@dadhi

This comment has been minimized.

Copy link

dadhi commented May 19, 2018

@dzmitry-lahoda,

Agreed.

For me the main benefit is the postfix notation which naturally fits to my train of thoughts / transformation chain from input to required result. This chain naturally flows forward - it is where the piping or your extension methods fit.

Given the usual notation I often need to return back to store the result of first transformation, then jump forward again to use the result.

It may not need to be applied everywhere, but it helps me a lot in small methods or LINQ queries, to produce a cleaner code. Usually it enables to convert such methods from statement to exptession-body, cutting more noise.

Even if it is only may be applied to single parameter methods, that's fine. One improvement at time!


R Blah(T input)
{
    var key = GetKey(input);
    var value = FindValue(key);
    return value == null ? null : Process(value); // null propagation cannot be applied here
}

Extension methods:


R Blah(T input) =>
    GetKey(input).Do(FindValue)?.Do(Process);

Pipe operator(s):


R Blah(T input) => 
    input |> GetKey |> FindValue ?> Process;

@masaeedu

This comment has been minimized.

Copy link

masaeedu commented Jun 19, 2018

Instead of tying placeholders and the pipeline operator together (at cost of increased complexity to both), can we instead have two simpler operators?

v |> f |> g |> h
// => always deugars to (and typechecks/dispatches identically to)
h(g(f(v)))
// without regard to number of arguments or overloads of f, g, h

In order to deal with multi-argument functions you can explicitly do:

v |> f |> (x => g(42, "foo", x)) |> ...

If the simple lambda wrapping x => ...x... feels too tedious, we can have an independent language feature for making partial application more convenient:

f(?, "foo")
// => desugars to
x => f(x, "foo")

Now you can use the two features in isolation or together:

v |> f |> g(42, "foo", ?) |> ...

Of course special casing the emit of g(42, "foo", ?) to skip creation of a lambda when inside a |> pipeline (for performance reasons) is fine, so long as the runtime semantics are identical.

@theunrepentantgeek

This comment has been minimized.

Copy link

theunrepentantgeek commented Jun 19, 2018

The problem with resorting to lambdas is that they always allocate - and they often capture.

This is going to create a lot of heap debris:

v 
|> f 
|> x => g(42, "foo", x) 
|>

Whereas, with the original proposal, this would not:

v 
|> f 
|> g(42, "foo") 
|>
@masaeedu

This comment has been minimized.

Copy link

masaeedu commented Jun 19, 2018

I guess it's too idealistic to hope there's some emit or JIT optimization that makes (x => g(42, "foo", x))(v) cost the same as g(42, "foo", v)?

@theunrepentantgeek

This comment has been minimized.

Copy link

theunrepentantgeek commented Jun 19, 2018

AFAIK, there isn't at present - and I believe designing a language feature in a way that it requires CLR changes dramatically changes the likelihood of implementation.

@orthoxerox

This comment has been minimized.

Copy link

orthoxerox commented Jun 19, 2018

@masaeedu @theunrepentantgeek that doesn't have to be a CLR change. (x => g(42, "foo", x))(v) is an invalid expression in C# and it can be lowered to a sequence expression (var x = v; g(42, "foo", x)) by the compiler.

@masaeedu

This comment has been minimized.

Copy link

masaeedu commented Jun 21, 2018

@HaloFour

I don't recall an overabundance of scenarios in my code where I write anything like d(c(b(a)))

If you've written code like foo.Select(x => x + 1).Where(x => x < 42).SelectMany(Factors).Distinct(), you've written code like d(c(b(a))).

@HaloFour

This comment has been minimized.

Copy link
Contributor

HaloFour commented Jun 21, 2018

@masaeedu

If you've written code like foo.Select(x => x + 1).Where(x => x < 42).SelectMany(Factors).Distinct(), you've written code like d(c(b(a))).

No, that's writing code like a().b().c().d(). The fact that it's converted into nested static method calls by the compiler doesn't really matter. The same would be true of any pipe operator.

@masaeedu

This comment has been minimized.

Copy link

masaeedu commented Jun 21, 2018

The Select, Where, SelectMany etc. are static methods, the fact that they're sugared for .Whatever() application is what doesn't really matter here. There are many more such functions that I'd like to be able to apply to values in a chain which can't be given fixed definitions as static methods, but are instead more conveniently computed from other functions and values.

@Jehoel

This comment has been minimized.

Copy link

Jehoel commented Jun 30, 2018

@theunrepentantgeek

The problem with resorting to lambdas is that they always allocate - and they often capture.

Is allocation on every invocation, or is a single capture-container object created on-demand and shared by subsequent calls (a-la switch's Dictionary)?

...and wouldn't the JIT inline the method invocations anyway?

@theunrepentantgeek

This comment has been minimized.

Copy link

theunrepentantgeek commented Jun 30, 2018

I believe that it depends on whether the lambda involves any captures. If it doesn't capture, I believe it will be allocated once. But if it does capture, I'm pretty sure that lambdas are separately allocated for every execution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment