Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Lambda function expressions #49

Conversation

vegorov-rbx
Copy link
Collaborator

No description provided.

@vegorov-rbx vegorov-rbx added the rfc Language change proposal label Jun 8, 2021
@zeux
Copy link
Collaborator

zeux commented Jun 8, 2021

This is mentioned in the RFC but this would also be the first time (I think?) that we have two syntactical variants for the same thing.

The parsing of (x)-> seems a bit complicated. My understanding is that we'll need to eat all 3 tokens, then lookahead and decide what to do - sometimes what we do is that we construct an Ast for an expression that uses a local or global based on the name?

It's unclear how to incorporate return type annotations into this syntax.

@alexmccord
Copy link
Member

alexmccord commented Jun 8, 2021

I'm not sure it makes sense to get rid of the -> token if we do use do keyword here. It might be a bit redundant, but the whole point of adding the introduction token do here is to simplify parsing logic.

Keeping the -> token will also give us an opportunity to put in a return type annotation on the lambda expressions, so something like do(x: any): boolean -> x == 1 will be valid syntax. It's not as nice to read without the -> here. This situation does come up occasionally in TypeScript when you want to filter and transform the value to a different type.

let a: Instance[] = [ new Part(), new Folder(), new Part() ];
// TS type inference infers 'Part[]' here, otherwise 'Instance[]' without return type annotation
let filtered = a.filter((x: any): x is Part => x instanceof Part);

The same will apply in Luau.

local a: {Instance} = { Instance.new("Part"), Instance.new("Folder"), Instance.new("Part") }
local filtered = filter(a, do(x): x :: Part -> x:IsA("Part"))

@vegorov-rbx
Copy link
Collaborator Author

vegorov-rbx commented Jun 8, 2021

Keeping the -> token will also give us an opportunity to put in a return type annotation on the lambda expressions

do(x: any): boolean x == 1 is already valid in the current (alternative) proposal.
Looks confusing though.

@vegorov-rbx
Copy link
Collaborator Author

It's unclear how to incorporate return type annotations into this syntax.

I don't see a problem with return type annotations, I even mention how we resolve the parsing of (x): to be a lambda function.

@zeux
Copy link
Collaborator

zeux commented Jun 8, 2021

I don't see a problem with return type annotations, I even mention how we resolve the parsing of (x): to be a lambda function.

I missed that, I think that runs into more issues.

  1. It's mildly inconsistent with existing practice. Currently we use : together with function prefixes and -> for type definitions. It feels a bit odd to use : here
  2. It needs even more lookahead, (x):number is a valid function call if it is followed by ( or string literal.

@alexmccord
Copy link
Member

do(x: any): boolean x == 1 is already valid in the current (alternative) proposal.
Looks confusing though.

There wasn't any mention of a return type annotation, which is why I was bringing it up. But yeah, like I said it's not as nice to read.

The points that @zeux brought up earlier is also the exact reason why I am a big proponent for having the starting token. It keeps the language simple for the parser as well as readers.

@vegorov-rbx
Copy link
Collaborator Author

  1. It's mildly inconsistent with existing practice. Currently we use : together with function prefixes and -> for type definitions. It feels a bit odd to use : here

I think of the opposite, that it is consistent to use ':' as the function already uses that. And we have a function here, not a function type.

2. It needs even more lookahead, (x):number is a valid function call if it is followed by ( or string literal.

This is a problem I haven't though about.
To clear things up, the design required only 1 token lookahead before and if we find (x)to not be followed by a disambiguating token we just create the required AstExpr without backtracing.
Now, however, this won't be enough.

@vegorov-rbx
Copy link
Collaborator Author

There wasn't any mention of a return type annotation

Sorry, that was my mistake. I have updated the document to correctly reference function header grammar instead of function argument grammar.

@alexmccord
Copy link
Member

Also, since the RFC doesn't make a mention of it as well, do we want multiple returns? I imagine yes, but this will run into ambiguous syntax as well: function_taking_a_lambda_returning_two_values(do(x, y) -> x, y).

How many arguments are we actually passing?

  1. (do(x, y) -> x, y), or
  2. (do(x, y) -> x), y

I'd probably prefer to solve it by requiring explicit parentheses syntactically around the lambda expression if it is found in expression list context and the lambda expression has more than 1 value. If the parser detects this ambiguity, it should throw a parse error pointing out this ambiguity.

@zeux
Copy link
Collaborator

zeux commented Jun 8, 2021

@alexmccord Thanks - multiple returns seem to be a problem with any syntax here that doesn't require extra parentheses (something that we've avoided in the past for other features like as or if-then). I don't think this is limited to multi-value expressions assuming you mean arguments, since (x) -> x, x+1 is ostensibly valid. Many contexts where expressions are valid are actually expression list contexts - return, function arguments, table literals, local values - so almost every use of lambdas would need to be wrapped in parentheses which seems suboptimal.

@vegorov-rbx
Copy link
Collaborator Author

vegorov-rbx commented Jun 9, 2021

I've changed the proposal to use the alternative with an introduction token.

Both previous and current proposals define grammar to only support a single return value (subexpr).
I have added this to the "Drawbacks" section.

@zeux
Copy link
Collaborator

zeux commented Jun 11, 2021

Thanks everyone for the feedback. My overall thoughts are that we should table this proposal for now.

This proposal introduces syntax sugar - it doesn't unlock new capabilities. Ergonomics are important, but what we're discovering is that it's difficult to reach this ergonomics without both internal and user-facing complexity.

In particular:

  • The ergonomic solution involves lack of a leading token, but this is very difficult to parse without backtracking, especially if we want to incorporate support for type annotations via :
  • To use a leading token, we need to use an existing keyword (ordinarily we'd pick something like fun/fn); the sensible choice might be do but it creates new unusual syntax to declare an anonymous function that is unlikely to make it outside of lambda syntax. This creates further divergence of syntax between short form and long form function definitions that is somewhat surprising, as between this and the type definitions we'll end up with three to four ways to talk about the functions.
  • To make matters worse, while normally Luau supports multiple returns, doing this here creates grammatical ambiguities. Solving these with parenthesized returns creates further parsing complications.
  • If we don't support multiple returns here, then the feature becomes even more alien and special-cased, since some types of lambdas won't be declarable via this short-form syntax. What's worse, presumably you'd be able to forward multiple returned values via the usual tail-return property, which means that it's not impossible to have a lambda that returns two values, you just can't get a lambda to return two values you want. We could make sure these always return a single value but that's also inconsistent with return behavior elsewhere...

Overall it feels that while the feature is tempting, there's significant complications with the design and pretty much every variation has severe issues one way or the other. Because of this and because this is purely sugar, unlike something like if-expr which addressed a critical gap in semantics which leads to severe problems today (such as people either making mistakes with incorrect use of and..or pattern, or using IIFEs to emulate ternary expression which is very costly), I think we should close this RFC. Per the RFC proposal process this doesn't mean this feature is never going to land in any form, but that the design we can come up with and implement today is too problematic to be desirable.

@zeux zeux closed this Jun 14, 2021
@Hexcede
Copy link

Hexcede commented Feb 19, 2023

C.c. @vegorov-rbx & @zeux

I wanted to jump in to this to add to the convo since it has been a while. I considered making a new separate RFC for this, but I figured I would reply to this existing one instead with my suggestions.

In short, my thoughts are that perhaps stealing syntax from Rust would just kind of work.


In general, this alternative syntax is a lot easier to parse. Both visually, and by the language, especially when compared to some of the potential alternatives above. It is visually obvious, and low conflict for readability and parsing due to the lack of >, -, ( or ) in the syntax. Currently, the unique opening symbol also doesn't serve any other purposes in the language, and I think this would be a fitting way to use it.


How this addresses the conflicts above

  • Remains ergonomic while still offering a unique opening token.
  • Uses a currently invalid symbol instead of a keyword.
  • Grammatical ambiguities make a little more sense - Note that parenthesized returns would also bar using (a, b) to truncate to (a)
  • Multiple return values can be supported and still look reasonable, type defs work, and everything is consistent with existing anonymous function definitions.
  • Exists in Rust with very similar neighboring syntaxes. It'd also be familiar for Rust programmers. The biggest difference here between Rust and luau is just the ability to return multiple values in luau.

Exploring this syntax in luau

In luau it would look like this:

local lessThan = |a: number, b: number|: boolean a < b
local sortedAsc = |a: number, b: number|: (number, number) math.min(a, b), math.max(a, b)
local getOne = || 1

The parsing for this is very similar to the parsing of a typical anonymous function expression, but instead of parsing a function body, we are parsing what is basically just a return statement in disguise.

The main problems are:

  • What do we do when there is ambiguity about multiple return values (seems like there is only one practical answer here)
  • How do we determine when the expression starts (seems identical to function body parsing already)
  • How do we determine when the expression ends (seems identical to an assignment statement)

Multi-value ambiguity

When it comes to the ambiguity with multiple values, I believe it is reasonable for the lambda to just be greedy and eat everything. This makes parsing easier here, but also in a lot of other cases too, which I think makes it pretty natural over other alternatives which generally would involve limiting the functionality of the lambda.

local sortedAsc = (|a: number, b: number|: (number, number) math.min(a, b), math.max(a, b))
local lambdaMin, max = (|a: number, b: number|: (number, number) math.min(a, b)), math.max(a, b)

-- With the ambiguous case being equivalent to sortedAsc above
local sortedAscAmbig = |a: number, b: number|: (number, number) math.min(a, b), math.max(a, b)

While it is probably possible to display linter warnings in some cases, in a lot of cases the type checker ends up indicating what's going on in a way that I think most programmers would probably understand, and despite still definitely being a flaw, I think that it's an easy enough flaw to overcome.

I don't recall what the sort of official stance for this style of function call is, but I figured I would mention it as a possibility. The unique syntax also allows for this open call syntax to be valid, and it looks consistent enough with the others not to seem weird or out of place, especially because there isn't a leading token.

someFunction "abc"
someFunction { a = 123 }
someFunction |a, b| a / b, b / a

More detail on parsing

Going into more depth about how similar this syntax is to existing anonymous function syntax, everything effectively is just a bunch of syntax substitutions in order.

  • | -> function(
  • | -> )
  • Can have type definitions prefixed by : which can either be a single type, or multiple types in parentheses
  • (Start of "body") -> return expr1, expr2, ...
  • (End of expression) -> end

Just for reference again, these two statements would be identical to eachother:

local a = function(a: number, b: number, ...: number): (number, number, number...)
	return a, b, ...
end
local a = |a: number, b: number, ...: number|: (number, number, number...) a, b, ...

The problem of determining when the actual expression/statement ends is pretty much exactly the same as the end of an assignment, which is also a solved problem:

local a, b = 1, 2 -- Here we go from expression -> new statement on the next line
local a = || 1, 2 -- Here we go from expression -> new statement on the next line
local a = (|| 1, 2)
local a, b = (|| 1, 2), 3
local a, b = 1, || 2, 3

@zeux
Copy link
Collaborator

zeux commented Feb 21, 2023

| has prior art in Ruby as well I believe, but the multi return issue remains no longer what the sequence is for the beginning delimiters - consider foldLeft(table, |a, b| a + b, 0), how does this parse?

@Gargafield
Copy link

Gargafield commented Feb 21, 2023

Like this maybe:

foldLeft(table, (|a ,b| a + b), 0)
-- Multiple returns
multiple(|a, b, c| (a, b, c), "Hello World")
-- And if you want to use statements you should use the normal anon function syntax:
callback(function(message)
    print(message)
    local something = 10
    return something
end)

@zeux
Copy link
Collaborator

zeux commented Feb 21, 2023

That's inconsistent with other places in the grammar though, where parenthesizing suppresses multiple value evaluation, eg print((math.modf(1, 2)))

@alexmccord
Copy link
Member

Personally, I expect foldLeft(table, |a, b| a + b, 0) to be parsed as foldLeft(table, (|a, b| a + b, 0)). Expression parsing is greedy, there's no reason why this should change at all.

@zeux
Copy link
Collaborator

zeux commented Feb 21, 2023

I agree that this is the sensible expectation. But this is also why we've decided to not go ahead with this - it's too error prone.

@vegorov-rbx vegorov-rbx mentioned this pull request Sep 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rfc Language change proposal
Development

Successfully merging this pull request may close these issues.

None yet

6 participants