Skip to content

Conversation

@Kampfkarren
Copy link
Contributor

@Kampfkarren Kampfkarren commented Nov 5, 2021

@alexmccord
Copy link
Contributor

Also missing from this is what to do about them when they are in lvalue assignments, a?.b?.c = 5. We would need to make this a syntax error because it's nonsensical.

Co-authored-by: Alexander McCord <11488393+alexmccord@users.noreply.github.com>
@Kampfkarren
Copy link
Contributor Author

Good catch on not mentioning that, though do we have to make it a syntax error? I was assuming that a?.b = c would be sugar for:

if a ~= nil then
    a.b = c
end

@Kampfkarren Kampfkarren changed the title RFC: nil-navigation operator RFC: Safe navigation operator Nov 5, 2021
@Dekkonot
Copy link
Contributor

Dekkonot commented Nov 5, 2021

For the sake of usability I'd suggest a?.b = 5 be a syntax error. It seems like a footgun waiting to happen to have assignments silently fail like that. I'd much rather have it throw an error!

@Kampfkarren
Copy link
Contributor Author

Works for me, I'll put it in the RFC.

@zeux zeux added the rfc Language change proposal label Nov 6, 2021

```lua
dog?.name
dog?.getName()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. This would be equivalent to (dog?.getName)() under current/intuitive grammar. What was the intent here?

Copy link
Contributor Author

@Kampfkarren Kampfkarren Nov 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intent is for getName to be ran only if dog is not nil, which matches how a few languages, including TypeScript, interpret this code. I can see this being ambiguous, but that is the intent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is the expression gets short circuited when the ?. fails. That's why a?.b.c.d.e will work, even if (a?.b).c.d.e wouldn't (because (a?.b) is it's own expression, that gets replaced with nil, then indexed by c.d.e)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the expression boundary though? E.g. a?.b + 1 presumably doesn't short circuit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd have to read up on how an expression is expressed in the Luau parser, but no.

I'm not really sure how to word this, or what ambiguities I'm missing.

Copy link
Contributor

@alexmccord alexmccord Nov 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To find all that, we have prior art to help us along most of the way. It's also great because it helps to maintain some consistency in the behavior across languages.

Here's JavaScript's RFC on optional chaining: https://tc39.es/proposal-optional-chaining/

From some light reading, the reason why t?.m() automagically works in JavaScript is because they have ?. start a different branch of the grammar that is identical to normal indexing syntax, but parses in a different way if ?. is involved. Some expert in JS syntax will be able to confirm that.

For example, here's two call sites (I won't get into TemplateStringsArray because that's a red herring):

t.m`hi`;  // this is parsed as a function call that passes some TemplateStringsArray
t?.m`hi`; // this is illegal syntax

As a result of that, the expression (t?.m)() would limit the scope of safe navigation to the expression t?.m. () is a dangerous operation here because it might call some undefined value. The same applies in t?.x + 5, because + operator is not part of the index access/function call operation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I think we should be explicit about the semantics of these operators in this RFC so that questions like "why can t?.x + 5 cause a runtime error while t?.x() can't" are obvious from how the RFC specifies the behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll write down in the RFC how binary operators like + operate, as well as how parentheses will operate. Is there anything else I should be thinking about?

The list of valid operators to follow the safe navigation operator would be:

```lua
dog?.name --[[ is the same as ]] if dog == nil then nil else dog.name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the intent actually for this to only work with nil values, rather than a general falsiness check? It has odd interactions with operators like and if not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is only for nil. I think that attempting to index a false value should error. I mention the falsiness thing in the dog and dog.name mention of the RFC.


Doing nothing is an option, as current standard if-checks already work, as well as the `and` trick in other use cases, but as shown before this can create some hard to read code, and nil values are common enough that the safe navigation operator is welcome.

Supporting optional calls/indexes, such as `x?[1]` and `x?()`, while not out of scope, are likely too fringe to support, while adding on a significant amount of parsing difficulty, especially in the case of shorthand function calls, such as `x?{}` and `x?""`.
Copy link
Collaborator

@bradsharp bradsharp Nov 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't say x?[index] or x?(...) are fringe cases. The following example comes to mind which uses both:

local function sendMessage(object, message, ...)
  object?.__prototype?[message]?(object, ...)
end

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line was added by request from Luau maintainers, I think it's a reasonable request but perhaps not something that needs to be in this initial RFC.

@@ -0,0 +1,137 @@
# Safe navigation postfix operator (?)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the operator cannot be used without . and : is it even a standalone operator ? or are these really operators ?. and ?:?

For example, C# calls them Null-conditional operators ?. and ?[] and that seems to align better with the behavior that is described here (but ? and . are still separate tokens)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't ? just be sugar for if x then x... else nil? It feels like we could support them and have this be its own operator.

Copy link
Contributor

@alexmccord alexmccord Nov 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vegorov-rbx It's probably just simpler documentation-wise to say that ?. and ?[] is a whole new operator than to explain the internal details of how ? actually works from the parser/codegen standpoint. I imagine that we'd do the same in our documentation.

@bradsharp I don't think that would be a 1:1 sugar. You don't want redundant calls to __index for every ?./?:/?[ in the expression path.

local print_index_mt = {
  __index = function(self, key)
    print(key)
    return rawget(self, "_" .. key)
  end
}

local t = setmetatable({
  _p1 = setmetatable({ _p2 = 5 }, print_index_mt)
}, print_index_mt)

print(t?.p1?.p2)
-- Should print the following:
--> p1
--> p2
--> 5
--
-- And not:
--> p1
--> p1
--> p2
--> 5

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexmccord ah yes I didn't consider that. A more accurate version would be:

local _temp = <lhs>
<result> = if _temp then _temp<rhs> else nil

@zeux
Copy link
Contributor

zeux commented Nov 22, 2021

One thing that was raised internally where this RFC doesn't specify the behavior exactly, and depending on the implementation may be a significant burden, is what happens when the function call returns more than one value. E.g. print(obj:?method()). The option that is easy to support is truncating the return list, so treating this as if obj ~= nil then obj:method() else nil. Preserving both values is likely going to be difficult to model, but is maybe the less surprising behavior.

@Kampfkarren
Copy link
Contributor Author

I think returning all values if not nil, returning one nil otherwise, is the ideal outcome. I'll specify that in the RFC.

Co-authored-by: Alexander McCord <11488393+alexmccord@users.noreply.github.com>
@zeux
Copy link
Contributor

zeux commented Dec 10, 2021

In today's episode of "language design is hard",

We've discussed this internally. The big conundrum remains the semantics of ?: calls and the interaction with multret calls.

The RFC as specified (?: returns all results from the call or a single nil if the object is nil) isn't really implementable without massive/complex rewrites of expression trees in our existing VM. You can't lower an individual call expression via ?: to a conditional because conditionals can't return variadic number of arguments on both sides without extra types of instructions. Moreover, while it would likely be possible to do that by extending the VM to support this use case (which is usually a sign that the feature is becoming too complex...), it makes our future JIT plans more ambiguous as it means we need to support dynamic stack layouts that are more complex than what we strictly need to support today.

IOW I don't think we can commit to supporting this sort of multret handling today - maybe we can do that in the future but right now it feels uncomfortable.

I was originally thinking we can use the simple if lowering where we retain just a single result, but per internal discussion this seems surprising as it doesn't match regular calls while using almost the same syntax, and this RFC as written agrees that it would be surprising.

So we're a bit at the crossroads. We could:

  1. Forge ahead with the RFC as specced. I feel like this will either run into "we need to change the VM to support this small feature" or "compiler handling of this expression is really complicated and results in non-intuitive & duplicated code" + complicate the future JIT plans.

  2. Change the rules to "safe calls only return a single result". This would be inconsistent with regular calls which is surprising and arbitrarily introduces a wart in the language which feels bad.

  3. Remove ?: from this RFC, which would further descope it (with the question of whether this feature is still useful without calls being open - we have some team members think that it's still very useful to have just the indexing, but currently don't have data to validate this hypothesis)

  4. Just abandon the whole idea.

I personally feel like 3 is the best option because it seems like safe navigation is still useful and it's semantically unambiguous, and this option keeps the road open for future support for calls if we decide to implement it.

If we do go with 3 then we'd need to prohibit a?.b(args) at a syntax level because it has the same issue (if we implement the call) and is a guaranteed runtime error if we use naive semantics that follows from safe navigation ((if a == nil then nil else a.b)(args)) to make sure we actually keep the path open for future call support that may or may not happen.

Thoughts welcome.

@Kampfkarren
Copy link
Contributor Author

I am open to going for option 3 in terms of implementation, but if the problems with ?: and ?. are implementation details and not design details, is it not also acceptable to merge the RFC as is, while implementing it piecemeal (initial implementation only being x?.y, which is useful), and reserving a?.b(args)?

@zeux
Copy link
Contributor

zeux commented Dec 10, 2021

Our design process isn't really disjoint from implementation - for example, if we discover issues during implementation we may pull the RFC post-factum completely (as happened with #34) or revise it. If an RFC can only be partially implemented in the short term we won't be able to mark it as implemented for a while which is a sign that the RFC should be split in two.

Cases like this occasionally come up after the RFC gets approved, but when we identify these cases during the RFC process it's better to correct them ahead of time.

@Kampfkarren
Copy link
Contributor Author

Fair enough. I'll get to cutting out ?: and ?.() when I can.

@Kampfkarren
Copy link
Contributor Author

Double-reminder to self to finish that--been busy

@Kampfkarren
Copy link
Contributor Author

RFC has been updated to focus on x?.y.

Copy link
Contributor

@zeux zeux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@zeux zeux merged commit 5e76489 into luau-lang:master Dec 20, 2021
@Kampfkarren Kampfkarren deleted the rfc-nil-navigation-operator branch December 20, 2021 23:42
@alexmccord
Copy link
Contributor

alexmccord commented Jan 6, 2022

I'd like to raise awareness about the fact that ?. operator will create an annoying footgun when performing a child lookup on a Roblox instance. One particularly nasty case is player.Character?.HumanoidRootPart?.Position which looks safe at a glance, but actually isn't! If by some happenstance there isn't a child named HumanoidRootPart, then an error is thrown instead of short-circuiting with nil.

Something we can probably do is create a lint pass for this case where you generate a warning given an expression <expr>.SomeChild? under the following condition:

  1. The arbitrary expression is either a ClassTypeVar or a UnionTypeVar that contains ClassTypeVar.
  2. The ClassTypeVar itself does not contain the property SomeChild.

However, this lint pass cannot help in all situations. Maybe the script is in non-strict mode and so most of the types are any, causing this lint pass to be ineffective. Or maybe you wrote a function similar to the following snippet, which wouldn't satisfy the above condition because player is not a Player but is a polymorphic interface { Character: { HumanoidRootPart: { Position: a }? }? }? which isn't a ClassTypeVar.

local function GetPlayerPosition(player)
  return player?.Character?.HumanoidRootPart?.Position
end

@Kampfkarren
Copy link
Contributor Author

Oh, that's actually disappointingly problematic! I didn't even think about that, but that's a fairly strong driving force behind the feature 😦

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

rfc Language change proposal

Development

Successfully merging this pull request may close these issues.

9 participants