Skip to content

Conversation

garu
Copy link
Contributor

@garu garu commented Jul 13, 2022

No description provided.

@duncand
Copy link

duncand commented Jul 13, 2022

@garu I agree with everything you have written in this pull request, and have no improvements to suggest. In particular I agree with this being explicit rather than implicit, naming the operator ?-> (its not just about ternary but because the question mark is a good mnemonic for a conditional action), always consistently using undefinedness as the test, and always consistently using the empty list as the default result. Good job! I recommend this be merged.

@duncand
Copy link

duncand commented Jul 13, 2022

Note, you have a typo in your example code, a } where you meant a ].

@abraxxa
Copy link

abraxxa commented Jul 13, 2022

I hope it's not too late to ask this: why not add no feature 'autovivification' instead? Isn't that why you need to check every level of nesting?

@KES777
Copy link

KES777 commented Jul 13, 2022

@abraxxa Look here. ?-> belongs not only to nested data structures, but also to function calls. (But I would rather prefer two characters ~>)

@abraxxa
Copy link

abraxxa commented Jul 13, 2022

Thanks, I‘ve overlooked the method call example.

@pragma-
Copy link

pragma- commented Jul 13, 2022

@abraxxa ... (But I would rather prefer two characters ~>)

@abraxxa: If two characters are preferred, I'd strongly lean towards ?> instead because ? is widely understood as the Optional/Maybe qualifier/modifier. But I think ?-> is a perfectly fine operator and I agree with the author's rationale for it (~> can be easily confused with ->, especially on small fonts; ~> is used by smart-match; ?-> looks exactly like -> but with the widely-recognized ? prepended; etc).

@garu: But I'm not so sure that I agree that ?-> needs to be made explicit in each subsequent chain. That seems rather excessively verbose. It seems to me that once you choose to optionally-chain, you very likely intend to do this for every subsequent chain. I'd very, very much prefer to write:

      $val = $data?->{deeply}{nested}[0]{data}{value}

instead of:

     $val = $data?->{deeply}?->{nested}?->[0]?->{data}?->{value};

@hernan604
Copy link

Thats interesting.

or maybe ??-> for the rest of dereferencing ie.
$val = $data??->{deeply}{nested}[0]{data}{value}

and ?-> for "this chain only" ie.
$val = $data?->{deeply}?->{nested}?->[0]?->{data}->{value};

@duncand
Copy link

duncand commented Jul 14, 2022

I think its better to keep it simple that ?-> affects only the specific step in the chain its used with, because that gives us the flexibility to have different behaviour in different places, eg sometimes we want there to be an error if the element doesn't exist, and other times we want it to be the magic empty list.

Likewise, I disagree with the idea of no feature 'autovivification' as a substitute for this new operator, because reasonably one may want to use both versions in the same file. Maybe that feature is still useful, but on its own merits independent of replacing this operator.

@duncand
Copy link

duncand commented Jul 14, 2022

Come to think of it, if this doesn't already happen, an additional complementary operator, say !->, would be good to force an error when an element doesn't exist, again instead of it auto vivifying, because sometimes we want a stricter this must be here interpretation.

@abraxxa
Copy link

abraxxa commented Jul 14, 2022

feature is scoped so you can disable it just in those blocks where you don‘t want autovivication. That case could also trigger an exception, think of it as strict mode for arrays/hashes, just like what happens when you call a nonexistent method on an object.

I‘d prefer that over a bulkload of additional operators.

@duncand
Copy link

duncand commented Jul 14, 2022

feature is scoped so you can disable it just in those blocks where you don‘t want autovivication. That case could also trigger an exception, think of it as strict mode for arrays/hashes, just like what happens when you call a nonexistent method on an object.

I‘d prefer that over a bulkload of additional operators.

Your suggestion doesn't help when using both versions of the operation in the same expression or statement, eg like how would feature handle $foo->{bar}->{baz}?->{quux}? This is a clear example of the operator method being superior.

@abraxxa
Copy link

abraxxa commented Jul 14, 2022

What are you expecting to happen from this construct?

@duncand
Copy link

duncand commented Jul 14, 2022

I expect that some of the lookups will raise an error if the element doesn't exist, and some of the other lookups will instead substitute the empty list.

@abraxxa
Copy link

abraxxa commented Jul 14, 2022

And this error on missing index should be triggered by what? Currently Perl would autovivify it, my proposed no feature ‚autovivification‘ would turn that into an exception instead.
So it would be a combination of both.

@duncand
Copy link

duncand commented Jul 14, 2022

And this error on missing index should be triggered by what? Currently Perl would autovivify it, my proposed no feature ‚autovivification‘ would turn that into an exception instead.
So it would be a combination of both.

So we would seem to be in agreement then that these should be two complimentary features rather than an either-or choice. My issue before with your suggestion was that having no feature autovivification would be a total replacement for the operator.

@abraxxa
Copy link

abraxxa commented Jul 14, 2022

Yes, I think the feature is a better alternative to the !-> operator.

@haarg
Copy link
Contributor

haarg commented Jul 14, 2022

An autovivification feature that could be disabled would be nicer for dealing with deep data structures, but it doesn't help with method chains.

@garu
Copy link
Contributor Author

garu commented Jul 14, 2022

Hi everyone!

Thank you for all the input provided so far, I really appreciate it and will accommodate what I can on an update to this PR later today. Meanwhile, let me try and address some of your comments below.

This RFC is NOT just about autovivification, it's about short-circuiting the dereference arrow on undefined values. You can already disable autovivification via the no autovivification pragma, not currently in core but available on CPAN. Also, please note that hash/array references don't always silently fail/autovivify, they only do it when the undefined value is inside a variable. If a function or method returns an undefined value, it will die:

    > perl -Mv5.36 -E 'my $x; my $y = $x->{foo}'         # $x becomes {}, $y is undef
    > perl -Mv5.36 -E 'sub foo {} my $y = foo()->{bar}'  # dies at runtime
    > perl -Mv5.36 -E 'sub foo {} my $y = foo()->[0]'    # dies at runtime

Even if this feels like a consistency bug in perl itself, that behaviour has been around for so long that backwards compatibility dictates it probably won't be fixed, so the optional chaining operator provides yet another advantage (but, again, not just that).

Providing the definedness check on arrow operators via a pragma is unfortunately not good enough as it (a) doesn't give you enough granularity in, say, $o?->m1->m2?->m3; (b) promotes action at a distance; and (c) even if only enabled lexically, provides little to no gain in terms of reading/writing, as {no feature X; $o->m1} vs if (defined $o) { $o->m1 }.

Regarding the need to be explicit in chained calls instead of having it "toggle" the feature, I understand and agree that it would be nicer to write (in fact, the original draft of this RFC had this feature), but after looking at a few examples and discussing it on p5p, it was clear that it could be harder to read, and make it easier to miss some errors (since it would be so easy to just leave it on after the first ?->). Short-circuiting instead of runtime exceptions could hide bugs, so we want developers to make a conscious and explicit choice every time they want this feature - even if writing the code becomes a little more tedious. This decision is endorsed by the fact that all implementations of the null-safe / optional chain operator in other languages also require it to be explicit. If it serves as a consolation (it did to me), CoffeeScript usage of this operator suggests that less than 20% involves more than 2 chained calls.

Adding a modifier to make the optional chain "greedy" (like ??->) and a non-null assertion operator (like !->) are interesting ideas that could be explored in the future by another RFC, but we need to do this with baby steps otherwise we risk falling yet again on the trap of feature creep and paralysis.

Finally, I'd like to invite everyone reading this to join in on this and other design discussions on the p5p mailing list, reserving this space for issues with the PR itself (typos, need for clarification on a certain section/example, etc). That way everyone is able to contribute and we keep everything centralized <3

* typo fix on example code;
* fix paragraph that approaches CPAN emulation of the feature;
* move mention of optional chains in string interpolation / regexes
  from "Open Issues" to "Future Scope", effectively addressing all
  open issues (and added nod to template literals);.
* expand "rejected ideas" FAQ to include rationale over lexical pragmas
  and related ideas, while also deepening the explanation of why it
  should be explicit.
@leonerd
Copy link
Contributor

leonerd commented Jul 15, 2022

A thought occurs to me - after a ?-> has returned undef, every new chain after that will have nothing to look at. There's no point not making all the subsequent ones optional.

$foo?->{a}{b}{c}

With the behaviour as described (where subsequent chainings are mandatory) in the above example - if $foo is undef, then trying to fetch the a key out of it will just yield undef. At which point every subsequent chaining would fail anyway.

So surely it makes sense that after a ?-> every subsequent chained access that doesn't specify an operator arrow would be an optional one.

@duncand
Copy link

duncand commented Jul 15, 2022

A thought occurs to me - after a ?-> has returned undef, every new chain after that will have nothing to look at. There's no point not making all the subsequent ones optional.

$foo?->{a}{b}{c}

With the behaviour as described (where subsequent chainings are mandatory) in the above example - if $foo is undef, then trying to fetch the a key out of it will just yield undef. At which point every subsequent chaining would fail anyway.

So surely it makes sense that after a ?-> every subsequent chained access that doesn't specify an operator arrow would be an optional one.

You are right, the optional ?-> can only reasonably follow mandatory -> in mixed settings, and ?-> can't reasonably precede ->. However, I would still make the syntax explicit at each step so it is more clear what is going on. And in addition we can add a new parse/compile-time error if anyone tries to put a -> after a ?-> to point out if they make the mistake.

@garu
Copy link
Contributor Author

garu commented Jul 15, 2022

A thought occurs to me - after a ?-> has returned undef, every new chain after that will have nothing to look at. There's no point not making all the subsequent ones optional.

Maybe I misunderstood, it feels like you're considering just the case where the optional chain returned undef. What if it doesn't?

my $foo = {};
my $c = $foo?->{a}{b}{c};

After the code above is executed, I expect $foo to be { a => { b => {} } }. However, if we write it as:

my $foo = {};
my $c = $foo?->{a}?->{b}?->{c};

Then I expect $foo to still be just {}. This may have interesting implications in code that relies on autovivification for some reason, and definitely for bug hunting (like when a JSON response from a third-party service breaks their contract and stops sending the key you want).

As for mixing mandatory and optional chains, let me give a real-world example: a certain famous global payment gateway's REST API provides a webhook that sends a POST request to your web application with JSON. That object, if present, may or may not have a "transactions" key. If it does, it is (by "contract") supposed to be an array with at least one element which always contains a "related_resources" key, which itself should be an array with at least one element and that element may or may not have a "refund" key containing refund data. So to see if you have the refunded successfully, you have to:

    my $refund_state;
    if (defined $json
        && defined $json->{transactions}
        && defined $json->{transactions}[-1]{related_resources}[-1]{refund}
    ) {
        $refund_state = $json->{transactions}[-1]{related_resources}[-1]{refund}{state};
    }

With the new optional chaining operator, it becomes:

   my $refund_state = $json?->{transactions}?->[-1]{related_resources}[-1]{refund}?->{state};

So while "transactions" may or may not be there, if it is, it must be an arrayref of at least 1 element which itself must contain a "related_resources" key, and so on, until you reach the (optional) "refund" sub-object. If for example they break their API contract and fail to provide a "related_resources", you want to know.

@leonerd
Copy link
Contributor

leonerd commented Jul 25, 2022

Folks: While implementing this I have encountered a small problem, to do with what happens if there are side-effects on the RHS of the optional operator.

In normal method calls, as well as hash or array element lookups, the arguments are all evaluated in "left-to-right" order, as you would read the code. I have an implementation in which all of these are safe - nothing dies:

undef?->method( die "method call" );

undef?->{die "hash elem"};

undef?->[die "array elem"];

But for calling subrefs, this isn't the case in normal perl:

eval: my @ord; (do { push @ord, "inv"; sub {} })->(do { push @ord, "arg1"; 1 }, do { push @ord, "arg2"; 2 }); [@ord]
[ 'arg1', 'arg2', 'inv' ]

This shows that the arguments are evaluated before the invocant, for EXPR->(...) syntax. This is problematic because it means that any side-effects in the arguments would become apparent before we evaluate the invocant to determine whether we even should:

undef?->(die "subref call")

This throws an exception from the side-effects of evaluating the arguments before it got as far as evaluating the invocant.

A potential fix for this would be to evaluate the expressions in a different order for EXPR?->(...) as compared regular (i.e. non-optional) EXPR->(...).


To be honest it's rather a mess that existing Perl already works this way:

eval: (do { say "First"; {} })->{say "Second"}
First
Second

eval: (do { say "First"; sub {} })->(say "Second")
Second
First

@duncand
Copy link

duncand commented Jul 25, 2022

I feel that the best option would be to change the behaviour of -> to be left to right, if there aren't any downsides of doing so.

@garu
Copy link
Contributor Author

garu commented Aug 15, 2022

@leonerd Thank you so much for all the time and effort you are putting into this. I think this is going to be a great addition to Perl and I miss it almost every time I code. I am following all the discussion regarding evaluation order on Perl/perl5#19997 and friends, and while I still strongly believe it is a bug and should be fixed, I wonder if we can compartmentalize and move forward with the implementation of ?-> with arguments being evaluated the way they currently are. This way we can get to optional chaining sooner and, when the argument parsing issue is resolved, we get it in optional chains for free.

What do you think?

@leonerd
Copy link
Contributor

leonerd commented Aug 19, 2022

I wonder if we can compartmentalize and move forward with the implementation of ?-> with arguments being evaluated the way they currently are. This way we can get to optional chaining sooner and, when the argument parsing issue is resolved, we get it in optional chains for free.

Yeah that seems fine. It affects function calls (both subrefs and method calls), but hash and arrayref access would be fine to begin with.

We already have an RFC 0018, so we've allocated a new number 0021 for this. Please rename (and add to the document header) and then we should be good to go.

@garu
Copy link
Contributor Author

garu commented Aug 19, 2022

We already have an RFC 0018, so we've allocated a new number 0021 for this. Please rename (and add to the document header) and then we should be good to go.

Done! Thank you!

@garu garu changed the title RFC 0018: optional chaining RFC 0021: optional chaining Aug 19, 2022
@garu
Copy link
Contributor Author

garu commented Sep 13, 2022

Hi @leonerd, @book and @rjbs! I understand you are all super busy, I'm just wondering if there is anything I can do to help move this RFC forward.

Thank you!

@leonerd
Copy link
Contributor

leonerd commented Sep 13, 2022

I believe it's awaiting the "Yes do it" rubber-stamp. I don't really have any objections to doing that. If nobody else does perhaps we say "Go do it".?

@leonerd
Copy link
Contributor

leonerd commented Sep 14, 2022

I should also clarify, nothing is stopping anyone having a go at starting implementing this. Feel free to go ahead :)

I have a mostly-working hack at it via a CPAN module and PL_infix_plugin; which is where I encountered the problem of entersub's evaluation order.

@rjbs
Copy link
Member

rjbs commented Sep 24, 2022

The status of this RFC is "waiting for an implementation to be produced."

@tonycoz
Copy link

tonycoz commented Sep 29, 2022

And this error on missing index should be triggered by what? Currently Perl would autovivify it, my proposed no feature ‚autovivification‘ would turn that into an exception instead.

Perl/perl5#18650

there didn't seem to be much interest.

@rjbs
Copy link
Member

rjbs commented Sep 30, 2022

(PSC continues to discuss this RFC, largely in the context of being blocked by Paul's comment here: #23 (comment) )

@rjbs
Copy link
Member

rjbs commented Oct 7, 2022

In status Implemented, so merged.

@rjbs rjbs merged commit 5aaa892 into Perl:main Oct 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants