Skip to content

Conversation

leonerd
Copy link
Contributor

@leonerd leonerd commented Aug 1, 2022

Evaluate the LHS expression giving the code ref, before evaluating the RHS expression giving the args.

This change does introduce a new meaning of the OPf_SPECIAL flag to the OP_ENTERSUB op, which might confuse thirdparty optree walking modules.

An alternative idea which could be explored instead, is to evaluate the LHS into a pad temporary first; making such syntax equivalent to a rewrite:

EXPR->(ARGS)
do { my $tmp = EXPR; $tmp->(ARGS) }

This would make the optree and the pad a little bit larger though.

Fixes #19997

@dur-randir
Copy link
Member

dur-randir commented Aug 3, 2022

For XS accessor generation modules this'd break the following use case (extensively present in DBIx::Class)

my $meth = $obj->can('method');
$meth->($arg1, $arg2);

With the current code it's possible to peek at the top of the stack to determine whether or not it's 'ours' call site (so should the optree optimisation be kept or not). As I see for normal method calls, aka $obj->method(...), it's still the case, but for cases like this the object is now at the bottom of the stack with no easy access. Furthermore, it makes such calls slower for no reason by introducing stack copying.

And the case this tries to fix has never been reported in all those years, so compared to DBIx::Class it's a bit less used.

@garu
Copy link
Contributor

garu commented Aug 6, 2022

Hi @leonerd! Maybe I'm doing something wrong, but after installing your branch I am still seeing the same behavior:

tmp/perl5  [entersub-evaluation-order 
▶ ./perl -v
This is perl 5, version 37, subversion 3 (v5.37.3 (v5.34.0-RC1-2556-g09b4aff6b9)) built for darwin-2level

▶ ./perl -Ilib -E 'my $x; $x->(die q(oops!))'
oops! at -e line 1.

▶ ./perl -Ilib -E 'does_not_exist(die q(oops!))'
oops! at -e line 1.

@demerphq
Copy link
Collaborator

demerphq commented Aug 6, 2022 via email

@garu
Copy link
Contributor

garu commented Aug 8, 2022

That doesnt look right to me. Why does it say two versions there?

I don't know. All I did was:

$ git clone https://github.com/leonerd/perl5.git
$ cd perl5
$ git checkout entersub-evaluation-order
$ sh Configure -de -Dusedevel
$ make
$ make test
$ ./perl -v
This is perl 5, version 37, subversion 3 (v5.37.3 (v5.34.0-RC1-2556-g09b4aff6b9)) built for darwin-2level
(...)
$ ./perl -Ilib -E 'my $x; $x->(die 42)'
42 at -e line 1.

What did I miss?

@haarg
Copy link
Contributor

haarg commented Aug 8, 2022

v5.34.0-RC1-2556-g09b4aff6b9 is the git describe output for the commit being built. The name it uses is based on the closest tag available. @leonerd's fork doesn't have all of the tags. The latest it has is v5.34.0-RC1.

@leonerd
Copy link
Contributor Author

leonerd commented Aug 8, 2022

$ ./perl -Ilib -E 'my $x; $x->(die 42)'
42 at -e line 1.

What did I miss?

I wouldn't expect that to behave any differently in my branch. It evaluates the CV (gets undef) then evaluates the args (which throws). It hasn't got as far as checking for definedness of the CV, because the evaluation already died.

My branch does affect the behaviour of code like this:

(do { die "abc" })->(die "def")

On my branch it fails "abc"; on current bleadperl it fails "def".

@demerphq
Copy link
Collaborator

demerphq commented Aug 8, 2022 via email

@leonerd
Copy link
Contributor Author

leonerd commented Aug 9, 2022

Wasn't part of the point to ensure there is something to pass the arguments to before we evaluate the arguments?

Partly. That could be a useful next step. The first step was just to get them evaluated in the right order so that kind of thing could be checked. This was also a first step towards having the optional chaining operator work properly here; ensuring it doesn't needlessly evaluate arguments if the CV invocant is undef.

@garu
Copy link
Contributor

garu commented Aug 9, 2022

For XS accessor generation modules this'd break the following use case (extensively present in DBIx::Class)

@dur-randir does DBIx::Class (or relevant dependencies) provide a test that fails when building it with this branch? It could be used to make sure any changes only get accepted if they don't break it. Apologies in advance if I misunderstood something!


PERL_STATIC_INLINE void
Perl_cx_pushsub(pTHX_ PERL_CONTEXT *cx, CV *cv, OP *retop, bool hasargs)
Perl_cx_pushsub(pTHX_ PERL_CONTEXT *cx, CV *cv, OP *retop, bool hasargs, U8 op_private)
Copy link

@bram-perl bram-perl Aug 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An observation: the value of PL_op->op_private is now passed to the function when it's relevant.
That gives the impression that PL_op is no longer/should no longer be used in this function.

But it still is: define of CX_PUSHSUB_GET_LVALUE_MASK:

#define CX_PUSHSUB_GET_LVALUE_MASK(func) \
        /* If the context is indeterminate, then only the lvalue */	\
        /* flags that the caller also has are applicable.        */	\
        (								\
           (PL_op->op_flags & OPf_WANT)					\
               ? OPpENTERSUB_LVAL_MASK					\
               : !(PL_op->op_private & OPpENTERSUB_LVAL_MASK)		\
                   ? 0 : (U8)func(aTHX)					\

It uses PL_op->op_private directly.

I don't know if this is a problem and/or if this is fixable.

This looks a bit: on one hand: for the cases where PL_op->op_private shouldn't be used in now uses an explicit 0; but on the other hand even in those cases it (might) still use PL_op->op_private.

@leonerd leonerd force-pushed the entersub-evaluation-order branch from 09b4aff to f98b53f Compare September 16, 2022 16:39
@leonerd leonerd force-pushed the entersub-evaluation-order branch from f98b53f to 4fe2b9e Compare September 16, 2022 16:48
@leonerd
Copy link
Contributor Author

leonerd commented Sep 16, 2022

Rebased on current blead. Should be ready for review now.

@leonerd leonerd force-pushed the entersub-evaluation-order branch 3 times, most recently from f5f501d to 4b88c45 Compare September 17, 2022 10:03
Avoids the incredibly subtle and hard-to-find "action-at-a-distance" by
having that function read it directly out of PL_op. See footnote in

  Perl#19997 (comment)

Also avoids having it be miscalculated and fixed elsewhere, by passing
zero in directly.
On visual inspection it would appear that the LHS expression that gives
the code reference should be evaluated first, before the arguments. This
was previously not the case, because of the order values were placed on
the stack.

By using the OPf_SPECIAL flag on OP_ENTERSUB to mean "expect to find the
CV first, rather than last" and swapping the optree order for this
syntax, we can ensure this becomes the case.

Fixes Perl#19997
@leonerd leonerd force-pushed the entersub-evaluation-order branch from 7994edd to 3c356cc Compare September 21, 2022 10:25
@leonerd
Copy link
Contributor Author

leonerd commented Sep 27, 2022

Ping all - anyone want to review this?

Copy link
Contributor

@richardleach richardleach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you just update the top item on the mark stack (set to svp+1), instead of doing this Move?
(I haven't thought this through, just musing.)

@leonerd
Copy link
Contributor Author

leonerd commented Sep 27, 2022

For XS accessor generation modules this'd break the following use case (extensively present in DBIx::Class)
...

New devel perl will do things that CPAN modules that poke at optrees will need to be aware of. This is always true

If CPAN modules don't get updated to cope then they won't like 5.38's optrees. But that is already true given a bunch of other changes for undef/emptyhv/padsv_store, etc... This was true in 5.36 because of defer and finally; 5.34 because of try/catch, 5.32 because of isa, ...

With the current code it's possible to peek at the top of the stack to determine whether or not it's 'ours' call site (so should the optree optimisation be kept or not). As I see for normal method calls, aka $obj->method(...), it's still the case, but for cases like this the object is now at the bottom of the stack with no easy access. Furthermore, it makes such calls slower for no reason by introducing stack copying.

It's still easy to find. You just have to consult the markstack as well in these cases - on 5.37.x if the OPf_SPECIAL is set. I'd be happy to suggest relevant changes to any CPAN module which needs updating to cope with this.

@leonerd
Copy link
Contributor Author

leonerd commented Sep 27, 2022

Could you just update the top item on the mark stack (set to svp+1), instead of doing this Move?
(I haven't thought this through, just musing.)

That was my first thought. I did that and it worked for the enter'ed sub itself, but then it breaks later on stack cleanup because the pushed CV itself is still on the stack. It isn't noticable in void-returning cases but it gets in the way in scalar- or list-returning ones. That situation could probably be fixed by adjusting the code in stack unwind but then it touches a bunch more places and becomes a lot more subtle in cleanup and CPAN module breakage. I found the Move() solution much simpler.

@richardleach
Copy link
Contributor

That was my first thought. I did that and it worked for the enter'ed sub itself, but then it breaks later on stack cleanup because the pushed CV itself is still on the stack. It isn't noticable in void-returning cases but it gets in the way in scalar- or list-returning ones.

Makes sense. I'm not going to suggest adding a second PUSHMARK, though I guess it could be an option.

pp_hot.c Outdated
Comment on lines 5180 to 5185
/* TODO: This is currently horribly inefficient. It can likely be made
* a lot better by adjusting the code lower down instead
* For now we'll just move all the stack args down one position
*/
PERL_UNUSED_RESULT(POPs);
/* Move all the stack args down one position to where everyone expects
* them */
Size_t count = SP - mark;
Move(svp+1, svp, count, SV *);

while(svp <= SP) {
svp[0] = svp[1];
svp++;
}
PERL_UNUSED_RESULT(POPs);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is slightly more efficient, but the old version compiled down to memmove() too (details below).

I do wonder if changing this is worth the performance cost,

Tested with:

perf stat ./miniperl -e 'my $foo = sub  { }; my @args = ( 1 .. 10 ); for (1..1_000_000) { $foo->(@args) }'

After 3 trials and taking the minimum of each set (perf includes interrupt processing) this appears to make the test run about 4% slower in instructions and user time.

Raw performance numbers

Before the patch:

       225,616,164      cycles                    #    4.529 GHz                    
       777,104,152      instructions              #    3.44  insn per cycle         
       155,617,016      branches                  # 3123.946 M/sec                  
            20,608      branch-misses             #    0.01% of all branches    
       0.050176000 seconds user

       247,633,809      cycles                    #    4.360 GHz                    
       777,100,981      instructions              #    3.14  insn per cycle         
       155,617,656      branches                  # 2739.773 M/sec                  
            20,560      branch-misses             #    0.01% of all branches    
       0.057839000 seconds user

       230,714,487      cycles                    #    4.556 GHz                    
       777,107,478      instructions              #    3.37  insn per cycle         
       155,618,909      branches                  # 3073.204 M/sec                  
            20,558      branch-misses             #    0.01% of all branches    
       0.050940000 seconds user

With the patch:

       245,858,270      cycles                    #    4.591 GHz                    
       812,074,404      instructions              #    3.30  insn per cycle         
       165,612,897      branches                  # 3092.589 M/sec                  
       20,359      branch-misses             #    0.01% of all branches
       0.053863000 seconds user

       238,782,598      cycles                    #    4.598 GHz                    
       812,090,195      instructions              #    3.40  insn per cycle         
       165,614,388      branches                  # 3188.989 M/sec                  
            20,394      branch-misses             #    0.01% of all branches    
       0.052243000 seconds user

       243,833,810      cycles                    #    4.404 GHz                    
       812,133,770      instructions              #    3.33  insn per cycle         
       165,624,467      branches                  # 2991.232 M/sec                  
            20,408      branch-misses             #    0.01% of all branches    
       0.056446000 seconds user
Generated code for this block

Before:

L3123:
        movq    PL_markstack_ptr(%rip), %rdx
        movslq  (%rdx), %rcx
        movq    PL_stack_base(%rip), %rdx
        leaq    (%rdx,%rcx,8), %rsi
        movq    8(%rsi), %rbp
        leaq    8(%rsi), %rdi
        movq    %rbp, 16(%rsp)
        cmpq    %rdi, %r13
        jb      .L2943
        subq    $16, %rax
        subq    %rsi, %rax
        addq    $16, %rsi
        shrq    $3, %rax
        leaq    8(,%rax,8), %rdx
        call    memmove@PLT
        jmp     .L2943

After:

.L3123:
        movq    PL_markstack_ptr(%rip), %rax
        movslq  (%rax), %rdx
        movq    PL_stack_base(%rip), %rax
        leaq    (%rax,%rdx,8), %rsi
        movq    %rbx, %rdx
        movq    8(%rsi), %rbp
        leaq    8(%rsi), %rdi
        movq    %rbp, 16(%rsp)
        subq    %rsi, %rdx
        js      .L3000
        addq    $16, %rsi
        leaq    -8(%rbx), %r13
        call    memmove@PLT
        jmp     .L2944

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do wonder if changing this is worth the performance cost,
...
After 3 trials and taking the minimum of each set (perf includes interrupt processing) this appears to make the test run about 4% slower in instructions and user time.

Hrm. Indeed - that sounds like a bit of a hit to be taking here :(

@iabyn
Copy link
Contributor

iabyn commented Sep 29, 2022 via email

@leonerd
Copy link
Contributor Author

leonerd commented Sep 30, 2022

  1. I'm vaguely nervous that this is touching heavily in areas related to my current and next projects: making the stack reference counted, and doing more signature work (especially eliminating @_), both of which get fairly complex around pp_entersub(). So if possible I'd prefer this work was deferred.

It is a bit of a heavy hit; additionally @tonycoz's findings above on the performance impact of it suggests that this is not a good way to implement it.

Also, it makes me very twitchy the idea of messing with long-standing behaviour which isn't a bug, and I dislike the idea of adding any more conditionals and other work to the hot pp_entersub() code paths unless really necessary.

While it currently seems like a really obscure bug, it becomes quite important when we consider RFC 0021 - Optional Chaning. The idea is to have a new ?-> operator that shortcircuits if the LHS is undefined.

Part of that shortcircuiting means not evaluating any of the righthand side if the lefthand is undefined. For structure accesses such as array and hash element lookups, these can already be easily implemented:

my $x = $aref?->[ gen_index() ];
my $y = $href?->{ gen_key() };

In these cases, the function inside the brackets does not get called if the reference on the LHS is undefined; the whole thing shortcircuits nicely.

However, in function call cases, the current interpreter implementation first evaluates the arguments to the function before it evaluates the function itself; thus meaning:

my $z = $subref?->( gen_args() );

would always invoke gen_args() even if $subref is not defined. It's an important part of the design of optional chaining that it should skip this, so this won't do.

Now, while we could define an entire new op for this case (OP_ENTERSUB_OPTIONAL?) that takes its arguments in the other order, it does then lead to the "quirk" that given the following two lines of very-similar looking code, they'd actually behave so differently:

my $i = A()->( B() );
my $j = A()?->( B() );

If we don't change the evaluation order of regular entersub, that would result in perl calling the functions B(); A(); A(); B(); which seems Surprising.

This is all why I feel it is important to make entersub run in a "less surprising" manner.

@leonerd
Copy link
Contributor Author

leonerd commented Sep 30, 2022

Plus I should add that all of the comments above about entersub on coderefs also apply to method calls:

my $z = $obj?->meth( gen_args() );

my $i = objA()->meth( B() );
my $j = objA()?->meth( B() );

@richardleach
Copy link
Contributor

Is it worth trying out the alternative suggestion (reproduced below) from the first post, just to see how that performs?

An alternative idea which could be explored instead, is to evaluate the LHS into a pad temporary first; making such syntax equivalent to a rewrite:

EXPR->(ARGS)
do { my $tmp = EXPR; $tmp->(ARGS) }

@tonycoz
Copy link
Contributor

tonycoz commented Oct 6, 2022

While it currently seems like a really obscure bug, it becomes quite important when we consider RFC 0021 - Optional Chaning. The idea is to have a new ?-> operator that shortcircuits if the LHS is undefined.

This could be handled by only doing the swapping for optional chaining, so only optional chaining pays the cost of the the stack rearrangement (or uses @richardleach's idea with the temp

It would be nice if we could just change the order of arguments to entersub, but I think that would break half of the XS on CPAN.

Now, while we could define an entire new op for this case (OP_ENTERSUB_OPTIONAL?) that takes its arguments in the other order, it does then lead to the "quirk" that given the following two lines of very-similar looking code, they'd actually behave so differently:

my $i = A()->( B() );
my $j = A()?->( B() );

If we don't change the evaluation order of regular entersub, that would result in perl calling the functions B(); A(); A(); B(); which seems Surprising.

I do wonder if changing that evaluation order for the existing operator will result in incompatible changes in behaviour, whether noisy (caught by a test), or silently breaking something.

I can see why consistent and left-to-right* order is desirable, but I'm not sure we can get it without breaking something.

  • perlop claims "In fact Perl has a general rule that the operands of an operator are evaluated in left-to-right order"

@demerphq
Copy link
Collaborator

demerphq commented Feb 8, 2023

This is in conflict and has not been touched for a while. @leonerd what are your plans here?

@jkeenan
Copy link
Contributor

jkeenan commented Jul 31, 2023

This is in conflict and has not been touched for a while. @leonerd what are your plans here?

@leonerd, ^^

@leonerd
Copy link
Contributor Author

leonerd commented Sep 18, 2023

In short: I have no idea.

In slightly longer: The underlying problem still exists, and would still get in the way of attempting to create an optional-chaining function or method call operator. I don't know whether the approach taken here would be the best solution or if it's better to attack the problem from some other direction.

@jkeenan
Copy link
Contributor

jkeenan commented Nov 22, 2023

In short: I have no idea.

In slightly longer: The underlying problem still exists, and would still get in the way of attempting to create an optional-chaining function or method call operator. I don't know whether the approach taken here would be the best solution or if it's better to attack the problem from some other direction.

My take on this pull request is that it's not going to go forward. I recommend that we close it, then substitute either a discussion of the issues on the P5P list or the opening of a GH issue with a succinct discussion of the problem informed by the comments in this ticket.

@leonerd

@jkeenan jkeenan added the Closable? We might be able to close this ticket, but we need to check with the reporter label Nov 22, 2023
@jkeenan
Copy link
Contributor

jkeenan commented Jul 13, 2024

In short: I have no idea.
In slightly longer: The underlying problem still exists, and would still get in the way of attempting to create an optional-chaining function or method call operator. I don't know whether the approach taken here would be the best solution or if it's better to attack the problem from some other direction.

My take on this pull request is that it's not going to go forward. I recommend that we close it, then substitute either a discussion of the issues on the P5P list or the opening of a GH issue with a succinct discussion of the problem informed by the comments in this ticket.

@leonerd

No one expressed opposition to my recommendation last November that we close this p.r., so doing so now. Please discuss underlying issues on list. Thanks.

@jkeenan jkeenan closed this Jul 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Closable? We might be able to close this ticket, but we need to check with the reporter hasConflicts

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Evaluation order of EXPR->(...) is confusing

10 participants