Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent block scoping #22204

Open
richardleach opened this issue May 9, 2024 · 3 comments
Open

Inconsistent block scoping #22204

richardleach opened this issue May 9, 2024 · 3 comments

Comments

@richardleach
Copy link
Contributor

Description
Given an if/else branch, such as the example below, it seems like a reasonable expectation that both branches behave identically with regard to entering a new scope - or not doing so.

sub foo {
    my $x = $_[0];
    if ($x) {
        return 1
    } else {
        return 0
    }
} 

But they don't. The else branch is wrapped in an ENTER/LEAVE pair, but the if branch is not.

5        <;> nextstate(main 3 -e:1) v ->6
-        <1> null K/1 ->-
7           <|> cond_expr(other->8) K/1 ->c
6              <0> padsv[$x:2,9] s ->7
-              <@> scope K ->-
-                 <;> ex-nextstate(main 5 -e:1) v ->8
a                 <@> return K ->b
8                    <0> pushmark s ->9
9                    <$> const[IV 1] s ->a
h              <@> leave KP ->b
c                 <0> enter ->d
d                 <;> nextstate(main 7 -e:1) v ->e
g                 <@> return K ->h
e                    <0> pushmark s ->f
f                    <$> const[IV 0] s ->g

That's not always the behaviour. In the following example, both branches are wrapped in an ENTER/LEAVE pair:

$ perl -MO=Concise,foo -e 'sub foo { $x = $_[0]; if ($x) { local $x = 2 } else { local $x = 3 }}'
main::foo:
...
5        <;> nextstate(main 2 -e:1) v:{ ->6
-        <1> null KP/1 ->-
7           <|> cond_expr(other->8) K/1 ->f
-              <1> ex-rv2sv sK/1 ->7
6                 <#> gvsv[*x] s ->7
d              <@> leave KP ->e
8                 <0> enter ->9
9                 <;> nextstate(main 4 -e:1) v:{ ->a
c                 <2> sassign sKS/2 ->d
a                    <$> const[IV 2] s ->b
-                    <1> ex-rv2sv sKRM*/LVINTRO,1 ->c
b                       <#> gvsv[*x] s/LVINTRO ->c
k              <@> leave KP ->e
f                 <0> enter ->g
g                 <;> nextstate(main 6 -e:1) v:{ ->h
j                 <2> sassign sKS/2 ->k
h                    <$> const[IV 3] s ->i
-                    <1> ex-rv2sv sKRM*/LVINTRO,1 ->j
i                       <#> gvsv[*x] s/LVINTRO ->j
-e syntax OK

The inconsistency seems undesirable because:

  • If a branch should enter a new scope but does not, buggy behaviour may result
  • If a branch needlessly enters a new scope, that harms performance

Steps to Reproduce

perl -MO=Concise,foo -e 'sub foo { my $x = $_[0]; if ($x) { return 1 } else { return 0 }}'

Expected behavior
Both branches have consistent scope behaviour.

Perl configuration
Standard blead, v5.36

@richardleach
Copy link
Contributor Author

The difference first traces back to Perl_op_scope. Within that function, dumping OP *o for the true case block shows OPf_PARENS set on the lineseq OP:

1    lineseq LISTOP(0x5564808c46c0) ===> [0x0]
     PARENT ===> [0x0]
     FLAGS = (UNKNOWN,KIDS,PARENS,SLABBED)
     |   
2    +--nextstate COP(0x5564808c4700) ===> [SELF]
     |   FLAGS = (VOID,SLABBED,MORESIB)
     |   LINE = 1
     |   PACKAGE = "main"
     |   SEQ = 4294967252
     |   
3    +--return LISTOP(0x5564808c4760) ===> [0x0]
         FLAGS = (UNKNOWN,KIDS,SLABBED)
         |   
4        +--pushmark OP(0x5564808c47a0) ===> [SELF]
         |   FLAGS = (SCALAR,SLABBED,MORESIB)
         |   
5        +--const SVOP(0x5564808c47d0) ===> [SELF]
             FLAGS = (SCALAR,SLABBED)
             SV = IV(0)

For the else block, OPf_PARENS is not set on the lineseq OP:

6    lineseq LISTOP(0x5564808c4808) ===> [0x0]
     PARENT ===> [0x0]
     FLAGS = (UNKNOWN,KIDS,SLABBED)
     |   
7    +--nextstate COP(0x5564808c4848) ===> [SELF]
     |   FLAGS = (VOID,SLABBED,MORESIB)
     |   LINE = 1
     |   PACKAGE = "main"
     |   SEQ = 4294967250
     |   
8    +--return LISTOP(0x5564808c48a8) ===> [0x0]
         FLAGS = (UNKNOWN,KIDS,SLABBED)
         |   
9        +--pushmark OP(0x5564808c48e8) ===> [SELF]
         |   FLAGS = (SCALAR,SLABBED,MORESIB)
         |   
10       +--const SVOP(0x5564808c4918) ===> [SELF]
             FLAGS = (SCALAR,SLABBED)
             SV = IV(1)

The presence or absence of OPf_PARENS here controls whether o is wrapped in enter/leave, or the lineseq OP is converted into a scope OP.

@richardleach
Copy link
Contributor Author

The callers to Perl_op_scope are:

  • Perl_yyparse (gramtype=gramtype@entry=258) at /home/rich/github/perl5/perly.y:702
/* else and elsif blocks */
else
        :       empty
        |       KW_ELSE mblock
                        {
                          ($mblock)->op_flags |= OPf_PARENS;
                          $$ = op_scope($mblock);
                        }
  • Perl_yyparse (gramtype=gramtype@entry=258) at /home/rich/github/perl5/perly.y:466
        |       KW_IF PERLY_PAREN_OPEN remember mexpr PERLY_PAREN_CLOSE mblock else
                        {
                          $$ = block_end($remember,  
                              newCONDOP(0, $mexpr, op_scope($mblock), $else));
                          parser->copline = (line_t)$KW_IF;
                        }

I have no understanding of the parser logic and don't know why ($mblock)->op_flags |= OPf_PARENS; isn't present in both locations.

@richardleach
Copy link
Contributor Author

Here's a simple example demonstrating this:

package foo;
sub new {
    return bless {}, 'foo';
}

sub DESTROY {
    $main::status = "Finished";
}

package main;
for (0..1) {
    $status = "Started";
    print $status, "\n";
    if ($_) {
        my $magic = foo->new();
    } else {
        my $magic = foo->new();
    }
    print "  $status\n";
}
print "$status\n";

A developer reasoning about this Perl code might expect the output to be:

Started
  Finished
Started
  Finished
Finished

whereas it actually outputs this:

Started
  Finished
Started
  Started
Finished

because the destructor isn't called at the expected time when the true branch of the if statement is followed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant