Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Local variable not updated in closure state #5439

Closed
yamam opened this issue Nov 12, 2018 · 26 comments

Comments

Projects
None yet
4 participants
@yamam
Copy link

commented Nov 12, 2018

Environment

jruby 9.2.3.0 (2.5.0) 2018-11-09 5833e2d Java HotSpot(TM) 64-Bit Server VM 25.191-b12 on 1.8.0_191-b12 +jit [mswin32-x86_64]

Expected Behavior

test.rb

var = 0

pr = proc do |v|
    puts "pr is called"
    if var == 0
        puts "ERROR!"
    end
end

def foo(&proc)
  proc.call
end

foo do
    begin
        var = 1
        puts "call pr"
        pr.call
    ensure
      var = 0
      puts "ensure end"
    end
end
jruby-9.2.0.0/bin/jruby test.rb
call pr
pr is called
ensure end

Actual Behavior

jruby-9.2.3.0/bin/jruby test.rb
call pr
pr is called
ERROR!
ensure end
@headius

This comment has been minimized.

Copy link
Member

commented Nov 12, 2018

This does not appear to be affected by Java integration; if I make JavaClass.new into a pure-Ruby method it behaves the same way. I will update the issue description to reflect this.

@headius headius changed the title ensure from Java callback has wrong behavior Local variable not updated in closure state Nov 12, 2018

@headius

This comment has been minimized.

Copy link
Member

commented Nov 12, 2018

This is definitely some oddity with IR. If I do any of the following, the error goes away:

  • Eliminate the ensure block
  • Add a puts at the beginning of the ensure block

Something about the ensure is causing the write to var to not go directly to the heap.

I think it may be eliminating the intermediate write to var because it sees that the ensure overwrites that value.

@subbuss @enebo When are heap variables written back to the heap? I know we do it around closure boundaries, since they need to see the most current state. In this case, after the update var = 1 a proc is called that needs to see that value, but we don't seem to push it back to the heap for either the puts call or the pr.call call.

Here's the IR for the broken version, with the main body and fall-through ensure:

block #4 (out: 5,6): CL2_LBL_0:9
  0:             line_num(lineNumber: 15)
  1:             line_num(lineNumber: 16)
  2:  %cl_2_8 := copy(str<call pr>)
  3:  %cl_2_7 := call_1o(self<%self>, %cl_2_8, callType: FUNCTIONAL, name: puts, potentiallyRefined: false)
  4:             line_num(lineNumber: 17)
  5:  %cl_2_9 := call_0o(*pr, callType: NORMAL, name: call, potentiallyRefined: false)
  6:  %cl_2_6 := copy(%cl_2_9)

block #5 (out: 11): CL2_LBL_4:20
  0:             line_num(lineNumber: 19)
  1:     *var := copy(fix<0>)
  2:             line_num(lineNumber: 20)
  3:  %cl_2_5 := copy(str<ensure end>)
  4:  %cl_2_4 := call_1o(self<%self>, %cl_2_5, callType: FUNCTIONAL, name: puts, potentiallyRefined: false)
  5:             pop_binding
  6:             restore_binding_viz(%cl_2_13)
  7:             pop_block_frame(%cl_2_14)
  8:             return(%cl_2_6)

Note there's no var = 1 at all.

If I add a puts to the ensure, it changes:

block #4 (out: 5,6): CL2_LBL_0:9
  00:             line_num(lineNumber: 15)
  01:     *var := copy(fix<1>)
  02:             line_num(lineNumber: 16)
  03: %cl_2_10 := copy(str<call pr>)
  04:  %cl_2_9 := call_1o(self<%self>, %cl_2_10, callType: FUNCTIONAL, name: puts, potentiallyRefined: false)
  05:             line_num(lineNumber: 17)
  06: %cl_2_11 := call_0o(*pr, callType: NORMAL, name: call, potentiallyRefined: false)
  07:  %cl_2_8 := copy(%cl_2_11)

block #5 (out: 11): CL2_LBL_4:20
  00:             line_num(lineNumber: 19)
  01:  %cl_2_5 := copy(str<ensure start>)
  02:  %cl_2_4 := call_1o(self<%self>, %cl_2_5, callType: FUNCTIONAL, name: puts, potentiallyRefined: false)
  03:             line_num(lineNumber: 20)
  04:     *var := copy(fix<0>)
  05:             line_num(lineNumber: 21)
  06:  %cl_2_7 := copy(str<ensure end>)
  07:  %cl_2_6 := call_1o(self<%self>, %cl_2_7, callType: FUNCTIONAL, name: puts, potentiallyRefined: false)
  08:             pop_binding
  09:             restore_binding_viz(%cl_2_15)
  10:             pop_block_frame(%cl_2_16)
  11:             return(%cl_2_8)

Looking at this I may have a theory: because there's no call immediately before var = 0 in the ensure, somehow the IR doesn't think there's any calls after var = 1, so it eliminates the assignment completely. Adding the puts causes it to propagate var to the heap.

@headius

This comment has been minimized.

Copy link
Member

commented Nov 12, 2018

Trivial reproduction that seems to support my theory:

var = 0
pr = ->(){p var}
1.times do
  var = 1
  pr.call
ensure
  var = 2
end
# => 0, not 2 because the ensure does not actually fire before pr.call
@headius

This comment has been minimized.

Copy link
Member

commented Nov 12, 2018

This appears to be a regression. JRuby 9.2 properly updates var here.

@headius headius added this to the JRuby 9.2.4.0 milestone Nov 12, 2018

@subbuss

This comment has been minimized.

Copy link
Contributor

commented Nov 12, 2018

This is what I see locally with master

[subbu@earth:~/jruby] jruby -X-C /tmp/x.rb
1
[subbu@earth:~/jruby] jruby /tmp/x.rb
0
[subbu@earth:~/jruby] jruby -X+C /tmp/x.rb
0
@enebo

This comment has been minimized.

Copy link
Member

commented Nov 12, 2018

@SubuSS @headius make sure you are running same passes as much as you can. Last comment points to jit or pass but I am being on a pass -X-C is not running

@subbuss

This comment has been minimized.

Copy link
Contributor

commented Nov 12, 2018

It has been so long that I no longer remember how to run specific passes :-) .. but, I found that ast --ir still runs and looks like Dead Code Elimination kills the "var = 1" assignment which means information about escaped variables has somehow been lost between 9.2 and master .. could be because of twiddling of flags on operations / instructions, etc .. for example.

@subbuss

This comment has been minimized.

Copy link
Contributor

commented Nov 12, 2018

Reverting ae777f0 seems to fix it.

@subbuss subbuss closed this in ce522d0 Nov 12, 2018

@subbuss

This comment has been minimized.

Copy link
Contributor

commented Nov 12, 2018

I did a blind revert - didn't analyze what was broken with it. Maybe @enebo and @headius and figure out. I don't have any more time today to look into this. Reopening so that we figure out what is going on.

@subbuss subbuss reopened this Nov 12, 2018

@enebo

This comment has been minimized.

Copy link
Member

commented Nov 12, 2018

I probably won't have time on plane to look at this but that removed irruntimehelper instr was enough magic/barrier in front of var to not eliminate it in DCE.

I feel like something has been broken with lva for a long time but the likelihood of var being first before call or other unknown instr which is call or side effect.

@enebo

This comment has been minimized.

Copy link
Member

commented Nov 12, 2018

It is like lva is correct that the local scope can see var is not used but does not know other scopes have access as the call right after var = 1 should be enough to know someone may see var with that value

@enebo

This comment has been minimized.

Copy link
Member

commented Nov 12, 2018

yay for plane wifi....

Not a solution but an explanation why the revert @subbuss did worked. When RESTORE_EXCEPTION instr is added the BB that contains it gains an addition exception edge to the ensure which bypasses the fallthrough BB which contains the var = 2. Because it can jump past this BB it cannot eliminate the first var assignment.

I feel a bit more confident that there is a generic issue with LVA which does not seem to know the variable is potentially being read from another scope (which would make any call or sideeffect invalidate marking the assignment dead). I may even be able to show this is broken without an ensure is present.

@enebo

This comment has been minimized.

Copy link
Member

commented Nov 12, 2018

The more I read this the more I don't understand how if we are in a closure and we see any call that we cannot just require all non-local-scope lvars as unconditionally alive. in LiveVariableNode.applyTransferFunction we do explicitly look for variables leaving in a call/closureaccepting and then add them to living if they are explicitly passed to the call but since we don't know what the actual call is we don't really know if that method already has access to any of these variables.

@headius

This comment has been minimized.

Copy link
Member

commented Nov 12, 2018

@enebo I think we had this debate before, weighing every write going to heap versus some tricky logic making sure var updates are visible on heap at the right time (like on call boundaries etc). It might not be too hard to measure how many reads/writes we eliminate in reality, since everything is a call (i.e. maybe we don't gain much versus the complexity).

@subbuss

This comment has been minimized.

Copy link
Contributor

commented Nov 12, 2018

since we don't know what the actual call is we don't really know if that method already has access to any of these variables.

I don't think you need to be that pessimistic about it. This is probably missing some check specific to when the scope being analyzed is a closure. For the method scope case, if there is a closure, the analysis conservatively assumes all things being live. I'll take a look at the code.

It might not be too hard to measure how many reads/writes we eliminate in reality, since everything is a call (i.e. maybe we don't gain much versus the complexity).

Once you start inlining, this will begin to matter more.

@enebo

This comment has been minimized.

Copy link
Member

commented Nov 13, 2018

@subbuss we are in a closure and it definitely assumes all variables are live on exit (which we pessimistically assume). The issue is within that closure we have no information on how those variables which are defined outside the closure are accessed. In the case of this bug a second closure is executed and it happens to ask for the value. If we knew no other closure accessed the variable them perhaps it is safe for us to remove var = 1. Perhaps the method itself using those variables are not important?

@subbuss

This comment has been minimized.

Copy link
Contributor

commented Nov 13, 2018

There is this code in LVA which indicates that 'var' should have been marked live before the call. So, the qn. is why didn't that if check succeed (assuming that is what is failing).

// Variables that belong to outer scopes should always
// be considered live since they can be accessed downstream
// of this call.
for (Variable x: problem.getNonSelfLocalVars()) {
    if (x instanceof LocalVariable && ((LocalVariable)x).getScopeDepth() > 0) {
        living.set(problem.getDFVar(x));
    }
}
@subbuss

This comment has been minimized.

Copy link
Contributor

commented Nov 13, 2018

I missed this earlier, but the IR output shows: var(0:0) = copy(Fixnum:1) . Why is it not depth 1? So, I think something else earlier is screwing up .. looking at the ast --ir output, it looks like OptimizeDynScope pass .. since before the pass we had var(1:0) = copy(Fixnum:1)

@subbuss

This comment has been minimized.

Copy link
Contributor

commented Nov 13, 2018

This is an interesting bug. :-) So the dynscope elimination is the right thing since it doesn't have its own scope hasn't escaped and is not needed. So, getScopeDepth() can be zero and refer to outer scope vars if dynscope has been eliminated .. so, we could add an additional condition there instead of just checking scope depth.

Alternatively we could suppress elimination of dynscope if any of the ancestor scopes has escaped (as in this example), but I don't think that makes sense.

So, I am leaning to abstracting that check to ... ((LocalVariable)x).isOuterScopeVariable(scope).

@subbuss

This comment has been minimized.

Copy link
Contributor

commented Nov 13, 2018

Ha! There is already the 'isDefinedLocally' flag in ClosureLocalVariable which does exactly this (added in d8e4959) and used in Load/Store passes.

I just added something in LocalVariable instead ... I'll consolidate the two and replace all checks of getScopeDepth() > 0 or getScopeDepth() == 0 with the new method. All those other depth > 0 and depth = 0 checks are also problematic!

Commit coming later tonight.

@subbuss subbuss closed this in c558644 Nov 13, 2018

subbuss added a commit that referenced this issue Nov 13, 2018

Fix #5439: Track when a local var belongs to an outer scope.
* There already was definedLocally boolean in ClosureLocalVariable
  but I renamed it to isOuterScopeVar and moved it to LocalVariable.

  We don't consistently use ClosureLocalVariable when the local
  variable use is in a closure. So, moving it to LocalVariable
  ensures that we catch all uses independent of whether we build a
  LocalVariable or a ClosureLocalVariable.

  Separately, investigate why we did make this distincting between
  these two variables and if there is any code that critically
  depends on this distinction because as I found in this patch,
  we cannot simply assume lvar uses in closures will use a
  ClosureLocalVariable instance.

* Used isOuterScopeVar() method instead of the buggy comparison of
  a var's scope depth with 0 multiple places in the codebase.

  Replaced isDefinedLocally() with !isOuterScopeVar() wherever it
  was used.
@subbuss

This comment has been minimized.

Copy link
Contributor

commented Nov 13, 2018

@enebo Maybe check whether this patch also fixes your inliner woes.

@subbuss

This comment has been minimized.

Copy link
Contributor

commented Nov 13, 2018

I didn't run tests because I don't know / remember how to run tests anymore, but hopefully Travis will call me out if I broke something.

In addition, we should definitely add new regression tests. Can one of you do that? I will need to page in how to do that since it has been a while and I cannot do that till much later in the week.

@headius

This comment has been minimized.

Copy link
Member

commented Nov 13, 2018

@subbu_ss I'll take care of tests. Thank you for the fix!

@headius headius reopened this Nov 14, 2018

@headius

This comment has been minimized.

Copy link
Member

commented Nov 14, 2018

This does not appear to be completely solved.

Running with JRuby master, 7b14404, the following script still fails to update var before the proc is called:

def foo
  var = 0
      pr = ->(){@var = var}
      1.times do
        var = 1
        pr.call
      ensure
        var = 2
      end
end

foo

p @var

Under JIT, this outputs "0", while interpreted it outputs "1".

The IR shows that var = 1 is getting optimized away again:

block #4 (out: 5,6): CL2_LBL_0:8
  0:             line_num(lineNumber: 4)
  1:             line_num(lineNumber: 5)
  2:  %cl_2_5 := call_0o(*pr, callType: NORMAL, name: call, potentiallyRefined: false)
  3:  %cl_2_4 := copy(%cl_2_5)

block #5 (out: 11): CL2_LBL_4:16
  0:             line_num(lineNumber: 7)
  1:     *var := copy(fix<2>)
  2:             pop_binding
  3:             restore_binding_viz(%cl_2_9)
  4:             pop_block_frame(%cl_2_10)
  5:             return(%cl_2_4)

@headius headius modified the milestones: JRuby 9.2.4.0, JRuby 9.2.5.0 Nov 14, 2018

@headius

This comment has been minimized.

Copy link
Member

commented Nov 14, 2018

Here is the spec I wrote:

describe "A value written to a variable read by a proc" do
  it "is visible to that proc" do
    o = Object.new
    def o.foo
      var = 0
      pr = ->(){@var = var}
      1.times do
        var = 1
        pr.call
      ensure
        var = 2
      end
    end

    o.foo

    expect(o.instance_variable_get(:@var)).to eq(1)
  end
end
@headius

This comment has been minimized.

Copy link
Member

commented Nov 14, 2018

False alarm...my local env got switched to 9.2.1.0 somehow.

@headius headius closed this Nov 14, 2018

@headius headius modified the milestones: JRuby 9.2.5.0, JRuby 9.2.4.0 Nov 14, 2018

headius added a commit that referenced this issue Nov 14, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.