New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

case/when eqq calls do not cache #3513

Closed
headius opened this Issue Dec 3, 2015 · 7 comments

Comments

Projects
None yet
2 participants
@headius
Member

headius commented Dec 3, 2015

Perhaps not a major loss but eqq calls in case/when do not do any caching currently, in either the interpreter or the JIT.

case foo; when Bar; end
    %v_3 = call_0o(%self ;n:foo, t:VA, cl:false)
    %v_6 = search_const(scope<0> ;name: Bar, no_priv: false)
    %v_5 = eqq(%v_6, %v_3)
    b_true(LBL_2:11, %v_5)

EqqInstr and its equivalent code in the JIT do not do any caching, and just call IRRuntimeHelpers.isEQQ.

@enebo

This comment has been minimized.

Show comment
Hide comment
@enebo

enebo Dec 3, 2015

Member

Seems like making a SearchConstantSite like InvokeSite would solve this wouldn't it? For some reason I thought there was already some caching of constants. Or was that not for SearchConstantInstr?

Member

enebo commented Dec 3, 2015

Seems like making a SearchConstantSite like InvokeSite would solve this wouldn't it? For some reason I thought there was already some caching of constants. Or was that not for SearchConstantInstr?

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Dec 5, 2015

Member

EqqInstr is only used by case/when right now, I believe. It's basically the === call plus isTrue in one operation. We can easily make it into a cached call in both interpreter and jit without changing it.

At some point we'll want to explore optimizing case/when itself for homogeneous types, like all symbols or all fixnums. I think that will have to happen in the builder, creating a special jump table instruction that can do O(1) jumps.

Member

headius commented Dec 5, 2015

EqqInstr is only used by case/when right now, I believe. It's basically the === call plus isTrue in one operation. We can easily make it into a cached call in both interpreter and jit without changing it.

At some point we'll want to explore optimizing case/when itself for homogeneous types, like all symbols or all fixnums. I think that will have to happen in the builder, creating a special jump table instruction that can do O(1) jumps.

@enebo

This comment has been minimized.

Show comment
Hide comment
@enebo

enebo Dec 5, 2015

Member

oh the caching of === itself. Sorry I misread that. This might be over simplistic but why is this even its own instruction? Seems like we could have just made this a callInstr for ===. There is something with undefined I don't get (so maybe it is more special than a call) but this would eliminate an instr and get caching for free. It strikes me simple specialization of calls for very specific calls should just be emitted differently in the JIT (unless it is much faster in interp too in which case the extra complexity of having an instr is worth it).

As you say we also want to optimize monotyped case/whens but that will need to happen earlier (in Builder or a pass).

Member

enebo commented Dec 5, 2015

oh the caching of === itself. Sorry I misread that. This might be over simplistic but why is this even its own instruction? Seems like we could have just made this a callInstr for ===. There is something with undefined I don't get (so maybe it is more special than a call) but this would eliminate an instr and get caching for free. It strikes me simple specialization of calls for very specific calls should just be emitted differently in the JIT (unless it is much faster in interp too in which case the extra complexity of having an instr is worth it).

As you say we also want to optimize monotyped case/whens but that will need to happen earlier (in Builder or a pass).

@enebo

This comment has been minimized.

Show comment
Hide comment
@enebo

enebo Dec 5, 2015

Member

ok I looked at this briefly. eqqinstr has some extra semantics beyond === but they are pretty simple to work around in IRBuilder. If rhs of === is undefined it represents a valueless case/when which is for truthiness. In this case I think we can build something explicit with btrue and then for ordinary case we emit the case with a callinstr of ===.

The secondary concern with this change is ability to detect case if we want to optimize cases in a compiler pass. I would argue ther is not a huge difference in an if/else pattern which has eqq and one which has callinstr of ===. The callinstr version will be a slightly more complicated check but it will also be slightly more generic.

So I think removing eqqinstr in favor of a callinstr will:

  1. Remove an instr (yay)
  2. Allow the callinstr to cache (this bug)
  3. Not inhibit our ability to optimize case statements later.
Member

enebo commented Dec 5, 2015

ok I looked at this briefly. eqqinstr has some extra semantics beyond === but they are pretty simple to work around in IRBuilder. If rhs of === is undefined it represents a valueless case/when which is for truthiness. In this case I think we can build something explicit with btrue and then for ordinary case we emit the case with a callinstr of ===.

The secondary concern with this change is ability to detect case if we want to optimize cases in a compiler pass. I would argue ther is not a huge difference in an if/else pattern which has eqq and one which has callinstr of ===. The callinstr version will be a slightly more complicated check but it will also be slightly more generic.

So I think removing eqqinstr in favor of a callinstr will:

  1. Remove an instr (yay)
  2. Allow the callinstr to cache (this bug)
  3. Not inhibit our ability to optimize case statements later.

headius added a commit that referenced this issue Dec 7, 2015

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Dec 7, 2015

Member

I pushed a change to at least include a call site in the eqq logic.

before:

                                               user     system      total        real
1m x5 cases, 1 fixnum when                 0.130000   0.010000   0.140000 (  0.132667)
1m x5 cases, 1 3-arg fixnum when           0.290000   0.000000   0.290000 (  0.285787)
1m x5 cases, 10 fixnum whens               1.320000   0.000000   1.320000 (  1.180070)
1m x5 cases, 10 one-char str whens         1.920000   0.010000   1.930000 (  1.891116)
1m x5 cases, 10 multi-char str whens       2.030000   0.020000   2.050000 (  1.995256)
1m x5 cases, 10 one-char sym whens         1.120000   0.000000   1.120000 (  1.125812)
1m x5 cases, 10 multi-char sym whens       1.130000   0.000000   1.130000 (  1.131690)
1m x5 cases, nil-false-true-else           0.570000   0.000000   0.570000 (  0.569903)
1m x5 cases, 9 fixnum 1 custom             1.290000   0.000000   1.290000 (  1.284829)

after:

                                               user     system      total        real
1m x5 cases, 1 fixnum when                 0.090000   0.000000   0.090000 (  0.091403)
1m x5 cases, 1 3-arg fixnum when           0.260000   0.000000   0.260000 (  0.252550)
1m x5 cases, 10 fixnum whens               0.750000   0.000000   0.750000 (  0.744448)
1m x5 cases, 10 one-char str whens         1.570000   0.010000   1.580000 (  1.532922)
1m x5 cases, 10 multi-char str whens       1.670000   0.010000   1.680000 (  1.628295)
1m x5 cases, 10 one-char sym whens         0.640000   0.000000   0.640000 (  0.636944)
1m x5 cases, 10 multi-char sym whens       0.630000   0.000000   0.630000 (  0.630300)
1m x5 cases, nil-false-true-else           0.510000   0.000000   0.510000 (  0.508370)
1m x5 cases, 9 fixnum 1 custom             1.000000   0.000000   1.000000 (  1.003131)

This helps basically all cases where the cache is hit every time.

We're still rather far off from the optimized cases in 1.7 though (the O(1) hacks I made):

                                               user     system      total        real
1m x5 cases, 1 fixnum when                 0.050000   0.000000   0.050000 (  0.053000)
1m x5 cases, 1 3-arg fixnum when           0.110000   0.000000   0.110000 (  0.058000)
1m x5 cases, 10 fixnum whens               0.060000   0.010000   0.070000 (  0.060000)
1m x5 cases, 10 one-char str whens         0.100000   0.000000   0.100000 (  0.091000)
1m x5 cases, 10 multi-char str whens       1.670000   0.010000   1.680000 (  1.630000)
1m x5 cases, 10 one-char sym whens         0.070000   0.000000   0.070000 (  0.066000)
1m x5 cases, 10 multi-char sym whens       0.870000   0.000000   0.870000 (  0.877000)
1m x5 cases, nil-false-true-else           0.470000   0.000000   0.470000 (  0.471000)
1m x5 cases, 9 fixnum 1 custom             1.140000   0.010000   1.150000 (  1.138000)

I have no idea how frequently these simple cases (fixnums, one-char str, one-char sym) come up in the real world, though.

Member

headius commented Dec 7, 2015

I pushed a change to at least include a call site in the eqq logic.

before:

                                               user     system      total        real
1m x5 cases, 1 fixnum when                 0.130000   0.010000   0.140000 (  0.132667)
1m x5 cases, 1 3-arg fixnum when           0.290000   0.000000   0.290000 (  0.285787)
1m x5 cases, 10 fixnum whens               1.320000   0.000000   1.320000 (  1.180070)
1m x5 cases, 10 one-char str whens         1.920000   0.010000   1.930000 (  1.891116)
1m x5 cases, 10 multi-char str whens       2.030000   0.020000   2.050000 (  1.995256)
1m x5 cases, 10 one-char sym whens         1.120000   0.000000   1.120000 (  1.125812)
1m x5 cases, 10 multi-char sym whens       1.130000   0.000000   1.130000 (  1.131690)
1m x5 cases, nil-false-true-else           0.570000   0.000000   0.570000 (  0.569903)
1m x5 cases, 9 fixnum 1 custom             1.290000   0.000000   1.290000 (  1.284829)

after:

                                               user     system      total        real
1m x5 cases, 1 fixnum when                 0.090000   0.000000   0.090000 (  0.091403)
1m x5 cases, 1 3-arg fixnum when           0.260000   0.000000   0.260000 (  0.252550)
1m x5 cases, 10 fixnum whens               0.750000   0.000000   0.750000 (  0.744448)
1m x5 cases, 10 one-char str whens         1.570000   0.010000   1.580000 (  1.532922)
1m x5 cases, 10 multi-char str whens       1.670000   0.010000   1.680000 (  1.628295)
1m x5 cases, 10 one-char sym whens         0.640000   0.000000   0.640000 (  0.636944)
1m x5 cases, 10 multi-char sym whens       0.630000   0.000000   0.630000 (  0.630300)
1m x5 cases, nil-false-true-else           0.510000   0.000000   0.510000 (  0.508370)
1m x5 cases, 9 fixnum 1 custom             1.000000   0.000000   1.000000 (  1.003131)

This helps basically all cases where the cache is hit every time.

We're still rather far off from the optimized cases in 1.7 though (the O(1) hacks I made):

                                               user     system      total        real
1m x5 cases, 1 fixnum when                 0.050000   0.000000   0.050000 (  0.053000)
1m x5 cases, 1 3-arg fixnum when           0.110000   0.000000   0.110000 (  0.058000)
1m x5 cases, 10 fixnum whens               0.060000   0.010000   0.070000 (  0.060000)
1m x5 cases, 10 one-char str whens         0.100000   0.000000   0.100000 (  0.091000)
1m x5 cases, 10 multi-char str whens       1.670000   0.010000   1.680000 (  1.630000)
1m x5 cases, 10 one-char sym whens         0.070000   0.000000   0.070000 (  0.066000)
1m x5 cases, 10 multi-char sym whens       0.870000   0.000000   0.870000 (  0.877000)
1m x5 cases, nil-false-true-else           0.470000   0.000000   0.470000 (  0.471000)
1m x5 cases, 9 fixnum 1 custom             1.140000   0.010000   1.150000 (  1.138000)

I have no idea how frequently these simple cases (fixnums, one-char str, one-char sym) come up in the real world, though.

@enebo

This comment has been minimized.

Show comment
Hide comment
@enebo

enebo Dec 8, 2015

Member

This is fine but I am now liking the idea of killing eqqinstr but I guess that will just end up being lots of code deletions...

Member

enebo commented Dec 8, 2015

This is fine but I am now liking the idea of killing eqqinstr but I guess that will just end up being lots of code deletions...

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Mar 7, 2016

Member

I'm going to mark this fixed and open a separate bug for ongoing optimizations.

Member

headius commented Mar 7, 2016

I'm going to mark this fixed and open a separate bug for ongoing optimizations.

@headius headius closed this Mar 7, 2016

@headius headius added this to the JRuby 9.1.0.0 milestone Mar 7, 2016

@headius headius referenced this issue Apr 20, 2016

Open

Remaining 2.3 checklist items not in JRuby 9.1 #3816

8 of 23 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment