Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Double-emitting of IR_NEWREF for the same key on the snap replay #1128

Closed
Buristan opened this issue Dec 4, 2023 · 3 comments
Closed

Double-emitting of IR_NEWREF for the same key on the snap replay #1128

Buristan opened this issue Dec 4, 2023 · 3 comments

Comments

@Buristan
Copy link

Buristan commented Dec 4, 2023

The following script prints incorrect results with JIT enabled:

local string_len = string.len

local take_side
local function trace(str)
  local tab = {}
  tab.key = -1
  tab.key = string_len(str)
  if take_side then end
  return tab.key
end

-- Uncompiled function to end up side trace here.
local function trace_wp(str)
  return trace(str)
end
jit.off(trace_wp)

jit.opt.start('hotloop=1', 'hotexit=1', 'tryside=1')

trace_wp('')
trace_wp('')
take_side = true
trace_wp('')
trace_wp('')

print(trace(''))

Returns -1 instead of 0.

The trace dump is the following:

---- TRACE 1 start (command line):5
---- TRACE 1 IR
....        SNAP   #0   [ ---- ---- ]
0001 }  tab TNEW   #0    #0
0002 }  p32 NEWREF 0001  "key"
0003 }  num HSTORE 0002  -1
....        SNAP   #1   [ ---- ---- 0001 -1   ]
0004    fun SLOAD  #0    R
0005 >  fun EQ     0004  (command line):5
0006 >  str SLOAD  #1    T
0007    int FLOAD  0006  str.len
0008    num CONV   0007  num.int
0009 }  num HSTORE 0002  0008
....        SNAP   #2   [ (command line):5|---- 0001 0007 ]
0010 >  int UREFO  (command line):5  #1
0011    p32 SUB    0010  0000
0012 >  p32 UGT    0011  +32
0013 >  nil ULOAD  0010
....        SNAP   #3   [ (command line):5|---- ---- 0008 ]
---- TRACE 1 stop -> return

---- TRACE 2 start 1/2 (command line):9
---- TRACE 2 IR
0001    int SLOAD  #3    PI
0002    num PVAL   #8
0003    tab TNEW   #0    #0
0004    p32 NEWREF 0003  "key"
0005    num HSTORE 0004  -1
0006    p32 NEWREF 0003  "key"
0007    num HSTORE 0006  0002
....        SNAP   #0   [ (command line):5|---- 0003 0001 ]
0008 >  nil GCSTEP
0009    num CONV   0001  num.int
....        SNAP   #1   [ (command line):5|---- 0003 0009 ]
---- TRACE 2 stop -> interpreter

As we can see, snapshot replaying emits NEWREF twice for the same key.

As a possible approach, we can scan backwards already emitted NEWREF and take the one with the necessary operands:

diff --git a/src/lj_snap.c b/src/lj_snap.c
index 68de208f..52e1e165 100644
--- a/src/lj_snap.c
+++ b/src/lj_snap.c
@@ -624,9 +624,25 @@ void lj_snap_replay(jit_State *J, GCtrace *T)
                 if (irr->o == IR_HREFK || irr->o == IR_AREF) {
                   IRIns *irf = &T->ir[irr->op1];
                   tmp = emitir(irf->ot, tmp, irf->op2);
+                } else if (irr->o == IR_NEWREF) {
+                  IRRef allocref = tref_ref(tr);
+                  IRRef keyref = tref_ref(key);
+                  IRRef ref = J->chain[IR_NEWREF];
+                  lj_assertJ(irref_isk(keyref),
+                             "sunk store for parent IR %04d with bad key %04d",
+                             refp - REF_BIAS, keyref - REF_BIAS);
+                  while (ref > allocref) {
+                    IRIns *newref = &J->cur.ir[ref];
+                    if (allocref == newref->op1 && newref->op2 == keyref) {
+                      tmp = ref;
+                      goto skip_newref;
+                    }
+                    ref = newref->prev;
+                  }
                 }
               }
               tmp = emitir(irr->ot, tmp, key);
+skip_newref:
               val = snap_pref(J, T, map, nent, seen, irs->op2);
               if (val == 0) {
                 IRIns *irc = &T->ir[irs->op2];

After the patch, the resulting IRs, as expected, are the following:

---- TRACE 2 start 1/2 (command line):9
---- TRACE 2 IR
0001    int SLOAD  #3    PI
0002    num PVAL   #8
0003    tab TNEW   #0    #0
0004    p32 NEWREF 0003  "key"
0006    num HSTORE 0004  0002
....        SNAP   #0   [ (command line):5|---- 0003 0001 ]
0007 >  nil GCSTEP
0008    num CONV   0001  num.int
....        SNAP   #1   [ (command line):5|---- 0003 0008 ]
---- TRACE 2 stop -> interpreter

I can't come up with counter-examples for this approach:

  • Non-constant key stores emit a snapshot after, so we can't be hit with aliasing entries.
  • table.clear() calls aren't emitted in the side trace head.
@corsix
Copy link

corsix commented Dec 8, 2023

Slightly simpler test case:

local t2
for i = 1, 75 do
  local t = {}
  t.k = false
  t.k = i > 0
  if i > 60 then t2 = t end
end
for k, v in pairs(t2) do assert(v, k) end

I also think that the patch can be simplified; allocref == newref->op1 should always be true (because of ref > allocref combined with the structure of what lj_snap_replay emits), and the while should be an if (because we should never need to look beyond the most recent NEWREF, and also it isn't safe to use anything prior to the most recent NEWREF, as NEWREF can trigger a rehash and thereby invalidate all prior results), at which point ref = newref->prev; can be dropped and some of your variables become single-use.

While playing around with test case minimisation, I also spotted what looks like an optimisation opportunity for table.new calls with constant arguments:

diff --git a/src/lj_ffrecord.c b/src/lj_ffrecord.c
index 1233e5f74e..4243098f1f 100644
--- a/src/lj_ffrecord.c
+++ b/src/lj_ffrecord.c
@@ -1444,6 +1456,15 @@ static void LJ_FASTCALL recff_table_new(jit_State *J, RecordFFData *rd)
 {
   TRef tra = lj_opt_narrow_toint(J, J->base[0]);
   TRef trh = lj_opt_narrow_toint(J, J->base[1]);
+  if (tref_isk(tra) && tref_isk(trh)) {
+    int32_t a = IR(tref_ref(tra))->i;
+    if (a < 0x7fff) {
+      uint32_t hbits = hsize2hbits(IR(tref_ref(trh))->i);
+      a = a > 0 ? a+1 : 0;
+      J->base[0] = emitir(IRTG(IR_TNEW, IRT_TAB), (uint32_t)a, hbits);
+      return;
+    }
+  }
   J->base[0] = lj_ir_call(J, IRCALL_lj_tab_new_ah, tra, trh);
   UNUSED(rd);
 }

@Buristan
Copy link
Author

Buristan commented Dec 9, 2023

Hi, Peter!
Thanks for your comments!
You are right. The first check is totally excessive since we replay all values for the first allocation, all for the second, etc. So, I added the corresponding assertion instead. For now, yes, there is no need in while since these allocations can't be sunk.
I've added here the updated version.

diff --git a/src/lj_snap.c b/src/lj_snap.c
index 68de208f..dc31524a 100644
--- a/src/lj_snap.c
+++ b/src/lj_snap.c
@@ -624,9 +624,25 @@ void lj_snap_replay(jit_State *J, GCtrace *T)
                 if (irr->o == IR_HREFK || irr->o == IR_AREF) {
                   IRIns *irf = &T->ir[irr->op1];
                   tmp = emitir(irf->ot, tmp, irf->op2);
+                } else if (irr->o == IR_NEWREF) {
+                  IRRef allocref = tref_ref(tr);
+                  IRRef keyref = tref_ref(key);
+                  IRRef newref_ref = J->chain[IR_NEWREF];
+                  IRIns *newref = &J->cur.ir[newref_ref];
+                  lj_assertJ(irref_isk(keyref),
+                             "sunk store for parent IR %04d with bad key %04d",
+                             refp - REF_BIAS, keyref - REF_BIAS);
+                  if (newref_ref > allocref && newref->op2 == keyref) {
+                    lj_assertJ(newref->op1 == allocref,
+                               "sunk store for parent IR %04d with bad tab %04d",
+                               refp - REF_BIAS, allocref - REF_BIAS);
+                    tmp = newref_ref;
+                    goto skip_newref;
+                  }
                 }
               }
               tmp = emitir(irr->ot, tmp, key);
+skip_newref:
               val = snap_pref(J, T, map, nent, seen, irs->op2);
               if (val == 0) {
                 IRIns *irc = &T->ir[irs->op2];

MikePall pushed a commit that referenced this issue Dec 10, 2023
Thanks to Sergey Kaplun and Peter Cawley. #1128
MikePall pushed a commit that referenced this issue Dec 10, 2023
@MikePall
Copy link
Member

Fixed. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants