-
Notifications
You must be signed in to change notification settings - Fork 931
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Internal state corruption #624
Comments
FYI latest v2.1 HEAD e9af1ab built as:
|
I can get the test case down to 3.2K lines at present, but it still classifies as huge. My leading hypothesis right now is that In support of that hypothesis, the following workaround seems to mask the problem for me: diff --git a/src/lj_parse.c b/src/lj_parse.c
index 3ae05446e3..97ce5adf70 100644
--- a/src/lj_parse.c
+++ b/src/lj_parse.c
@@ -1331,6 +1331,7 @@ static void fs_fixup_bc(FuncState *fs, GCproto *pt, BCIns *bc, MSize n)
fs->framesize, 0);
for (i = 1; i < n; i++)
bc[i] = base[i].ins;
+ bc[n] = BCINS_AD(BC_RET0, 0, 1);
}
/* Fixup upvalues for child prototype, step #2. */
@@ -1576,7 +1577,7 @@ static GCproto *fs_finish(LexState *ls, BCLine line)
fs_fixup_ret(fs);
/* Calculate total size of prototype including all colocated arrays. */
- sizept = sizeof(GCproto) + fs->pc*sizeof(BCIns) + fs->nkgc*sizeof(GCRef);
+ sizept = sizeof(GCproto) + (fs->pc+1)*sizeof(BCIns) + fs->nkgc*sizeof(GCRef);
sizept = (sizept + sizeof(TValue)-1) & ~(sizeof(TValue)-1);
ofsk = sizept; sizept += fs->nkn*sizeof(TValue);
ofsuv = sizept; sizept += ((fs->nuv+1)&~1)*2; Alternatively, the following catches the problem with an assertion failure very early, before it gets a chance to corrupt further state: diff --git a/src/lj_meta.c b/src/lj_meta.c
index f6e6d46a1d..5b3ea666ef 100644
--- a/src/lj_meta.c
+++ b/src/lj_meta.c
@@ -426,6 +426,7 @@ void lj_meta_istype(lua_State *L, BCReg ra, BCReg tp)
{
L->top = curr_topL(L);
ra++; tp--;
+ lj_assertL(tp <= 13, "bad type %d", (int)tp);
lj_assertL(LJ_DUALNUM || tp != ~LJ_TNUMX, "bad type for ISTYPE");
if (LJ_DUALNUM && tp == ~LJ_TNUMX) lj_lib_checkint(L, ra);
else if (tp == ~LJ_TNUMX+1) lj_lib_checknum(L, ra);
diff --git a/src/lj_parse.c b/src/lj_parse.c
index 3ae05446e3..39beb656db 100644
--- a/src/lj_parse.c
+++ b/src/lj_parse.c
@@ -1331,6 +1331,7 @@ static void fs_fixup_bc(FuncState *fs, GCproto *pt, BCIns *bc, MSize n)
fs->framesize, 0);
for (i = 1; i < n; i++)
bc[i] = base[i].ins;
+ bc[n] = BCINS_AD(BC_ISTYPE, 0, 123);
}
/* Fixup upvalues for child prototype, step #2. */
@@ -1576,7 +1577,7 @@ static GCproto *fs_finish(LexState *ls, BCLine line)
fs_fixup_ret(fs);
/* Calculate total size of prototype including all colocated arrays. */
- sizept = sizeof(GCproto) + fs->pc*sizeof(BCIns) + fs->nkgc*sizeof(GCRef);
+ sizept = sizeof(GCproto) + (fs->pc+1)*sizeof(BCIns) + fs->nkgc*sizeof(GCRef);
sizept = (sizept + sizeof(TValue)-1) & ~(sizeof(TValue)-1);
ofsk = sizept; sizept += fs->nkn*sizeof(TValue);
ofsuv = sizept; sizept += ((fs->nuv+1)&~1)*2; |
Well, I haven't gotten very far with this one. What does help in reproducing it, is to turn off ASLR system-wide, turn off all VM randomizations in Also tried with Valgrind, but that only exposed a false positive ( |
Further to my hypothesis, I added an extra assert to catch things a few steps earlier: diff --git a/src/lj_snap.c b/src/lj_snap.c
index f1358cf29b..4ec535f8f1 100644
--- a/src/lj_snap.c
+++ b/src/lj_snap.c
@@ -121,6 +121,10 @@ static MSize snapshot_framelinks(jit_State *J, SnapEntry *map, uint8_t *topslot)
MSize f = 0;
map[f++] = SNAP_MKPC(J->pc); /* The current PC is always the first entry. */
#endif
+ if (J->pt) {
+ lj_assertJ(J->pc >= proto_bc(J->pt) && J->pc < proto_bc(J->pt) + J->pt->sizebc,
+ "PC out of range");
+ }
while (frame > lim) { /* Backwards traversal of all frames above base. */
if (frame_islua(frame)) {
#if !LJ_FR2 This assert is subsequently tripped as so:
The trace 18 referenced by the
The function in question is There is some logic in if (bc_op(*pc) == BC_JLOOP) {
BCIns *retpc = &traceref(J, bc_d(*pc))->startins;
if (bc_isret(bc_op(*retpc))) {
if (J->state == LJ_TRACE_RECORD) {
...
} else {
pc = retpc;
setcframe_pc(cf, pc);
}
}
} This logic requires that the diff --git a/src/lj_record.c b/src/lj_record.c
index 9e41ce0562..4e091e9400 100644
--- a/src/lj_record.c
+++ b/src/lj_record.c
@@ -571,10 +571,10 @@ static LoopEvent rec_iterl(jit_State *J, const BCIns iterins)
}
/* Record LOOP/JLOOP. Now, that was easy. */
-static LoopEvent rec_loop(jit_State *J, BCReg ra)
+static LoopEvent rec_loop(jit_State *J, BCReg ra, int adv)
{
if (ra < J->maxslot) J->maxslot = ra;
- J->pc++;
+ J->pc += adv;
return LOOPEV_ENTER;
}
@@ -2424,7 +2424,7 @@ void lj_record_ins(jit_State *J)
rec_loop_interp(J, pc, rec_iterl(J, *pc));
break;
case BC_LOOP:
- rec_loop_interp(J, pc, rec_loop(J, ra));
+ rec_loop_interp(J, pc, rec_loop(J, ra, 1));
break;
case BC_JFORL:
@@ -2434,7 +2434,7 @@ void lj_record_ins(jit_State *J)
rec_loop_jit(J, rc, rec_iterl(J, traceref(J, rc)->startins));
break;
case BC_JLOOP:
- rec_loop_jit(J, rc, rec_loop(J, ra));
+ rec_loop_jit(J, rc, rec_loop(J, ra, !bc_isret(bc_op(traceref(J, rc)->startins))));
break;
case BC_IFORL: |
Fixed. Thanks Arseny and Peter! |
If this has been fixed, please also rehabilitate issue #615 for integrity sake by restoring its title and removing the "invalid" label. Thank you! |
Reported by Arseny Vakhrushev. Analysis and fix contributed by Peter Cawley. (cherry picked from commit ff1e72a) Sergey Bronnikov: * added the description and the test for the problem Part of tarantool/tarantool#8825 ------------ Related issues: * LuaJIT#611 * LuaJIT#615 * LuaJIT#624 LuaJIT@ff1e72ac tarantool/tarantool#8825 https://github.com/tarantool/tarantool/wiki/Vanilla-LuaJIT-sync-status
Reported by Arseny Vakhrushev. Analysis and fix contributed by Peter Cawley. (cherry picked from commit ff1e72a) Sergey Bronnikov: * added the description and the test for the problem Part of tarantool/tarantool#8825 ------------ Related issues: * LuaJIT#611 * LuaJIT#615 * LuaJIT#624 LuaJIT@ff1e72ac tarantool/tarantool#8825 https://github.com/tarantool/tarantool/wiki/Vanilla-LuaJIT-sync-status
Hello,
This is our second (and hopefully final) attempt to report an issue with LuaJIT 2.1 that leads to all sorts of buggy behaviour. The first attempt #615 was regretfully rejected without due attention. We've since further reduced the test case down to about 4.5K lines of code and deliberately got rid of the techniques in question to illustrate that the issue has nothing to do with them. In particular, the code:
require()
,package.*
, etc.;dofile()
,loadfile()
,loadstring()
,load()
functions;debug.getfenv()/debug.setfenv()
to get/change a coroutine's environment;debug.traceback()
to get stack traces.We seem to be unable to make it any better as any major movement or even renaming functions/fields hides the bug away. The code itself obviously makes no sense and serves solely as a way to reliably reproduce the issue without much effort.
Link to the test case --> test.zip
By simply running
luajit-2.1.0-beta3 test.lua
, one can observe all kinds of beautiful things like:Some familiar/new crash traces to get you started:
The text was updated successfully, but these errors were encountered: