Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.Sign up
Fix or ditch trace stitching #13
Trace stitching is currently broken and disabled.
It was quite useful for some code bases with many C calls that would otherwise cause a NYI. It was not so useful for other code bases where enough effort has been put into converting the code to do C calls with the FFI (e.g. OpenResty).
Previous discussion: http://www.freelists.org/post/luajit/small-script-to-reproduce-bogus-trace-stitch-errors-at-line-0-with-coroutines-in-latest-21,1
If this is to be fixed and re-enabled, please note the stitching heuristics haven't been successfully tuned at all. The trace length turned out not to be a useful criterion, when evaluated in isolation. The -Ominstitch limit was effectively used as an on/off switch.
I've been thinking about this today because code base I work with hits a lot of "NYI: C function" aborts.
I came up with a way to fix stitching which is while being somewhat hacky - seems to be our best shot.
The core idea is to store boxed stitch link on the stack instead of storing trace number directly on the stack. There is no lightweight boxed
Every time tracer modifies
This way even if the trace dies the stitch link can live on (with a 0 inside because
There is one very dodgy part here: touching stitch box object inside
This solution has certain memory overhead (a udata with finalizer per stitched trace, plus a metatable containing that finalizer if any trace was stitched) but this overhead seems to be reasonably low and is only paid for stitched traces. Solution has a low implementation complexity and doesn't seem to have any architecture implications because it mostly reuses already existing machinery.
Alternative solution I considered was to write
I have a patch implementing this fix for x86 ready - so, Mike, if you think this approach has a reasonable ratio of cost/benefit/hackiness[*] and you don't see a mistake in my reasoning then I'll port it to all other supported platforms and send you a PR.
Otherwise, it was a fun exercise - and at least I found another bug in process :)
[*] - I personally feel that it's OK approach, though I am not happy with the fact that I have to box a 4 byte number into a box with 24 byte header (on x86).
After pondering over this for a while, it becomes clear what the real issue was: let the GC do its work and not try to second-guess it with a link number instead of a proper GC reference on the stack. Thank you for the insight!
I think putting the GCtrace object itself on the stack would be the cleanest solution. Freeing the trace needs to blacklist the link, of course.
The chicken-and-egg problem with code generation could be solved by putting a load from a non-moving location into the IR and filling in the final GCtrace later. Maybe split
Yep, I completely agree with your assessment that putting
The only draw back it has, as far as I see it, is that continuations pointing to these almost dead traces will retain more memory compared to a separate stitch link box approach: whole
I will give it a shot and send you a PR.
Do you mean flushing? Freeing (as in