Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert nested interpreter calls into a single loop #20

merged 10 commits into from
Feb 2, 2019


Copy link

Previously, each frame of a Lua function was run in its own interpreter. While this made the implementation simpler (and possibly more elegant), it did mean the Lua stack depth was bound by Java's (which wasn't very high, due to the 3-4 calls needed to execute a function).

Instead, we convert an interpreter into a nested loop, with the outer one setting up the function from the current frame, and the inner one acting mostly the same as before. However, instead of using OperationHelper.invoke/ for Lua functions, we'll push a new frame and run the interpreter that way.

We don't currently support running metamethods inside the same interpreter, but I think that is acceptable - Lua is the same, and it would mean we have to inline most of OperationHelper, which bloats the interpreter even more.

As far as performances goes, it's rather hard to tell. I've tried to run an awful lot of benchmarks, but they're a) hazy at best and b) some have improved, some have got worse. It appears that call-heavy code has become worse, though I'm not entirely sure why - it's something I'd like to resolve before merging, but is not entirely a blocker.

 - Replace most constants with those from Lua.*
 - Make branching instructions consume the jump target from the next
   instruction too. This is a minor optimisation, but is allowed by the
   Lua VM.
Our value of Short.MAX_VALUE is pretty arbitrary, but is pretty close to
Lua's limit with an average-large (30) number of variables.
I'm not entirely convinced this is a "correct" optimisation, and really
doesn't make a difference in practice.
While we may lose out on some performance gain from them previously
being intrinsics, the throwing of ArithemticException was causing
serious performance regressions in very specific code paths.

The included code (primes.lua) would cause an overflow (throwing an
exception), and then modulus it (making it wrap back to an integer
again). As a result, this code ran in over 14s (almost instantly
after this patch and on master).
As the interpreter was the only thing which actually created or consumed
tail calls, and this is handled internally now, we no longer need this.
A wee bit ugly, but should still be faster than coercing to and from a
@SquidDev SquidDev merged commit e380d8e into master Feb 2, 2019
@SquidDev SquidDev deleted the feature/one-interpreter branch February 2, 2019 09:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet

Successfully merging this pull request may close these issues.

None yet

1 participant