Stack vs. register based VMs #8

madmalik · 2018-04-26T16:11:43Z

These are the two prevalent implementation techniques for virtual machines. I'd like to collect some resources about the trade offs.

On a high level:

Stack machines are simpler to implement
Stack machine instructions are shorter (often one byte) since they use have operands. This may get muddled by embedded constants. Java for example has variable instructions in practice for this reason.
Stack machines need more instructions because they have to do stack management. That means the interpreter loop overhead is more noticeable.
Register machines may be more prone to leak implementation details into the hosted programming language. For example, LUA has a limit of 200 registers per stack frame, which directly corresponds to 200 local variables per function.
It's said that register machines are faster in general. That seems plausible to me because of the higher amount of work per instruction, but i don't have any real data about that.
i could imagine (now it gets really wonky...) that it's easier to JIT register instructions, but that may be totally wrong and also very dependent on other implementation details.

Examples:

Stack based VMs

CPython
Ruby
JVM

Register Based VM

Lua
LuaJIT
Dalvik

Alternatives:

There are Belt Machines. I don't of any practical VMs that uses it, i came across the concept via the MILL CPU architecture. May be worth evaluating.

uazu · 2018-04-26T17:13:50Z

Another question is whether the compiled bytecode (or word-code, etc) is intended for distribution or just for running in memory. There are different trade-offs. Stack-based is more compact for distribution. Also see WASM for a "structured stack machine" variant designed for JITting.

madmalik · 2018-04-28T19:58:00Z

Stack-based is more compact for distribution.

I'd be interesting how variable width register bytecode would fare in that regard and if there'd be a performance penalty for that approach.

cipriancraciun · 2018-05-04T14:15:08Z

I think we must not exclude another important technique for writing VM's, that is: AST (or similar) based.

I say this technique is important, mainly because many "young" or "simple" implementations choose to use this approach, especially since it is trivial and requires almost no additional effort or knowledge. For example when I investigated Scheme/Lisp implementations is Rust, almost two thirds were AST-based (I counted all independent on "maturity").

Under this AST-based VM category I think there are two main approaches, mainly based on their "efficiency":

actual parser AST evaluating VM's, usually recursive;
"expression" based evaluating VM's, ones that have a compiler that translates parser AST into a more "low-level" expression based form;

For example in my Scheme implementation ( https://github.com/volution/vonuvoli-scheme ) I took the second, expression-based evaluator, mainly from two reasons:

safety, as one can't misinterpret byte-codes or counts of arguments from the stack; moreover the Rust compiler actually makes many of the sanity checks;
it can be used as an intermediary language between the compiler and any other form of VM; (in fact kind of how Rust uses LLVM;)

The expression structures can be seen at the following link:
https://github.com/volution/vonuvoli-scheme/blob/development/sources/expressions.rs

On a second thought, perhaps there is a forth kind of VM's: the "transpiled" ones, i.e. the hosted language generates Rust code which is the compiled and executed; (or by extension the hosted language can be "transpiled" to any other hosted language and run;)

Although this forth category isn't actually a VM, my take on this thread is that it tries to see "how" one can actually evaluate the hosted code.

madmalik · 2018-05-06T09:02:01Z

that is: AST (or similar) based.

I excluded them because i was only thinking of what you called the first approach. Literal AST walking is just too slow and cumbersome since a parser AST is not made to be executed.

The second approach seems to be a lot more interesting. Imo, the lines to actual bytecode are quite blurry. The more optimized and low level the expressions based form becomes, the more i'll look like bytecode.

As i see it it's more about the construction of the intermediate form than an inherent difference in abstraction. De- and encoding expressions/operations by hand is probably more compact and there is more control, but it's clear that using standard rust enums and structs is simpler and safer. The real performance difference remains to be seen.

Also, if the hot paths are jitted anyway, the baseline interpreter performance may become less important.

cipriancraciun · 2018-05-06T15:44:39Z

Imo, the lines to actual bytecode are quite blurry. The more optimized and low level the expressions based form becomes, the more i'll look like bytecode.

Initially (when I started implementing my interpreter) I agreed with your definition of bytecode, i.e. anything that is "constructed" from the AST, but which is not native code.

However, after a few months I changed my opinion, and consider this "expression" based interpreters a middle-ground between AST and bytecode, neither one nor the other. Definitively better than AST, but clearly less efficient than bytecode.

The real performance difference remains to be seen.

I have compared my interpreter performance with Chibi-Scheme, which is C-based and does feature an actual bytecode VM, and the difference is clear: their bytecode-VM is a few times faster than my expression-VM for "CPU-intensive expressions" (i.e. arithmetic, looping, etc.). (Looking at their code, except the VM, I would say the overhead should be similar, therefore I would consider this a "fair" comparison.)

Also, if the hot paths are jitted anyway, the baseline interpreter performance may become less important.

Indeed I agree with what you have pointed above, because when I am comparing "real-life" Scheme code (like for example re-implementing a simple cat or find tool), my implementation is almost 20x faster, mainly due to the fact that I implement all the builtin functionality in Rust, meanwhile they implement the majority of builtin functionality in Scheme (and thus interpreted).

Therefore my take for anyone wanting to implement hosted languages is: instead of investing a lot of time to design and implement a bytecode-VM (either stack or register based), one could use that effort and implement more efficient and extensive builtin functionalities (i.e. utility functions and syntaxes) directly in Rust, so that the majority of the CPU time is spent in doing actual "work" instead of "interpreting" the bytecode. (Because no matter how efficient a VM is, it can't beat native code.) :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stack vs. register based VMs #8

Stack vs. register based VMs #8

madmalik commented Apr 26, 2018

uazu commented Apr 26, 2018

madmalik commented Apr 28, 2018

cipriancraciun commented May 4, 2018

madmalik commented May 6, 2018

cipriancraciun commented May 6, 2018

Stack vs. register based VMs #8

Stack vs. register based VMs #8

Comments

madmalik commented Apr 26, 2018

uazu commented Apr 26, 2018

madmalik commented Apr 28, 2018

cipriancraciun commented May 4, 2018

madmalik commented May 6, 2018

cipriancraciun commented May 6, 2018