New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Replace eval with bytecode-based VM #3307
base: main
Are you sure you want to change the base?
Conversation
crates/typst/src/vm/opcodes.rs
Outdated
return f(); | ||
|
||
#[cfg(not(target_arch = "wasm32"))] | ||
stacker::maybe_grow(32 * 1024, 2 * 1024 * 1024, f) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does while
loop use a lot of stack?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It create a scope which allocates ~10 kB of stack space, I don't know (yet) whether the stackers are needed, I'll see as I go along
Please forgive the horrendous commit history, VSCode was having a seizure it seems as I was trying to merge 👀 |
The aim of this PR is to completely replace the
eval
module with two new ones:compiler
andvm
.Motivation
Currently, the cycle of compilation of a typst document is the
parse-eval-typeset-export
pipeline as can be seen in the figure below. The document is first parsed, then evaluated a first time, then the document is typeset1, finally the document is exported (to PDF in this example). In principle, this is all well and fine, after all, one would expect evaluation to be slow, or at least slower than Python maybe, but that would not be consequential since the actual layout, text shaping, etc. would be the slowest part. But as it turns out, this could not be further from the truth. Indeed, with the rise of awesome packages like tablex by @PgBiel, Cetz by @johannes-wolf, and others, the demand on the evaluation subsystem has grown.Additionally, as typst moves more and more towards doing processing and displaying data natively (see my blog post on the topic), I believe it is reasonable to expect that this pressure on the evaluation will grow. As such, it becomes imperative to evaluate its performance and ensure that it does not become a bottleneck.
And while evaluation was not a sufficiently big part of the compilation process, since all of the improvements to performance (see #2398, #2399, #2420, #2400, #2426, #2504, #2693, #2718, #2759, #2777, #2801, #2973, and #2989 to name only my own performance-focused PRs), it has become apparent that evaluating of typst document (especially those with tablex tables) has become too slow. As can be seen in the second image2, all of the light blue areas are related to code evaluation and over a complete document it accounts for a significant part of the total compilation time.
The solution
We can therefore take a page out of our predecessor's book: Java, Python, ActionScript3 just to name a few. We can use a binary representation of our code (henceforth referred to as the bytecode), which is a list of instructions (referred to as opcodes) that can be executed by a Virtual Machine (VM). The idea is to create a register based VM (since they generally exhibit better performance cfr. @frozolotl) that executed the code after it has been compiled.
Indeed, think of the following code:
The closure (
it => strong(it)
) needs to be executed for every single piece of text in the document. Every time, we will need to re-traverse the tree of syntax nodes to evaluate it. Whereas with the bytecode VM, the bytecode can be reused and can be as simple as:(Note that in practice it's even simpler than that but harder to understand)
Indeed, since we can now re-use our compiled closures, we can skip the tree traversal every time the function is called, we can bake in some optimizations, and make your document compile faster. This really adds up over all of the closure calls in a large document!
The new architecture would therefore look as follows:
Pros vs Cons
Here are some advantages of this approach in no particular order:
Here are some disadvantages of this approach:
Current performance gains
As of writing this, the VM is not yet complete but already shows a 2-3x improvement in evaluation speed compared to main. And I expect that as it starts working and I optimize the VM itself, performance will improve. The main bottleneck in the VM is currently instruction decoding which is ungodly slow compared to the rest but I will be improving that in the coming days/couple of weeks.
Current state
Footnotes
Typesetting does call into evaluation for closures passed in many built-in functions such as
layout
,style
,locate
, etc. As well as any supplement, show rules, etc. running on your document; ↩This plot was made using the new
--timings
arguments available since New performance timings #3096 and run on my thesis. ↩Maybe not the way we necessarily want to go, RIP ActionScript ↩