Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Replace eval with bytecode-based VM #3307

Draft
wants to merge 103 commits into
base: main
Choose a base branch
from
Draft

Conversation

Dherse
Copy link
Sponsor Collaborator

@Dherse Dherse commented Jan 30, 2024

The aim of this PR is to completely replace the eval module with two new ones: compiler and vm.

Motivation

Currently, the cycle of compilation of a typst document is the parse-eval-typeset-export pipeline as can be seen in the figure below. The document is first parsed, then evaluated a first time, then the document is typeset1, finally the document is exported (to PDF in this example). In principle, this is all well and fine, after all, one would expect evaluation to be slow, or at least slower than Python maybe, but that would not be consequential since the actual layout, text shaping, etc. would be the slowest part. But as it turns out, this could not be further from the truth. Indeed, with the rise of awesome packages like tablex by @PgBiel, Cetz by @johannes-wolf, and others, the demand on the evaluation subsystem has grown.

Additionally, as typst moves more and more towards doing processing and displaying data natively (see my blog post on the topic), I believe it is reasonable to expect that this pressure on the evaluation will grow. As such, it becomes imperative to evaluate its performance and ensure that it does not become a bottleneck.

And while evaluation was not a sufficiently big part of the compilation process, since all of the improvements to performance (see #2398, #2399, #2420, #2400, #2426, #2504, #2693, #2718, #2759, #2777, #2801, #2973, and #2989 to name only my own performance-focused PRs), it has become apparent that evaluating of typst document (especially those with tablex tables) has become too slow. As can be seen in the second image2, all of the light blue areas are related to code evaluation and over a complete document it accounts for a significant part of the total compilation time.

eval (2)

image

The solution

We can therefore take a page out of our predecessor's book: Java, Python, ActionScript3 just to name a few. We can use a binary representation of our code (henceforth referred to as the bytecode), which is a list of instructions (referred to as opcodes) that can be executed by a Virtual Machine (VM). The idea is to create a register based VM (since they generally exhibit better performance cfr. @frozolotl) that executed the code after it has been compiled.

Indeed, think of the following code:

#show text: it => strong(it)

Hello, world!

The closure (it => strong(it)) needs to be executed for every single piece of text in the document. Every time, we will need to re-traverse the tree of syntax nodes to evaluate it. Whereas with the bytecode VM, the bytecode can be reused and can be as simple as:

Args { out: R0 }
PushArg { out: R0, value: A0}
Call { callee: strong, args: R0, out: R1 }
Return { value: R1 }

(Note that in practice it's even simpler than that but harder to understand)

Indeed, since we can now re-use our compiled closures, we can skip the tree traversal every time the function is called, we can bake in some optimizations, and make your document compile faster. This really adds up over all of the closure calls in a large document!

The new architecture would therefore look as follows:

eval (3)

Pros vs Cons

Here are some advantages of this approach in no particular order:

  • No need to traverse the tree of syntax nodes for evaluation
  • Less code needed to eval typst source => better instruction cache efficiency
  • Less data needed to eval typst source => better data cache efficiency
  • Less allocations
  • Generally faster (less instructions per loc executed)

Here are some disadvantages of this approach:

  • Harder to debug
  • Somewhat harder to maintain
  • Harder to prototype new syntax
  • A bit overkill?

Current performance gains

As of writing this, the VM is not yet complete but already shows a 2-3x improvement in evaluation speed compared to main. And I expect that as it starts working and I optimize the VM itself, performance will improve. The main bottleneck in the VM is currently instruction decoding which is ungodly slow compared to the rest but I will be improving that in the coming days/couple of weeks.

Current state

  • Functions in math
  • Test math
  • Plugins
  • Fix remaining bugs (if any)
  • Fix spans and tests (some spans have changed)

Footnotes

  1. Typesetting does call into evaluation for closures passed in many built-in functions such as layout, style, locate, etc. As well as any supplement, show rules, etc. running on your document;

  2. This plot was made using the new --timings arguments available since New performance timings #3096 and run on my thesis.

  3. Maybe not the way we necessarily want to go, RIP ActionScript

@Dherse Dherse marked this pull request as draft January 30, 2024 22:45
@Dherse Dherse added the perf Related to performance and optimization. label Jan 30, 2024
return f();

#[cfg(not(target_arch = "wasm32"))]
stacker::maybe_grow(32 * 1024, 2 * 1024 * 1024, f)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does while loop use a lot of stack?

Copy link
Sponsor Collaborator Author

@Dherse Dherse Jan 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It create a scope which allocates ~10 kB of stack space, I don't know (yet) whether the stackers are needed, I'll see as I go along

@Dherse
Copy link
Sponsor Collaborator Author

Dherse commented Feb 4, 2024

Please forgive the horrendous commit history, VSCode was having a seizure it seems as I was trying to merge 👀

@Dherse Dherse mentioned this pull request Feb 21, 2024
@xentec xentec mentioned this pull request Mar 15, 2024
@laurmaedje laurmaedje changed the title [WIP] Replace eval with bytecode based VM Replace eval with bytecode-based VM Apr 1, 2024
@laurmaedje laurmaedje added the waiting-on-author Pull request waits on author label Apr 2, 2024
@laurmaedje laurmaedje changed the title Replace eval with bytecode-based VM [WIP] Replace eval with bytecode-based VM Apr 2, 2024
@laurmaedje laurmaedje added waiting-on-author Pull request waits on author and removed waiting-on-author Pull request waits on author labels Apr 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
perf Related to performance and optimization. waiting-on-author Pull request waits on author
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet