[WIP] Replace eval with bytecode-based VM #3307

Dherse · 2024-01-30T22:45:44Z

The aim of this PR is to completely replace the eval module with two new ones: compiler and vm.

Motivation

Currently, the cycle of compilation of a typst document is the parse-eval-typeset-export pipeline as can be seen in the figure below. The document is first parsed, then evaluated a first time, then the document is typeset¹, finally the document is exported (to PDF in this example). In principle, this is all well and fine, after all, one would expect evaluation to be slow, or at least slower than Python maybe, but that would not be consequential since the actual layout, text shaping, etc. would be the slowest part. But as it turns out, this could not be further from the truth. Indeed, with the rise of awesome packages like tablex by @PgBiel, Cetz by @johannes-wolf, and others, the demand on the evaluation subsystem has grown.

Additionally, as typst moves more and more towards doing processing and displaying data natively (see my blog post on the topic), I believe it is reasonable to expect that this pressure on the evaluation will grow. As such, it becomes imperative to evaluate its performance and ensure that it does not become a bottleneck.

And while evaluation was not a sufficiently big part of the compilation process, since all of the improvements to performance (see #2398, #2399, #2420, #2400, #2426, #2504, #2693, #2718, #2759, #2777, #2801, #2973, and #2989 to name only my own performance-focused PRs), it has become apparent that evaluating of typst document (especially those with tablex tables) has become too slow. As can be seen in the second image², all of the light blue areas are related to code evaluation and over a complete document it accounts for a significant part of the total compilation time.

The solution

We can therefore take a page out of our predecessor's book: Java, Python, ActionScript³ just to name a few. We can use a binary representation of our code (henceforth referred to as the bytecode), which is a list of instructions (referred to as opcodes) that can be executed by a Virtual Machine (VM). The idea is to create a register based VM (since they generally exhibit better performance cfr. @frozolotl) that executed the code after it has been compiled.

Indeed, think of the following code:

#show text: it => strong(it)

Hello, world!

The closure (it => strong(it)) needs to be executed for every single piece of text in the document. Every time, we will need to re-traverse the tree of syntax nodes to evaluate it. Whereas with the bytecode VM, the bytecode can be reused and can be as simple as:

Args { out: R0 }
PushArg { out: R0, value: A0}
Call { callee: strong, args: R0, out: R1 }
Return { value: R1 }

(Note that in practice it's even simpler than that but harder to understand)

Indeed, since we can now re-use our compiled closures, we can skip the tree traversal every time the function is called, we can bake in some optimizations, and make your document compile faster. This really adds up over all of the closure calls in a large document!

The new architecture would therefore look as follows:

Pros vs Cons

Here are some advantages of this approach in no particular order:

No need to traverse the tree of syntax nodes for evaluation
Less code needed to eval typst source => better instruction cache efficiency
Less data needed to eval typst source => better data cache efficiency
Less allocations
Generally faster (less instructions per loc executed)

Here are some disadvantages of this approach:

Harder to debug
Somewhat harder to maintain
Harder to prototype new syntax
A bit overkill?

Current performance gains

As of writing this, the VM is not yet complete but already shows a 2-3x improvement in evaluation speed compared to main. And I expect that as it starts working and I optimize the VM itself, performance will improve. The main bottleneck in the VM is currently instruction decoding which is ungodly slow compared to the rest but I will be improving that in the coming days/couple of weeks.

Current state

Typesetting does call into evaluation for closures passed in many built-in functions such as layout, style, locate, etc. As well as any supplement, show rules, etc. running on your document; ↩
This plot was made using the new --timings arguments available since New performance timings #3096 and run on my thesis. ↩
Maybe not the way we necessarily want to go, RIP ActionScript ↩

crates/typst/src/vm/opcodes_raw.rs

Enter-tainer · 2024-01-31T09:35:00Z

crates/typst/src/vm/opcodes.rs

+        return f();
+
+        #[cfg(not(target_arch = "wasm32"))]
+        stacker::maybe_grow(32 * 1024, 2 * 1024 * 1024, f)


does while loop use a lot of stack?

It create a scope which allocates ~10 kB of stack space, I don't know (yet) whether the stackers are needed, I'll see as I go along

Dherse · 2024-02-04T00:10:11Z

Please forgive the horrendous commit history, VSCode was having a seizure it seems as I was trying to merge 👀

Dherse added 10 commits January 23, 2024 19:46

Initial work

4907717

Before rewrite

1ca3c63

Initial new VM

f2d0508

Fixed instruction decoding & rules

370d5bd

Fixed jumps in scopes

c00cfeb

Lots of fixes for compile, iter, etc.

41fab3e

Fixed writable access

c3a51db

cargo fmt

82f6a29

Fixed loops

a2d9ba3

Fixed loop (again)

e64ddad

Dherse marked this pull request as draft January 30, 2024 22:45

Dherse added the perf Related to performance and optimization. label Jan 30, 2024

Dherse added 2 commits January 31, 2024 01:31

Fixed condition & loops

d99fe29

cargo fmt

798327a

Enter-tainer reviewed Jan 31, 2024

View reviewed changes

crates/typst/src/vm/opcodes_raw.rs Show resolved Hide resolved

Enter-tainer reviewed Jan 31, 2024

View reviewed changes

Dherse added 14 commits January 31, 2024 11:48

Fixed control flow

d04bf41

Cheaper Opcode::Flow (no span)

6b998f3

Fixed export & capturing duplicates

6fdf9ee

Fixed recursive module call weirdness

9ec4037

Fixed label attachment

00cd913

Fix show rule, fixed set and show (join), fixed break

355b843

Removed logging

c18933e

Fix pattern capture fail, fix no display in markup

d5da17b

Fixed messy decoding-encoding

748d236

Fixed jumps, duplicate consts, etc.

8217c44

Fixed jumps once and for all!

5bcaeb3

Variable # of registers, reuse regs

116f4e4

cargo fmt

33350e4

Simplify & speed

762235a

Dherse added 2 commits February 4, 2024 01:03

Fix show rule, fixed set and show (join), fixed break

747c46d

Merge remote-tracking branch 'upstream/main' into bytecode-vm

6eb4fa3

Dherse added 6 commits February 4, 2024 01:12

Fix merge & fmt

c7658fe

Fixed module import

3f8216f

Fixed cyclic import

a3425d2

Fix performance

7ec3a13

Optimizations

5ec67c3

Simplification & opt

2d46781

Dherse mentioned this pull request Feb 6, 2024

error: file not found (searched at <file path>) after 6999be #3312

Closed

1 task

Dherse added 13 commits February 6, 2024 13:17

Various attempts at optimization

6b7a5bc

Removed unsafe

7709042

Optimization bonanza

66c81a1

Haha optimizations goes brrrrrrrrrrrrrrr

9099196

PicoStr goes brrr & simplification

072ab62

Fixed arg span

e8cf94b

Massive compiler cleanup & simplification

67e0fdd

Further cleanup

ef55cca

Cleanup Call opcode

4dc43e7

Fixed state

f829d3f

Improved efficiency

7cff984

Working on things

6994b6a

Missing file

4808ad3

Dherse mentioned this pull request Feb 21, 2024

Wider callsite span #3466

Merged

xentec mentioned this pull request Mar 15, 2024

HTML Export #721

Open

laurmaedje changed the title ~~[WIP] Replace eval with bytecode based VM~~ Replace eval with bytecode-based VM Apr 1, 2024

laurmaedje added the waiting-on-author Pull request waits on author label Apr 2, 2024

laurmaedje changed the title ~~Replace eval with bytecode-based VM~~ [WIP] Replace eval with bytecode-based VM Apr 2, 2024

laurmaedje added waiting-on-author Pull request waits on author and removed waiting-on-author Pull request waits on author labels Apr 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Replace eval with bytecode-based VM #3307

[WIP] Replace eval with bytecode-based VM #3307

Dherse commented Jan 30, 2024 •

edited

Enter-tainer Jan 31, 2024

Dherse Jan 31, 2024 •

edited

Dherse commented Feb 4, 2024

[WIP] Replace eval with bytecode-based VM #3307

Are you sure you want to change the base?

[WIP] Replace eval with bytecode-based VM #3307

Conversation

Dherse commented Jan 30, 2024 • edited

Motivation

The solution

Pros vs Cons

Current performance gains

Current state

Footnotes

Enter-tainer Jan 31, 2024

Choose a reason for hiding this comment

Dherse Jan 31, 2024 • edited

Choose a reason for hiding this comment

Dherse commented Feb 4, 2024

Dherse commented Jan 30, 2024 •

edited

Dherse Jan 31, 2024 •

edited