Skip to content

Latest commit

 

History

History
58 lines (42 loc) · 2.01 KB

startup.md

File metadata and controls

58 lines (42 loc) · 2.01 KB

Startup benchmark

What's this

startup.sh is a benchmark intended to measure start-up time and memory footprint. For each runtimes, it instantiates a large module (ffmpeg.wasm, which is about 20MB) and executes only a small part of it. (just print the -version message)

Result

Result

  • Run on a macOS/amd64 laptop:

    MacBook Pro (15-inch, 2018)
    2.2 GHz 6-Core Intel Core i7
    
  • Raw values

  • Plotted with plot.py.

Observations

  • The default wasm3 and wasmi with lazy option (wasmi (lazy) row) perform best. However, it's mainly because of their lazy compilation/validation. With lazy compilation/validation disabled, they have certain compilation/validation overhead as shown in the wasm3 (no lazy) and wasmi rows.

    Note: While lazy validation is explicitly allowed by the spec, it's a bit controversial and thus many of runtimes don't implement it. Specifically, toywasm intentionally doesn't implement it because it complicates shared modules.

  • Toywasm and WAMR classic interpreter are second best. It's expected as they don't involve complex compilation processes.

  • WAMR fast-jit seems lightweight for a JIT-based runtime as it's advertized. It also uses a lazy compilation strategy by default. Unlike wasm3 and wasmi, it doesn't defer the validation though. The performance with lazy compilation is a bit unstable, probably because of a naive locking.

  • Toywasm's annotations have small but measurable overheads. cf. Overhead of the annotations (ffmpeg)

  • It's common for JIT-based runtimes to spawn many compilation threads to improve startup time. (thus "user" far larger than "real")

  • Some of runtimes involve surprisingly large RSS like 600MB. I'm not sure why.