Skip to content

titzer/oopsla22-artifact

Repository files navigation

OVERVIEW

This archive contains both the source and compiled binaries used in experiments for the paper entitled "A fast in-place interpreter for WebAssembly", paper #273 at OOPSLA 2022. It contains source checkouts of the benchmarks (PolybenchC) as well as the 6 engines tested (JavaScriptCore, SpiderMonkey, V8, wizard, wasm3, and Wasm Micro Runtime).

SUPPORTED PLATFORM: x86-64-linux

The only supported platform for these experiments is Linux running on a x86-64 processor, due mainly to the limitations of Wizard, the experimental engine evaluated in the paper.

POSSIBLY REQUIRED LIBRARIES

You may need to install some libraries to run (definitely to build) some of the Web engines.

 % sudo apt install libicu-dev python ruby bison flex cmake build-essential ninja-build git gperf

Contents

  • ./benchmarks - contains the source and compiled (.wasm) of all benchmarks
  • ./src - contains the source and compiled artifacts of all engines
  • ./engines - contains the compiled artifacts of some engines
  • ./data - contains output data generated by the run-*.bash scripts
  • ./data-linux-4.15-i7-8700K - contains data gathered to generate graphs in the paper
  • run-*.bash - the scripts used to run engines to generate raw data
  • summarize-*.bash - the scripts used to summarize raw data for presentation

Supported claims

The code and data in this archive was used to generate data and graphs that support the following claims:

Section 3.4

  • Final paragraph: "the difference between the best (tuned) and worse (untuned) interpreter performance is 20% to 60% across the benchmark suite"
    • supported by additional runs, comparing with ENGINES="wizeng wizeng-slow"

Section 4.2

  • From Figure 8
    • translation time for optimizing compilers is over 1000ns/byte
    • translation time for baseline compilers ranges from ~200-800ns/byte
    • translation time for rewriting interpreters ranges from ~20-200ns/byte
    • translation time for wizard ranges from 3-4ns/byte

Section 4.3

  • From Figure 9
    • translation space ratio for wamr is about ~3.7
    • translation space ratio for wasm3 is about ~2.0
    • translation space ratio for jsc-int is about ~1.0
    • translation space ratio for v8-liftoff is about 2.5-2.6
    • translation space ratio for v8-turbofan is about 2.4-2.7
    • translation space ratio for wizard is about 0.3-0.4

Section 4.4

  • From Figure 10
    • absolute execution time of benchmarks on v8-turbofan and wasm3
  • From Figure 11
    • normalized execution time of wasm3 (relative to turbofan)
      • below 1x for 4 shortest benchmarks, between 2x and 5x for middle 10, trending to 10x for remaining
    • normalized execution time of baseline compilers
      • below 1x for 4 shortest benchmarks, between 1x and 1x for middle 9, trending to 2.5x-3.x for remaining
    • normalized execution time of optimizing compilers
      • 1x to 2x for nearly all benchmarks
  • From Figure 12
    • normalized execution time of all interpreters (relative to wasm3) is within 1.5x to 3.5x
    • wizard performs roughly on par with wamr-classic (outliers are +/- 10%)
    • wizard performs on with wamr-fast for 4 shortest benchmarks
    • wamr-fast is around 1.5x slower than wasm3 on nearly all benchmarks
    • wizard is on par (+/- 5%) with jsc-int for nearly all benchmarks
  • The jump table for wamr improves performance by roughly 2x
    • Supported by additional runs, comparing ENGINES="wamr-classic wamr-slow"

Getting Started Guide

This archive contains both the source and compiled binaries used in experiments for the paper entitled "A fast in-place interpreter for WebAssembly", paper #273 at OOPSLA 2022. It contains source checkouts of the benchmarks (PolybenchC) as well as the 6 engines tested (JavaScriptCore, SpiderMonkey, V8, wizard, wasm3, and Wasm Micro Runtime).

First, apologies for the size! The first 3 engines are, to put it mildly, enormous pieces of software. The checkouts here for JavaScriptCore and SpiderMonkey include the source of the entire browser in which they are embedded. V8 contains the entire JavaScript engine and its tests, which is considerable.

NO NEED TO BUILD FROM SOURCE

Building the browser engines from source is a major exercise and could take hours of machine time. So, avoid building if you can! The JavaScript shells should run directly from the checkout, as they contain the results of building (i.e. binary JS shells). The remaining engines are simpler and easier to build, but also should not require building.

Sample data is included

Sample data that was used to make the figures in the paper is included (in data-linux-4.15-i7-8700K).

Sample figures are included

The spreadsheet used to make the figures in the paper is included (figures.ods). Cut and paste the output of summarize-*.bash into appropriate places in the spreadsheet to regenerate them.

Scripts are included for running experiments

The scripts described in "Step-by-step instructions" below can generate all the data used to make figures in the paper.

Step-by-step instructions

Gathering data

You shouldn't need to build anything to begin generating data.

Execution time and translation time data for the 6 engines is generated by two scripts. Each has a number of configuration options that can be specified with environment variables. The default settings run experiments that can last hours, mostly due to repeating each benchmark 100 times (to get 95% confidence intervals). See "Shorter runs" below to see how to reduce the running time.

  % [DATA=<dir>] [RUNS=<N>] [ENGINES=<list>] ./run-execution-experiments.bash [<benchmark>*]
  % [DATA=<dir>] [RUNS=<N>] [ENGINES=<list>] ./run-translation-experiments.bash [<benchmark>*]

Summarizing data

The raw data generated into a data directory mostly consists of numbers in text files. Two main scripts summarize the results for viewing or pasting into the spreadsheet.

  % [DATA=<dir>] [ENGINES=<list>] [ERROR=1] ./summarize-execution.bash [<benchmark>*]
  % [DATA=<dir>] [ENGINES=<list>] ./summarize-translation.bash [<benchmark>*]

Note that the raw data gathered on the test machine is included in this archive, so it is possible to create a summary without running any experiments.

  % DATA=./data-linux-4.15-i7-8700K [ENGINES=<list>] ./summarize-execution.bash [<benchmark>*]

Charting data

To create figures similar to the ones in the paper, use the figures.ods spreadsheet. The scripts below generate a tab-separated output. The tabs are important! Don't cut and paste from a terminal window.

  • Output of the ./summarize-translation.bash script can be pasted into the Translation sheet and the spreadsheet should update, making Figures 8 and 9.
  • Output of the ERRORS=1 ./summarize-execution.bash script can be pasted into the Execution sheet and the spreadsheet should update, making Figures 10 and 11.
  • Output of the ./summarize-scatter.bash script can be pasted into the Scatter sheet and the spreadsheet should update, making the scatter plots in Figures 1 and 2.

In each of these sheets, the exact cell to paste the output data is indicated in red.

Expected outputs

A typical run of the execution time experiments will produce output like so:

 % RUNS=5 ./run-execution-experiments.bash 
---- bicg -----------
sm-base     0.015246     0.015196     0.014956     0.015280     0.014996  min=0.014956  avg=0.015135  stddev=0.000000
sm-opt     0.017844     0.019413     0.018549     0.018081     0.017999  min=0.017844  avg=0.018377  stddev=0.000000
v8-liftoff     0.009751     0.010067     0.011369     0.010390     0.009777  min=0.009751  avg=0.010271  stddev=0.000000
v8-turbofan     0.014428     0.015298     0.015406     0.015704     0.015669  min=0.014428  avg=0.015301  stddev=0.000000
jsc-int     0.015033     0.014778     0.015047     0.014803     0.014677  min=0.014677  avg=0.014868  stddev=0.000000
jsc-bbq     0.011690     0.011802     0.011529     0.011716     0.011849  min=0.011529  avg=0.011717  stddev=0.000000
jsc-omg     0.028569     0.026394     0.026610     0.026770     0.027004  min=0.026394  avg=0.027069  stddev=0.000000
wizard     0.012073     0.011955     0.011904     0.011948     0.011822  min=0.011822  avg=0.011940  stddev=0.000000
wasm3     0.007645     0.007467     0.007426     0.007386     0.007413  min=0.007386  avg=0.007467  stddev=0.000000
wamr-slow     0.022461     0.022326     0.022461     0.022303     0.022358  min=0.022303  avg=0.022382  stddev=0.000000
wamr-classic     0.017074     0.018284     0.016912     0.016746     0.017098  min=0.016746  avg=0.017223  stddev=0.000000
wamr-fast     0.012073     0.012204     0.012012     0.012200     0.012087  min=0.012012  avg=0.012115  stddev=0.000000
---- mvt -----------
sm-base     0.015417     0.015232     0.014920     0.015104     0.015092  min=0.014920  avg=0.015153  stddev=0.000000
sm-opt     0.017868     0.017908    ...

It will produce files in the data/ directory like so:

% ls data/execution.bicg.*
data/execution.bicg.jsc-bbq  data/execution.bicg.js-int   data/execution.bicg.v8-liftoff    data/execution.bicg.wamr-fast  data/execution.bicg.wizard
data/execution.bicg.jsc-int  data/execution.bicg.sm-base  data/execution.bicg.v8-turbofan   data/execution.bicg.wamr-slow
data/execution.bicg.jsc-omg  data/execution.bicg.sm-opt   data/execution.bicg.wamr-classic  data/execution.bicg.wasm3

A typical run of the translation time experiments will produce output like so:

  % RUNS=5 ./run-translation-experiments.bash

---- bicg -----------
sm-base
	us=.041685 bytes=2.345500 count=38
	us=.036911 bytes=2.345500 count=38
	us=.039786 bytes=2.345500 count=38
	us=.039786 bytes=2.345500 count=38
	us=.039786 bytes=2.345500 count=38
sm-opt
	us=.989881 bytes=1.565742 count=38
	us=1.122287 bytes=1.832802 count=38
	us=1.410752 bytes=2.301797 count=38
	us=.885304 bytes=1.636163 count=38
	us=.724369 bytes=1.565018 count=38
v8-liftoff
	us=.154576 bytes=2.957235 count=76
	us=.125776 bytes=2.957235 count=76
	us=.264695 bytes=2.957235 count=76
	us=.140202 bytes=2.957235 count=76
	us=.197135 bytes=2.957235 count=76
v8-turbofan
	us=1.768827 bytes=2.582678 count=75
	us=1.864880 bytes=2.582678 count=75
	us=1.791467 bytes=2.582678 count=76
	us=1.632527 bytes=2.582678 count=76
	us=1.801221 bytes=2.582678 count=76
jsc-int
	us=.105498 bytes=1.040967 count=38
	us=.138421 bytes=1.192343 count=38
	us=.154968 bytes=1.064465 count=38
	us=.149442 bytes=1.040967 count=38
	us=.151342 bytes=1.040967 count=38
jsc-bbq
	us=1.012372 bytes=0 count=38
	us=1.085764 bytes=0 count=38
	us=1.061444 bytes=0 count=38
...

It will produce files in the data/ directory like so:

% ls data/translation.bicg.*
data/translation.bicg.jsc-bbq.bytes  data/translation.bicg.sm-base.bytes     data/translation.bicg.v8-turbofan.bytes  data/translation.bicg.wasm3.bytes
data/translation.bicg.jsc-bbq.us     data/translation.bicg.sm-base.us	     data/translation.bicg.v8-turbofan.us     data/translation.bicg.wasm3.us
data/translation.bicg.jsc-int.bytes  data/translation.bicg.sm-opt.bytes      data/translation.bicg.wamr.bytes	      data/translation.bicg.wizard.bytes
data/translation.bicg.jsc-int.us     data/translation.bicg.sm-opt.us	     data/translation.bicg.wamr-fast.bytes    data/translation.bicg.wizard.us
data/translation.bicg.jsc-omg.bytes  data/translation.bicg.v8-liftoff.bytes  data/translation.bicg.wamr-fast.us
data/translation.bicg.jsc-omg.us     data/translation.bicg.v8-liftoff.us     data/translation.bicg.wamr.us

Expected running time

It takes approximately 2-3 hours to generate the data for the execution and translation time experiments with RUNS=100. Reducing the number of runs reduces the amount of time proportionally, so that with RUNS=5, total running time should be less than 20 minutes.

Shorter runs

To reduce the workload, reduce either the number of runs or select a subset of the benchmarks or engines. Generally, the script will overwrite data from previous runs of the same benchmark, so using a partial data directory is recommended.

  % mkdir -p partial
  
  % DATA=partial RUNS=5 ./run-execution-experiments.bash

  % DATA=partial RUNS=5 ./run-execution-experiments.bash bicg

  % DATA=partial RUNS=5 ENGINES="wizard wamr-fast" ./run-execution-experiments.bash bicg

About

Scripts and raw data for OOPSLA 2022 artifact

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published