Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: Port the optimizer to WebAssembly #1092

Merged
merged 112 commits into from Jan 3, 2023
Merged

perf: Port the optimizer to WebAssembly #1092

merged 112 commits into from Jan 3, 2023

Conversation

samestep
Copy link
Collaborator

@samestep samestep commented Sep 13, 2022

Description

This PR improves optimizer performance by an order of magnitude, building on my optimizer performance experiment from earlier this year.

Implementation strategy and design decisions

This PR adds the following build tools to our setup:

  • Cargo for Rust stuff
  • wasm-bindgen to generate JavaScript/TypeScript scaffolding around Wasm compiled from Rust
  • ts-rs to generate TypeScript types from Rust Serde types

Quoting from Stack Overflow:

Some browsers limit the size of modules that can be compiled synchronously because that blocks the main thread.

One implication of this is that compiling an autodiff graph becomes an async operation, so in particular, compileStyle (and thus also compileTrio) becomes async.

Another implication is that loading/initializing the optimizer itself is an async operation. We considered three possible architectures to accommodate this:

  1. Don't initialize the optimizer Wasm module at load time, and instead, make each of the functions exposed by the Wasm module wrapper have an implicit precondition that the module must have already been loaded; this has the benefit that we don't need top-level await and we don't need to make all those functions async, but the drawback that we now need to remember to properly initialize the module before every possible usage or we'll have race conditions.
  2. Change the interface of the Wasm module wrapper so that rather than exposing things directly, it exposes just one async function that initializes the Wasm module and returns an object with methods to access all the underlying Wasm functions; this has the advantage of not requiring top-level await and not giving things implicit preconditions, but the viral disadvantage that now every consumer of the Wasm module needs to either be async or take in this object as a parameter.
  3. Use top-level await, and update our downstream packages to use ESM instead of CJS; this has the advantage of no race conditions, no implicit preconditions, and no viral changes to downstream function signatures, but the disadvantage that docs-site needs to support top-level await.

This PR takes approach (1), but we would like to switch to (3) in the future if possible; see the further discussion below.

Checklist

  • I have commented my code, particularly in hard-to-understand areas
  • My changes generate no new ESLint warnings
  • I have reviewed any generated changes to the diagrams/ folder

Open questions

  • To reduce the time to compile the WebAssembly to native code, a followup should identify sets of objectives/constraints that are topologically similar (usually those sharing a source location in the Style program) and compile those to a single WebAssembly function, instead of duplicating their logic across many functions.
  • Because a WebAssembly.Module "can be efficiently shared with Workers, and instantiated multiple times", a followup should allow the optimizer to be run in a web worker; see this issue: Providing a positive user experience for slow Penrose trios #1095
  • If a generated gradient function crashes while computing polynomial roots, it does not undo the change it made to the stack pointer. I don't know of a good way to handle this. But on the other hand, I don't know whether it's possible for such a crash to occur.
  • While developing this PR, I added ${FORCE_COLOR+--color=true} to @penrose/optimizer's esbuild script, and ${FORCE_COLOR+--color=always} to its cargo scripts, to get color output in Nx. I've removed those because they're not portable to Windows, but it would be nice to come up with a portable way to get this back in the future; see this issue: Some scripts aren't colored when run by Nx #1176
  • Originally I expected that this PR would allow us to handle a thousand points in our hundred-points-around-star example (which would previously yield a stack overflow), but instead, after this PR it simply crashes the whole page. See Stack overflow with too many substance objects #1070
  • In the medium term, it would be nice to remove all our instances of await ready; by using a top-level await in @penrose/optimizer. As far as I can tell, this would require us to get rid of Webpack first; see docs: Migrate site from Docusaurus to VitePress #1172
  • I removed the autodiff debug node, because I don't think we ever use it, and it would have taken nontrivial effort to port to WebAssembly. I can add it back if we want, but didn't want to waste time if we don't care.
  • This PR uses a prerelease version of wasm-bindgen that includes the --keep-lld-exports flag and does not emit BigInt literals. Once version 0.2.84 is released, we should just switch to that, but for now, the CONTRIBUTING.md instructions just say to install from GitHub at a specific commit, and our CI/CD scripts pull a Linux binary which I compiled and put in a GitHub Gist.
  • Although we mostly compile Wasm asynchronously for reasons described above, there are two exceptions: in tests, and for convex partitioning. The former is fine because we only run tests in Node and not in the browser, but the latter may become an issue if we end up using convex partitioning on polygons with too many points or whose point locations are computed via sufficiently complicated expressions.
  • As you can verify by running strings packages/optimizer/target/wasm32-unknown-unknown/release/penrose_optimizer.wasm, the built WebAssembly binary contains absolute paths (starting with /Users/samueles/.cargo/registry/src/github.com-1ecc6299db9ec823/ in my case) to local files on whatever machine built the binary. See this Rust issue: Enable --remap-path-prefix for absolute paths by default rust-lang/rust#40552

@codecov
Copy link

codecov bot commented Sep 13, 2022

Codecov Report

Merging #1092 (2caa419) into main (831a598) will increase coverage by 0.83%.
The diff coverage is 96.04%.

❗ Current head 2caa419 differs from pull request most recent head 275b29c. Consider uploading reports for the commit 275b29c to get more accurate results

@@            Coverage Diff             @@
##             main    #1092      +/-   ##
==========================================
+ Coverage   62.07%   62.91%   +0.83%     
==========================================
  Files          59       59              
  Lines        7111     7164      +53     
  Branches     1708     1673      -35     
==========================================
+ Hits         4414     4507      +93     
+ Misses       2614     2572      -42     
- Partials       83       85       +2     
Impacted Files Coverage Δ
packages/core/src/renderer/dragUtils.ts 7.31% <0.00%> (ø)
packages/core/src/types/ad.ts 100.00% <ø> (ø)
packages/core/src/index.ts 47.31% <56.25%> (-0.52%) ⬇️
packages/core/src/engine/Autodiff.ts 89.84% <97.60%> (+5.18%) ⬆️
packages/core/src/utils/Wasm.ts 98.33% <98.33%> (ø)
packages/core/src/compiler/Style.ts 67.25% <100.00%> (+0.05%) ⬆️
packages/core/src/contrib/Utils.ts 41.81% <100.00%> (ø)
packages/core/src/engine/AutodiffFunctions.ts 100.00% <100.00%> (ø)
packages/core/src/engine/EngineUtils.ts 56.08% <100.00%> (+3.87%) ⬆️
packages/core/src/utils/Util.ts 54.74% <0.00%> (-5.31%) ⬇️
... and 2 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@github-actions
Copy link

github-actions bot commented Sep 13, 2022

± Registry diff

M	3d-projection-fake-3d-linear-algebra.svg
M	circle-example-euclidean.svg
M	collinear-euclidean.svg
M	congruent-triangles-euclidean.svg
M	continuousmap-continuousmap.svg
M	hypergraph-hypergraph.svg
M	incenter-triangle-euclidean.svg
M	lagrange-bases-lagrange-bases.svg
M	midsegment-triangles-euclidean.svg
M	non-convex-non-convex.svg
M	one-water-molecule-atoms-and-bonds.svg
M	parallel-lines-euclidean.svg
M	persistent-homology-persistent-homology.svg
M	points-around-line-shape-distance.svg
M	points-around-polyline-shape-distance.svg
M	points-around-star-shape-distance.svg
M	siggraph-teaser-euclidean-teaser.svg
M	small-graph-disjoint-rect-line-horiz.svg
M	small-graph-disjoint-rects-small-canvas.svg
M	small-graph-disjoint-rects.svg
M	tree-tree.svg
M	tree-venn-3d.svg
M	tree-venn.svg
M	two-vectors-perp-vectors-dashed.svg
M	vector-wedge-exterior-algebra.svg
M	wet-floor-atoms-and-bonds.svg
M	word-cloud-example-word-cloud.svg
M	wos-laplace-estimator-walk-on-spheres.svg
M	wos-nested-estimator-walk-on-spheres.svg
M	wos-offcenter-estimator-walk-on-spheres.svg
M	wos-poisson-estimator-walk-on-spheres.svg

📊 Performance

Key

Note that each bar component rounds up to the nearest 100ms, so each full bar is an overestimate by up to 400ms.

     0s   1s   2s   3s   4s   5s   6s   7s   8s   9s
     |    |    |    |    |    |    |    |    |    |
name ▝▀▀▀▀▀▀▀▀▀▀▀▚▄▄▄▄▄▄▄▄▄▞▀▀▀▀▀▀▀▀▀▀▀▀▚▄▄▄▄▄▄▄▄▄▖
      compilation labelling optimization rendering

Data

                                        0s   1s   2s   3s   4s   5s   6s   7s   8s   9s
                                        |    |    |    |    |    |    |    |    |    |
3d-projection-fake-3d-linear-algebra    ▝▀▚▚
allShapes-allShapes                     ▝▀▀▄▚▄▖
arrowheads-arrowheads                   ▝▀▚▚
circle-example-euclidean                ▝▀▀▀▀▀▀▞▀▀▚
collinear-euclidean                     ▝▀▀▀▚▚
congruent-triangles-euclidean           ▝▀▀▀▀▀▀▀▀▀▚▚
continuousmap-continuousmap             ▝▀▀▚▚
hypergraph-hypergraph                   ▝▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▚▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▖
incenter-triangle-euclidean             ▝▀▀▀▀▀▚▚
lagrange-bases-lagrange-bases           ▝▀▀▚▚
midsegment-triangles-euclidean          ▝▀▀▀▀▀▚▚
non-convex-non-convex                   ▝▀▀▀▀▀▚▚
one-water-molecule-atoms-and-bonds      ▝▚▚
parallel-lines-euclidean                ▝▀▀▀▚▚
persistent-homology-persistent-homology ▝▀▀▀▀▀▀▀▚▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▚▖
points-around-line-shape-distance       ▝▀▀▀▀▚▚
points-around-polyline-shape-distance   ▝▀▀▀▀▀▀▀▀▀▀▀▀▀▀▞▖
points-around-star-shape-distance       ▝▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▞▖
siggraph-teaser-euclidean-teaser        ▝▀▀▀▀▀▞▖
small-graph-disjoint-rect-line-horiz    ▝▀▀▀▀▀▀▀▀▀▀▀▀▚▚
small-graph-disjoint-rects              ▝▀▀▚▚
small-graph-disjoint-rects-large-canvas ▝▀▚▚
small-graph-disjoint-rects-small-canvas ▝▀▚▚
tree-tree                               ▝▀▀▄▄▞▖
tree-venn                               ▝▀▀▀▀▀▄▚
tree-venn-3d                            ▝▀▀▀▀▞▄▖
two-vectors-perp-vectors-dashed         ▝▀▀▞▖
vector-wedge-exterior-algebra           ▝▀▀▚▚
wet-floor-atoms-and-bonds               ▝▀▀▚▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▚
word-cloud-example-word-cloud           ▝▀▀▀▀▀▀▀▚▚
wos-laplace-estimator-walk-on-spheres   ▝▀▀▀▀▀▀▀▀▀▀▀▚▚
wos-nested-estimator-walk-on-spheres    ▝▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▞▀▀▀▀▀▀▀▚
wos-offcenter-estimator-walk-on-spheres ▝▀▀▀▀▀▀▀▀▀▀▀▀▞▖
wos-poisson-estimator-walk-on-spheres   ▝▀▀▀▀▀▀▀▀▀▀▀▚▀▀▖

@cloudflare-pages
Copy link

cloudflare-pages bot commented Sep 13, 2022

Deploying with  Cloudflare Pages  Cloudflare Pages

Latest commit: 275b29c
Status: ✅  Deploy successful!
Preview URL: https://31f7697b.penrose-72l.pages.dev
Branch Preview URL: https://wasm-optimizer.penrose-72l.pages.dev

View logs

@kyleliangus
Copy link

Almost there!

@joshsunshine
Copy link
Member

joshsunshine commented Dec 20, 2022

While developing this PR, I added ${FORCE_COLOR+--color=true} to @penrose/optimizer's esbuild script, and ${FORCE_COLOR+--color=always} to its cargo scripts, to get color output in Nx. I've removed those because they're not portable to Windows, but it would be nice to come up with a portable way to get this back in the future.

@samestep can you write an issue about this? Perhaps @liangyiliang can implement a workaround for Windows since he is our main Windows user.

@joshsunshine
Copy link
Member

Originally I expected that this PR would allow us to handle a thousand points in our hundred-points-around-star example (which would previously yield a stack overflow), but instead, after this PR it simply crashes the whole page. See #1070

@samestep can you update 1070 so it explains the new issue.

@samestep
Copy link
Collaborator Author

While developing this PR, I added ${FORCE_COLOR+--color=true} to @penrose/optimizer's esbuild script, and ${FORCE_COLOR+--color=always} to its cargo scripts, to get color output in Nx. I've removed those because they're not portable to Windows, but it would be nice to come up with a portable way to get this back in the future.

@samestep can you write an issue about this? Perhaps @liangyiliang can implement a workaround for Windows since he is our main Windows user.

Done: #1176

Originally I expected that this PR would allow us to handle a thousand points in our hundred-points-around-star example (which would previously yield a stack overflow), but instead, after this PR it simply crashes the whole page. See #1070

@samestep can you update 1070 so it explains the new issue.

Done: #1070 (comment)

Copy link
Member

@wodeni wodeni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for going through the PR with me @samestep :p.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants