Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Value serialization #3158

Merged
merged 37 commits into from
May 2, 2024
Merged

Value serialization #3158

merged 37 commits into from
May 2, 2024

Conversation

berekuk
Copy link
Collaborator

@berekuk berekuk commented Apr 6, 2024

Fixes #2081
Fixes #1198

Not working in browser yet, but CLI is functional, for the value types that I've implemented so far, including lambdas.

This response was generated in a worker thread (ProjectItem passes AST+externals to it), serialized by the worker, and deserialized back by the main thread:

$ pnpm --silent run cli run -e 'a=5; b={|x|x+a}; c = 1 to 2'
{a: 5, b: (x) => internal code, c: Sample Set Distribution}

TODO:

  • support all remaining value types and all dist subtypes (SampleSetDists are supported, but Symbolic and pointsets aren't)
  • better serialization for lambdas and expressions (expressions contain values, this forced me to write more code than necessary, and AST is still heavily duplicated)
  • check on TypeScript level that all serialized types are serializable (ideally, JSON-compatible)
  • avoid vLambda calls in serialization — they're bad for deduplication
  • error handling in web workers
  • measure performance — spinning up a new worker for each run might be too expensive
  • integrate with Next.js and browsers

Copy link

changeset-bot bot commented Apr 6, 2024

⚠️ No Changeset found

Latest commit: 6f23f20

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Copy link

vercel bot commented Apr 6, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Updated (UTC)
quri-hub ✅ Ready (Inspect) Visit Preview May 2, 2024 8:57pm
quri-ui ✅ Ready (Inspect) Visit Preview May 2, 2024 8:57pm
squiggle-components ✅ Ready (Inspect) Visit Preview May 2, 2024 8:57pm
squiggle-website ✅ Ready (Inspect) Visit Preview May 2, 2024 8:57pm

@berekuk
Copy link
Collaborator Author

berekuk commented Apr 7, 2024

This is now semi-functional in components storybook, with these caveats:

  • only in dev mode, prod build didn't bundle the worker properly
  • value serialization now works for almost all values (except Plots; todo)
  • exceptions are forwarded, but only as strings, so no stacktraces
  • web workers are created on each simulation, so there's a ~250ms overhead per simulation on my MBP M1 Pro
  • there's no debouncing and responses can arrive in the wrong order
  • there's a weird bug where typing something simple like x=1 causes an error at first

Still, it's good enough to try and confirm the expected benefits: editing a slow code with autoruns enabled doesn't freeze the editor 🎉

@berekuk
Copy link
Collaborator Author

berekuk commented Apr 17, 2024

On performance: serialization of JSON from the worker back to the main thread is slower than I expected.

As an example, in CLI, List.upTo(1,3000) -> map({|i|2 to 3}) (3000 samplesets of 1000 samples) takes 0.7 seconds (on M1 Pro). Around 0.2s is for getStdlib(), but the remaning 0.5 seconds is still higher than 0.18s it takes on the old version.

I don't know yet if it'll be similar in the browser, Node's worker_threads are implemented differently from webworkers.

This is not a blocker, because this example is pathological (we'd have trouble rendering 3000 samplesets in React, anyway), and because unblocked main thread is more valuable, so user experience will be better, but if timings in the browser are similar, then it will cause more CPU overhead.

@berekuk
Copy link
Collaborator Author

berekuk commented Apr 17, 2024

Progress since my last comment:

  • runners API (with EmbeddedRunner and NodeWorkerRunner)
  • errors and the remaning value types are serializable now
  • serialization supports multiple types, so we can serialize and deduplicate expressions too; not AST yet, but it's straightforward to add

Things to do:

  • bundle the worker in squiggle-lang (probably with esbuild)
    • this should fix existing tests; write more tests
  • optional: implement WebWorkerRunner, remove web-worker dependency
  • use the worker in Next.js
  • run a long-lived worker (currently I spin up a new one every time, so it has to getStdlib() every time, and that's expensive)
  • make runners configurable in SqProject; possibly add a toggle in the playground to choose a runner

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

JSON representations of all Squiggle Formats. Find a way to persist squiggle results
1 participant