-
Notifications
You must be signed in to change notification settings - Fork 16
DEW ADR 002 ‐ Off Thread Runtime and On Disk Storage
Pierre Raybaut edited this page Jun 26, 2026
·
1 revision
This document is an Architecture Decision Record (ADR) capturing selection, design directions, and accepted tradeoffs for moving heavy computation off the UI thread and lifting in-tab memory boundaries.
- Date: 2026-06-07
- Project: DataLab Experimental Web Interface (DEW) — NLnet NGI0 Commons Fund
- Scope: Milestone 1h — performance optimization, worker default runtime, on-disk array storage
- Repository: https://github.com/DataLab-Platform/web
- Builds on: DEW ADR #1 — Browser-Native Frontend Architecture
As the web-native prototype processed increasingly large scientific datasets (e.g. 2048²+ float64 images), two main constraints emerged:
-
Memory ceilings: Pyodide runs inside a 32-bit WebAssembly sandbox (
wasm32), capping the linear helper heap at a maximum of 4 GB (and allocation limits near ~2 GB on several modern desktop browsers). -
UI Thread blocking: Executing heavy operations directly inside Pyodide on the main browser thread freezes rendering, leading to bad UX. Furthermore, the synchronous Origin Private File System (OPFS) handle (
createSyncAccessHandle), necessary for swift, low-latency array storage, is restricted to Web Workers by specification.
We need a strategy to offload execution from the UI thread while bypassing memory caps, maintaining 100% static deployability, and keeping the codebase clean.
- Static-only guarantees: Avoid using SharedArrayBuffer (to bypass COOP/COEP header requirements for straightforward sub-path static hosting).
- Infinite workspace scaling: Let the active working set exceed the ~2 GB wasm32 heap limitation.
- No caller-side changes: Prevent changes in execution interfaces from causing churn across UI view structures.
- Stable and reversible rollout: Allow selective mode toggling between main-thread and worker modes.
- Option A (Status Quo): Keep Pyodide on the main thread and retain arrays strictly in memory. Leads to crash/OOM situations on large multi-image sets.
-
Option B (Asynchronous OPFS Only): Maintain in-thread Pyodide, but stream array buffers to the local OPFS async API (
createWritable). Solves memory growth but introduces high write-latency overheads. - Option C (Dedicated Worker + Synchronous OPFS - Chosen): Move the Pyodide kernel and object model into a Dedicated Web Worker, communicating via a zero-copy transferable bridge. Inside the worker, execute synchronous reads/writes directly onto the fast OPFS handle.
- Option D (Memory64 / wasm64 builds): Adopt 64-bit WebAssembly Pyodide runtimes. Deferred until a stable upstream release is available.
We implemented and integrated Option C (Worker-hosted runtime + Synchronous OPFS spill) as the nominal mode:
-
Unified Surface: Mapped identical
RuntimeApiinterfaces to hide thread routing behind a clear proxy façade (WorkerRuntimeProxy/kernelWorker). - Zero-Copy Bridge: Payloads bypass structured clone bottlenecks by declaring internal memory buffers as transferables during message posts.
-
Automatic Fallback: The runtime continues to run fully on-disk or in-RAM depending on browser capabilities, automatically falling back to an in-thread worker engine if OPFS handles fail (or if
?runtime=mainis declared).
- Flatter Heap footprint: Memory usage stays constant even when manipulating multi-gigabyte scientific workspaces as arrays spill on-disk.
- Fluid User Interface: Heavy algorithms execute off-main-thread.
- No Hosting prerequisites: Transferables avoid the need for cross-origin isolation headers.
- Double Worker hop: Embedded subsystems (macros/notebook workers) communicate through the main thread to access kernel objects.
-
Single-Run Interruptibility limitations: To preserve plain static hosting (avoiding
SharedArrayBufferinterrupt cues) and maintain memory safety (avoiding a parallel compute worker instance which creates OOM crashes), interruption is executed client-side at batch loop boundaries rather than mid-computation within a single algorithm's execution loop.
- Integrated and promoted worker mode to the nominal execution default inside src/runtime/runtimeMode.ts.
- Validated with complete Playwright integration suites and OPFS stress tests.
- Extended developer guides within
doc/architecture.mdanddoc/troubleshooting.md.