Skip to content

DEW ADR 001 ‐ Browser Native Frontend Architecture

Pierre Raybaut edited this page May 14, 2026 · 1 revision

DEW ADR #1 — Browser-Native Frontend Architecture

This document is an Architecture Decision Record (ADR) — a short, dated note that captures a single architectural choice, the alternatives that were considered, and the reasons for picking one of them. It follows the spirit of MADR (Markdown Architectural Decision Records).

  • Date: 2026-05-14
  • Project: DataLab Experimental Web Interface (DEW) — NLnet NGI0 Commons Fund
  • Scope: Task 1 — Web Frontend Prototype (closes sub-task 1a. Architecture consolidation and decision record)
  • Repository: https://github.com/DataLab-Platform/web

1. Context and problem statement

DataLab is a desktop scientific data-processing platform built on a Qt GUI (PlotPyStack) and on the headless Sigima computation engine. The DEW project aims to deliver a browser-native reimplementation of DataLab so that any user can run the full application — signals, images, ROIs, processings, fits, plugins, macros, notebooks, HDF5 I/O — from a plain URL, with no install, no account and no server-side processing of user data.

The architectural question that drove the entire prototype is:

How do we deliver a feature-rich DataLab in the browser while preserving the local-first guarantees of the desktop application (no upload, no account, full reproducibility) and reusing as much of the existing Python codebase as technically possible?

Several approaches are technically viable. This ADR records the options that were considered, the criteria used to compare them, the decision that was made for the DEW prototype, and the consequences that decision has for the rest of the project.

2. Decision drivers

The following drivers were extracted from the DEW Memorandum of Understanding, from the DataLab roadmap, and from operational constraints surfaced during the first weeks of prototyping.

  1. Local-first, zero-upload by default. DataLab is used in industrial and research contexts where data confidentiality is a hard constraint. The browser version must preserve this property: user data stays in the browser tab unless the user explicitly exports it.
  2. Maximal reuse of Sigima. Sigima embodies years of curated scientific computation work (signal/image objects, ROI semantics, processing catalogue, parameter DataSets). The web frontend must consume that engine as is, not reimplement it in JavaScript.
  3. Feature parity ambition. The frontend must be capable of reaching parity (or close) with the desktop application's processing surface, not a toy demo.
  4. Static deployability. The bundle must be hostable on plain static hosting (GitHub Pages, S3, an internal nginx) and behind arbitrary sub-paths, with no Python runtime on the server side.
  5. Embeddability. Third-party host applications must be able to embed DataLab-Web as a static asset.
  6. Plugin compatibility with the desktop API. Existing DataLab plugins should run unchanged in the browser, modulo unavoidable async adaptations of dialog calls.
  7. Operational sustainability. The project is small. The chosen stack must be maintainable by the DataLab core team without permanent specialist frontend support, and must not introduce per-user infrastructure costs.

3. Considered options

Four candidate architectures were evaluated. They are described below with their respective trade-offs against the eight decision drivers.

Option A — Standalone React/TypeScript SPA + Sigima in Pyodide (chosen)

A static React + TypeScript bundle served from any web host, loading Pyodide (CPython compiled to WebAssembly) into the page on cold start, then installing Sigima and its scientific dependencies (numpy, scipy, scikit-image, h5py, pandas, pywavelets…) via micropip. A Python bootstrap module owns an in-browser hierarchical object model that mirrors DataLab desktop's ObjectModel. The TypeScript runtime exposes a typed bridge (DataLabRuntime) and dispatches all Python calls; UI components never touch Pyodide directly. Plotting is delegated to Plotly.js since Qt-based PlotPy is unavailable in the browser.

Driver Assessment
Local-first ✅ Native: no network call after the initial bundle download.
Sigima reuse ✅ Sigima runs as is inside Pyodide — same wheels, same code.
Feature parity ✅ Demonstrated on the prototype (panels, ROIs, fits, plugins, macros, notebooks, HDF5).
Static deployability ✅ A single static bundle, drop-in to GitHub Pages and any sub-path.
Embeddability ✅ The bundle is wrapped by @datalab-platform/web-sdk for postMessage integration.
Plugin compatibility PluginBase shim preserves the desktop API; only dialog calls become await edit_async(...).
Sustainability 🟡 Pyodide cold start (~30–60 s on first visit) and bundle size (~10 MB) require care.
Accessibility / security ✅ Standard React tooling; full control over CSP, sanitisation and ARIA; no server attack surface.

Option B — JupyterLite + extensions

JupyterLite is a JupyterLab distribution that runs in the browser, also on Pyodide. The DataLab desktop UI would be reconstructed as a set of JupyterLab extensions, custom widgets and a custom application shell on top of JupyterLite.

Driver Assessment
Local-first ✅ Same Pyodide story.
Sigima reuse ✅ Same Pyodide story.
Feature parity 🟡 Reachable in theory but requires fighting the notebook-centric shell, the JupyterLab command/menu model, and Lumino widgets to recreate DataLab's panel/object-tree/menu workflow.
Static deployability ✅ JupyterLite deploys statically.
Embeddability 🔴 JupyterLab's shell is large and opinionated; embedding it as a sub-component inside an Angular host, for example, is awkward.
Plugin compatibility 🔴 DataLab plugins target a Qt-style dialog/action API, not JupyterLab's command palette and Lumino widgets. A second plugin abstraction would be needed.
Sustainability 🟡 Bundle size and cold start are larger than a focused SPA; we would also depend on the JupyterLab release cadence.

Why not chosen: JupyterLite is excellent at running notebooks in the browser, but DEW's deliverable is not a notebook environment — it is a feature-rich data-processing application whose interaction model (panels, object tree, ROI editor, contextual menus, parameter dialogs) maps poorly onto a JupyterLab shell. The integration cost exceeds the cost of building a focused React shell, and the result would be harder to embed in third-party hosts.

Option C — Server-side Python web framework (Panel / Dash / Solara / Streamlit)

Run DataLab on a server using a Python web framework (Panel, Dash, Solara, Streamlit or similar). The browser becomes a thin client; computations execute on the server.

Driver Assessment
Local-first 🔴 Data must be uploaded to the server. Incompatible with DataLab's confidentiality requirements unless every user runs their own backend, which defeats the "no install" goal.
Sigima reuse ✅ Sigima runs natively server-side.
Feature parity 🟡 Achievable, but rich UI interactions (live ROI editing on large images, contrast tools, profile dragging) suffer from round-trip latency.
Static deployability 🔴 Requires a Python runtime, a process manager and per-user state — operationally incompatible with the static-hosting requirement.
Embeddability 🟡 Possible via iframes, but every embedding scenario also pulls in the per-user backend.
Plugin compatibility ✅ Plugins would run server-side, like in the desktop.
Sustainability 🔴 Per-user compute and storage costs grow linearly with adoption.

Why not chosen: This option violates the local-first driver (#1) and the static-deployability driver (#4), which are non-negotiable for the DEW project. It would also turn DataLab into a hosted service the project cannot sustainably operate.

Option D — Server-connected SPA over the existing Web API

A pure JavaScript/TypeScript SPA talking to the FastAPI Web API that already ships with DataLab desktop and DataLab-Kernel. The browser would render Plotly views and forms, but every computation would round-trip to a Python server hosting a Sigima process.

Driver Assessment
Local-first 🔴 Same issue as Option C: every user needs a backend. The Web API was designed for single-user notebook integration, not multi-tenant hosting.
Sigima reuse ✅ Direct.
Feature parity 🟡 Reachable for batch-style operations; degraded for live interactive editing.
Static deployability 🔴 The SPA is static, but the system as a whole is not.
Embeddability 🟡 Same caveats as Option C.
Plugin compatibility ✅ Plugins run server-side.
Sustainability 🔴 Same as Option C.

Why not chosen: Same fundamental misfit as Option C. The existing Web API remains the right tool for DataLab-Kernel and for remote-control scenarios where the server is the user's own machine — it is not the right foundation for the public DEW deployment.

4. Decision

Option A (Standalone React/TypeScript SPA + Sigima in Pyodide) is adopted as the architecture for DataLab-Web.

The architecture is intentionally factored so that:

  • src/runtime/ is the only module that touches the Pyodide API. All Python calls go through DataLabRuntime; the rest of the UI consumes typed TypeScript interfaces (SignalMeta, SignalData, ProcessingDescriptor, AnalysisResult, MacroRecord, NotebookRecord…).
  • src/runtime/bootstrap.py owns the in-browser hierarchical object model (panels → groups → objects → ROIs), mirroring DataLab desktop's ObjectModel. It is re-executable across HMR reloads (the _MODEL and _CATALOG singletons survive).
  • src/runtime/processor.py introspects the Sigima catalogue and exposes build_signal_catalog() / build_image_catalog() and the apply_* dispatch — the equivalent of DataLab desktop's register_1_to_1 / register_n_to_1 / etc. machinery. New Sigima entries surface in the menu bar with no JS change.
  • src/runtime/macroWorker.ts / notebookWorker.ts host secondary Pyodide instances (one per worker) so macros and notebook cells run off the UI thread.
  • src/runtime/dlw_plugins.py and the portable shim in src/runtime/dlplugins/datalab/ provide a PluginBase API source-compatible with desktop plugins, modulo await param.edit_async(...) for dialogs.
  • src/components/ holds presentational components with no Pyodide imports; parameter dialogs are auto-generated by DataSetDialog.tsx / DataSetForm/ from guidata DataSet schemas.
  • packages/sdk/ ships the host-side @datalab-platform/web-sdk so third-party applications can embed the static bundle and drive it through postMessage.
  • HDF5 is the single durable source of truth for workspaces; IndexedDB only holds recovery caches for macros and notebooks.

The persistence model, the testing pyramid (Vitest + React Testing Library, Playwright, pytest in Pyodide) and the contributor-facing documentation all derive from this decision.

5. Consequences

Positive

  • Local-first by construction. No backend, no per-user infrastructure, no upload of user data. The bundle is auditable end-to-end.
  • Maximal Sigima reuse. The same Sigima wheel that runs in DataLab desktop and in DataLab-Kernel runs in the browser. Bug fixes and feature additions in Sigima propagate automatically.
  • Static deployability. A single dist/ directory hosted behind any web server. The GitHub Actions workflow ships a fresh build to GitHub Pages on every release.
  • Embeddability. The SDK turns the bundle into a drop-in component for host applications, demonstrated by the Angular integration examples.
  • Plugin compatibility. The PluginBase shim allows the same plugin source to run in DataLab desktop and in DataLab-Web with minimal changes.
  • Security posture. No server-side execution of user notebooks / macros; the browser tab is the only sandbox the project needs to reason about.

Negative / accepted trade-offs

  • Cold-start cost. First load downloads Pyodide (~10 MB) and installs Sigima via micropip — typically 30–60 s on the user's first visit. Subsequent loads are cached by the browser. The prototype already mitigates this with a slim Plotly distribution (cbebd7e) and a worker-scoped Pyodide fixture for tests (65f2206); lazy-loading of sigima sub-modules is on the Milestone 1h backlog.
  • No PlotPy. Qt-based PlotPy is unavailable in the browser. We use Plotly.js and mirror PlotPy conventions where possible (curve styles, ROI overlays, geometry results) by following the patterns established in DataLab-Kernel/datalab_kernel/plotly_backend.py.
  • Main-thread Pyodide for processings. Today only macros and notebooks live in their own Pyodide instances; heavy processings still run on the main Pyodide instance. Moving them to a worker, with explicit cancellation primitives, is the core of Milestone 1h.
  • WebAssembly limits. A small number of Sigima dependencies (notably OpenCV) require care; the prototype uses opencv-python-headless where applicable and falls back to pure-Python or scikit-image equivalents otherwise.
  • Browser support. Modern Chromium/Firefox/Safari only; legacy browsers without WebAssembly SIMD are out of scope.

Implications for downstream tasks

  • Task 2 (Snapshot Sync Server). The local-first model implies that any sharing mechanism must be opt-in and snapshot-based. The Snapshot Sync Server is therefore designed as a manifest-only registry of immutable workspaces, not as a live execution backend.
  • Task 3 (Accessibility and Security). The static-bundle posture simplifies CSP definition and dependency auditing, and lets the accessibility work focus on a single rendering path (React + Plotly + custom dialogs).
  • Internationalisation (Milestone 1i). The framework will need to cover both UI strings (TypeScript side) and Python-side messages exposed through Pyodide; the DataLabRuntime boundary is the natural choke point to bridge the two.

6. Compliance and follow-up