Skip to content

DEW Report #1 ‐ 2026‐05‐14

Pierre Raybaut edited this page May 14, 2026 · 1 revision

Report #1 - Date: 2026-05-14

Scope of the report

This report is the first progress report on the DataLab Experimental Web Interface (DEW) project, funded by the NLnet Foundation as part of the NGI0 Commons Fund. It traces all the work delivered in the DataLab-Web repository since its first commit on 2026-03-06 up to the date of this report.

The DEW project is structured around three tasks:

  1. Web Frontend Prototype — feature-rich browser-native reimplementation of DataLab running Sigima in Pyodide (CPython compiled to WebAssembly).
  2. Snapshot Sync Server — a lightweight FastAPI registry of immutable workspace snapshots enabling asynchronous collaboration with manifest-only storage.
  3. Accessibility and Security — accessibility improvements guided by recognised WCAG principles, plus security hardening.

This first report focuses exclusively on Task 1 (Web Frontend Prototype), which is the core deliverable and the only task on which work has started. Tasks 2 and 3 will be covered by subsequent reports.

The following table collects the main project links related to this report:

Item URL
Repository https://github.com/DataLab-Platform/web
Live demo (GH Pages) https://datalab-platform.com/web/
MoU DataLab Experimental Web Interface (DEW) — Memorandum of Understanding (NLnet)
Sigima (engine) https://github.com/DataLab-Platform/Sigima
DataLab-Kernel https://github.com/DataLab-Platform/DataLab-Kernel

Executive Summary

In about two months of focused development (2026-03-06 → 2026-05-11, 109 commits), DataLab-Web has reached a state where the entire computation engine of DataLab runs inside the browser through Pyodide/WASM, paired with a custom React + TypeScript user interface modelled on the desktop Qt application. Plotting is delegated to Plotly.js since Qt-based PlotPy is unavailable in the browser.

The current prototype already covers most of Task 1's planned scope:

  • Browser runtime with a typed Python ↔ JavaScript bridge over Pyodide, automatic Sigima catalogue introspection and a hierarchical in-memory object model mirroring DataLab desktop's ObjectModel (✅ Milestone 1b).
  • Signal and image panels with synthetic generators, Plotly visualisation, contrast tools, profile extraction, ROI editor (segment, rectangular, circular, polygonal), object tree, properties panel, statistics card, metadata editor and computation history (✅ Milestone 1c).
  • Macro and notebook subsystems, each running in a dedicated Web Worker (its own Pyodide instance) and exchanging data with the UI through an async proxy API mirroring DataLab's RemoteProxy (✅ Milestone 1d).
  • Plugin system providing a PluginBase API such that desktop Qt plugin source code runs unchanged in the browser, provided dialogs use await param.edit_async(...) (✅ Milestone 1e).
  • I/O subsystem with browser-side HDF5 read/write, an HDF5 browser dialog, a text import wizard, image I/O and per-directory save (✅ Milestone 1f).
  • Static deployment configured for sub-path hosting, with a GitHub Actions workflow producing a deployable bundle (✅ Milestone 1g).
  • Initial performance optimisations (zero-copy binary transfer for large signals/images over the remote-control bridge, ~5.5× speed-up on the multi-image grid, increased timeouts and refactored async handling) (🟡 Milestone 1h).
  • Three-layer test pyramid (Vitest + React Testing Library, Playwright end-to-end suite, pytest inside Pyodide) and an associated testing-strategy document for contributors.

In addition to the planned scope, two notable developments were accomplished:

  • A remote-control bridge and TypeScript client SDK (@datalab-platform/web-sdk), packaged separately and demonstrated through Angular integration examples, allowing third-party teams to embed the static bundle and drive it through postMessage.
  • An AI Assistant panel with provider tools, opening the door to in-app conversational assistance built on top of the same remote-control surface.

Milestone Status for Report #1

The following paragraphs summarize the status of each Task 1 milestone at the time of this report.

✅ 1a. Architecture consolidation and decision record

The architectural choices have been progressively consolidated in the repository's contributor-facing documentation (README.md, CONTRIBUTING.md, .github/copilot-instructions.md, doc/testing-strategy.md, doc/notebooks.md, doc/plugins.md), which already capture the Pyodide/WASM stack, the React + TypeScript UI, the Plotly visualisation choice, the persistence model and the testing pyramid.

The dispersed rationale has been promoted into a self-contained Architecture Decision Record: DEW-ADR-001 — Browser-Native Frontend Architecture. It records the eight decision drivers (local-first, Sigima reuse, feature parity, static deployability, embeddability, plugin compatibility, sustainability, accessibility/security) and a substantiated comparison to the alternatives considered: JupyterLite extensions, server-side Python frameworks (Panel / Dash / Solara / Streamlit) and a server-connected SPA over the existing FastAPI Web API.

✅ 1b. Browser runtime and Python ↔ JS bridge

The runtime layer is in place and has been progressively hardened:

  • Pyodide loader with version pinning (PYODIDE_VERSION synchronised between runtime.ts and index.html).
  • In-browser hierarchical object model owned by src/runtime/bootstrap.py (panels, groups, objects, ROIs), re-executable across hot-reloads.
  • Sigima catalogue introspection in src/runtime/processor.py (build_signal_catalog() / build_image_catalog()), exposing the apply_* dispatch — the equivalent of DataLab desktop's register_1_to_1 / register_n_to_1 machinery.
  • Typed TypeScript wrappers around all Python calls in src/runtime/runtime.ts, with JSON-friendly helper functions on the Python side (tolist() on arrays, plain dicts).
  • Two refactors stabilising the runtime symbol surface (src/sigima/src/runtime/, Sigima*DataLabRuntime / Runtime*) so that Sigima remains the engine and DataLabRuntime is the browser orchestration layer.

Notable commits: 97fe501 (initial scaffolding), 82f380f (generic processor and feature catalogue), 717ea6c (Phase 1 validation for Pyodide runtime and serialisation queue), b29f5e4 (timeouts and async handling), 14989f7 / dd28a90 (runtime renaming refactors), 38cf93b (complex-valued Y arrays in get_signal_xy).

✅ 1c. Core UI: signal and image panels, object tree, menus, dialogs, ROI editor

The core UI has reached a state that mirrors a large portion of the desktop Qt application:

  • Signal panel: 1D curves with synthetic generators (Gaussian, Lorentzian, Voigt, Planck blackbody, sine, sawtooth, triangle, square, sinc, chirp, step, exponential, logistic, pulses, polynomial, custom expressions, noise distributions…) and full Plotly visualisation with cross-hair markers, annotations and ROI overlays.
  • Image panel: 2D arrays with synthetic generators (2D Gaussian, ramp, checkerboard, sinusoidal grating, ring pattern, Siemens star, 2D sinc, uniform / normal / Poisson noise…), zoomable Plotly heatmap, contrast adjustment, LUT range management, colormap selector with invert toggle, cross profiles, stats area tools, image grid distribution and side-by-side multi-image grid view.
  • Object tree with multi-group workspace, selection-driven menus, properties tab, metadata editor, statistics card, computation history and "last processing" reapply.
  • Menu bar driven by automatic introspection of the Sigima catalogue (operations, transforms, filters, fitting, FFT/PSD, stability analyses…). New Sigima entries appear without any JS change.
  • Dialogs auto-generated by DataSetDialog.tsx / DataSetForm/ from guidata DataSet schemas, plus dedicated dialogs for interactive fit, profile definition, "Erase area", "Create ROI grid", help and shortcuts.
  • ROI management: segment, rectangular, circular and polygonal regions of interest with a dedicated editor, live-update support and grid view.
  • View options: light/dark theme toggle, resizable splitters with persisted layout, pop-out side panel, opt-in result-overlay box, graphical titles toggle.

Notable commits: 1e6dadb (signal creation side panel), 1cd4201 (menu bar enhancements), 01a4147 (analysis results), af257ca (ROI icons and functionality), df5947c (image processing and UI), 2d58983 (draggable splitters and side panel pop-out), e04a695 (Properties panel with stats card, array preview, metadata editor), 84fd425 (OpenCV blob detection), c779c38 (image ROI editing), fc9d1f3 (interactive fitting dialog), 095d7fd (profile definition dialog), 649bcab (light/dark theme), 5b149ee2e6f0ff (visualisation polish), 417cfa2 (multi-image grid), 1007aa8 (graphical titles toggle), 537b60f (HDF5 browser tree expansion).

✅ 1d. Macros and notebooks subsystem

Both subsystems are functional and isolated from the UI thread:

  • Macros: embedded Python editor (CodeMirror with autocompletion and search) plus a console, mirroring DataLab's macro system. Macros run in a dedicated Web Worker (its own Pyodide instance) and call an async proxy API mirroring DataLab's RemoteProxy. Inline tab renaming and CRUD operations are supported.
  • Notebooks: multi-tab notebook panel with code & markdown cells, persistent in-browser autosave (IndexedDB), full nbformat v4.5 .ipynb import / export, a bundled Quickstart template and bidirectional Convert to macro / Convert to notebook actions (Spyder-style # %% / # %% [markdown] separators). See doc/notebooks.md.
  • Persistence model documented and implemented: HDF5 workspace as the single durable source of truth, with IndexedDB recovery caches for macro/notebook content, a workspace dirty marker in the title bar, a beforeunload confirmation guard and a one-time cold-start recovery banner.

Notable commits: ecdc3d2 (macro management), 0e1874f (async errors in macro worker), c352441 (notebook panel + e2e tests), b473ad7 (notebook CRUD), e6ace78 (unified recent store), 14a6095 (UI alignment between macro/notebook panels), 1cf817e (workspace dirty tracking and titled HDF5 sessions), 4c0dcb6 (cold-start recovery banner), 1178def (inline macro renaming).

✅ 1e. Plugin system compatible with the desktop Qt plugin API

A first-class plugin host is in place. The same plugin source can run in DataLab desktop and in DataLab-Web provided dialogs use await param.edit_async(...). The implementation lives in src/runtime/dlw_plugins.py and the portable shim in src/runtime/dlplugins/datalab/. See doc/plugins.md.

Notable commits: b614e8c (initial plugin system), 57a5971 (keep UI responsive when plugin dialog crosses Pyodide bridge).

✅ 1f. I/O subsystem: browser-side HDF5 access, text import wizard, file save/load

The I/O subsystem is operational:

  • HDF5: open and save full workspaces from the browser via h5py running in Pyodide, with a dedicated HDF5 browser dialog (selection-driven tree expansion, partial loading).
  • Text import wizard for CSV / TSV / column files with delimiter detection and preview.
  • Image I/O for individual files.
  • Per-directory save dialog producing a coherent on-disk layout via the browser file APIs.

Notable commits: c5ed19f (HDF5 workspace open/save), 9efdc7e (HDF5 browser), 49b83df (text import wizard), 41cb77e (Blob/Uint8Array compatibility fix), 79ff29f (Save to directory), fe57302 (image I/O), 537b60f (tree expansion controls).

✅ 1g. Static deployment for sub-path hosting (e.g. GitHub Pages)

The build pipeline produces a fully static bundle that can be hosted behind any web server. Vite is configured with base: "./" so the bundle is drop-in deployable to any sub-path. A GitHub Actions workflow produces and deploys the bundle. The 📦 Release Package (app + SDK) task additionally produces datalab-web-<v>.tgz and datalab-platform-web-sdk-<v>.tgz tarballs for third-party embedding.

Notable commits: 1a88537 (GitHub Pages deployment workflow), df84065 (initial guard around automatic deployment), cbebd7e (slim Plotly dist + path resolution), 36c38e2 (shippable static bundle for third-party Angular embedding).

🟡 1h. Performance optimisation and interruptible processing

Several performance improvements have already landed:

  • Zero-copy binary transfer for large signals/images over the remote-control bridge (3afe06c).
  • ~5.5× speed-up on the multi-image grid display (745dd73).
  • Increased timeouts and refactored async handling for stability under load (b29f5e4).
  • Use of the slim Plotly distribution to reduce bundle size (cbebd7e).

➡️ Remaining actions:

  • Move long-running Sigima computations off the main thread into a dedicated Web Worker (today only macros and notebooks live in their own Pyodide instances; processings still run in the main Pyodide instance).
  • Introduce explicit cancellation primitives so that long-running processings can be interrupted from the UI, mirroring the desktop application's "separate process" option.
  • Lazy-load sigima sub-modules to further reduce cold-start time.

⬜ 1i. Internationalisation framework and French translation

Not started yet. The framework will need to cover both UI strings and Python-side messages exposed through Pyodide, with French as the reference locale and a documented translation contribution workflow.

🚧 Out-of-scope additions worth noting

Two developments delivered alongside Task 1 do not strictly belong to a planned milestone but are key to the project's viability:

  • Remote-control bridge and TypeScript client SDK (3d99506, 3afe06c, 311ca0d): packaged as @datalab-platform/web-sdk under packages/sdk/, with Angular integration examples. This makes DataLab-Web embeddable as a static bundle driven through postMessage and is the foundation for the AI Assistant.
  • AI Assistant panel (9092767): in-app AI assistant with provider tools, wired through the same remote-control surface.

These additions emerged from operational opportunities (Codra-side integration use cases) and have been made compatible with the local-first model — the SDK only relays user-initiated commands, no data leaves the browser without explicit user action.

Quality and testing

A three-layer test pyramid was set up early in the project and has grown alongside features:

  • Vitest + React Testing Library for fast unit / component coverage of the TypeScript code.
  • Playwright for end-to-end specs driven through a real browser, with a worker-scoped Pyodide fixture to amortise cold-start cost (65f2206), a consolidated suite filling coverage gaps (c08d9d6, de483f1) and isolated performance probes.
  • pytest inside Pyodide for the Python helpers shipped with the runtime (tests/python).

A dedicated testing strategy document (e879883) is published for contributors, including the promotion criteria for permanent E2E specs and the use of throwaway probes (tests/e2e/_repro_*.spec.ts) for one-shot diagnosis. Vitest is documented as a mandatory pre-completion check (d64c4dd).

Linting and formatting are enforced via ESLint, Prettier, Ruff and Pylint (f4d7ded, 8c98e0e, f8dc43f, 680e52b, 5724cae).

Assisted-by trailers and the NLnet GenAI policy

Following the NLnet GenAI policy, commits that benefit from AI assistance carry an Assisted-by: <Model> <Version> trailer. Both README.md and CONTRIBUTING.md document this requirement explicitly (04f696a). Roughly half of the commits in this period carry such a trailer; human review remained mandatory before any commit and structural decisions stayed under human responsibility.

Code References for Report #1

  • Repository: github.com/DataLab-Platform/web
  • Default branch: main
  • First commit: 97fe501 (2026-03-06)
  • Latest commit at the time of this report: 537b60f (2026-05-10)
  • Total commits in the period: 109
  • Current package version: 0.1.0 (Beta) — surfaced in the menubar as a Beta badge (1aa50b1)

Key contributor-facing documents:

Next steps after Report #1

Short-term focus, in approximate order:

  1. Milestone 1h — Move long-running processings off the main thread and introduce cancellation primitives.
  2. Milestone 1i — Implement the internationalisation framework and ship the French translation as the reference locale.
  3. Task 2 — Start the Snapshot Sync Server (FastAPI registry, manifest schema, "Shared workspace" UI integration).
  4. Task 3 — Begin accessibility improvements (keyboard navigation, ARIA labels, contrast checks) and the security hardening pass (CSP, dependency audit in CI, sanitisation of imported macros and notebooks).

Overall progress summary

Status legend: ✅ done · 🟡 partial · 🚧 in progress · ⬜ not started.

Task Sub-task Status
1. Web Frontend Prototype 🟡
1a. Architecture consolidation and decision record
1b. Browser runtime and Python ↔ JS bridge
1c. Core UI: signal and image panels, object tree, menus, dialogs, ROI editor
1d. Macros and notebooks subsystem
1e. Plugin system compatible with the desktop Qt plugin API
1f. I/O subsystem: browser-side HDF5 access, text import wizard, file save/load
1g. Static deployment for sub-path hosting (e.g. GitHub Pages)
1h. Performance optimisation and interruptible processing 🟡
1i. Internationalisation framework and French translation
2. Snapshot Sync Server
2a. FastAPI snapshot registry server with documented REST schemas
2b. UI integration in DataLab-Web ("Shared workspace" menu)
2c. Architecture decision record, scriptable demo scenario, end-to-end Playwright tests
3. Accessibility and Security
3a. Accessibility improvements
3b. Security hardening and audit response

Clone this wiki locally