Skip to content

refactor(runtimed): remove dead broadcast emissions#2065

Merged
rgbkrk merged 1 commit intomainfrom
refactor/remove-dead-broadcasts
Apr 23, 2026
Merged

refactor(runtimed): remove dead broadcast emissions#2065
rgbkrk merged 1 commit intomainfrom
refactor/remove-dead-broadcasts

Conversation

@rgbkrk
Copy link
Copy Markdown
Member

@rgbkrk rgbkrk commented Apr 23, 2026

Summary

Remove 8 broadcast types fully superseded by RuntimeStateDoc CRDT sync. Net -285 lines.

Already filtered at relay (never reached frontend):

  • KernelStatus, ExecutionStarted, ExecutionDone, QueueChanged, EnvSyncState

Reached frontend but redundant with CellChangeset/CRDT materialization:

  • Output, DisplayUpdate, OutputsCleared

Remaining broadcasts (kept):

  • Comm (ephemeral widget messages, not CRDT-able)
  • EnvProgress (transient UI progress)
  • PathChanged (notebook doc concern)
  • NotebookAutosaved (event, not state)
  • KernelError (not in CRDT yet)

Why

The CRDT changeset pipeline (CellChangeset + scheduleMaterialize) is the source of truth for outputs, execution state, kernel status, and queue state. The broadcast path was a parallel channel that predated the CRDT. It could cause double-renders or stale overwrites if messages arrived out of order with CRDT sync.

Also removes broadcast_tx from KernelState (no broadcasts left to send) and the peer relay filter that was dropping the already-filtered broadcasts.

Test plan

  • cargo test -p runtimed - 389 tests pass
  • cargo xtask clippy - clean
  • cargo xtask lint - clean
  • Full workspace builds

Remove 8 broadcast types that are fully superseded by RuntimeStateDoc
CRDT sync:

Already filtered at relay (never reached frontend):
- KernelStatus, ExecutionStarted, ExecutionDone, QueueChanged, EnvSyncState

Reached frontend but redundant with CellChangeset/CRDT materialization:
- Output, DisplayUpdate, OutputsCleared

The CRDT changeset pipeline (CellChangeset + scheduleMaterialize) is the
source of truth for outputs and execution state. The broadcast path was
a parallel channel that predated the CRDT and could cause double-renders
or stale overwrites if messages arrived out of order.

Also removes the peer relay filter that was dropping the already-filtered
broadcasts, and removes broadcast_tx from KernelState (no broadcasts
left to send).

Remaining broadcasts (Comm, EnvProgress, PathChanged, NotebookAutosaved,
KernelError) are either ephemeral events or not representable in the CRDT.
@github-actions github-actions Bot added the daemon runtimed daemon, kernel management, sync server label Apr 23, 2026
@rgbkrk rgbkrk merged commit d8d6db3 into main Apr 23, 2026
16 of 17 checks passed
@rgbkrk rgbkrk deleted the refactor/remove-dead-broadcasts branch April 23, 2026 04:07
rgbkrk added a commit that referenced this pull request Apr 23, 2026
…broadcast (#2066)

* fix(runtimed-py): poll RuntimeStateDoc for wait_for_ready instead of broadcast

wait_for_ready was listening for KernelStatus broadcast which was
removed in #2065. Switch to polling handle.get_runtime_state() for
kernel.status == "idle", which is the CRDT source of truth.

* fix(runtimed-py): wait for state transition, not stale idle snapshot

Two-phase poll: first wait for status to leave "idle" (restart in
progress), then wait for it to return to "idle" (new kernel ready).
Prevents returning immediately against the pre-restart idle snapshot.
rgbkrk added a commit that referenced this pull request Apr 23, 2026
Reflect shipped state (PR 0 + RoomIdentity), drop the RoomDocState
substruct (D1 rationale: doc stays top-level), update field counts
against post-#2065 code, and reshape the migration path as three
atomic per-substruct PRs instead of the original scaffolding-then-
migrate split (Rust can't fake field access via methods).

Target is now three PRs: RoomBroadcasts, RoomPersistence,
RoomConnections. ~89 callsites remaining across them.
rgbkrk added a commit that referenced this pull request Apr 23, 2026
- RuntimeStateDoc moved from notebook-doc to runtime-doc (#2056)
- CRDT writes go through RuntimeStateHandle (#2059)
- Pull task uses fork()/merge() for async blob work
- Dead broadcasts removed (#2065) - manifest updates propagate via CRDT
- Updated review pointers to current file paths
rgbkrk added a commit that referenced this pull request Apr 23, 2026
- RuntimeStateDoc moved from notebook-doc to runtime-doc (#2056)
- CRDT writes go through RuntimeStateHandle (#2059)
- Pull task uses fork()/merge() for async blob work
- Dead broadcasts removed (#2065) - manifest updates propagate via CRDT
- Updated review pointers to current file paths
rgbkrk added a commit that referenced this pull request Apr 23, 2026
* docs(specs): streaming Arrow IPC for DataFrame repr (#1816)

Design for dx emitting a Parquet head + a pull handle for
incremental Arrow IPC continuation, so huge DataFrames render
a first screenful immediately and grow in place.

Key shape:
- Head: 100-ish rows serialized as Parquet through the existing
  dx path. Sift's existing load hits immediately.
- Continuation: new `nteract.dx.stream.<id>` comm. Runtime agent
  pulls Arrow IPC chunks outside the execution-message hot path
  and appends them as blob refs in a new manifest field on the
  same output id.
- Transport is shared with #1815 (query backend).
- No mutable blobs, no ContentRef shape change — chunks are a
  JSON list of existing blob refs inside the manifest.
- Late joiners replay from the CRDT because chunks go through
  normal sync, not a side channel.

* docs: update streaming Arrow IPC spec for runtime-doc crate changes

- RuntimeStateDoc moved from notebook-doc to runtime-doc (#2056)
- CRDT writes go through RuntimeStateHandle (#2059)
- Pull task uses fork()/merge() for async blob work
- Dead broadcasts removed (#2065) - manifest updates propagate via CRDT
- Updated review pointers to current file paths

* docs: fix reserved-comm-namespace pointers in streaming Arrow IPC spec

The namespace rule moved out of CLAUDE.md and now lives in
.claude/rules/architecture.md § "Reserved Comm Namespace:
`nteract.dx.*`". Update the two spec references to point there.

No change to the design itself.

* feat(runtimed-wasm): install console_error_panic_hook on module init

Rust panics inside WASM currently surface to the frontend as an opaque
`__wbg___wbindgen_throw_6b64449b9b9ed33c` stack with wasm-function
indices and no file/line. The error reaches the App ErrorBoundary and
the "Something went wrong" fallback renders, but the cause is
invisible in packaged / CI builds.

This is exactly what's happening on UV Pyproject + UV Prewarmed E2E
today (post-#2103): something in the runtime-doc read path panics
when the daemon syncs a RuntimeState that walks through the full
lifecycle starting → running, and we have no way to name it.

Install `console_error_panic_hook::set_once()` from a
`#[wasm_bindgen(start)]` function so it runs exactly once before
any `NotebookHandle` is constructed. Panics now log with file, line,
message, and a Rust backtrace.

Combined with #2101 (ErrorBoundary → host logger), the next failing
E2E run will emit both the React component stack and the Rust panic
payload into `e2e-logs/app.log`.

Rebuilds the WASM bundle to pick up the hook wiring.

Verification:
- `cargo xtask wasm runtimed` — succeeds
- `deno test --allow-read crates/runtimed-wasm/tests/` — shape test
  still passes (51 filtered + 1 ok, the expected set)

* feat(notebook-app): forward console.error to host logger

The wasm panic hook from the previous commit calls `console.error`.
In dev builds `attachConsole()` from tauri-plugin-log is DEV-only
(see packages/notebook-host/src/tauri/index.ts:280), and the plugin
only bridges Rust log output INTO the browser console — it doesn't
forward browser console OUT to Rust. In packaged / CI builds the
panic message goes to `console.error` and stops there.

Install a small forwarder in main.tsx: wrap `console.error` to also
call `logger.error` (host-log). WASM panics now land in notebook.log
alongside everything else, visible in CI's `e2e-logs/app.log`.

Preserves the original console.error behavior so devtools stays
unchanged. The forwarding call is in a try/catch so a logger failure
can't swallow the original error.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

daemon runtimed daemon, kernel management, sync server

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant