Skip to content

Fix deprecation, dead code warnings, and other clippiness#1

Merged
rgbkrk merged 2 commits intomainfrom
fix-deprecation-warnings
Feb 11, 2026
Merged

Fix deprecation, dead code warnings, and other clippiness#1
rgbkrk merged 2 commits intomainfrom
fix-deprecation-warnings

Conversation

@rgbkrk
Copy link
Copy Markdown
Member

@rgbkrk rgbkrk commented Feb 11, 2026

Replaces all deprecated create_client_shell_connection calls with create_client_shell_connection_with_identity + peer_identity_for_session from runtimelib 1.2.0. Suppresses the dead code warning on execute_with_stdin.

- sidecar: replace deprecated create_client_shell_connection with
  create_client_shell_connection_with_identity + peer_identity_for_session
  at all three call sites (run, request_kernel_info, request_python_cwd)
- runt: #[allow(dead_code)] on execute_with_stdin (useful but unused)
- remove build step for packages/ui (source-only library, no build script)
@rgbkrk rgbkrk force-pushed the fix-deprecation-warnings branch from d270cdd to 6a3cf14 Compare February 11, 2026 18:52
- Remove redundant `use env_logger` import
- Box JupyterMessage in SidecarEvent to reduce enum size (464 → ~8 bytes)
- Use shorthand field init (`content` instead of `content: content`)
- Remove useless .into() on string literals in Key::Character()
- Compare Method::GET directly instead of &Method::GET
@rgbkrk rgbkrk changed the title Fix deprecation and dead code warnings Fix deprecation, dead code warnings, and other clippiness Feb 11, 2026
@rgbkrk rgbkrk merged commit 814595c into main Feb 11, 2026
7 checks passed
@rgbkrk rgbkrk deleted the fix-deprecation-warnings branch February 11, 2026 19:10
rgbkrk added a commit that referenced this pull request Feb 24, 2026
…update_display_data

- Fix High #1: Normalize daemon output from JupyterMessageContent to nbformat shape
- Fix High #2: Add daemon_execution to save_setting_locally for local persistence
- Fix Medium #3: Add daemon_execution to from_json and apply_json_changes for migration
- Fix Medium #4: Add onUpdateDisplayData callback for update_display_data handling
- Fix Low #5: Remove verbose broadcast logging (keep only error logs)
rgbkrk added a commit that referenced this pull request Feb 25, 2026
…update_display_data

- Fix High #1: Normalize daemon output from JupyterMessageContent to nbformat shape
- Fix High #2: Add daemon_execution to save_setting_locally for local persistence
- Fix Medium #3: Add daemon_execution to from_json and apply_json_changes for migration
- Fix Medium #4: Add onUpdateDisplayData callback for update_display_data handling
- Fix Low #5: Remove verbose broadcast logging (keep only error logs)
rgbkrk added a commit that referenced this pull request Feb 25, 2026
* Add Tauri commands for daemon kernel execution

Extends the notebook sync client to support request/response patterns:
- Add NotebookBroadcastReceiver for kernel events from daemon
- Add send_request() method with typed frames
- Add recv_frame_any() for handling all frame types
- Forward broadcasts to frontend via daemon:broadcast event

New Tauri commands for daemon-owned execution:
- launch_kernel_via_daemon
- queue_cell_via_daemon
- clear_outputs_via_daemon
- interrupt_via_daemon
- shutdown_kernel_via_daemon
- get_daemon_kernel_info
- get_daemon_queue_state

This is Step 4 of the daemon-owned kernel execution plan.
Frontend hooks (Step 5) still need to be updated to use these.

* Add frontend useDaemonKernel hook and broadcast types

New frontend components for daemon-owned kernel execution:

types.ts:
- DaemonBroadcast: Types for kernel status, outputs, queue changes, errors
- DaemonNotebookResponse: Types for daemon request responses

useDaemonKernel.ts:
- Hook for daemon kernel operations
- Listens for daemon:broadcast events
- Provides launchKernel, queueCell, clearOutputs, interruptKernel, shutdownKernel
- Tracks kernel status, queue state, and kernel info
- Parses output JSON from broadcasts and calls onOutput callback

This hook is separate from useKernel (local execution) to allow
gradual migration. Apps can choose which execution mode to use.

* Add daemon_execution setting to synced settings

New setting to enable/disable daemon-owned kernel execution:

Backend (settings_doc.rs, sync_client.rs):
- Add daemon_execution: bool to SyncedSettings struct
- Add get_bool/put_bool methods to SettingsDoc
- Handle boolean values in put_value for Tauri commands

Frontend (useSyncedSettings.ts):
- Add daemonExecution state and setDaemonExecution callback
- Sync across windows via settings:changed event

The setting defaults to false. When enabled (in a future PR),
the app will use useDaemonKernel instead of useKernel.

* Fix daemon_execution field in SyncedSettings tests

* Address review feedback: output normalization, settings persistence, update_display_data

- Fix High #1: Normalize daemon output from JupyterMessageContent to nbformat shape
- Fix High #2: Add daemon_execution to save_setting_locally for local persistence
- Fix Medium #3: Add daemon_execution to from_json and apply_json_changes for migration
- Fix Medium #4: Add onUpdateDisplayData callback for update_display_data handling
- Fix Low #5: Remove verbose broadcast logging (keep only error logs)

* Wire daemon execution switch in App.tsx

When daemon_execution setting is enabled:
- Use useDaemonKernel hook for kernel operations
- Launch kernel via daemon instead of local ensureKernelStarted
- Queue cells via daemon with cell source
- Route outputs through daemon broadcasts

The switch allows testing daemon-owned kernel execution while
keeping local execution as the stable default.

* Fix daemon execution order: launch kernel before queuing cells

The daemon returns NoKernel when QueueCell is called without a running
kernel, dropping the request. Fix by ensuring kernel is launched first,
then queueing cells after it's ready.

* Bump runtimed to 0.1.0-dev.3 and speed up dev builds

- Bump version from 0.1.0-dev.2 to 0.1.0-dev.3
- cargo xtask build now uses debug mode for runtimed/runt-cli (~50s faster)
- cargo xtask build-dmg/build-app still use release mode for distribution
- cargo xtask install-daemon unchanged (always release for perf)

* Update Cargo.lock for runtimed version bump
rgbkrk added a commit that referenced this pull request Feb 25, 2026
The daemon was regenerating Jupyter message headers when sending comm
messages to the kernel, breaking widget protocol expectations. This caused
slider interactions in one window to not sync to other windows.

Changes:
- Send full message envelope through daemon instead of just msg_type/content
- Parse and preserve frontend's original header (session, msg_id, etc.)
- Use JupyterMessageContent::from_type_and_content for proper type parsing
- Handle no_kernel response as an error condition in frontend

This fixes issue #1 from code review: daemon routing was dropping the
original frontend header/session and regenerating new headers per message.
rgbkrk added a commit that referenced this pull request Feb 25, 2026
The daemon was regenerating Jupyter message headers when sending comm
messages to the kernel, breaking widget protocol expectations. This caused
slider interactions in one window to not sync to other windows.

Changes:
- Send full message envelope through daemon instead of just msg_type/content
- Parse and preserve frontend's original header (session, msg_id, etc.)
- Use JupyterMessageContent::from_type_and_content for proper type parsing
- Handle no_kernel response as an error condition in frontend

This fixes issue #1 from code review: daemon routing was dropping the
original frontend header/session and regenerating new headers per message.
rgbkrk added a commit that referenced this pull request Feb 25, 2026
* Enable widget support (ipywidgets) in daemon execution mode

Implement bidirectional comm message routing through the daemon to support
ipywidgets and other comm-based outputs. Comm messages from the kernel are
broadcast to all connected frontends, and widget interactions are routed back
through the daemon to the kernel. All windows sharing a notebook can now
interact with widgets with synchronized state.

- Add NotebookBroadcast::Comm variant for broadcasting comm_open/msg/close
- Handle comm messages in kernel iopub task and route through daemon
- Add NotebookRequest::SendComm for frontend→kernel comm messages
- Extend useDaemonKernel hook with onCommMessage callback and sendCommMessage
- Wire daemon comm sending in App.tsx with module-level sender pattern
- Bump daemon version to 0.1.0-dev.9

* Fix widget sync by preserving frontend message headers

The daemon was regenerating Jupyter message headers when sending comm
messages to the kernel, breaking widget protocol expectations. This caused
slider interactions in one window to not sync to other windows.

Changes:
- Send full message envelope through daemon instead of just msg_type/content
- Parse and preserve frontend's original header (session, msg_id, etc.)
- Use JupyterMessageContent::from_type_and_content for proper type parsing
- Handle no_kernel response as an error condition in frontend

This fixes issue #1 from code review: daemon routing was dropping the
original frontend header/session and regenerating new headers per message.

* Bump daemon version to 0.1.0-dev.10

* Document widget multi-window sync limitation (#276)

Update runtimed.md to reflect current widget support status:
- Widget comm routing is implemented (PR #275)
- Single-window widgets work correctly
- Multi-window sync is a known limitation tracked in #276
rgbkrk added a commit that referenced this pull request Feb 27, 2026
Track active comm channels in daemon and send CommSync broadcast to newly
connected clients, enabling widget reconstruction in secondary windows.

- Add CommState module for tracking comm_open/msg/close
- Wire CommState into NotebookRoom and kernel manager
- Send CommSync after initial Automerge sync on client connect
- Handle comm_sync in frontend to synthesize comm_open messages
- Clear comm state on kernel shutdown

Known limitations:
- Third window opened after execution may not receive comm_sync (needs investigation)
- Widget state not synced in real-time (by design - avoids feedback loops)
- Buffer paths not fully reconstructed (Codex review finding #1)
- Replay order nondeterministic (Codex review finding #2)

Closes #276

Co-Authored-By: QuillAid <261289082+quillaid@users.noreply.github.com>
rgbkrk added a commit that referenced this pull request Feb 27, 2026
Track active comm channels in daemon and send CommSync broadcast to newly
connected clients, enabling widget reconstruction in secondary windows.

- Add CommState module for tracking comm_open/msg/close
- Wire CommState into NotebookRoom and kernel manager
- Send CommSync after initial Automerge sync on client connect
- Handle comm_sync in frontend to synthesize comm_open messages
- Clear comm state on kernel shutdown

Known limitations:
- Third window opened after execution may not receive comm_sync (needs investigation)
- Widget state not synced in real-time (by design - avoids feedback loops)
- Buffer paths not fully reconstructed (Codex review finding #1)
- Replay order nondeterministic (Codex review finding #2)

Closes #276

Co-Authored-By: QuillAid <261289082+quillaid@users.noreply.github.com>
rgbkrk added a commit that referenced this pull request Feb 27, 2026
* Add widget comm state sync for multi-window support

Track active comm channels in daemon and send CommSync broadcast to newly
connected clients, enabling widget reconstruction in secondary windows.

- Add CommState module for tracking comm_open/msg/close
- Wire CommState into NotebookRoom and kernel manager
- Send CommSync after initial Automerge sync on client connect
- Handle comm_sync in frontend to synthesize comm_open messages
- Clear comm state on kernel shutdown

Known limitations:
- Third window opened after execution may not receive comm_sync (needs investigation)
- Widget state not synced in real-time (by design - avoids feedback loops)
- Buffer paths not fully reconstructed (Codex review finding #1)
- Replay order nondeterministic (Codex review finding #2)

Closes #276

Co-Authored-By: QuillAid <261289082+quillaid@users.noreply.github.com>

* Address Codex review findings #2 and #3

- Use sequence numbers for deterministic comm replay order (finding #2)
  Widget models can reference other widgets via IPY_MODEL_. Returning comms
  in insertion order ensures dependencies are created before dependents.

- Clear comm state when launching new kernel (finding #3)
  If a kernel crashes without proper shutdown, stale comm snapshots could
  persist. Now cleared on new kernel launch, not just explicit shutdown.

Co-Authored-By: QuillAid <261289082+quillaid@users.noreply.github.com>

* Add error logging for broadcast deserialization failures

Previously, failed deserializations during init were silently ignored.
Now logs a warning with the error and payload size to help diagnose
issues like comm_sync not being received.

Co-Authored-By: QuillAid <261289082+quillaid@users.noreply.github.com>

* Add diagnostic logging for comm_sync handling

Helps diagnose third window widget sync issue by logging:
- When comm_sync event is received
- Whether onCommMessage handler is available
- Unknown broadcast event types

Co-Authored-By: QuillAid <261289082+quillaid@users.noreply.github.com>

---------

Co-authored-by: QuillAid <261289082+quillaid@users.noreply.github.com>
rgbkrk added a commit that referenced this pull request Feb 28, 2026
- Fix issue #1: Remove pending completion entry if shell.send() fails,
  preventing stale entries from accumulating
- Fix issue #3: Downgrade completion logging from info! to debug! to
  reduce noise from high-frequency autocomplete requests
rgbkrk added a commit that referenced this pull request Feb 28, 2026
- Fix issue #1: Remove pending completion entry if shell.send() fails,
  preventing stale entries from accumulating
- Fix issue #3: Downgrade completion logging from info! to debug! to
  reduce noise from high-frequency autocomplete requests
rgbkrk added a commit that referenced this pull request Feb 28, 2026
* Add daemon support for code completions (Tab/Ctrl+Space)

Implement code completion requests via the daemon using the Jupyter protocol's
complete_request/complete_reply messages. Uses LSP-ready CompletionItem
structure for future extensibility with additional completion sources (ruff,
basedpyright). Closes #314.

* Address review feedback: clean up pending completions on send failure

- Fix issue #1: Remove pending completion entry if shell.send() fails,
  preventing stale entries from accumulating
- Fix issue #3: Downgrade completion logging from info! to debug! to
  reduce noise from high-frequency autocomplete requests
rgbkrk added a commit that referenced this pull request Feb 28, 2026
Bug #1: New notebooks with explicit runtime selection now set the
kernelspec metadata correctly. Previously, new_empty_with_runtime()
left kernelspec as None, causing the daemon to fall back to
default_runtime instead of respecting the user's choice.

- Set kernelspec in new_empty(), new_empty_with_runtime(),
  new_empty_with_conda_from_environment_yml(), and
  new_empty_with_uv_from_pyproject()

Bug #2: Notebooks in directories with pyproject.toml now properly
use the project's dependencies via `uv run --with ipykernel`.

- kernel_manager.rs: Branch on env_source for Python kernels;
  use `uv run` for uv:pyproject instead of pooled env
- notebook_sync_server.rs: Skip pool acquisition for uv:pyproject
  since kernel manager uses uv run directly
rgbkrk added a commit that referenced this pull request Feb 28, 2026
…#337)

* feat: extract kernel-launch crate for shared kernel launching

Create a new shared crate that both notebook and runtimed can use for
kernel launching and tool bootstrapping:

- Extract tools.rs from notebook crate into kernel-launch
- Add kernel-launch dependency to notebook and runtimed
- Update daemon to properly dispatch Python vs Deno kernels
- Deno kernels now use bootstrapped deno via kernel_launch::tools::get_deno_path()
- Python kernels require pooled environment (no kernelspec fallback)
- Add default_runtime check to auto_launch_kernel

This fixes the issue where Deno notebooks were incorrectly launching
Python kernels because the daemon wasn't checking default_runtime.

* fix: update LaunchKernel to handle Deno and require pool env

- Deno kernels skip pooled env acquisition
- Python kernels return error if pool empty (no kernelspec fallback)
- Non-prewarmed Python sources also require pool env

* feat: add inline Deno detection (runt.deno metadata)

- check_inline_deps now detects runt.deno in notebook metadata
- Auto-launch checks inline deps first, which can override default_runtime
- Notebooks with runt.deno config launch Deno kernels regardless of settings

* fix: prioritize project files over default_runtime and route conda to correct pool

Issues fixed:
1. Project files (pyproject.toml, environment.yml, pixi.toml) now take
   precedence over default_runtime setting. This prevents Deno from
   launching when opening a Python project notebook.

2. Conda-prefixed env_sources (conda:env_yml, conda:pixi, conda:inline)
   now correctly route to the conda pool instead of UV pool.

Detection priority is now:
1. Inline deps (runt.deno, runt.uv, runt.conda)
2. Project files (implies Python context)
3. Default runtime (only if no project context)

* fix: prioritize notebook kernelspec over project files for runtime detection

Detection priority is now:
1. Notebook's kernelspec (metadata.kernelspec.name) - determines python vs deno
2. For Python notebooks: resolve env (inline deps → project files → prewarmed)
3. For Deno notebooks: just launch Deno
4. For new notebooks (no kernelspec): use default_runtime setting

This allows Deno and Python notebooks to coexist in the same project directory.
A Deno notebook in a directory with pyproject.toml will still launch Deno.

* docs: clarify kernel launching priority with two-stage detection

Document the new kernel launching architecture where runtime type
(Python vs Deno) is determined first, then environment resolution
for Python notebooks. This allows Python and Deno notebooks to
coexist in the same project directory.

- Add kernel-launch crate to documentation
- Explain Stage 1: Runtime Detection (kernelspec-first)
- Explain Stage 2: Python Environment Resolution (inline → project → prewarmed)
- Document Deno kernel bootstrapping via rattler
- Update developer guide (AGENTS.md) with new two-stage detection
- Update user guide (docs/environments.md) with clearer explanation
- Add detailed architecture notes to contributing/environments.md

* Fix kernel detection for new notebooks and pyproject support

Bug #1: New notebooks with explicit runtime selection now set the
kernelspec metadata correctly. Previously, new_empty_with_runtime()
left kernelspec as None, causing the daemon to fall back to
default_runtime instead of respecting the user's choice.

- Set kernelspec in new_empty(), new_empty_with_runtime(),
  new_empty_with_conda_from_environment_yml(), and
  new_empty_with_uv_from_pyproject()

Bug #2: Notebooks in directories with pyproject.toml now properly
use the project's dependencies via `uv run --with ipykernel`.

- kernel_manager.rs: Branch on env_source for Python kernels;
  use `uv run` for uv:pyproject instead of pooled env
- notebook_sync_server.rs: Skip pool acquisition for uv:pyproject
  since kernel manager uses uv run directly

* Fix inline UV dependencies to use uv run --with

For uv:inline sources, the kernel manager now uses `uv run --with <dep>`
for each inline dependency instead of a prewarmed pool environment.
This ensures that notebooks with inline dependencies (like `requests`)
actually have those packages available at runtime.

Changes:
- kernel_manager.rs: Add inline_deps parameter to launch(); handle
  uv:inline by building `uv run --with ipykernel --with <deps>...`
- notebook_sync_server.rs: Add get_inline_uv_deps() helper to extract
  dependency list from notebook metadata; pass None for pooled_env
  and extract deps when env_source is uv:inline

* feat(runtimed): add cached inline UV env support

Replace ephemeral `uv run --with` approach with cached environments
for inline UV dependencies. Environments are cached by dependency hash
in ~/.cache/runt/inline-envs/ for fast reuse on subsequent launches.

New module inline_env.rs provides:
- prepare_uv_inline_env(): Creates/returns cached venv with deps installed
- Dependency hash for cache key (sorted, stable)
- Uses `uv venv` + `uv pip install` for reliable installation

Changes:
- kernel_manager.rs: uv:inline now uses prepared cached env
- notebook_sync_server.rs: Calls prepare_uv_inline_env() before launch
  for both auto_launch_kernel and LaunchKernel request handler
- lib.rs: Export new inline_env module

TODO: Add conda:inline support using rattler (helper function added
but not yet wired up)

* docs: document inline dependency environment caching

Add documentation for how uv:inline deps work:
- Cached environments in ~/.cache/runt/inline-envs/
- Hash-based caching for fast reuse
- Flow from detection to kernel launch

Add inline_env.rs to key files table.
rgbkrk added a commit that referenced this pull request Mar 1, 2026
1. Socket path mismatch (Issue #1): Session.connect() now respects
   RUNTIMED_SOCKET_PATH env var, and test fixtures set it when spawning
   daemon in CI mode.

2. Nested timeouts (Issue #2): Remove outer 10ms timeout wrapping
   recv_frame_any() since it already has internal 100ms timeout.
   This prevents repeatedly canceling mid-read.

3. sync_rx not drained (Issue #3): Use try_send() instead of send().await
   for changes channel. If receiver isn't keeping up, skip the update
   rather than blocking. Python bindings keep sync_rx alive but don't
   consume it.

4. Parse failure semantics (Issue #4): When output_type is "error" but
   parsing fails, create an error Output to preserve success=false
   semantics.

5. CONDUCTOR_WORKSPACE_PATH (Safia's comment): Use env var as preferred
   repo root fallback before walking up parent directories.
rgbkrk added a commit that referenced this pull request Mar 1, 2026
* feat(runtimed-py): add PyO3 bindings for daemon client

Add Python bindings for runtimed daemon operations:

- DaemonClient: pool status, ping, list rooms, flush, shutdown
- Session: connect to notebook room, start kernel, execute code
- ExecutionResult/Output: structured output types

Uses PyO3 0.28 with pyo3-async-runtimes for tokio integration.

Note: Current Session.execute() uses QueueCell shortcut which
bypasses automerge doc. See design gap notes for proper
doc-based execution flow.

* chore: add *.so to gitignore, remove stray architecture.md

- Add *.so to root gitignore for Python extension modules
- Remove contributing/architecture.md (lives on architectural-principles branch)

* feat(runtimed-py): implement document-first execution

Switch runtimed-py bindings to use ExecuteCell instead of deprecated
QueueCell. The daemon now reads cell source from the automerge document,
ensuring all connected clients see the same code being executed.

Changes:
- Add Cell class to expose cell info to Python
- Add document operations: create_cell, set_source, get_cell, get_cells, delete_cell
- Add execute_cell() using ExecuteCell (reads from doc)
- Add run() convenience method (create + execute)
- Fix sync task exit bug by storing sync_rx in SessionState
- Add timeout to recv_frame_any() to prevent blocking
- Add biased select! to prioritize commands over polling

* test(runtimed-py): add daemon integration tests

Add comprehensive integration tests for the document-first execution
pattern. Tests cover:
- Basic connectivity to daemon
- Document operations (create/update/get/delete cells)
- Cell execution via ExecuteCell (reading from automerge doc)
- Multi-client synchronization (two sessions sharing a notebook)
- Kernel lifecycle (start/interrupt/shutdown)
- Output types (stdout/stderr/display_data)
- Error handling

Supports two modes:
- Dev mode: uses existing daemon via `cargo xtask dev-daemon`
- CI mode: spawns isolated daemon with log capture

24 tests covering the core document-first architecture.

* ci(runtimed-py): add daemon integration tests to CI

Add a new job to the build workflow that runs the runtimed-py
integration tests. The job:
- Builds runtimed binary and runtimed-py Python bindings
- Runs 24 integration tests in CI mode (spawns isolated daemon)
- Uploads test logs as artifacts for debugging

Tests verify document-first execution, multi-client sync, kernel
lifecycle, output capture, and error handling.

* docs: add Python bindings documentation

Comprehensive guide for the runtimed Python package covering:
- Session API for code execution
- DaemonClient for low-level operations
- Document-first execution pattern
- Multi-client scenarios
- Result types (ExecutionResult, Output, Cell)
- Sidecar launcher for rich output

* docs(runtimed-py): add package README for PyPI

* fix(runtimed-py): address code review findings

1. Socket path mismatch (Issue #1): Session.connect() now respects
   RUNTIMED_SOCKET_PATH env var, and test fixtures set it when spawning
   daemon in CI mode.

2. Nested timeouts (Issue #2): Remove outer 10ms timeout wrapping
   recv_frame_any() since it already has internal 100ms timeout.
   This prevents repeatedly canceling mid-read.

3. sync_rx not drained (Issue #3): Use try_send() instead of send().await
   for changes channel. If receiver isn't keeping up, skip the update
   rather than blocking. Python bindings keep sync_rx alive but don't
   consume it.

4. Parse failure semantics (Issue #4): When output_type is "error" but
   parsing fails, create an error Output to preserve success=false
   semantics.

5. CONDUCTOR_WORKSPACE_PATH (Safia's comment): Use env var as preferred
   repo root fallback before walking up parent directories.
rgbkrk added a commit that referenced this pull request Mar 9, 2026
…ides

Add comprehensive documentation for developers:
- contributing/frontend-architecture.md: src/ vs apps/notebook/src/ split, path aliases, data flow
- contributing/testing.md: unified testing guide (Vitest, Rust, Hone, Python, E2E)
- contributing/typescript-bindings.md: ts-rs workflow for Rust → TypeScript types
- contributing/widget-development.md: widget system internals, WidgetStore API, adding widgets
- contributing/architecture.md: expanded Principle 5 with blob store implementation details

Addresses high-priority documentation gaps from gap analysis (#1-4) without multi-window sync docs as requested.
rgbkrk added a commit that referenced this pull request Mar 9, 2026
…ides

Add comprehensive documentation for developers:
- contributing/frontend-architecture.md: src/ vs apps/notebook/src/ split, path aliases, data flow
- contributing/testing.md: unified testing guide (Vitest, Rust, Hone, Python, E2E)
- contributing/typescript-bindings.md: ts-rs workflow for Rust → TypeScript types
- contributing/widget-development.md: widget system internals, WidgetStore API, adding widgets
- contributing/architecture.md: expanded Principle 5 with blob store implementation details

Addresses high-priority documentation gaps from gap analysis (#1-4) without multi-window sync docs as requested.
rgbkrk added a commit that referenced this pull request Mar 9, 2026
…ides (#652)

Add comprehensive documentation for developers:
- contributing/frontend-architecture.md: src/ vs apps/notebook/src/ split, path aliases, data flow
- contributing/testing.md: unified testing guide (Vitest, Rust, Hone, Python, E2E)
- contributing/typescript-bindings.md: ts-rs workflow for Rust → TypeScript types
- contributing/widget-development.md: widget system internals, WidgetStore API, adding widgets
- contributing/architecture.md: expanded Principle 5 with blob store implementation details

Addresses high-priority documentation gaps from gap analysis (#1-4) without multi-window sync docs as requested.
rgbkrk added a commit that referenced this pull request Mar 18, 2026
Agents (and humans) waste time when they don't know the right
incantation for rebuilding Python bindings, running integration
tests, or interacting with the dev daemon. Add a copy-paste-ready
Quick Recipes section at the top of AGENTS.md covering:

- Dev daemon env vars (the #1 footgun)
- Rebuilding runtimed-py into the correct venv
- Running Python integration tests against the dev daemon
- Launching the notebook app in dev mode
- WASM rebuild
- Pointers to the right contributing docs per task
rgbkrk added a commit that referenced this pull request Mar 18, 2026
Agents (and humans) waste time when they don't know the right
incantation for rebuilding Python bindings, running integration
tests, or interacting with the dev daemon. Add a copy-paste-ready
Quick Recipes section at the top of AGENTS.md covering:

- Dev daemon env vars (the #1 footgun)
- Rebuilding runtimed-py into the correct venv
- Running Python integration tests against the dev daemon
- Launching the notebook app in dev mode
- WASM rebuild
- Pointers to the right contributing docs per task
rgbkrk added a commit that referenced this pull request Mar 18, 2026
* feat(runtimed-py): per-cell accessors for O(1) MCP reads

Add get_cell_source, get_cell_type, get_cell_outputs,
get_cell_execution_count, get_cell_ids, get_cell_metadata,
and get_cell_position to DocHandle, Session, and AsyncSession.

Each reads from the snapshot watch channel (no mutex) and only
clones the requested field — not the entire Vec<CellSnapshot>.
This makes MCP reads O(1) per field instead of O(n_cells).

Available on both Session (sync) and AsyncSession (async).

* feat(nteract-mcp): use per-cell accessors to skip full materialization

set_cell and_run: get_cell_type() instead of get_cell() — only needs
to check if the cell is code before executing, no reason to materialize
all cells and resolve blobs.

resource_cell_by_index: get_cell_ids() to resolve index → ID, then
get_cell() for just the one cell. Previously materialized every cell.

set_cell unchanged path: remove unused get_cell() call that was
materializing all cells for no reason.

Also add .pyi stubs for all new per-cell accessors on both Session
and AsyncSession.

* perf(runtimed-py): get_cell uses single-cell lookup instead of full materialization

handle.get_cell(cell_id) clones just the one matching CellSnapshot
from the snapshot watch channel. The previous code called
handle.get_cells() which cloned ALL cells, then linear-scanned to
find the one it wanted.

* fix(runtimed-py): add missing get_cell_position, reduce lock contention

Add get_cell_position to session_core, Session, and AsyncSession —
the .pyi stub declared it but the implementation was missing.

All per-cell accessors now clone the DocHandle (cheap Arc clone)
and drop the SessionState mutex before calling the accessor method.
The accessors read from a watch::Receiver that doesn't need the
mutex held, so this reduces contention with concurrent operations.

* test(runtimed-py): integration tests for per-cell accessors

20 tests covering sync + async per-cell accessor methods:
- get_cell_ids ordering
- get_cell_source, get_cell_type, get_cell_outputs,
  get_cell_execution_count, get_cell_position for existing cells
- None returns for nonexistent cells
- Position ordering matches creation order
- Consistency with get_cell() full materialization
- Source updates reflected after sync

* docs(agents): add quick recipes for common dev tasks

Agents (and humans) waste time when they don't know the right
incantation for rebuilding Python bindings, running integration
tests, or interacting with the dev daemon. Add a copy-paste-ready
Quick Recipes section at the top of AGENTS.md covering:

- Dev daemon env vars (the #1 footgun)
- Rebuilding runtimed-py into the correct venv
- Running Python integration tests against the dev daemon
- Launching the notebook app in dev mode
- WASM rebuild
- Pointers to the right contributing docs per task
quillaid added a commit that referenced this pull request Apr 11, 2026
…tarts

**Problem:**
When the MCP child process (`runt-nightly mcp`) exits after idle timeout,
the proxy holds a stale handle and never triggers restart. Upstream clients
see "Transport closed" and stay dead until manual `/mcp` restart.

This is Bug #3 from ~/reports/2026-04-11-daemon-and-proxy-bugs.md and is
the root cause of daemon orphaning issues (cascades to Bugs #1 and #2).

**Solution:**
Add background monitoring task that polls child status every 500ms and
automatically restarts when the child exits:

- Spawn monitor task after init_child() and restart_child()
- Poll child_client.is_closed() to detect exit
- Trigger restart_child() transparently
- Prevent concurrent restarts with mutex
- Send tool_list_changed notification to keep upstream alive

**Restart Metrics:**
Add tracking for supervisor_status:
- restart_count() - number of child restarts
- child_uptime_secs() - current child uptime
- last_restart_time() - when last restart occurred

**Structured Logging:**
Enhanced events for lifecycle tracking:
- child_spawned (with binary path, args, upstream client)
- child_restart_requested (with restart number, uptime)
- child_exited (with uptime before restart)
- child_restart_failed (with error message)

**Testing:**
- 3 new unit tests (restart metrics, concurrent restart prevention)
- All 97 tests pass
- Manual testing still needed (idle timeout scenario)

**Design Choice:**
Used polling (500ms) instead of waiting() because RunningService doesn't
implement Clone and waiting() consumes self. Polling is simpler and
low-overhead.

Resolves: Bug #3 (MCP Proxy Child Monitoring)
Research: ~/reports/bug3-proxy-lifecycle-research.md
Plan: ~/reports/consolidated-fix-plan.md
rgbkrk pushed a commit that referenced this pull request Apr 11, 2026
…tarts (#1701)

* fix(runt-mcp-proxy): add child process monitoring for transparent restarts

**Problem:**
When the MCP child process (`runt-nightly mcp`) exits after idle timeout,
the proxy holds a stale handle and never triggers restart. Upstream clients
see "Transport closed" and stay dead until manual `/mcp` restart.

This is Bug #3 from ~/reports/2026-04-11-daemon-and-proxy-bugs.md and is
the root cause of daemon orphaning issues (cascades to Bugs #1 and #2).

**Solution:**
Add background monitoring task that polls child status every 500ms and
automatically restarts when the child exits:

- Spawn monitor task after init_child() and restart_child()
- Poll child_client.is_closed() to detect exit
- Trigger restart_child() transparently
- Prevent concurrent restarts with mutex
- Send tool_list_changed notification to keep upstream alive

**Restart Metrics:**
Add tracking for supervisor_status:
- restart_count() - number of child restarts
- child_uptime_secs() - current child uptime
- last_restart_time() - when last restart occurred

**Structured Logging:**
Enhanced events for lifecycle tracking:
- child_spawned (with binary path, args, upstream client)
- child_restart_requested (with restart number, uptime)
- child_exited (with uptime before restart)
- child_restart_failed (with error message)

**Testing:**
- 3 new unit tests (restart metrics, concurrent restart prevention)
- All 97 tests pass
- Manual testing still needed (idle timeout scenario)

**Design Choice:**
Used polling (500ms) instead of waiting() because RunningService doesn't
implement Clone and waiting() consumes self. Polling is simpler and
low-overhead.

Resolves: Bug #3 (MCP Proxy Child Monitoring)
Research: ~/reports/bug3-proxy-lifecycle-research.md
Plan: ~/reports/consolidated-fix-plan.md

* fix(runt-mcp-proxy): address code review feedback

**Code Review Fixes:**
1. **Critical:** Fix task leak on restart - old monitor now exits after
   spawning new one (prevents N+1 tasks after N restarts)
2. **Important:** Improve docstring to document polling vs waiting()
   constraint (RunningService ownership issue)
3. **Important:** Make polling interval configurable via ProxyConfig
   (monitor_poll_interval_ms, default 500ms)

**Changes:**
- Monitor task now breaks after successful restart instead of continuing
- Added monitor_poll_interval_ms field to ProxyConfig (default: 500ms)
- Updated all ProxyConfig creation sites (supervisor, mcpb, tests)
- Enhanced docstring to explain design constraint

**Testing:**
- All 97 tests still pass
- cargo check passes for runt-mcp-proxy, mcp-supervisor, mcpb-runt
- cargo xtask lint --fix passes

**Review:** superpowers:code-reviewer identified issues, all addressed
quillaid added a commit that referenced this pull request Apr 11, 2026
**Problem:**
Pool size settings written by `runt config set` were silently ignored by
the daemon. Users could set `uv_pool_size: 15` in settings.json, but the
daemon always used compiled defaults (3/3/2) because:

1. CLI writes pool sizes to settings.json correctly
2. Daemon reads from Automerge SettingsDoc
3. SettingsDoc import functions never read pool_size keys from JSON

This is Bug #1 from ~/reports/2026-04-11-daemon-and-proxy-bugs.md.

**Solution:**

1. **Add pool size import in from_json()** (initial migration)
   - Read uv_pool_size, conda_pool_size, pixi_pool_size from JSON
   - Write to SettingsDoc during first-time migration
   - File: crates/runtimed-client/src/settings_doc.rs:409-419

2. **Add pool size import in apply_json_changes()** (live updates)
   - Detect pool_size changes in settings.json
   - Update SettingsDoc when values change
   - Log changes at info level
   - File: crates/runtimed-client/src/settings_doc.rs:886-918

3. **Remove CLI seed override** (daemon.rs:516-519)
   - Removed code that overwrote pool sizes from CLI args
   - Settings.json is now the single source of truth
   - CLI defaults only apply on first launch before settings.json exists

4. **Add Pixi pool size to startup logs** (main.rs:413)
   - Was missing from daemon config dump
   - Now matches UV and Conda logging

5. **Add --pixi-pool-size CLI flag** (main.rs:72-74)
   - Completes parity with UV and Conda flags
   - Default: 2 (matches existing behavior)

**Testing:**

Added integration test `test_pool_size_config_honored` that verifies:
- from_json() correctly imports pool sizes
- apply_json_changes() correctly updates on JSON changes
- No-op optimization works (no change when values match)

All unit tests pass (125 tests in runtimed-client).
New integration test passes.

**Impact:**

Before: Pool sizes effectively read-only at compile time (3/3/2)
After: Pool sizes fully configurable via settings.json or CLI commands

**Verification:**
```bash
# Set pool sizes
runt-nightly config set uv_pool_size 15
runt-nightly config set conda_pool_size 15
runt-nightly config set pixi_pool_size 15

# Restart daemon
runt-nightly daemon restart

# Verify
runt-nightly daemon status
# Expected: UV: 15/15, Conda: 15/15, Pixi: 15/15
```

Resolves: Bug #1 (Config Pool Sizes)
Research: ~/reports/bug1-config-research.md
Plan: ~/reports/consolidated-fix-plan.md
quillaid added a commit that referenced this pull request Apr 11, 2026
Fixes test_daemon_take_empty_pool timeout failure in CI.

**Root cause:**
After the Bug #1 fix, daemon warming loops read pool sizes exclusively
from SettingsDoc (which imports from settings.json). When tests set
config.uv_pool_size = 0, the daemon ignored that and used SettingsDoc
defaults (3/3/2), causing it to try warming environments during tests.

This led to timeouts because:
1. Tests use config.{uv,conda,pixi}_pool_size = 0 to disable warming
2. Daemon's SettingsDoc has no settings.json in test runs
3. SettingsDoc defaults to (3, 3, 2) from schema
4. Daemon tries to warm 3 UV + 3 Conda + 2 Pixi = 8 environments
5. Environment creation hangs or times out in CI

**Solution:**
Check BOTH config and SettingsDoc for pool sizes:
- If config.{uv,conda,pixi}_pool_size == 0 → use 0 (test mode)
- Otherwise → use SettingsDoc value (production mode)

This preserves test isolation while honoring user-configured pool sizes.

**Changes:**
- daemon.rs:2705-2720 (UV loop): Check config.uv_pool_size before SettingsDoc
- daemon.rs:2793-2808 (Conda loop): Check config.conda_pool_size before SettingsDoc
- daemon.rs:2891-2906 (Pixi loop): Check config.pixi_pool_size before SettingsDoc

**Testing:**
- test_daemon_take_empty_pool now passes (was timing out)
- All 22 integration tests pass
- test_multiple_notebooks_concurrent_isolation is flaky but unrelated

Fixes: test_daemon_take_empty_pool timeout in PR #1702
Related: Bug #1 (Config Pool Sizes)
rgbkrk pushed a commit that referenced this pull request Apr 11, 2026
* fix(runtimed): honor pool_size settings from settings.json

**Problem:**
Pool size settings written by `runt config set` were silently ignored by
the daemon. Users could set `uv_pool_size: 15` in settings.json, but the
daemon always used compiled defaults (3/3/2) because:

1. CLI writes pool sizes to settings.json correctly
2. Daemon reads from Automerge SettingsDoc
3. SettingsDoc import functions never read pool_size keys from JSON

This is Bug #1 from ~/reports/2026-04-11-daemon-and-proxy-bugs.md.

**Solution:**

1. **Add pool size import in from_json()** (initial migration)
   - Read uv_pool_size, conda_pool_size, pixi_pool_size from JSON
   - Write to SettingsDoc during first-time migration
   - File: crates/runtimed-client/src/settings_doc.rs:409-419

2. **Add pool size import in apply_json_changes()** (live updates)
   - Detect pool_size changes in settings.json
   - Update SettingsDoc when values change
   - Log changes at info level
   - File: crates/runtimed-client/src/settings_doc.rs:886-918

3. **Remove CLI seed override** (daemon.rs:516-519)
   - Removed code that overwrote pool sizes from CLI args
   - Settings.json is now the single source of truth
   - CLI defaults only apply on first launch before settings.json exists

4. **Add Pixi pool size to startup logs** (main.rs:413)
   - Was missing from daemon config dump
   - Now matches UV and Conda logging

5. **Add --pixi-pool-size CLI flag** (main.rs:72-74)
   - Completes parity with UV and Conda flags
   - Default: 2 (matches existing behavior)

**Testing:**

Added integration test `test_pool_size_config_honored` that verifies:
- from_json() correctly imports pool sizes
- apply_json_changes() correctly updates on JSON changes
- No-op optimization works (no change when values match)

All unit tests pass (125 tests in runtimed-client).
New integration test passes.

**Impact:**

Before: Pool sizes effectively read-only at compile time (3/3/2)
After: Pool sizes fully configurable via settings.json or CLI commands

**Verification:**
```bash
# Set pool sizes
runt-nightly config set uv_pool_size 15
runt-nightly config set conda_pool_size 15
runt-nightly config set pixi_pool_size 15

# Restart daemon
runt-nightly daemon restart

# Verify
runt-nightly daemon status
# Expected: UV: 15/15, Conda: 15/15, Pixi: 15/15
```

Resolves: Bug #1 (Config Pool Sizes)
Research: ~/reports/bug1-config-research.md
Plan: ~/reports/consolidated-fix-plan.md

* docs: address code review feedback with clarifying comments

Added comments per code review:
- Warming loops now document they read from SettingsDoc not config
- Links to integration test for verification
- CLI flag default linked to schema constant

No functional changes, only documentation improvements.

* fix(runtimed): honor config pool_size=0 for test isolation

Fixes test_daemon_take_empty_pool timeout failure in CI.

**Root cause:**
After the Bug #1 fix, daemon warming loops read pool sizes exclusively
from SettingsDoc (which imports from settings.json). When tests set
config.uv_pool_size = 0, the daemon ignored that and used SettingsDoc
defaults (3/3/2), causing it to try warming environments during tests.

This led to timeouts because:
1. Tests use config.{uv,conda,pixi}_pool_size = 0 to disable warming
2. Daemon's SettingsDoc has no settings.json in test runs
3. SettingsDoc defaults to (3, 3, 2) from schema
4. Daemon tries to warm 3 UV + 3 Conda + 2 Pixi = 8 environments
5. Environment creation hangs or times out in CI

**Solution:**
Check BOTH config and SettingsDoc for pool sizes:
- If config.{uv,conda,pixi}_pool_size == 0 → use 0 (test mode)
- Otherwise → use SettingsDoc value (production mode)

This preserves test isolation while honoring user-configured pool sizes.

**Changes:**
- daemon.rs:2705-2720 (UV loop): Check config.uv_pool_size before SettingsDoc
- daemon.rs:2793-2808 (Conda loop): Check config.conda_pool_size before SettingsDoc
- daemon.rs:2891-2906 (Pixi loop): Check config.pixi_pool_size before SettingsDoc

**Testing:**
- test_daemon_take_empty_pool now passes (was timing out)
- All 22 integration tests pass
- test_multiple_notebooks_concurrent_isolation is flaky but unrelated

Fixes: test_daemon_take_empty_pool timeout in PR #1702
Related: Bug #1 (Config Pool Sizes)
rgbkrk added a commit that referenced this pull request Apr 18, 2026
Two codex-review fixes on commit 9878b6c:

**P2 #1 — widget patches before CRDT writer**. The bootstrap fallback
only mirrored patches into the local store and returned; nothing
replayed them into the CRDT once the writer became available. Under
the pre-A2 manager, the debounced flush kept retrying until the
writer appeared. Restored equivalent behavior with a small pending
queue:

- On `updateAndPersist` without a writer: queue the patch (last-wins
  merge per comm) AND mirror to the local store so the UI still
  feels responsive.
- On the next `updateAndPersist` that sees a writer: drain the
  queue in insertion order before processing the new write.
- Fallback 50ms polling timer for the "writer just showed up but no
  new interaction happened" case. Cancels when the queue drains.
- `reset()` (called on kernel restart) clears the queue — stale
  pre-restart patches shouldn't reach the new kernel.
- `clearComm(commId)` (comm_close) drops the per-comm entry.

**P2 #2 — stall watchdog false positives under sustained traffic**.
The watchdog was idempotent: once armed, new outbound flushes
didn't extend the deadline. A long slider drag or play-widget loop
produces 100+ outbound flushes/sec; the daemon never quite catches
up so `generate_runtime_state_sync_reply()` never returns null;
watchdog fires after 3s mid-interaction. Switched arm to
reset-on-each-flush. Each new outbound write gets a fresh window,
so healthy-but-busy sessions don't trip the recovery. The stall we
do want to catch — pipe drops the frame, daemon panics — shows up
as no more outbound to reset the timer AND no inbound within 3s,
which this still detects.

Tests updated:
- 2 new bootstrap-queue tests: drain on next update, coalescing
  multiple pre-writer writes on the same comm.
- Existing watchdog tests still pass (reset-on-flush changes timing
  slightly but not test semantics).

765 tests pass (+2); lint clean.
rgbkrk added a commit that referenced this pull request Apr 18, 2026
Two codex-review fixes on commit 9878b6c:

**P2 #1 — widget patches before CRDT writer**. The bootstrap fallback
only mirrored patches into the local store and returned; nothing
replayed them into the CRDT once the writer became available. Under
the pre-A2 manager, the debounced flush kept retrying until the
writer appeared. Restored equivalent behavior with a small pending
queue:

- On `updateAndPersist` without a writer: queue the patch (last-wins
  merge per comm) AND mirror to the local store so the UI still
  feels responsive.
- On the next `updateAndPersist` that sees a writer: drain the
  queue in insertion order before processing the new write.
- Fallback 50ms polling timer for the "writer just showed up but no
  new interaction happened" case. Cancels when the queue drains.
- `reset()` (called on kernel restart) clears the queue — stale
  pre-restart patches shouldn't reach the new kernel.
- `clearComm(commId)` (comm_close) drops the per-comm entry.

**P2 #2 — stall watchdog false positives under sustained traffic**.
The watchdog was idempotent: once armed, new outbound flushes
didn't extend the deadline. A long slider drag or play-widget loop
produces 100+ outbound flushes/sec; the daemon never quite catches
up so `generate_runtime_state_sync_reply()` never returns null;
watchdog fires after 3s mid-interaction. Switched arm to
reset-on-each-flush. Each new outbound write gets a fresh window,
so healthy-but-busy sessions don't trip the recovery. The stall we
do want to catch — pipe drops the frame, daemon panics — shows up
as no more outbound to reset the timer AND no inbound within 3s,
which this still detects.

Tests updated:
- 2 new bootstrap-queue tests: drain on next update, coalescing
  multiple pre-writer writes on the same comm.
- Existing watchdog tests still pass (reset-on-flush changes timing
  slightly but not test semantics).

765 tests pass (+2); lint clean.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant