Skip to content

feat(runtimed-py): per-cell accessors for O(1) MCP reads#924

Merged
rgbkrk merged 6 commits intomainfrom
feat/runtimed-py-cell-accessors
Mar 18, 2026
Merged

feat(runtimed-py): per-cell accessors for O(1) MCP reads#924
rgbkrk merged 6 commits intomainfrom
feat/runtimed-py-cell-accessors

Conversation

@rgbkrk
Copy link
Copy Markdown
Member

@rgbkrk rgbkrk commented Mar 18, 2026

Per-cell accessors on the Python bindings so MCP reads are O(1) per field instead of O(n_cells) full materialization.

The problem

session.get_cell(cell_id) materializes ALL cells via handle.get_cells(), then linear-scans for the matching ID. For a 100-cell notebook, reading one cell's source means cloning 100 CellSnapshots (each with source text, outputs, metadata) just to throw 99 away.

The fix

New methods on DocHandle, Session, and AsyncSession:

Method Returns What it clones
get_cell_ids() Vec<String> Only the ID strings
get_cell_source(cell_id) Option<String> One source string
get_cell_type(cell_id) Option<String> One cell_type string
get_cell_outputs(cell_id) Option<Vec<String>> One cell's output vec (raw, no blob resolution)
get_cell_execution_count(cell_id) Option<String> One execution_count string
get_cell_metadata(cell_id) Option<serde_json::Value> One cell's metadata
get_cell_position(cell_id) Option<String> One position string

All read from the snapshot watch channel (lock-free borrow, no mutex). Linear scan over the cell list but only clone the one field requested.

MCP server integration

Wired the new accessors into the nteract MCP server:

  • set_cell and_runget_cell_type() instead of get_cell(). Only needs to check if it's code before executing, no reason to materialize all cells and resolve blobs.
  • resource_cell_by_indexget_cell_ids() to resolve index → ID, then get_cell() for just the one cell. Previously materialized every cell.
  • set_cell unchanged path — removed unused get_cell() call that was materializing all cells for no reason.

Also added .pyi type stubs for all new accessors on both Session and AsyncSession.

PR submitted by @rgbkrk's agent Quill, via Zed

Add get_cell_source, get_cell_type, get_cell_outputs,
get_cell_execution_count, get_cell_ids, get_cell_metadata,
and get_cell_position to DocHandle, Session, and AsyncSession.

Each reads from the snapshot watch channel (no mutex) and only
clones the requested field — not the entire Vec<CellSnapshot>.
This makes MCP reads O(1) per field instead of O(n_cells).

Available on both Session (sync) and AsyncSession (async).
@github-actions github-actions bot added sync Automerge CRDT sync protocol daemon runtimed daemon, kernel management, sync server quill PR authored by Quill Agent 🦆 labels Mar 18, 2026
set_cell and_run: get_cell_type() instead of get_cell() — only needs
to check if the cell is code before executing, no reason to materialize
all cells and resolve blobs.

resource_cell_by_index: get_cell_ids() to resolve index → ID, then
get_cell() for just the one cell. Previously materialized every cell.

set_cell unchanged path: remove unused get_cell() call that was
materializing all cells for no reason.

Also add .pyi stubs for all new per-cell accessors on both Session
and AsyncSession.
@github-actions github-actions bot added the mcp MCP server and agent integrations label Mar 18, 2026
…aterialization

handle.get_cell(cell_id) clones just the one matching CellSnapshot
from the snapshot watch channel. The previous code called
handle.get_cells() which cloned ALL cells, then linear-scanned to
find the one it wanted.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds per-cell read accessors across notebook-sync and the runtimed-py bindings to avoid full-cell materialization for common MCP read paths, and wires the new accessors into the nteract MCP server for cheaper cell lookups.

Changes:

  • Introduces per-cell snapshot accessors on notebook-sync::DocHandle (IDs, source, type, outputs, execution_count, metadata, position).
  • Exposes a subset of those accessors through runtimed-py Session/AsyncSession, plus .pyi stub updates.
  • Updates nteract MCP server to use get_cell_type() / get_cell_ids() to reduce unnecessary full-cell materialization.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
python/runtimed/src/runtimed/runtimed.pyi Adds type stubs for new per-cell accessors on Session and AsyncSession.
python/nteract/src/nteract/_mcp_server.py Switches MCP server code paths to use new per-cell accessors (get_cell_type, get_cell_ids).
crates/runtimed-py/src/session_core.rs Implements new per-cell accessor functions in the shared Rust core for Python bindings.
crates/runtimed-py/src/session.rs Exposes new per-cell accessors on the sync Session Python class.
crates/runtimed-py/src/async_session.rs Exposes new per-cell accessors on the async AsyncSession Python class.
crates/notebook-sync/src/handle.rs Adds per-cell snapshot accessor methods to DocHandle to reduce cloning and improve read performance.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

def get_cell_outputs(self, cell_id: str) -> list[str] | None: ...
def get_cell_execution_count(self, cell_id: str) -> str | None: ...
def get_cell_ids(self) -> list[str]: ...
def get_cell_position(self, cell_id: str) -> str | None: ...
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in d2dceecget_cell_position wired through session_core, Session, and AsyncSession.

def get_cell_outputs(self, cell_id: str) -> Coroutine[Any, Any, list[str] | None]: ...
def get_cell_execution_count(self, cell_id: str) -> Coroutine[Any, Any, str | None]: ...
def get_cell_ids(self) -> Coroutine[Any, Any, list[str]]: ...
def get_cell_position(self, cell_id: str) -> Coroutine[Any, Any, str | None]: ...
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in d2dceec — same commit as above.

Comment thread crates/runtimed-py/src/session_core.rs Outdated
Comment on lines +647 to +652
let st = state.lock().await;
let handle = st
.handle
.as_ref()
.ok_or_else(|| to_py_err("Not connected"))?;
Ok(handle.get_cell_source(cell_id))
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in d2dceec — all per-cell accessors now clone the DocHandle (cheap Arc clone) inside a scoped block and drop the mutex before calling the accessor. The accessors read from a watch::Receiver that doesn't need the mutex held.

Comment on lines +328 to +356
/// Get a cell's source without materializing all cells.
fn get_cell_source(&self, cell_id: &str) -> PyResult<Option<String>> {
self.runtime
.block_on(session_core::get_cell_source(&self.state, cell_id))
}

/// Get a cell's type without materializing all cells.
fn get_cell_type(&self, cell_id: &str) -> PyResult<Option<String>> {
self.runtime
.block_on(session_core::get_cell_type(&self.state, cell_id))
}

/// Get a cell's raw outputs without blob resolution.
fn get_cell_outputs(&self, cell_id: &str) -> PyResult<Option<Vec<String>>> {
self.runtime
.block_on(session_core::get_cell_outputs(&self.state, cell_id))
}

/// Get a cell's execution count.
fn get_cell_execution_count(&self, cell_id: &str) -> PyResult<Option<String>> {
self.runtime
.block_on(session_core::get_cell_execution_count(&self.state, cell_id))
}

/// Get all cell IDs in document order.
fn get_cell_ids(&self) -> PyResult<Vec<String>> {
self.runtime
.block_on(session_core::get_cell_ids(&self.state))
}
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree these need tests. Deferring to a follow-up — integration tests require a running daemon and the test harness in python/runtimed/tests/ connects to a live socket. Will add coverage for existing/nonexistent cell cases in both sync and async flavors.

rgbkrk added 2 commits March 17, 2026 19:00
Add get_cell_position to session_core, Session, and AsyncSession —
the .pyi stub declared it but the implementation was missing.

All per-cell accessors now clone the DocHandle (cheap Arc clone)
and drop the SessionState mutex before calling the accessor method.
The accessors read from a watch::Receiver that doesn't need the
mutex held, so this reduces contention with concurrent operations.
20 tests covering sync + async per-cell accessor methods:
- get_cell_ids ordering
- get_cell_source, get_cell_type, get_cell_outputs,
  get_cell_execution_count, get_cell_position for existing cells
- None returns for nonexistent cells
- Position ordering matches creation order
- Consistency with get_cell() full materialization
- Source updates reflected after sync
@github-actions github-actions bot added test Test infrastructure and coverage documentation Improvements or additions to documentation labels Mar 18, 2026
Agents (and humans) waste time when they don't know the right
incantation for rebuilding Python bindings, running integration
tests, or interacting with the dev daemon. Add a copy-paste-ready
Quick Recipes section at the top of AGENTS.md covering:

- Dev daemon env vars (the #1 footgun)
- Rebuilding runtimed-py into the correct venv
- Running Python integration tests against the dev daemon
- Launching the notebook app in dev mode
- WASM rebuild
- Pointers to the right contributing docs per task
@rgbkrk rgbkrk force-pushed the feat/runtimed-py-cell-accessors branch from 0c59c6b to 7c9ed42 Compare March 18, 2026 02:18
@rgbkrk rgbkrk merged commit a70324e into main Mar 18, 2026
19 of 20 checks passed
@rgbkrk rgbkrk deleted the feat/runtimed-py-cell-accessors branch March 18, 2026 02:49
rgbkrk added a commit that referenced this pull request Mar 18, 2026
…tub types

Audit of merged #923 + #924 found several issues:

Rust (session_core.rs):
- collect_outputs: get_cells() + find → get_cell() (single-cell lookup)
- get_cell_metadata: get_cells() + find → handle.get_cell_metadata()
  with clone-handle-drop-lock pattern

MCP server:
- replace_match: get_cell() → get_cell_source() (only needs source)
- replace_regex: same fix

Type stubs (.pyi):
- move_cell: None → str (returns new position)
- get_cell: Cell | None → Cell (raises on missing, never None)
- set_cell_source_hidden: None → bool
- set_cell_outputs_hidden: None → bool
- set_cell_tags: None → bool
All fixed for both Session and AsyncSession.
rgbkrk added a commit that referenced this pull request Mar 18, 2026
…tub types (#927)

Audit of merged #923 + #924 found several issues:

Rust (session_core.rs):
- collect_outputs: get_cells() + find → get_cell() (single-cell lookup)
- get_cell_metadata: get_cells() + find → handle.get_cell_metadata()
  with clone-handle-drop-lock pattern

MCP server:
- replace_match: get_cell() → get_cell_source() (only needs source)
- replace_regex: same fix

Type stubs (.pyi):
- move_cell: None → str (returns new position)
- get_cell: Cell | None → Cell (raises on missing, never None)
- set_cell_source_hidden: None → bool
- set_cell_outputs_hidden: None → bool
- set_cell_tags: None → bool
All fixed for both Session and AsyncSession.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

daemon runtimed daemon, kernel management, sync server documentation Improvements or additions to documentation mcp MCP server and agent integrations quill PR authored by Quill Agent 🦆 sync Automerge CRDT sync protocol test Test infrastructure and coverage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants