Skip to content

[nd 6/5] Back NotebookDocument cell text with Loro CRDT#8849

Draft
manzt wants to merge 6 commits intomainfrom
manzt/nd-loro
Draft

[nd 6/5] Back NotebookDocument cell text with Loro CRDT#8849
manzt wants to merge 6 commits intomainfrom
manzt/nd-loro

Conversation

@manzt
Copy link
Copy Markdown
Collaborator

@manzt manzt commented Mar 24, 2026

Prev #8842, #8843, #8844, #8845, #8846. Experiment with composing the document model with a CRDT for cell text.

The document event model from the previous PRs sequences structural changes via transactions, but cell text was still plain strings with Loro running as a parallel system. This explores a split-ownership model: NotebookDocument keeps structural metadata (ordering, names, configs), Loro owns cell text. One source of truth for each concern, no reconciliation.

doc = NotebookDocument(create_doc())
doc.add_cell(cell_id, code="x = 1", name="__", config=CellConfig())
doc.loro_doc.commit()

doc.get_cell(cell_id).code  # materializes from LoroText.to_string()

Non-interactive writes (kernel, file-watch, code_mode) go through SetCode ops which write to Loro. Character-level edits flow through the Loro WebSocket directly. The session's LoroDoc is shared with the RTC doc manager — server-originated mutations auto-broadcast to connected clients. On the frontend, code changes flow exclusively through Loro; the transaction middleware no longer handles set-code.

@manzt manzt added the enhancement New feature or request label Mar 24, 2026
@vercel
Copy link
Copy Markdown

vercel bot commented Mar 24, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
marimo-docs Ready Ready Preview, Comment Mar 27, 2026 5:04pm

Request Review

@manzt manzt changed the title Back NotebookDocument cell text with Loro CRDT [nd 6/5] Back NotebookDocument cell text with Loro CRDT Mar 24, 2026
# Copyright 2026 Marimo. All rights reserved.
"""Typed wrappers for ``loro`` APIs with incomplete stubs.

The ``loro`` stubs omit return types on ``__new__`` and the
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should raise upstream tbh

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 25, 2026

Bundle Report

Changes will increase total bundle size by 57 bytes (0.0%) ⬆️. This is within the configured threshold ✅

Detailed changes
Bundle name Size Change
marimo-esm 25.59MB 57 bytes (0.0%) ⬆️

Affected Assets, Files, and Routes:

view changes for bundle: marimo-esm

Assets Changed:

Asset Name Size Change Total Size Change (%)
assets/cells-*.js 57 bytes 687.89kB 0.01%

Files in assets/cells-*.js:

  • ./src/core/codemirror/cells/extensions.ts → Total Size: 7.59kB

@manzt manzt force-pushed the manzt/nd-watch branch 2 times, most recently from dc4ed22 to 94074d2 Compare March 26, 2026 16:18
@github-actions github-actions bot added the bash-focus Area to focus on during release bug bash label Mar 26, 2026
manzt and others added 6 commits March 27, 2026 12:59
The file watcher previously sent separate UpdateCellIdsNotification and
UpdateCellCodesNotification when the notebook file changed on disk. Now
it diffs `session.document` against the reloaded cell_manager and emits
a single `NotebookDocumentTransactionNotification` with typed ops. The
session intercepts it, applies to the canonical document, stamps the
version, and forwards to the frontend. The same path used by code_mode
and the frontend transaction endpoint.

SyncGraphCommand for autorun stays separate since execution is not a
structural concern. The non-autorun path no longer needs to send
DeleteCellCommand individually — deletes are part of the transaction.
NotebookDocument previously stored cell code as plain strings on mutable
NotebookCell structs. This moves cell text ownership to a `LoroDoc`,
making the CRDT the single source of truth for code content while the
document continues to own structural metadata (cell ordering, names,
configs) in a new lightweight CellMeta class.

`NotebookCell` becomes a frozen, read-only snapshot materialized on
access from CellMeta + LoroText.to_string(). It is never stored
internally by the document. `SetCode` ops now perform a full
delete-then-insert on the `LoroText` container, which is the correct
semantic for non-interactive writes (kernel, file-watch, code_mode).
Character-level edits from the frontend continue to flow through the
Loro RTC WebSocket unchanged.

The new internal layout:

```py
doc = NotebookDocument(create_doc())
doc.add_cell(cell_id, code="x = 1", name="__", config=CellConfig())
doc.loro_doc.commit()

doc.get_cell(cell_id).code   # materializes snapshot from LoroText
```
`LoroDocManager` previously created its own `LoroDoc` with duplicate
cell data at RTC init time. Now that `NotebookDocument` owns the
`LoroDoc`, the manager just registers the session's existing doc via
`register_doc`, giving RTC clients and the document model a single
shared instance.

The cleanup timer is removed since the doc's lifetime is tied to the
session. A `subscribe_local_update` hook broadcasts server-originated
Loro mutations (from SetCode, file-watch) to connected RTC clients, and
`apply()` batches all ops into one `doc.commit()` so clients receive one
update per transaction.
Now that the backend LoroDoc owns cell text, the frontend no longer
needs to send `set-code` ops through the document transaction API or
apply them from server notifications. Code changes flow exclusively
through the Loro WebSocket sync. `cellCodeEditing` skips Loro-originated
CodeMirror changes to prevent them from round-tripping back through the
reducer and middleware.
tentralizes loro type-stub workarounds in `_loro.py` by adding
`unwrap_text` alongside the existing constructor wrappers, removing the
inline `type: ignore` from `document.py`. Guards the RTC broadcast
callback against `asyncio.QueueFull` for slow consumers. Updates the RTC
test suite to match the new `register_doc`/`get_doc` API and removes
tests for the deleted cleanup timer (the deadlock scenario they guarded
against no longer exists since the LoroDoc now lives for the session's
lifetime with no background cleanup task).
Base automatically changed from manzt/nd-watch to main March 27, 2026 19:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bash-focus Area to focus on during release bug bash enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants