Skip to content

feat(container-loader): add readOnly option to loadFrozenContainerFromPendingState#27211

Open
anthony-murphy wants to merge 25 commits intomicrosoft:mainfrom
anthony-murphy:users/anthonm/frozen-container-readonly-option
Open

feat(container-loader): add readOnly option to loadFrozenContainerFromPendingState#27211
anthony-murphy wants to merge 25 commits intomicrosoft:mainfrom
anthony-murphy:users/anthonm/frozen-container-readonly-option

Conversation

@anthony-murphy
Copy link
Copy Markdown
Contributor

@anthony-murphy anthony-murphy commented Apr 30, 2026

Description

Adds an optional readOnly parameter (default true, preserving existing behavior) to loadFrozenContainerFromPendingState and createFrozenDocumentServiceFactory. When readOnly: false, the frozen container loads as writable so the runtime accepts DDS submissions and accumulates them in pendingStateManager for getPendingLocalState() to capture — without publishing them.

The mechanism: a sibling WritableFrozenDeltaStream is the live connection (claims include DocWrite, mode stays "read"). ConnectionManager.sendMessages matches it via isWritableFrozenDeltaStreamConnection and short-circuits before the read-mode reconnect branch — outbound writes are dropped at the network layer instead of triggering a reconnect (which under ReconnectMode.Never would close the container).

Other notable changes:

  • FrozenDeltaStream is split into two sibling classes (FrozenDeltaStream read-only, WritableFrozenDeltaStream writable) over a shared abstract base. Both internal-only. Type discrimination at consumer call sites uses predicate functions (isFrozenDeltaStreamConnection, isWritableFrozenDeltaStreamConnection) — no raw instanceof.
  • WritableFrozenDeltaStream mints a per-instance frozen-delta-stream/<uuid> clientId to avoid pendingStateManager 0x173 (replayPendingStates called twice for same clientId!) on reconnect with dirty pending ops. The read-only variant keeps the historical constant "storage-only client".
  • submitSignal is now a silent no-op on both variants (was a 403 nack on the read-only variant). Signals are ephemeral and dropping is the right behavior with no upstream; the read-only variant's submit stays defensive (403 nack) since ops are durable state.
  • createFrozenDocumentServiceFactory rewraps an already-wrapped factory if its readOnly differs from the caller — most recent intent wins.

API report and a changeset are included.

Reviewer Guidance

  • vladsud sign-off welcome on the ConnectionManager.sendMessages WritableFrozenDeltaStream short-circuit — sits in front of Remove two types of nacks - "Nonexistent client" & "Readonly client" #7753's read-mode invariant.
  • ChumpChief sign-off welcome on the @legacy @alpha shape (readOnly?: boolean polarity / positional arg on createFrozenDocumentServiceFactory).
  • jatgarg (optional) review of the FrozenDeltaStream constructor reshape at her two historical call sites.

The submit / submitSignal asymmetry on FrozenDeltaStream is intentional: ops are durable (nack to surface invariant breaks); signals are ephemeral (drop).

fixes AB#72108

…mPendingState

Currently a frozen container is always read-only via the storageOnly policy.
This adds a `readOnly` option (default `true`) so callers can load a frozen
container that surfaces as writable in order to accrue and capture additional
pending state without publishing it.

When `readOnly: false`:
- The initial connect returns a `FrozenDeltaStream` configured with `DocWrite`
  scope so `readOnlyInfo.readonly === false`.
- The runtime accepts DDS submissions; the first submit triggers a read→write
  upgrade attempt that the document service intercepts and hangs (we have no
  upstream and won't fabricate a quorum join op).
- The container settles into Disconnected. DDS local apply continues, and ops
  accumulate in the runtime's pendingStateManager so getPendingLocalState()
  can capture them.

Other changes:
- `FrozenDeltaStream` constructor switched from positional args to an options
  bag with `readOnly`, `storageOnlyReason`, and `readonlyConnectionReason`.
- Both variants nack on submit/submitSignal — under normal flow neither is
  ever called (sendMessages short-circuits read-mode ops to reconnect), so a
  nack reaching the connectionManager is the right defensive signal.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 30, 2026

Hi! Thank you for opening this PR. Want me to review it?

Based on the diff (334 lines, 4 files), I've queued these reviewers:

  • Correctness — logic errors, race conditions, lifecycle issues
  • Security — vulnerabilities, secret exposure, injection
  • API Compatibility — breaking changes, release tags, type design
  • Performance — algorithmic regressions, memory leaks
  • Testing — coverage gaps, hollow tests

Toggle the reviewer checkboxes above to adjust, then tick the box below to start:

  • Start review

anthony-murphy and others added 2 commits April 30, 2026 13:11
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread packages/loader/container-loader/src/frozenServices.ts Outdated
- submitSignal becomes a silent no-op for both FrozenDeltaStream variants.
  The old read-only behavior emitted a non-array nack payload (would crash
  nackHandler's `messages[0]` access) and, on the writable variant, would
  trigger reconnect/close on stray runtime/presence signals. Dropping is
  the correct behavior since signals are ephemeral and we have no upstream.
- createFrozenDocumentServiceFactory now re-wraps an already-wrapped
  factory when its readOnly differs from the caller's argument, so the
  caller's most recent intent wins instead of being silently dropped.
- Surface FrozenDocumentServiceFactory.readOnly/inner so the rewrap path
  can unwrap; rename the inner field for clarity.
- Tests:
  - signal submission on the writable frozen container does not close it.
  - getPendingLocalState() round-trips writable-frozen edits through a
    second writable-frozen load (the load-bearing invariant for this PR).
  - readOnly: false wins over an already-wrapped readOnly: true factory.
  - Fix a misleading comment that claimed writable variant "silently
    drops" submissions; ops actually accumulate in pendingStateManager.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@anthony-murphy
Copy link
Copy Markdown
Contributor Author

Review fixes — addressed in e771980

Picking up the four findings from the Deep Review (only one was an inline thread; the other three were summary-only):

1 + 3: submitSignal payload shape and writable failure mode

Resolved together by changing submitSignal to a silent no-op for both variants. The old behavior was wrong on two axes:

  • The read-only variant emitted a non-array nack payload ({ operation, content }), and nackHandler does messages[0] — that would crash before the diagnostic could surface, so the "loud failure" intent wasn't actually achievable.
  • The writable variant nacking on a stray runtime/presence signal would close the container (when _readonlyPermissions is true) or trigger reconnect spam (when false). Either is strictly worse than dropping.

Signals are ephemeral by nature; a frozen container has no upstream; dropping is the correct behavior. submit stays defensive because ops are durable state and we want to surface invariant breaks for those. JSDoc updated to spell out the asymmetry.

Regression test added: submitting a signal does not close the container — calls runtime.submitSignal post-load and asserts the container stays open and undisposed.

2: Wrapped-factory readOnly mismatch

See inline reply on the existing thread. Short version: rewrap on mismatch, with a regression test.

4: getPendingLocalState() round-trip test

Added: captures local writes in getPendingLocalState() and round-trips through a second frozen load — N writable DDS sets → capture pending state → load a second writable-frozen container from that blob → assert the layered edits are visible alongside the original snapshot's data. This is the load-bearing invariant for the whole feature, and the test was a real gap.

Other

Stale "drops them silently" comment in the existing accepts local writes test is now corrected to describe the actual mechanism (ops accumulate in pendingStateManager; the upgrade-hang in connectToDeltaStream is what blocks the wire).

Holding the polish items (close-vs-dispose hung-promise leak, dispose-test gap on the hung promise, readonlyConnectionReason dropped on writable path) for follow-up — none of them affect correctness in current usage. PR description update to follow.

@anthony-murphy
Copy link
Copy Markdown
Contributor Author

Deep Review: Thanks for the rapid turnaround on e771980 — the submitSignal no-op (with submit staying defensive), the wrapped-factory rewrap-on-mismatch, and the getPendingLocalState() round-trip test all land cleanly and address the prior findings. The asymmetry rationale you wrote up (signals ephemeral → drop; ops durable → nack to surface invariant breaks) is the right call.

One new concern surfaced this round, posted as an inline thread on frozenServices.ts: the writable initial connect hangs for allowReconnect: false and non-interactive callers, because Container.connectToDeltaStream forces args.mode = "write" on the very first connect — bypassing the client.mode !== "write" early return that hands out the writable FrozenDeltaStream. The new tests don't cover this combination. Details in the thread; the updated review summary above has the path-to-ready.

Comment thread packages/loader/container-loader/src/frozenServices.ts Outdated
…ial connect

Container.connectToDeltaStream forces args.mode = "write" on the very first
connect when allowReconnect is false or the client is non-interactive
(container.ts:1517-1523). The previous FrozenDocumentService logic gated
on client.mode alone, so that initial connect would be intercepted by the
upgrade-hang path and the load would never complete.

Track first-vs-subsequent connect on FrozenDocumentService and always
return the writable FrozenDeltaStream on the first call, regardless of
client.mode. Mode-based hang now applies only to subsequent write-mode
connects, which is the runtime-driven read→write upgrade we actually
want to intercept. Subsequent read-mode connects still return a fresh
writable stream.

Tests:
- loadFrozenContainerFromPendingState({readOnly: false, allowReconnect: false})
  asserts readonly === false, container stays open, and pending edits
  round-trip through getPendingLocalState().
- Same assertions for the non-interactive-client variant
  (clientDetailsOverride.capabilities.interactive === false).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
anthony-murphy and others added 4 commits May 1, 2026 09:55
…ect is hung

The dispose ordering on the hung write-upgrade promise is the most invariant-relevant
scenario in writable-frozen — FrozenDocumentService.dispose() and pendingConnectRejecters
exist specifically to make container.dispose() complete in this state — and it had no
test coverage.

Two regression tests:
- dispose() completes while the read→write upgrade connect is hung — triggers the
  upgrade via a DDS write, awaits the disconnect that precedes the hang, then
  asserts dispose() returns and disposed === true.
- close() does not hang while the read→write upgrade connect is hung — same flow
  via close(); pins the documented benign-leak tradeoff (close() does not propagate
  to service.dispose(), so the hung promise stays pending until GC).

Also document the close-vs-dispose tradeoff in connectToDeltaStream's hung-promise
block so future readers know which path drains pendingConnectRejecters and why.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Blocked CI lint with @typescript-eslint/no-unnecessary-type-arguments —
void is the default for timeoutPromise's T parameter.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ose/close tests

The new dispose-during-hung-upgrade and close-during-hung-upgrade tests
were timing out at mocha's ~2s default while waiting 5s for a
"disconnected" event. The writable-frozen container does not transition
to a "saved" state and does not necessarily emit "disconnected" before
the upgrade-connect call lands — connection state may transition
through EstablishingConnection instead, or the runtime's outbox flush
is async enough that nothing visible happens in 2s.

Switch to a bounded 500ms sleep that yields long enough for the upgrade
attempt to register the hung promise on pendingConnectRejecters. Test
still validates the dispose drain path; it just doesn't require an
explicit disconnected signal.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… deep review

Six independent items, each previously held as Tier 3 polish:

- connectionManager.sendMessages: tripwire comment near the read-mode
  short-circuit so a future refactor surfaces writable-frozen's
  dependency on this exact path (the deferred reconnect-to-write is
  what FrozenDocumentService.connectToDeltaStream intercepts to hang).

- FrozenDeltaStream constructor: runtime asserts that storageOnlyReason
  and readonlyConnectionReason are not passed when readOnly: false —
  the writable variant has no readOnlyInfo to surface them on, so silent
  acceptance was misleading. Per repo convention, asserts use string
  literal messages, not hex codes.

- FrozenDeltaStream constructor: JSDoc the `as ITokenClaims` cast — only
  scopes drives observable behavior (DocRead vs DocWrite gates
  readOnlyInfo); inventing sentinels for tenant/document/user/iat/exp/ver
  would imply quorum membership we cannot honor.

- ILoadFrozenContainerFromPendingStateProps.readOnly: JSDoc rationale
  for the negative polarity (alignment with IContainer.readOnlyInfo),
  preempting the alpha-rename-window question.

- New writes-after-disconnect integration test: writes performed AFTER
  the upgrade-connect has hung still apply locally and round-trip
  through getPendingLocalState() into a second writable-frozen load.

- New unit-test file for FrozenDeltaStream covering submit's nack
  payload shape (array, code 403, one entry per op), submitSignal as
  a silent no-op for both variants, and the constructor guards above.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread packages/loader/container-loader/src/frozenServices.ts Outdated
anthony-murphy and others added 2 commits May 1, 2026 15:15
…am connections

Without this, loadFrozenContainerFromPendingState({ readOnly: false,
allowReconnect: false }) closes asynchronously after the first DDS write:
the read-mode reconnect scheduled by sendMessages → reconnect("write") →
ReconnectMode.Never path → closeHandler() fires on a microtask. The
existing allowReconnect:false test masked this because it captured pending
state before the close microtask could be observed.

ConnectionManager.sendMessages now recognizes a FrozenDeltaStream as the
live connection and drops outbound messages at the network layer — the
runtime's outbox keeps them in pendingStateManager for getPendingLocalState
to capture. This is the design invariant the writable-frozen flow already
depended on, made explicit and enforced rather than load-bearing on a
side-effect of the read-mode short-circuit.

Other consequences:
- Container stays Connected for writable-frozen (previously settled into
  Disconnected after the upgrade-connect hung). JSDocs in frozenServices.ts
  and createAndLoadContainerUtils.ts updated to reflect the new mechanism.
- The upgrade-hang at FrozenDocumentService.connectToDeltaStream({ mode:
  "write" }) is now defense-in-depth — unreachable in normal flow but
  retained for invariant breaks. JSDoc updated accordingly.

Tests:
- Extended `loads with allowReconnect: false` to flush microtasks after
  the initial writes, assert closed === false, perform another batch of
  writes, and assert all writes round-trip through pending state into a
  second writable-frozen load. Without the sendMessages fix this would
  fail on the closed assertion.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…trings + changeset

After the sendMessages FrozenDeltaStream short-circuit landed, the
integration "dispose() / close() while upgrade-connect is hung" tests
no longer exercise the path they were written for: the short-circuit
returns before any read→write reconnect fires, so pendingConnectRejecters
is empty when those tests call dispose() / close(), and disposed === true
/ closed === true assertions pass on the synchronous setter alone.

Restoration: add focused unit tests in frozenServices.spec.ts that drive
FrozenDocumentService.connectToDeltaStream({mode: "write"}) directly to
exercise the rejecter:
- rejects the hung upgrade-connect when service.dispose() is called
- rejects synchronously when called after service.dispose()
- drains multiple pending rejecters on a single dispose() call

Also align the now-stale narrative across artifacts:
- Changeset: replace the "first runtime submit triggers an internal
  read→write upgrade attempt that cannot succeed, so the container settles
  into Disconnected" narrative with the actual flow (sendMessages drops
  outbound at the FrozenDeltaStream short-circuit; container stays
  Connected). Call out the submitSignal read-only-variant behavior
  change (was 403 nack, now silent no-op) explicitly.
- Integration tests: rename "dispose() while the read→write upgrade
  connect is hung" → "dispose() runs cleanly on a writable-frozen
  container after a local write" (and same for close); rename
  "captures writes performed after the upgrade-connect has hung" →
  "captures writes batched across timer/microtask boundaries". Update
  inline comments to describe the actual short-circuit-first flow and
  cross-link the focused unit test for the dispose-rejecter coverage.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@anthony-murphy
Copy link
Copy Markdown
Contributor Author

Tier 1 finding addressed in 411863f

The dispose/close-during-hung-upgrade integration tests were vacuous after the sendMessages short-circuit landed — confirmed. Restored real coverage via focused unit tests in `packages/loader/container-loader/src/test/frozenServices.spec.ts` that drive `FrozenDocumentService.connectToDeltaStream({mode: "write"})` directly, then assert dispose() rejects the hung promise. Three cases:

  • `rejects the hung upgrade-connect when service.dispose() is called` — flips `handedOutInitialConnection` via an initial read connect, then captures the second (write-mode) connect's promise, asserts it stays pending across microtask flushes, calls `service.dispose()`, asserts it rejects with the expected message.
  • `rejects synchronously when called after service.dispose()` — pins the disposed-first branch (`if (this.disposed) reject(...)`) of the upgrade-hang.
  • `drains multiple pending rejecters on a single dispose() call` — three concurrent hung upgrade promises, single dispose, asserts all three reject.

Picked option (b) — focused unit test against a directly-constructed `FrozenDocumentService` — over option (a) — driving connectToDeltaStream from the integration test — because (b) is lighter weight and doesn't require accessing the underlying `IDocumentService` from inside the integration container.

Also addressed Tier-3 polish in the same commit:

  • Changeset reconciled. Replaced the "first runtime submit triggers an internal read→write upgrade attempt that cannot succeed" / "settles into Disconnected" narrative with the actual flow (sendMessages drops outbound at the FrozenDeltaStream short-circuit; container stays Connected). Added an explicit call-out for the submitSignal read-only-variant behavior change (was 403 nack, now silent no-op for both variants — same rationale: signals are ephemeral and dropping is the right behavior with no upstream).
  • Test docstrings reconciled. Renamed `dispose()/close() while the read→write upgrade connect is hung` → `dispose()/close() runs cleanly on a writable-frozen container after a local write`, and `captures writes performed after the upgrade-connect has hung` → `captures writes batched across timer/microtask boundaries`. Inline comments updated to describe the actual short-circuit-first flow and cross-link the new unit test for the rejecter coverage.
  • PR description was already updated in ff37f19 to match the short-circuit-first design.

Held for the next iteration / human reviewer:

  • vladsud signoff on the connectionManager hunk (would prefer their judgment on `instanceof` vs a `policies`-style flag for refactor resilience)
  • Summarizer / id-compressor / blob JSDoc note on `readOnly?: boolean` — small addition; happy to do this in a follow-up commit if the next review still flags it
  • Flag-polarity decision — already JSDoc'd with rationale (alignment with `IContainer.readOnlyInfo.readonly`); awaiting human signoff

anthony-murphy and others added 3 commits May 1, 2026 16:06
The unit test typed `rejection` as `unknown` and called `instanceof Error`,
which TypeScript rejects for `unknown` LHS. Type rejection as
`Error | undefined` and check for `undefined` instead — equivalent
semantics, no instanceof on unknown.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
assert.strictEqual is declared with `asserts actual is T`, which narrowed
`rejection` to type `undefined` for the rest of the function. The
subsequent `assert(rejection !== undefined)` then narrowed to `never`,
making `rejection.message` a TS2339 error.

Capture into a separate const before the pending-state assertion so the
narrowing scopes to that const and the post-dispose check sees rejection
as Error|undefined again.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The lint rule requires catch handler parameters to be named 'error'.
Bare 'e' was caught by eslint --quiet at 178:22.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@anthony-murphy
Copy link
Copy Markdown
Contributor Author

Deep Review: Thanks for the rapid turnaround on 411863f — confirmed:

  • The focused unit tests in frozenServices.spec.ts (rejects the hung upgrade-connect when service.dispose() is called, rejects synchronously when called after service.dispose(), drains multiple pending rejecters on a single dispose() call) restore real coverage of the pendingConnectRejecters drain. Driving connectToDeltaStream({mode: "write"}) directly against a constructed FrozenDocumentService is the lighter-weight option (b) and gets meaningful assertions back — the integration-test version was always going to be awkward once the sendMessages short-circuit prevented the upgrade from firing.
  • Changeset reconciliation matches the actual flow now (sendMessages short-circuit; container stays Connected; submitSignal read-only-variant 403→no-op called out explicitly).
  • Test docstring renames (...while the read→write upgrade connect is hung...runs cleanly on a writable-frozen container after a local write, and the timer/microtask-batching rename) match what the tests actually exercise post-short-circuit.

One new concern surfaced this round, posted as an inline thread on frozenServices.ts:133-137: the writable-variant reconnect branch (client.mode !== "write" → fresh FrozenDeltaStream({readOnly:false})) reuses the constant "storage-only client" clientId across instances, which will trip pendingStateManager's 0x173 (replayPendingStates called twice for same clientId!) the first time a writable-frozen container reconnects with dirty pending ops. The fix is either per-instance clientId minting (mirrored in initialClients) or — if the branch is dead-by-construction now that sendMessages short-circuits the read-mode upgrade — dropping/asserting against the branch entirely. Details, evidence, and the open design question in the inline thread; updated review summary above has the path-to-ready.

Comment thread packages/loader/container-loader/src/frozenServices.ts Outdated
Pre-fix, every FrozenDeltaStream instance shared the constant clientId
"storage-only client". A subsequent connect through
FrozenDocumentService.connectToDeltaStream's `client.mode !== "write"`
branch (e.g. an explicit container.disconnect/connect cycle on a
writable-frozen container with dirty pending ops) would replay pending
ops against the same clientId twice and trip pendingStateManager's
0x173 assert ("replayPendingStates called twice for same clientId").

Mint a fresh `frozen-delta-stream/<uuid>` clientId per instance, mirror
it in initialClients so the audience handler still observes "self" and
transitions to Connected.

Adjusted two existing tests in noDeltaStream.spec.ts that asserted the
literal "storage-only client" string — they now assert the
"frozen-delta-stream/" prefix instead.

New regression test in loadFrozenContainerFromPendingState.spec.ts
forces a disconnect/reconnect cycle on a writable-frozen container
with dirty pending ops and verifies the container survives, all writes
remain locally visible, and pending state still round-trips. Pre-fix
this would have tripped 0x173.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread packages/loader/container-loader/src/frozenServices.ts Outdated
Comment thread packages/loader/container-loader/src/connectionManager.ts
Comment thread packages/loader/container-loader/src/test/frozenServices.spec.ts
Comment thread packages/test/local-server-tests/src/test/noDeltaStream.spec.ts Outdated
…instance clientId; subsystem JSDoc

Three Tier-3 polish items from the deep review on 849c0ed:

- Narrow ConnectionManager.sendMessages's FrozenDeltaStream short-circuit
  to the writable variant (`!this.connection.readOnly`). The read-only
  variant retains its FrozenDeltaStream.submit 403-nack tripwire — a
  stray submit on a storage-only frozen connection should remain
  observable as an upstream invariant break. Tripwire comment expanded
  to document the read-only-vs-writable split.

- Add a focused unit test in frozenServices.spec.ts pinning that
  subsequent connectToDeltaStream({mode:"read"}) calls on a writable
  FrozenDocumentService return distinct FrozenDeltaStream instances with
  distinct frozen-delta-stream/<uuid> clientIds, and that initialClients
  mirrors the per-instance id. This pins the linchpin that prevents
  pendingStateManager 0x173 on the read-mode reconnect branch — a future
  refactor that breaks the per-instance UUID would now fail this test.

- Subsystem JSDoc note on ILoadFrozenContainerFromPendingStateProps.readOnly
  confirming summarizer / id-compressor / blob behavior is unchanged from
  the read-only frozen path (storage ops throw, no acks, no quorum
  changes); only delta is the runtime accepting DDS submissions and
  accumulating them in pendingStateManager.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@anthony-murphy
Copy link
Copy Markdown
Contributor Author

Quick status note on the 8/10 verdict on ce7f4ed:

The "held subsystem JSDoc note" item in Path to Ready is already addressed in ce7f4ed4d1. See createAndLoadContainerUtils.ts:265-272:

Subsystem behavior is unchanged from the read-only frozen path regardless of readOnly: storage operations still throw (only readBlob is supported); summarizer / id-compressor never fire because no acks arrive; the quorum is whatever was captured in pending state and gains no members during the writable-frozen lifetime. The only behavioral delta is that the runtime accepts DDS submissions and accumulates them in pendingStateManager.

Remaining items from the Path to Ready are all on the human reviewer side:

  • Human sign-offs: vladsud (connectionManager short-circuit), ChumpChief (@legacy @alpha polarity / arg order), dannimad (frozen-container test pattern), jatgarg (optional, constructor refactor at her two call sites).
  • Optional Tier-3 follow-ups I am holding: focused mocked unit test for the sendMessages instanceof FrozenDeltaStream && !readOnly early-return (requires a ConnectionManager test double — agreed it would tighten regression detection but is heavier than the integration coverage already in place); noDeltaStream.spec.ts filename rename (cosmetic, file covers paths beyond just FrozenDeltaStream).

All 7 prior Deep Review iterations have closed; no fresh correctness findings on this commit. Standing down on autonomous changes until human review lands.

Comment thread packages/loader/container-loader/src/frozenServices.ts Outdated
…itable siblings

- FrozenDeltaStream (read-only) and WritableFrozenDeltaStream (writable) now extend a shared abstract FrozenDeltaStreamBase. The split replaces the prior `readOnly` flag and the runtime asserts that gated which options were valid for which variant — now encoded in types.
- New isWritableFrozenDeltaStreamConnection predicate parallels isFrozenDeltaStreamConnection. ConnectionManager.sendMessages uses the predicate instead of a raw `instanceof + .readOnly` check; all type discrimination at consumer call sites goes through predicate functions.
- Collapsed FrozenDocumentService.connectToDeltaStream: it now returns a fresh WritableFrozenDeltaStream regardless of client.mode or whether this is the initial connect. The write-mode hang fallback was unreachable in normal flow (sendMessages short-circuit blocks the only runtime path that would request a write reconnect), so the rejecter Set, disposed flag, handedOutInitialConnection flag, and dispose-time draining loop are all gone. dispose() is a no-op.
- Tests updated: removed three lifecycle tests that exercised the now-deleted hang/rejecter machinery; kept the per-instance clientId regression test. Refreshed comments in the integration-test file that referenced the dead code paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread packages/loader/container-loader/src/frozenServices.ts
Comment thread packages/loader/container-loader/src/frozenServices.ts Outdated
Comment thread packages/loader/container-loader/src/connectionManager.ts
…est; document storageOnly consumer audit

- New `forceReadonly(true) surfaces a writable-frozen container as readonly` integration test pins the interaction with microsoft#11655's "lie disconnected" mechanism: forceReadonly(true) on a writable-frozen container falls through to the standard disconnect/reconnect-as-read flow, the new connection is another WritableFrozenDeltaStream, and `_forceReadonly = true` overrides readOnlyInfo.readonly to true regardless of the new stream's DocWrite scope. The runtime stops submitting at the readonly boundary; the sendMessages short-circuit becomes double-protection rather than the load-bearing layer.
- JSDoc note on `FrozenDocumentService.policies` records the consumer audit (only ConnectionManager.connectCore consumes `policies.storageOnly` as a frozen-container marker; all other matches are drivers reading their own policies or `IReadOnlyInfo.storageOnly`, which is derived from the live connection). Writable-frozen is intentionally indistinguishable from a normal container at the policies layer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread packages/loader/container-loader/src/frozenServices.ts Outdated
Comment thread packages/loader/container-loader/src/frozenServices.ts Outdated
anthony-murphy and others added 3 commits May 5, 2026 09:57
…-instance UUID to writable variant

Address review feedback that the read-only frozen path's identity (clientId, IClient.user.id) shouldn't change since customers may key off it. The 0x173 replay-assert risk that motivated per-instance clientIds applies only to the writable variant, where the runtime accumulates dirty pending ops across reconnects.

- FrozenDeltaStream (read-only): clientId restored to the constant `"storage-only client"`. clientFrozenDeltaStream.user.id reverted to `"storage-only client"`. The historical clientIdFrozenDeltaStream constant is restored.
- WritableFrozenDeltaStream: still mints a per-instance `frozen-delta-stream/<uuid>` clientId; the 0x173 dodge stays where it's actually needed.
- FrozenDeltaStreamBase: constructor now takes `(clientId, claims)` parameters so each subclass picks its own scheme. claims is a single field on the base (was abstract + duplicate `as ITokenClaims` casts in two subclasses); module-level readOnlyClaims / writableClaims constants share the cast rationale.
- noDeltaStream.spec.ts: 4 assertions reverted from `assert.match(/^frozen-delta-stream\//)` back to `strictEqual("storage-only client")` to match the restored read-only identity.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@anthony-murphy anthony-murphy marked this pull request as ready for review May 6, 2026 01:27
Copilot AI review requested due to automatic review settings May 6, 2026 01:27
@anthony-murphy anthony-murphy requested a review from a team as a code owner May 6, 2026 01:27
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a readOnly?: boolean option (defaulting to true) to loadFrozenContainerFromPendingState and createFrozenDocumentServiceFactory, enabling a new “writable frozen” load mode where the runtime can accept local DDS submissions while still preventing any outbound ops from reaching a real service.

Changes:

  • Add readOnly?: boolean plumbing for frozen-from-pending-state loads and factory wrapping (default preserves existing read-only behavior).
  • Split frozen delta stream behavior into read-only vs writable variants, and add a ConnectionManager.sendMessages short-circuit to drop outbound messages for the writable-frozen connection.
  • Add targeted unit + local-server tests for the writable-frozen flow and regression coverage (pending state accrual, reconnect safety, signals, etc.).

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
packages/test/local-server-tests/src/test/loadFrozenContainerFromPendingState.spec.ts Adds extensive local-server coverage for readOnly: false flow, reconnects, pending state round-trip, and lifecycle behaviors.
packages/loader/container-loader/src/test/frozenServices.spec.ts New unit tests for frozen delta stream nack/no-op behavior and per-instance clientId uniqueness for writable connections.
packages/loader/container-loader/src/frozenServices.ts Introduces writable frozen delta stream variant + per-instance clientId; adds readOnly wrapping behavior to factory/service.
packages/loader/container-loader/src/createAndLoadContainerUtils.ts Extends frozen-load props with readOnly?: boolean and forwards to frozen document service factory wrapper.
packages/loader/container-loader/src/connectionManager.ts Adds writable-frozen sendMessages short-circuit; updates FrozenDeltaStream construction to new options shape.
packages/loader/container-loader/api-report/container-loader.legacy.alpha.api.md Updates legacy alpha API report for the new optional readOnly parameter.
.changeset/frozen-container-readonly-option.md Adds release note + minor bump for the new option and behavior changes.

Comment thread .changeset/frozen-container-readonly-option.md Outdated
anthony-murphy and others added 3 commits May 5, 2026 19:27
…ts; fix changeset wording

Address Copilot review feedback on timing-dependent test waits and stale wording:

- Changeset: "synthetic FrozenDeltaStream" → "synthetic WritableFrozenDeltaStream" in the writable-context paragraph (the writable path uses WritableFrozenDeltaStream specifically, and sendMessages short-circuits on that).
- captures writes batched across timer/microtask boundaries: 500ms sleep → setTimeout(0). Crossing one macrotask boundary is sufficient for the test's intent (runtime continues to capture pending state across timer boundaries); 500ms was overkill. Renamed map keys preDisconnect/postDisconnect → preDelay/postDelay since no disconnect occurs in this test. Updated comment "FrozenDeltaStream short-circuit" → "WritableFrozenDeltaStream short-circuit".
- dispose() / close() runs cleanly: dropped the 200ms delays. The simplification that collapsed FrozenDocumentService.connectToDeltaStream branches removed the async hung-promise machinery, so there's no deferred work to wait for; dispose()/close() can run immediately after the synchronous root.set().
- forceReadonly(true): dropped the 200ms delay. forceReadonly toggles _forceReadonly synchronously, so readOnlyInfo updates without awaiting a reconnect cycle (the disconnect/reconnect-as-read triggered behind it is exercised by the survives-disconnect/reconnect test).
- survives a disconnect/reconnect cycle: replaced the 200ms sleep with Promise.race against the "connected" event and a 500ms safety-net timeout. Resolves immediately if the event fires; bounded otherwise. The safety net is documented.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI biome check on `a3360ee43c` flagged the multiline `new Promise<void>((resolve) => frozenContainer.once("connected", () => resolve()))` arrow body. Reformat to one line.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 6, 2026

🔗 No broken links found! ✅

Your attention to detail is admirable.

linkcheck output


> fluid-framework-docs-site@0.0.0 ci:check-links /home/runner/work/FluidFramework/FluidFramework/docs
> start-server-and-test "npm run serve -- --no-open" 3000 check-links

1: starting server using command "npm run serve -- --no-open"
and when url "[ 'http://127.0.0.1:3000' ]" is responding with HTTP status code 200
running tests using command "npm run check-links"


> fluid-framework-docs-site@0.0.0 serve
> docusaurus serve --no-open

[SUCCESS] Serving "build" directory at: http://localhost:3000/

> fluid-framework-docs-site@0.0.0 check-links
> linkcheck http://localhost:3000 --skip-file skipped-urls.txt

Crawling...

Stats:
  288641 links
    1922 destination URLs
    2172 URLs ignored
       0 warnings
       0 errors


@anthony-murphy
Copy link
Copy Markdown
Contributor Author

Deep Review

Reviewed commit 102c584 on 2026-05-06.

Readiness: 10/10 — READY

Sign-off ready. No Tier 1–3 findings on 102c584; same code position as the a3360ee review with the surface concerns (assert(!isWritableFrozenDeltaStreamConnection) invariant guard, as ITokenClaims casts, option polarity) reading as soft polish rather than blockers. Remaining motion is the human sign-offs the PR description already requests — vladsud on the ConnectionManager.sendMessages short-circuit, ChumpChief on the @legacy @alpha API shape, jatgarg (optional) on the constructor reshape at her two historical call sites.

Context for Reviewers

  • The WritableFrozenDeltaStream short-circuit in sendMessages is the first deliberate exception to vladsud's Remove two types of nacks - "Nonexistent client" & "Readonly client" #7753 read-mode-no-submits invariant. Ordering is the entire correctness story — isWritableFrozenDeltaStreamConnection at connectionManager.ts:1093-1120 MUST stay above the read-mode reconnect branch; under ReconnectMode.Never the read-mode branch would call closeHandler and tear the container down on the first DDS write. Tripwire comment is in place; no automated enforcement. The branch is gated to isWritableFrozenDeltaStreamConnection (not FrozenDeltaStream) so the read-only variant retains its submit 403-nack tripwire.
  • clientId ownership change is asymmetric by design. Read-only variant keeps the constant "storage-only client" (hard-coded in noDeltaStream.spec.ts); only WritableFrozenDeltaStream mints a per-instance frozen-delta-stream/<uuid>. Targets the pendingStateManager 0x173 (replayPendingStates called twice for same clientId!) assert that only fires on the writable path when reconnecting with dirty pending ops. Disconnect/reconnect integration test pins this. Inline JSDoc captures the rationale; redirect any "give every frozen connection a uuid clientId for symmetry" suggestions there.
  • initialClients mirroring is load-bearing. PR Simplified model for Audience; Leverage join signal's referenceSequenceNumber for catch up logic #12164 (vladsud, 2022-11-13) made the connected transition occur only after "self" appears in Audience. The frozen-stream base mirrors each synthetic clientId into initialClients so the audience handler observes self without a real join signal; the new unit test asserts each writable stream's initialClients[0]?.clientId matches its fresh clientId.
  • allowReconnect: false and non-interactive-client first-connect interaction is also covered by the sendMessages short-circuit. Container.connectToDeltaStream forces mode = "write" on the very first connect under those conditions; the same short-circuit catches both the runtime upgrade path and the ReconnectMode.Never close path with a single early return. Integration tests pin both allowReconnect: false and non-interactive-client variants.
  • submitSignal is now a silent no-op for both variants (was 403 nack on the read-only variant in ContainerLoader: Load pending state as a frozen container #25538). Documented in the changeset; signals are ephemeral and dropping is the right behavior with no upstream. submit stays defensive (403 nack on the read-only variant) — ops are durable state, a stray submit signals an invariant break.
  • Writable-frozen intentionally diverges from Lie to ContainerRuntime about connected state when in readonly mode #11655's "lie disconnected" approach. Container stays Connected, runtime submits, sendMessages drops at the network layer. The new forceReadonly(true) integration test pins the interaction with Lie to ContainerRuntime about connected state when in readonly mode #11655's mechanism — a writable-frozen container with forceReadonly(true) surfaces as readonly.
  • Type-system enforcement of readOnly-only options matches the author's own established review standard. On PR Handle sessoionforbidden error and add storageOnlyReason to Readonly info #15907, anthony-murphy pushed back on as casts in this area. The sibling-class split (FrozenDeltaStream / WritableFrozenDeltaStream) plus isFrozenDeltaStreamConnection / isWritableFrozenDeltaStreamConnection predicates and the explicit "no raw instanceof checks" PR-description note directly mirror that standard. The surviving as ITokenClaims casts at frozenServices.ts:173-184,296-307 are explicitly bracketed by JSDoc rationale.
  • @legacy @alpha shape extends an existing props bag. PR ContainerLoader: Move getPendingLocalState to legacy/alpha #25513 review (ChumpChief, 2025-09-24): "how does this scale if there's a second thing that we want to export an alpha version for?" — anthony-murphy: "it should become an overload". The PR adds readOnly?: boolean to the existing ILoadFrozenContainerFromPendingStateProps rather than introducing a new function; positional-arg createFrozenDocumentServiceFactory(readOnly?, factory?) is the asymmetry worth scrutiny.
  • Rewrap-on-mismatch and per-instance UUID are extra defensive depth. Independent design proposals missed both hazards: createFrozenDocumentServiceFactory unwraps and rewraps when an already-wrapped factory's readOnly differs from the caller's argument so the most recent intent wins; the per-instance writable clientId defends against pendingStateManager 0x173 on reconnect with dirty pending ops.
  • Writable-frozen is a scoped opt-in layered on the original storage-only design. PR ContainerLoader: Load pending state as a frozen container #25538 introduced FrozenDocumentServiceFactory / FrozenDocumentService / FrozenDeltaStream for read-only, storage-only frozen loads. This PR preserves that default while adding readOnly?: boolean and gates the non-storage-only policy to readOnly: false.
For human reviewer
  • vladsud sign-off on the ConnectionManager.sendMessages short-circuit (packages/loader/container-loader/src/connectionManager.ts:1093-1120) — first deliberate exception to his Remove two types of nacks - "Nonexistent client" & "Readonly client" #7753 read-mode-no-submits invariant. Verify (a) the writable-frozen short-circuit precedes every submit-path branch, (b) whether the tripwire comment is sufficient or an assert(!isWritableFrozenDeltaStreamConnection(this.connection)) guard above the read-mode reconnect branch is wanted, (c) the read-only FrozenDeltaStream 403-nack tripwire remains observable. Also: confirm intended forceReadonly(true) × writable-frozen fall-through semantics pinned by the new test.
  • ChumpChief sign-off on the @legacy @alpha API shape — readOnly?: boolean polarity (negative, default true) on ILoadFrozenContainerFromPendingStateProps; positional-arg vs. options-bag asymmetry on createFrozenDocumentServiceFactory(readOnly?, factory?); whether overloading loadFrozenContainerFromPendingState({ readOnly }) is the right axis or the writable case warrants its own entry point (the ContainerLoader: Move getPendingLocalState to legacy/alpha #25513 "second thing" question now exercised). Both independent proposals chose positive polarity (loadWritable / allowLocalEdits); the JSDoc explicitly notes a positive-polarity option "can layer on top later without breaking callers."
  • jatgarg (optional) review of (a) the FrozenDeltaStream constructor positional → options-bag refactor at her two historical call sites (Handle sessoionforbidden error and add storageOnlyReason to Readonly info #15907 storageOnlyReason, Handle out of storage error from driver and make the container loadable #17131 readonlyConnectionReason); confirm the readonly event payload still carries error.errorType === DriverErrorType.outOfStorage end-to-end after the reshape; (b) the read-only variant losing its submitSignal 403-nack tripwire on the storage-only / forbidden / out-of-storage paths.
  • Sibling-class split vs single class with mode flag — taste call. PR has documented rationale (type-system narrowing of read-only-only options + instanceof predicate excluding the writable variant). Independent design proposals both chose a single-class + flag shape; PR's choice preserves a type-discriminant invariant.
  • Locus of decoupling — connection-layer (PR's choice) vs Container-layer (localEditsOnly field, canSendOps independent of readonly). Both achieve the same observable behavior; PR's blast radius is smaller.
  • Persisted as ITokenClaims cast pattern at frozenServices.ts:173-184 and :296-307. A reviewer who wants a makeFrozenClaims(scopes) helper rather than the eslint-disable bracketing should call it out explicitly. No current-correctness bug — only claims.scopes is read downstream.
  • submitSignal harmonization to silent no-op on the read-only variant — pre-existing public path; documented in the changeset. Reviewers wanting a positive consumer audit (grep results) rather than a negative claim ("no known consumer") can call it out.
  • Cannot be assessed by the pipeline — whether the disconnect/reconnect regression test currently passes in CI on 102c584; runtime behavior of presence/signal subsystems on a writable-frozen connection over time; cross-fork determinism of the per-instance UUID clientId under concurrent reconnects; production drift in PendingStateManager / Outbox ordering that would change the "ops are durable before sendMessages drops them" invariant.
Review history (10 prior reviews)
  • a3360ee 2026-05-05 · 10/10 — sign-off ready; no Tier 1–3 findings; soft-polish items survive as Tier 4
  • 4ddd5c1 2026-05-05 · 10/10 — sign-off ready; no Tier 1–3 findings; remaining motion is human sign-offs
  • 2035078 2026-05-05 · 9/10 — no Tier 1–3 findings; gap to 10/10 attributed to pending human sign-offs
  • 86c509a 2026-05-05 · 9/10 — no Tier 1–3 findings; gap to 10/10 is human sign-offs
  • ac44bd1 2026-05-04 · 8/10 — three new Tier-3 polish items (stale PR-description paragraph, policies.storageOnly consumer audit, forceReadonly(true) pinning test)
  • e93eb26 2026-05-04 · 8/10 — design holds on re-review; new disposed-asymmetry inline thread; polish + sign-offs remaining
  • ce7f4ed 2026-05-01 · 8/10 — short-circuit narrowing to writable-only and per-instance-UUID unit test landed; surviving items polish + sign-offs
  • 849c0ed 2026-05-01 · 8/10 — per-instance clientId fix lands; instanceof short-circuit narrowing and unit-test polish raised
  • cdcf203 2026-05-01 · 5/10 — writable-frozen reconnect 0x173 clientId-reuse risk on frozenServices.ts:133-137
  • ff37f19 2026-05-01 · 3/10 — vacuous dispose/close-during-hung-upgrade tests after the sendMessages short-circuit

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants