Skip to content

feat(dm): persist outbound DM plaintext in chat delegate (#256)#259

Merged
sanity merged 5 commits into
mainfrom
worktree-issue-256-outbound-dms
May 16, 2026
Merged

feat(dm): persist outbound DM plaintext in chat delegate (#256)#259
sanity merged 5 commits into
mainfrom
worktree-issue-256-outbound-dms

Conversation

@sanity
Copy link
Copy Markdown
Contributor

@sanity sanity commented May 16, 2026

Problem

In-room DMs (#243 / #244) carry only ECIES ciphertext on the wire — only the recipient can decrypt the body. The sender's own UI / `riverctl dm list` rendered their own sent DMs as "sent — ciphertext only", which is terrible UX: you typed the message, you sent it, but you can't see what you typed afterwards.

Approach

Persist the sender's plaintext in the chat delegate (same backing store as room secrets / signing keys / migrated rooms). This gives us:

  • Persistence across reloads. Sent DMs still readable next week.
  • Multi-device convergence. Co-room devices share delegate state.
  • Correct purge semantics. Recipient tombstones drop the plaintext in lockstep with the contract's ciphertext eviction.
  • Encrypted at rest. Delegate storage already encrypts data tied to the room secret.

What's in this PR

Wire format (`common/src/chat_delegate.rs`)

  • `OUTBOUND_DMS_STORAGE_KEY = b"outbound_dms"`
  • `OutboundDmStore { entries: Vec }` — `Vec` not `HashMap` so JSON serialisation works per the "non-string map keys in JSON-serialized API types" bug-prevention pattern. JSON + CBOR round-trip tests pin both shapes.

UI (`ui/src/components/app/chat_delegate.rs`, `direct_messages.rs`, `dm_thread_modal.rs`)

  • `OUTBOUND_DMS` global signal, hydrated from the delegate on startup (and from any legacy delegate as part of the existing migration).
  • `save_outbound_dm()` appends an entry, enforces `MAX_DM_MESSAGES_PER_PAIR` cap eviction, queues a delegate save. All signal mutation routed through `crate::util::defer` per the WASM signal-safety rules.
  • `prune_outbound_dms_for_purges()` drops cache entries whose ciphertext is no longer present in any room. Wired into a `use_effect` in `App()` that subscribes to `ROOMS` only — not `OUTBOUND_DMS` — to avoid self-triggering on its own writes. Skips the `with_mut` entirely when nothing to remove.
  • Thread modal's view memo snapshots the cache once and consults it for outbound bubbles; miss → legacy "sent — ciphertext only" placeholder for back-compat.

CLI (`cli/src/commands/dm.rs`, `cli/src/storage.rs`)

  • New `outbound_dms.json` side file managed by `Storage`.
  • `dm send` writes plaintext after a successful state delta.
  • `dm list` consults the cache and renders plaintext for outbound bubbles; misses fall back to `<sent: ciphertext only>`.
  • Per-pair cap eviction split into `apply_per_pair_cap` for unit testing.

Migration (`legacy_delegates.toml`)

  • New V18 entry for the pre-feat: persist outbound DM plaintext in the chat delegate #256 delegate (`code_hash=6229c2a1…`, `delegate_key=11db7236…`).
  • `fire_legacy_migration_request` now also probes legacy delegates for the `outbound_dms` blob so a rebuild doesn't orphan sender plaintext.
  • Room contract WASM intentionally unchanged — only `chat_delegate.rs` types added to river-core, and dead-code elimination strips them from the room contract binary. Verified by rebuilding from pre-feat: persist outbound DM plaintext in the chat delegate #256 source and confirming the local-build hash matches a rebuild with my changes.

Testing

  • `cargo test -p river-core`: JSON + CBOR round-trip of `OutboundDmStore` (+ empty store).
  • `cargo test -p riverctl --lib dm`: per-pair cap eviction (over-cap drops oldest by timestamp; pairs are isolated).
  • `cargo test -p river-core --test migration_test`: V18 entry validates (BLAKE3 invariants, no duplicates).
  • `cargo make check-migration`: confirms old hash present in TOML.

Manual UI test deferred to deploy step.

Out of scope (per issue #256)

  • Backfilling pre-delegate outbound DMs — sent under older clients stay as ciphertext-only forever.
  • Edit/delete of sent DMs.
  • Cross-room delegate sync (same scope as room-secret sync).

Closes #256

[AI-assisted - Claude]

sanity added a commit that referenced this pull request May 16, 2026
Multi-model review (4 internal + Codex) on PR #259 surfaced two
critical bugs and several IMPORTANT issues. This commit addresses them.

## BLOCKING — Cold-start prune-clobbers-cache race

`prune_outbound_dms_for_purges` previously compared the cache to the
"live messages" set in `ROOMS` and removed entries whose ciphertext
was missing. That signal is unsafe on cold start: the cache hydrates
from the delegate before the room contract state finishes loading,
so `live_keys` is empty when the prune effect first fires —
destroying the entire cache and persisting the empty result.

Fix: switch both the UI (`prune_outbound_dms_for_purges`) and CLI
(`prune_outbound_cache_for_room`) prune helpers to act ONLY on
entries whose token appears in some recipient's
`AuthorizedRecipientPurges` envelope. Purge envelopes are monotonic
network state, so their presence is always a true tombstone signal
regardless of room-state freshness.

## BLOCKING (Codex P1) — Migration-done flag was not scoped to legacy set

`river_legacy_migration_done` was a single global localStorage flag.
Once set (after any successful migration), every subsequent delegate
WASM bump that added a new entry to `legacy_delegates.toml` was
silently blocked: `fire_legacy_migration_request` returned early
without probing the new V18 entry, making the user's saved rooms
appear lost.

Fix: scope the flag per-legacy-set by suffixing the localStorage key
with the BLAKE3 fingerprint (16 hex) of `LEGACY_DELEGATES`. Adding
or removing a legacy entry now automatically invalidates the
previous flag and re-enables migration. Pinned by
`legacy_migration_flag_is_scoped_to_set`.

## IMPORTANT — Lost-update race on concurrent saves

`save_outbound_dms_to_delegate` snapshotted the cache then awaited
the delegate round-trip; two concurrent sends could race and the
later-landing save would overwrite the fresher one, silently losing
entries.

Fix: single-flight via `OUTBOUND_DMS_SAVE_IN_FLIGHT` /
`OUTBOUND_DMS_SAVE_DIRTY` atomic-flag pair. Rapid sends coalesce
into "current save + one final catch-up" rather than racing
parallel saves.

## IMPORTANT — Legacy hydrate overwrote fresher current entries

`hydrate_outbound_dms_cache` unconditionally `insert`ed every
deserialized entry, so a legacy response arriving after the current
delegate's response could overwrite the fresher entry.

Fix: on key collision, keep whichever entry has the larger
`timestamp`.

## IMPORTANT — Missing issue-256 regression test

Extracted a pure helper `lookup_outbound_plaintext(cache, room,
recipient, token)` from the `dm_thread_modal.rs` render path. Both
the UI and CLI now route through it for outbound-bubble rendering,
and two new tests pin the load-bearing lookup tuple:
- `dm_outbound_lookup_returns_plaintext_on_hit`
- `dm_outbound_lookup_returns_err_on_miss` (three miss cases — wrong
  room, recipient, token).

## Documentation

- `.claude/rules/river-publish.md`: "How Delegate Migration Works"
  now lists `[ROOMS_STORAGE_KEY, OUTBOUND_DMS_STORAGE_KEY]` as the
  probe set and documents the per-storage-key conflict-resolution
  rule.
- `AGENTS.md` "In-Room Direct Messages": Phase 5 block describing
  the outbound-DM plaintext cache (wire format, render path, save
  path, prune path, legacy migration, CLI side).
- `cli/src/storage.rs::save_outbound_dms`: threat-model note
  clarifying CLI plaintext-at-rest is consistent with `rooms.json`.

## Tests

- `cargo test -p river-ui ... direct_messages` — 5 passing
  (incl. 2 new #256 regression tests).
- `cargo test -p river-ui ... chat_delegate::tests` — 6 passing
  (incl. 1 new per-set flag test).
- `cargo test -p riverctl --lib dm` — 6 passing.
- `cargo test -p river-core` — all green (144/29/44/+ all suites).

[AI-assisted - Claude]
@sanity
Copy link
Copy Markdown
Contributor Author

sanity commented May 16, 2026

Multi-model review complete

Ran 4 internal reviewers (code-first / testing / skeptical / big-picture) + Codex CLI. Two critical issues found and fixed in f053d4d:

BLOCKING fixes

  1. Cold-start prune wipes the cache (skeptical reviewer). The original prune compared the cache to live direct_messages entries and removed any orphan — which fired at startup before room state had finished loading, destroying the entire cache. Switched both UI and CLI prune helpers to act ONLY on entries listed in some recipient's AuthorizedRecipientPurges envelope. Purge envelopes are monotonic, so their presence is always a true tombstone signal regardless of freshness.
  2. Migration-done flag was not scoped to legacy set (Codex P1). Every WASM bump that added a new legacy entry was silently blocked for any user who already migrated under a previous set, making their data appear lost. Fixed by suffixing the localStorage key with a BLAKE3 fingerprint of the current LEGACY_DELEGATES set, so adding/removing entries auto-invalidates the flag. Pre-existing bug surfaced by this PR's V18 entry; affects every prior delegate upgrade since the gating was introduced in fix(ui): gate legacy delegate migration on current delegate response #254. Pinned by legacy_migration_flag_is_scoped_to_set.

IMPORTANT fixes

  1. Lost-update race on concurrent saves (skeptical). Single-flighted save_outbound_dms_to_delegate via an atomic-flag pair so rapid sends coalesce into "current save + one final catch-up".
  2. Legacy hydrate clobbered fresher current entries (skeptical). On key collision, keep whichever has the larger timestamp.
  3. No regression test for the issue-256 symptom (testing reviewer). Extracted a pure lookup_outbound_plaintext helper and added two tests pinning the load-bearing lookup tuple.

Documentation

  • .claude/rules/river-publish.md: documented the two-key probe set (ROOMS_STORAGE_KEY, OUTBOUND_DMS_STORAGE_KEY) and the per-key conflict-resolution rule.
  • AGENTS.md "In-Room Direct Messages": added Phase 5 block on the outbound cache.
  • cli/src/storage.rs::save_outbound_dms: threat-model note about plaintext-at-rest.

Coordination note

PR #258 (Enter-to-send / DMs to left rail) also touches dm_thread_modal.rs::send. Both are MERGEABLE against main today but conflict against each other; whoever lands second rebases the send closure.

CI rerunning now.

[AI-assisted - Claude]

sanity added 2 commits May 16, 2026 16:59
In-room DMs carry only ECIES ciphertext on the wire — only the
recipient can decrypt the body. The sender's UI / `riverctl dm list`
rendered their own sent DMs as "sent — ciphertext only", which was
terrible UX.

This change persists the sender's plaintext in the chat delegate (same
backing store as room secrets and signing keys), so:
- Reloads recover the original plaintext.
- A second device (sharing the chat delegate state) converges on the
  same outbound view.
- Recipient purges drop the plaintext in lockstep with the contract's
  ciphertext eviction.
- Encrypted at rest (delegate storage already encrypts data tied to
  the room secret).

Wire format (`common/src/chat_delegate.rs`):
- `OUTBOUND_DMS_STORAGE_KEY = b"outbound_dms"`
- `OutboundDmStore { entries: Vec<OutboundDmEntry> }` — `Vec`, not
  `HashMap`, so JSON serialization works per the
  "non-string map keys in JSON-serialized API types" bug-prevention
  pattern. JSON and CBOR round-trip tests pin both shapes.

UI (`ui/src/components/app/chat_delegate.rs`, `direct_messages.rs`,
`dm_thread_modal.rs`):
- `OUTBOUND_DMS` global signal hydrated from the delegate on startup
  (and from any legacy delegate as part of the existing migration).
- `save_outbound_dm()` appends an entry, enforces the contract's
  per-pair cap, and queues a delegate save (all signal mutation
  routed through `crate::util::defer` per the WASM signal-safety
  rules).
- `prune_outbound_dms_for_purges()` drops cached plaintext for
  ciphertext that has been purged from any room. Wired into a
  `use_effect` in `App()` that subscribes to ROOMS only — never to
  OUTBOUND_DMS — to avoid self-triggering on its own writes. The
  function skips the `with_mut` entirely when nothing needs removal.
- Thread modal's view memo snapshots the cache once and consults it
  for outbound bubbles; miss → legacy "sent — ciphertext only"
  placeholder for back-compat.

CLI (`cli/src/commands/dm.rs`, `cli/src/storage.rs`):
- New `outbound_dms.json` side file managed by `Storage`.
- `dm send` writes the plaintext after a successful state delta.
- `dm list` consults the cache and renders plaintext for outbound
  bubbles; misses fall back to `<sent: ciphertext only>`.
- `apply_per_pair_cap` is split out so per-pair eviction is unit
  tested without an `ApiClient`.

Migration (`legacy_delegates.toml`):
- Adds V18 entry for the pre-#256 delegate WASM. The new delegate
  WASM hash is `47e54c90…`; legacy migration probes also fetch the
  `outbound_dms` blob from legacy delegates so a delegate rebuild
  doesn't orphan sender plaintext.
- Room contract WASM is intentionally unchanged — only `chat_delegate`
  types in river-core are added, the room-contract crate doesn't
  import them.

Tests:
- river-core: JSON + CBOR round-trip of `OutboundDmStore`, plus empty
  store.
- riverctl: per-pair cap eviction (over-cap drops oldest by
  timestamp, pairs are isolated).

Closes #256

[AI-assisted - Claude]
Multi-model review (4 internal + Codex) on PR #259 surfaced two
critical bugs and several IMPORTANT issues. This commit addresses them.

## BLOCKING — Cold-start prune-clobbers-cache race

`prune_outbound_dms_for_purges` previously compared the cache to the
"live messages" set in `ROOMS` and removed entries whose ciphertext
was missing. That signal is unsafe on cold start: the cache hydrates
from the delegate before the room contract state finishes loading,
so `live_keys` is empty when the prune effect first fires —
destroying the entire cache and persisting the empty result.

Fix: switch both the UI (`prune_outbound_dms_for_purges`) and CLI
(`prune_outbound_cache_for_room`) prune helpers to act ONLY on
entries whose token appears in some recipient's
`AuthorizedRecipientPurges` envelope. Purge envelopes are monotonic
network state, so their presence is always a true tombstone signal
regardless of room-state freshness.

## BLOCKING (Codex P1) — Migration-done flag was not scoped to legacy set

`river_legacy_migration_done` was a single global localStorage flag.
Once set (after any successful migration), every subsequent delegate
WASM bump that added a new entry to `legacy_delegates.toml` was
silently blocked: `fire_legacy_migration_request` returned early
without probing the new V18 entry, making the user's saved rooms
appear lost.

Fix: scope the flag per-legacy-set by suffixing the localStorage key
with the BLAKE3 fingerprint (16 hex) of `LEGACY_DELEGATES`. Adding
or removing a legacy entry now automatically invalidates the
previous flag and re-enables migration. Pinned by
`legacy_migration_flag_is_scoped_to_set`.

## IMPORTANT — Lost-update race on concurrent saves

`save_outbound_dms_to_delegate` snapshotted the cache then awaited
the delegate round-trip; two concurrent sends could race and the
later-landing save would overwrite the fresher one, silently losing
entries.

Fix: single-flight via `OUTBOUND_DMS_SAVE_IN_FLIGHT` /
`OUTBOUND_DMS_SAVE_DIRTY` atomic-flag pair. Rapid sends coalesce
into "current save + one final catch-up" rather than racing
parallel saves.

## IMPORTANT — Legacy hydrate overwrote fresher current entries

`hydrate_outbound_dms_cache` unconditionally `insert`ed every
deserialized entry, so a legacy response arriving after the current
delegate's response could overwrite the fresher entry.

Fix: on key collision, keep whichever entry has the larger
`timestamp`.

## IMPORTANT — Missing issue-256 regression test

Extracted a pure helper `lookup_outbound_plaintext(cache, room,
recipient, token)` from the `dm_thread_modal.rs` render path. Both
the UI and CLI now route through it for outbound-bubble rendering,
and two new tests pin the load-bearing lookup tuple:
- `dm_outbound_lookup_returns_plaintext_on_hit`
- `dm_outbound_lookup_returns_err_on_miss` (three miss cases — wrong
  room, recipient, token).

## Documentation

- `.claude/rules/river-publish.md`: "How Delegate Migration Works"
  now lists `[ROOMS_STORAGE_KEY, OUTBOUND_DMS_STORAGE_KEY]` as the
  probe set and documents the per-storage-key conflict-resolution
  rule.
- `AGENTS.md` "In-Room Direct Messages": Phase 5 block describing
  the outbound-DM plaintext cache (wire format, render path, save
  path, prune path, legacy migration, CLI side).
- `cli/src/storage.rs::save_outbound_dms`: threat-model note
  clarifying CLI plaintext-at-rest is consistent with `rooms.json`.

## Tests

- `cargo test -p river-ui ... direct_messages` — 5 passing
  (incl. 2 new #256 regression tests).
- `cargo test -p river-ui ... chat_delegate::tests` — 6 passing
  (incl. 1 new per-set flag test).
- `cargo test -p riverctl --lib dm` — 6 passing.
- `cargo test -p river-core` — all green (144/29/44/+ all suites).

[AI-assisted - Claude]
@sanity sanity force-pushed the worktree-issue-256-outbound-dms branch from f053d4d to 5594655 Compare May 16, 2026 22:00
sanity added 3 commits May 16, 2026 17:04
Bumped by cargo make publish-river during #256 outbound-DM deploy.

[AI-assisted - Claude]
…259)

Codex P2 finding on PR #259 re-review (post-rebase): when the
`outbound_dms` delegate response arrives AFTER `rooms_data`, the
`App()` ROOMS-subscribed prune effect has already fired on an empty
cache, and intentionally does NOT subscribe to OUTBOUND_DMS (to
avoid looping on its own writes). Without a follow-up prune,
plaintext for already-purged DMs persists in the cache until some
unrelated ROOMS change re-triggers the effect.

Fix: after `hydrate_outbound_dms_cache` populates the cache,
`crate::util::defer` a follow-up `prune_outbound_dms_for_purges()`
call. setTimeout(0) macrotasks execute in FIFO enqueue order, so
the prune runs AFTER the hydrate's internal `defer(with_mut)` and
sees the freshly hydrated entries.

Per the new global rules (~/.claude/rules/{publish-from-main,
no-merge-with-red-ci, multi-model-review}.md), this fix:
- Goes through the branch + merge to main path (not a feature-branch
  republish).
- Will trigger CI re-run on the new HEAD SHA.
- Will trigger a third round of reviewers since the code changed.

[AI-assisted - Claude]
Codex re-review of `b6c923ac` flagged a race in
`save_outbound_dms_to_delegate`: between the post-await
`DIRTY.swap(false)` and the `IN_FLIGHT.store(false)`, a concurrent
caller could observe `IN_FLIGHT == true`, set `DIRTY = true`, and
return — while the in-flight save then released `IN_FLIGHT` without
re-checking `DIRTY`, stranding the dirty update with no save
running. Result: most-recent sent-DM plaintext silently lost
across reloads.

Replace the `AtomicBool` IN_FLIGHT/DIRTY pair with a
`futures::lock::Mutex` + dirty flag. The mutex serializes the
entire critical section so the TOCTOU window doesn't exist: every
caller marks dirty, queues at the mutex, and the holder drains
the dirty queue inside the lock via the inner while-loop. Same
coalescing semantics (at most 2 delegate writes per N rapid
mutations) without the race.

The mutex/dirty-flag pattern is also more obviously correct under
multi-threaded execution — even though WASM is single-threaded
today, using `AcqRel`/`Release` ordering implied multi-thread
intent and the race would manifest on any rebuild for a
multi-threaded target.

Per the new global rules
(~/.claude/rules/multi-model-review.md per-CODE-CONTENT requirement):
this fix triggers ANOTHER re-review round on the new HEAD SHA
before merge.

[AI-assisted - Claude]
@sanity sanity merged commit 9e63997 into main May 16, 2026
6 checks passed
sanity added a commit that referenced this pull request May 16, 2026
Publish from main after PR #259 merge (squash commit 9e63997) per
the new global publish-from-main rule. v30000303 had the Codex P2
hydrate-then-prune ordering race and the TOCTOU save race; both
fixed in the merged PR. v30000304 is the corrected build.

[AI-assisted - Claude]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: persist outbound DM plaintext in the chat delegate

1 participant