fix(read-path): reach complete threads, dense-second timelines, and all people in the GUI by tlongwell-block · Pull Request #1418 · block/buzz

tlongwell-block · 2026-07-01T02:10:27Z

What & why

This PR fundamentally fixes how the desktop GUI reaches channel messages, thread replies, and people — removing the "hacky limits" that silently miss messages/people. This is the Tier-1 read-path work: reachability via keyset pagination over the already-merged relay substrate (PR #1321), no schema change, no hot-path rewrite. Two lanes, one PR.

Lane 1 — messages + threads (Eva)

Threads (descendant gap). The thread panel derived replies from the channel cache, and useLoadMissingAncestors only backfilled ancestors. Replies outside the cold-load window were never fetched, so deep/old threads rendered silently incomplete. useThreadReplies closes it via the same cache seam: page get_thread_replies to the floor over the gap-free (created_at, event_id) keyset and merge into the channel cache (dedupe by id, idempotent per root, retry on error). Downstream grouping/ordering/unread unchanged.

The backend get_thread_replies keyset-ordered on event_created_at alone with no tiebreak, so a page ending mid-second dropped every remaining tied reply. Composite (event_created_at, event_id) keyset closes it. The query keys on root_event_id (which root rows lack), so results are replies-only (depth >= 1) — root excluded by construction.

Channel timeline (dense-second wall). Ordinary same-second overlap is duplicate-safe (relay until is inclusive), so the timeline doesn't drop messages normally. The one genuinely unreachable case: a single created_at second with more messages than one WS page — until never advances, everything behind it is stuck. get_channel_messages_before gives the pager a composite-keyset fallback that:

activates only on a full-page same-second stall (the proven wall),
seeds from the max event id already known at the boundary second (advances via id > before_id under created_at DESC, id ASC),
drains the entire boundary second before honoring the per-pass budget,
leaves the WS live-tail/optimistic/cold-load hot path untouched.

Lane 2 — people search / mentions / members (Wren)

Pages user search so add-users / tag-users / mention interactions reach every match instead of a hacky first-page limit (user_search.rs, useMentions.ts, members/mention/DM UIs, profile/hooks.ts). Empty-query directory browse is now paged end-to-end (relay page→offset on the non-search /query path; desktop emits a next cursor on full raw pages). E2E coverage in channels.spec.ts + mentions.spec.ts.

The blocker a live relay surfaced (and its fix)

Static review passed the thread path, but a dynamic redteam against a real relay caught a production-fatal blocker the mock hid: get_thread_replies built a kindless /query filter, and the relay bridge p-gate (p_gated_filters_authorized) 403s any kindless filter before routing — a kindless filter "could match" a p-gated kind, so the gate demands a #p=self tag the thread read doesn't send. The relay routes the thread query purely off #e + depth_limit (kind doesn't filter it), but the gate runs first. The e2e mock never modeled p-gating, so CI stayed green on a 403 — every thread-backfill call would have failed in prod.

Fixes:

messages.rs — give the thread filter the non-p-gated TIMELINE_KINDS (via a shared const + build_thread_replies_filter helper mirroring the channel-messages sibling, so the two p-gate filters can't drift). Guard unit tests assert the filter carries kinds and that every kind is absent from P_GATED_KINDS (the real invariant), plus composite-cursor paging.
e2eBridge.ts — model the relay p-gate for /query and WS REQ so a kindless #e read is now rejected in the mock too (faithful port of p_gated_filters_authorized), closing the CI blind spot. Also fix the mock subtree walk to traverse nested replies transitively (frontier BFS for depth > 1).
useThreadReplies.ts — clear the fetched-root mark on cancellation unless the paging loop completed, so an interrupted first fetch retries on reopen instead of stranding the feature. Narrow effect deps to primitive channel id/type.
Corrected a false "includes root at depth 0" contract at all three doc sites (messages.rs / tauri.ts / types.ts): results are replies-only.

Verification

Live re-verify at the final tip (the gate that caught the blocker): built the relay at this SHA on a clean-room DB, seeded root + 250 dense same-second depth-1 replies + 1 depth-2 grandchild.

kindless #e filter → HTTP 403 (blocker reproduced exactly),
shipped kinded filter → HTTP 200, 251 rows, 250/250 depth-1 reached, depth-2 reached, root excluded.

Green on the integrated branch: tsc, biome, file-sizes, px-text, cargo check, unit suites (buzz-db 124/0 incl. Postgres with a red-check proving the stub test fails on the pre-fix SQL; desktop 864/0), the dense-second Playwright spec, and all pre-push CI hooks (desktop-test, rust-tests, mobile-test, desktop-tauri-test).

Notes for review

Not in scope (deferred): the provable-no-loss relay_seq work (Tier-2) — never implemented (spec-only), named as follow-on, not reopened here.
File-size overrides bumped (messages.rs, tauri.ts) for load-bearing bindings/fixes from both lanes. Splitting the central bindings is known debt / a separate PR — no opportunistic refactor here.
Pre-existing flakes: scroll-history.spec.ts "preserves user scroll" / "does not teleport" flake on local runners at the clean baseline too (wheel timing); pass under CI's retry policy. Not touched by this change.

Final score: 9/10.

Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>

Threads rendered silently incomplete because the desktop assembled them from the channel cache instead of fetching the reply subtree. This wires a first-class server-side thread read (`get_thread_replies` over `thread_metadata`) through the bridge and a new Tauri command, and fixes a correctness hole in its pagination. The pagination bug: `get_thread_replies` keyset-ordered on `event_created_at` alone, with `WHERE event_created_at > $cursor` and no tiebreak. Thread replies routinely share a `created_at` second (bursty threads), so once a page ended mid-second the cursor advanced past the whole second and every remaining tied reply was silently dropped — reproduced live on a seeded 40-reply thread spanning two seconds (24 + 16): paging at limit 10 lost 14 of the 24 tied replies. That is the exact "missed messages" class this work targets. Fix (Tier-1, no schema change): composite `(event_created_at, event_id)` keyset. The query orders and compares on the pair (`WHERE (event_created_at, event_id) > ($ts, $id) ORDER BY event_created_at ASC, event_id ASC`), and the cursor carries the last reply's event id as the tiebreak end to end: - buzz-db: composite cursor decode (8-byte BE seconds + raw event id) and row-comparison keyset; a bare 8-byte cursor still works for back-compat. - buzz-relay bridge: `extract_thread_cursor` reads `thread_cursor` + `thread_cursor_id` and encodes the composite bytes. - desktop: `ThreadCursor { created_at, event_id }` request/response type; the `get_thread_replies` command derives the next cursor from the last event's `(created_at, id)` transparently — no server-issued token. Verified end to end against a live relay: paging the seeded thread at limit 10 returns a set byte-identical to the full-thread oracle (40/40, no gaps, no duplicates). New DB regression test pins the same-second-tie case; +5 bridge unit tests cover the composite cursor encoding. Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>

…the GUI Second half of the Tier-1 read-path work (the backend thread fetch + composite keyset landed in the prior commit). This wires the frontend to those server-side reads and adds a channel-timeline escape hatch, closing the two remaining "missed messages" gaps in the desktop GUI without a schema change or a hot-path rewrite. Threads (descendant gap): the thread panel derives its replies from the channel cache, and `useLoadMissingAncestors` only backfills *ancestors* (walking `e`-tags upward). Replies that fell outside the channel cold-load window were therefore never fetched — deep or old threads rendered silently incomplete. `useThreadReplies` closes this using the same cache seam: on thread open it pages `get_thread_replies` to the floor over the gap-free `(created_at, event_id)` keyset and merges each event into the channel cache (`mergeMessages`, dedupe by id). Idempotent per (channel, root); retries on error rather than caching a partial subtree. All downstream grouping/ordering/unread derivation is unchanged — the thread simply becomes complete. Channel timeline (dense-second wall): scrollback pages history over WS `REQ` with a bare `until` (`created_at`) cursor. Ordinary same-second overlap is duplicate-safe (relay `until` is inclusive), so the timeline does not drop messages in the normal case. The one genuinely unreachable case is a single `created_at` second holding more messages than one WS page: `until` keeps returning that second's newest slice, the boundary never advances, and everything behind it is stuck. `get_channel_messages_before` gives the pager a bridge composite-keyset fallback that advances within a tied second via `id > before_id` under the relay's `created_at DESC, id ASC` order. It activates *only* on a full-page same-second stall (the proven wall), seeds from the max event id already known at the boundary second (not `baseline[0].id`, which would re-request held rows), and drains that entire boundary second before honoring the per-pass row-floor/batch budget — yielding mid-second would strand the tail behind the same stall with no guaranteed sentinel re-arm. Older history then resumes on the normal budget. The WS live-tail/optimistic/cold-load hot path is untouched. Verification: new Playwright parity test seeds a 450-message dense second (>2 WS pages) behind the cold-load window and proves the keyset fallback fires and every row is reachable. It caught a real tail-stranding bug in the first cut (drain honored the batch budget mid-second, stranding 50 of 450 permanently); the boundary-second-drain invariant is the fix. Green: tsc, biome, file-sizes, px-text, cargo check, 1399 unit tests, and the new spec. `tauri.ts`'s existing file-size override is bumped for the two load-bearing bindings (split is known debt, a separate PR). Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>

Lane 2 (Wren): page user search so add-users / tag-users interactions reach every match instead of a hacky first-page limit. Independent of Lane 1's message/thread read-path (disjoint files); merged to land the read-path work as one PR. Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz> * feat/read-path-people-model: test(desktop): cover people search pagination fix(desktop): page user search reachability

Keep the read-path branch mergeable so CI materializes the merge ref. Relay-only change, disjoint from the desktop read-path work. Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz> * origin/main: fix(relay): enable Redis TLS for rediss:// (ElastiCache) (#1417)

…eads The channel-timeline dense-second fallback sent its composite keyset tiebreak under the filter key `n`, but the relay bridge's `extract_before_id` reads it from `before_id` (crates/buzz-relay/src/api/ bridge.rs). So on the real `/query` path `query.before_id` was never set: the fallback silently degraded to a bare inclusive `until`, re-returned the same full dense-second page, re-emitted the same cursor, and `drainOlderViaKeyset` could spin inside the boundary-second drain loop instead of breaking the wall — the exact case the escape hatch exists to fix. Caught by Wren in PR review. The Playwright spec passed anyway because its mock-store path reimplements the keyset in JS (never exercising the relay filter field), and config mode sent the same wrong key. Fix both call sites to send `before_id`, and add a Rust unit test that pins the field name so the client/relay contract can't drift silently again — extracting `build_channel_messages_before_filter` to make the emitted filter testable, mirroring `build_search_messages_filter`. Verified: 863/863 desktop cargo tests (incl. the 2 new filter tests), tsc, biome, file-sizes, px-text, 1399 node unit tests, dense-second Playwright. Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>

The empty-query people directory listing was terminal: it could never page past its first page, so on busy relays not all people were reachable via the DM picker / add-user / tag flows (Tyler's "all people" goal). Two halves, both fixed: - Relay: the non-search general query path (POST /query, no `search` field) built an EventQuery but never read the raw `page` extension, so no SQL OFFSET was set. Only `search`-bearing filters route to handle_bridge_search (which has its own page/per_page). Empty kind:0 listings fell through here. New `extract_page_offset` wires page -> (page-1)*limit into query.offset; query_events already applies OFFSET with deterministic `created_at DESC, id ASC` ordering, so offset paging is stable. Only fires when `page` > 1 is explicitly present, so keyset/general queries are untouched. - Desktop: search_users' empty-query branch called list_user_search_results, which always returned next_cursor: None, so the frontend could never request page 2. Now emits Some(page+1) when the relay returned a full page (raw events.len() >= max), mirroring the existing NIP-50 search branch. Raw length is the correct fullness signal since list_user_search_results dedupes/truncates. Config-mode e2eBridge delegates empty-query paging to the real relay /query, so this fix makes that path correct too; the mock-store path already paged locally. Added Rust unit tests pinning the page->offset contract (absent/page-1 -> None, page N -> (N-1)*limit, missing limit -> None) — the durable server-side guard the JS mock structurally can't verify. Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>

…nings Rust Lint CI runs clippy with -D warnings, so the pre-existing `.as_ref().map(|c| c.as_slice())` on the Lane 1 thread-cursor path (bridge.rs) was a red gate regardless of provenance. `as_deref()` is the exact equivalent (Option<Vec<u8>> -> Option<&[u8]>). Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>

The headline Lane-1 thread-backfill fix (get_thread_replies / useThreadReplies) was dead against a real relay: the desktop command built a KINDLESS /query filter, and the relay bridge p-gate (p_gated_filters_authorized) rejects any kindless filter with 403 before routing — a kindless filter 'could match' a p-gated kind, so it demands a #p tag we don't send. The relay routes the thread query purely off #e + depth_limit (kind does not filter it), but the gate still runs first. The e2e mock didn't model p-gating, so CI stayed green on a 403. Fixes: - messages.rs: give get_thread_replies' filter the non-p-gated TIMELINE_KINDS (extracted a shared const + build_thread_replies_filter helper mirroring the channel sibling, so the two filters can't drift). Guard unit tests assert the filter carries kinds and every kind is NOT in P_GATED_KINDS (the real invariant), plus composite-cursor paging. - e2eBridge.ts: model the relay p-gate for /query and WS REQ so a kindless #e read is now rejected in the mock too — closes the CI blind spot. Also fix the mock subtree walk to traverse nested replies transitively. - useThreadReplies.ts: clear the fetched-root mark on cancellation unless the paging loop completed, so an interrupted first fetch retries on reopen instead of silently stranding the feature. Narrow effect deps to primitive channel id/type to avoid object-identity churn. - Correct the false 'includes root at depth 0' contract at all three doc sites (messages.rs / tauri.ts / types.ts): the query keys on root_event_id, which root rows lack, so results are replies-only (depth >= 1). - thread.rs: fix the nested root-stub INSERT in insert_thread_metadata (column list omitted community_id while binding it, scrambling every placeholder) to match the correct production sibling in event.rs. Add depth-2 reachability and a direct stub-branch guard (Postgres-gated). - check-file-sizes.mjs: record load-bearing overrides for messages.rs (fix + guard tests crossed 1000 from 995) and tauri.ts (+3 doc-only lines). Verified: tauri 864/0, buzz-db 124/0 (incl. Postgres, serialized, with a red-check confirming the stub test fails on the pre-fix SQL), and a live clean-room relay at this tip: kindless #e -> 403, shipped kinded filter -> 200 with 250/250 dense same-second depth-1 replies + a depth-2 nested reply reached, root excluded. Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>

wesbillman · 2026-07-01T16:30:41Z

Non-blocking adversarial note: the new depth_limit thread bridge route appears to use the request filter for the up-front p-gate, but not for the rows it emits.

In crates/buzz-relay/src/api/bridge.rs, the special thread path calls get_thread_replies(...) and then only checks event_in_accessible_channel before pushing events. The general query path below it additionally applies filters_match(...), reader_authorized_for_event(...), and is_author_only_event(...) result-level checks.

I don't see a current leak path from this PR because the sensitive p-gated / author-only kinds I checked don't appear to be channel-scoped thread rows today. But the invariant feels easy to regress later: a future sensitive channel-scoped kind with thread metadata could be returned here even though the request advertised only safe TIMELINE_KINDS to clear the p-gate.

Suggested hardening: mirror the general path's result-level filtering/auth checks in the thread route before serializing each reply. That keeps the auth gate and emitted rows coupled, rather than relying on today's set of thread-metadata-capable kinds.

Brings the branch current with main (~20 commits, incl. relay perf #1453/#1454 and mention ranking #1431). One conflict resolved in useMentions.ts: kept main's rankMentionCandidates pipeline, re-applied this branch's suggestion slice change (Math.max(MENTION_SUGGESTION_LIMIT, mentionCandidates.length)). Verified post-merge: tsc --noEmit clean, biome check clean, desktop unit tests 1475/1475, cargo check --workspace clean. Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>

…into HEAD * origin/paul/nip-am-agent-turn-metrics: fix(profile): consolidate agent profile runtime metadata (#1451) fix(desktop): simplify workspace rail badges (#1462) perf(desktop): instant channel switching — non-blocking first paint, persisted snapshots (#1452) perf(relay): bounded-concurrency multi-filter query execution (S2) (#1457) fix(desktop): classify timeline prepends so history loads don't bump unread (#1416) fix(desktop): quiet gate for workspace switches instead of boot splash (#1449) fix(read-path): reach complete threads, dense-second timelines, and all people in the GUI (#1418) E1+E3: reduce relay ingest/fan-out DB round trips; ack p99 −7–16%, fd p99 −6–28%, p999 tails −29–53% vs PR #1453 tip (#1454) perf(relay): defer post-commit dispatch and avoid verify clone (#1453) fix(relay): include git hook tools in runtime image (#1326) feat(chart): per-pod emptyDir git scratch when persistence disabled (multi-replica HA) (#1450) fix(relay): remove media bearer-token auth (#1444) fix(desktop): stop search shortcut from hijacking the sidebar (#1447) Co-authored-by: Will Pfleger <pfleger.will@gmail.com> Signed-off-by: Will Pfleger <pfleger.will@gmail.com>

npub12gtutshhh76rx0jx697f32f9tffd4hhp3hx58fp4x6u4uemkm7sqf8f757 and others added 10 commits June 30, 2026 20:59

fix(desktop): page user search reachability

b5e3c3c

Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>

test(desktop): cover people search pagination

4e6c827

Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>

wesbillman approved these changes Jul 1, 2026

View reviewed changes

wesbillman mentioned this pull request Jul 2, 2026

perf(desktop): instant channel switching — non-blocking first paint, persisted snapshots #1452

Merged

tlongwell-block merged commit 7da936f into main Jul 2, 2026
29 checks passed

tlongwell-block deleted the feat/read-path-lane1-threads branch July 2, 2026 14:36

tlongwell-block mentioned this pull request Jul 2, 2026

perf(relay): bounded-concurrency multi-filter query execution (S2) #1457

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(read-path): reach complete threads, dense-second timelines, and all people in the GUI#1418

fix(read-path): reach complete threads, dense-second timelines, and all people in the GUI#1418
tlongwell-block merged 11 commits into
mainfrom
feat/read-path-lane1-threads

tlongwell-block commented Jul 1, 2026 •

edited

Loading

Uh oh!

wesbillman commented Jul 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

tlongwell-block commented Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What & why

Lane 1 — messages + threads (Eva)

Lane 2 — people search / mentions / members (Wren)

The blocker a live relay surfaced (and its fix)

Verification

Notes for review

Uh oh!

wesbillman commented Jul 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tlongwell-block commented Jul 1, 2026 •

edited

Loading