Skip to content

fix: channel member search via NIP-50 + Typesense indexer fix for kind:0#569

Merged
tlongwell-block merged 2 commits into
mainfrom
fix/channel-add-members-nip50
May 13, 2026
Merged

fix: channel member search via NIP-50 + Typesense indexer fix for kind:0#569
tlongwell-block merged 2 commits into
mainfrom
fix/channel-add-members-nip50

Conversation

@tlongwell-block
Copy link
Copy Markdown
Collaborator

Problem

The 'Add members' search in channels silently hid users. Three compounding bugs:

  1. Backend cap of 1000 events. The Tauri search_users command fetched up to 2000 kind:0 events via POST /query and grepped them client-side, but the HTTP bridge clamps queries at 1000 events. On relays with more than 1000 live profiles, the rest were invisible.

  2. Frontend cap of 8 results. ChannelMemberInviteCard requested only 8 results from useUserSearchQuery, with no relevance ranking.

  3. Typesense tokenizer doesn't extract names from raw JSON content. This one cost the most time. Kind:0 events store their structured data as a JSON blob ({"display_name":"alice",...}), and Typesense's default tokenizer glues the leading " onto the next word, producing a token like "alice that doesn't match a clean q=alice. Empirical result before this fix: NIP-50 search for "Bob" finds zero users named Bob, even though their docs are indexed and retrievable. Only multi-word names like "Alice Wonderland" matched, on the second word ("Wonderland").

Fix

Three changes, smallest surface that actually solves the bug:

1. Desktop search_users uses NIP-50 (commands/profile.rs).
Sends {kinds:[0], search:q, limit:50} instead of fetching every kind:0 and grepping. The bridge already routes search to Typesense (api/bridge.rs::handle_bridge_search) — we just weren't using it. Off the 1000-event cap. Scales to any relay size.

2. New client-side ranker (nostr_convert.rs::rank_user_search_results).
Re-ranks the ≤50 Typesense hits by exact > prefix > substring on display_name (or name) > nip05 > pubkey-hex. Drops hits that only matched on about/website — those are noise in autocomplete. Dedupes by pubkey defensively.

3. Indexer flattens kind:0 content for tokenization (sprout-search/src/index.rs::flatten_kind0_for_indexing).
For kind:0 only, parse the JSON and append display_name/name/nip05 values to the indexed content string with whitespace separators. The Typesense content field is write-only (the bridge fetches canonical events from Postgres by id after Typesense returns hits — bridge.rs:471), so appending derived tokens is safe and doesn't affect any read path. about and website are deliberately excluded to avoid name-prefix false positives.

Frontend limit bumped 8 → 25 so server ranking has room to refine client-side before truncation.

Backfill — required once after deploy

New / updated kind:0 events index correctly automatically. Existing kind:0 docs need a one-time reindex:

just reindex-kind0

The new sprout-reindex-kind0 binary streams kind:0s from Postgres in 500-row batches through SearchService::index_batch and exits. Idempotent (Typesense upsert). Safe to run repeatedly.

Tests

  • cargo test -p sprout-search --lib → 21 pass (10 new)
  • cargo test --lib nostr_convert (desktop) → 37 pass (13 new)
  • cargo test -p sprout-relay --lib → 147 pass (no regressions)
  • cargo fmt --all --check → clean
  • cargo clippy -p sprout-search -p sprout-relay --all-targets -- -D warnings → clean
  • pnpm typecheck + pnpm check → clean
  • Pre-push hooks (full workspace clippy + tests + desktop/mobile/web builds) → all green

Manual e2e

Seeded 8 kind:0 profiles locally with varied name patterns (alice, Bob, Lev, Banana Joe, Malice, alicia, alice-old, Zed). Ran just reindex-kind0. Queried sprout get-users --name <q> for: alice, Bob, Lev, Banana, Zed, mal, testbot, charlie. All returned the right user(s). Before the indexer fix the same queries returned 0 results (single-word names) or only false-positive bio matches.

Replaces

Closes #567. Same UX bug; the new PR is the more correct fix (server-side, indexed) and additionally handles the single-word-name case that #567 didn't catch.

Things to watch

  • The reindex must be run once after deploy for the fix to take effect on already-indexed profiles. Documented in the justfile recipe and the binary's module docs.
  • flatten_kind0_for_indexing only extracts display_name/name/nip05. If we ever want fuzzy bio search we'd need a different approach (likely dedicated Typesense fields with weighted query_by). Out of scope here.
  • The behaviour leans on Typesense default tokenization rules. If we ever swap engines or upgrade Typesense in a way that changes tokenization, the appended tokens may become redundant but won't break anything — the original JSON is still in the doc.

Files

crates/sprout-relay/Cargo.toml                            |   4 +
crates/sprout-relay/src/bin/reindex_kind0.rs              | 116 +++++++++++
crates/sprout-search/src/index.rs                         | 189 ++++++++++++++-
desktop/scripts/check-file-sizes.mjs                      |   2 +-
desktop/src-tauri/src/commands/profile.rs                 |  51 ++--
desktop/src-tauri/src/nostr_convert.rs                    | 256 +++++++++++++++++++++
desktop/src/features/channels/ui/ChannelMemberInviteCard.tsx |   4 +-
justfile                                                  |   7 +
8 files changed, 592 insertions(+), 35 deletions(-)

# Problem

The 'Add members' search in channels silently hid users. Two compounding
bugs, plus a third discovered while testing:

1. **Backend cap of 1000 events.** The Tauri `search_users` command
   fetched up to 2000 kind:0 events via POST /query and grepped them
   client-side, but the HTTP bridge clamps queries at 1000 events. On
   relays with more than 1000 live profiles, the rest were invisible.

2. **Frontend cap of 8 results.** `ChannelMemberInviteCard` requested
   only 8 results from `useUserSearchQuery`, with no relevance ranking.

3. **Typesense tokenizer doesn't extract names from raw JSON content.**
   This one cost the most time. Kind:0 events store their structured
   data as a JSON blob (`{"display_name":"alice",...}`), and Typesense's
   default tokenizer glues the leading `"` onto the next word, producing
   a token like `"alice` that doesn't match a clean `q=alice`. Empirical
   result before this fix: NIP-50 search for "Bob" finds zero users
   named Bob, even though their docs are indexed and retrievable.

# Fix

Three changes, smallest surface that actually solves the bug:

**1. Desktop `search_users` uses NIP-50** (`commands/profile.rs`).
   Sends `{kinds:[0], search:q, limit:50}` instead of fetching every
   kind:0 and grepping. The bridge already routes `search` to Typesense
   (`api/bridge.rs::handle_bridge_search`) — we just weren't using it.
   Off the 1000-event cap. Scales to any relay size.

**2. New client-side ranker** (`nostr_convert.rs::rank_user_search_results`).
   Re-ranks the ≤50 Typesense hits by exact > prefix > substring on
   display_name (or name) > nip05 > pubkey-hex. Drops hits that only
   matched on `about`/`website` — those are noise in autocomplete.
   Dedupes by pubkey defensively. 13 new unit tests cover scoring,
   dedupe, edge cases (empty query, non-kind:0 hits, name-only profiles).

**3. Indexer flattens kind:0 content for tokenization**
   (`sprout-search/src/index.rs::flatten_kind0_for_indexing`). For
   kind:0 only, parse the JSON and append `display_name`/`name`/`nip05`
   values to the indexed `content` string with whitespace separators.
   The Typesense `content` field is write-only (the bridge fetches
   canonical events from Postgres by id after Typesense returns hits —
   bridge.rs:471), so appending derived tokens is safe. `about` and
   `website` are deliberately excluded to avoid name-prefix false
   positives. 10 new unit tests including malformed JSON tolerance,
   non-kind:0 untouched, and ordering of appended tokens.

Frontend limit bumped 8 → 25 so server ranking has room to refine
client-side before truncation.

# Backfill

New / updated kind:0 events index correctly automatically. Existing
docs need one-time reindex:

    just reindex-kind0

The new `sprout-reindex-kind0` binary streams kind:0s from Postgres
in 500-row batches through `SearchService::index_batch` and exits.
Idempotent (Typesense upsert).

# Tests / verification

- `cargo test -p sprout-search --lib`           → 21 pass (10 new)
- `cargo test --lib nostr_convert` (desktop)    → 37 pass (13 new)
- `cargo test -p sprout-relay --lib`            → 147 pass (no regressions)
- `cargo fmt --all --check`                     → clean
- `cargo clippy -p sprout-search -p sprout-relay --all-targets -- -D warnings` → clean
- `pnpm typecheck` + `pnpm check`                → clean
- End-to-end: built relay in screen, seeded 8 kind:0 profiles with
  varied name patterns, ran `just reindex-kind0`, then queried via
  `sprout get-users --name` for: alice, Bob, Lev, Banana, Zed, mal,
  testbot, charlie. All returned the right user(s). Before the indexer
  fix the same queries returned 0 or only false-positive bio matches.

# Replaces

Closes #567 — same UX bug; this PR's solution is more correct (server
side instead of larger-fetch client side) and now also handles the
single-word-name case that #567 didn't notice.

# Things to watch

- The reindex must be run once after deploy for the fix to take effect
  on already-indexed profiles. Documented in justfile + binary header.
- `flatten_kind0_for_indexing` only extracts `display_name`/`name`/`nip05`.
  If we ever want fuzzy bio search we'd need a different field weight
  approach (likely dedicated Typesense fields). Out of scope here.

Signed-off-by: Tyler Longwell <109685178+tlongwell-block@users.noreply.github.com>
@tlongwell-block tlongwell-block force-pushed the fix/channel-add-members-nip50 branch from 7a63445 to c7040b9 Compare May 13, 2026 17:38
Switches the kind:0 backfill from OFFSET-based paging to a snapshot
ceiling (`until = Utc::now()` at start) plus a keyset cursor over
`(created_at, id)` matching the underlying
`ORDER BY created_at DESC, id ASC` index.

Closes the only correctness footgun raised in review (3-way review with
Sami + Quinn): under live write traffic, OFFSET could skip a row when a
new kind:0 arrived mid-run and shifted the window. Snapshot + keyset
gives strict run-once semantics — new arrivals fall outside the snapshot
and are handled by the live index path, no skips, no duplicates at page
boundaries.

Uses `query_events`'s existing `until` + `before_id` composite-cursor
support; that API was designed for exactly this pattern.

Also documents `SearchHit.content` to flag that for kind:0 it contains
the appended-token form (display_name/name/nip05), not the canonical
event content. All production read paths refetch the canonical
StoredEvent from Postgres by id, so this is invisible today — but the
doc-comment prevents a future feature from accidentally trusting the
field.

Signed-off-by: Tyler Longwell <109685178+tlongwell-block@users.noreply.github.com>
@tlongwell-block tlongwell-block merged commit 9a403d3 into main May 13, 2026
15 checks passed
@tlongwell-block tlongwell-block deleted the fix/channel-add-members-nip50 branch May 13, 2026 19:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant