fix(desktop): rank and dedupe channel member search results#567
Closed
tlongwell-block wants to merge 3 commits into
Closed
fix(desktop): rank and dedupe channel member search results#567tlongwell-block wants to merge 3 commits into
tlongwell-block wants to merge 3 commits into
Conversation
The 'Add members' search in the channel UI was silently hiding people:
relay-returned kind:0 events were filtered in arrival order and the
loop broke as soon as it had `limit` matches. Match #1500 was lost
behind 8 earlier matches every time, producing the reported
'consistent' missing-user behavior.
Two compounding bugs:
1. `ChannelMemberInviteCard` requested only 8 results (hook default).
Bumped to 25; the backend already clamps at 50.
2. `commands::profile::search_users` had no ranking and truncated in
arrival order. Replaced the inline filter with a new pure helper
`nostr_convert::filter_and_rank_user_search` that:
a. Dedupes the incoming events to the latest kind:0 per pubkey
(max created_at, tiebreak min event id) — kind:0 is replaceable
per NIP-01, so an older stale profile must never outrank the
live one regardless of how well its content happens to match.
b. Scores every remaining match (exact > prefix > substring;
display_name > nip05 > pubkey), sorts, then truncates.
The ranking mirrors the ORDER BY in `sprout-db::user::search_users`
so the relay path and the DB path stay consistent.
Also bumped the file-size override on `nostr_convert.rs` (870->1170)
to fit the new helper and 11 unit tests, including direct regressions
for the late-match-dropped-under-limit bug and the three replaceable-
event dedupe semantics (stale-better-rank loses, live profile wins,
stale-only-match drops the pubkey).
cargo test nostr_convert::tests:: -> 35/35 pass
pnpm typecheck + pnpm check (biome + file-sizes) -> clean
cargo fmt --check -> clean
Signed-off-by: Tyler Longwell <109685178+tlongwell-block@users.noreply.github.com>
Self-review of the channel-member-search ranker turned up one under-tested corner: the NIP-01 'lowest event id retained' tiebreak that kicks in when two replaceable kind:0 events for one pubkey share `created_at`. The dedupe logic handled it, but nothing locked the behavior in, and silent regression there would put a stale profile back on top. Test builds two events with the same Keys + custom_created_at, derives the expected winner from `a.id < b.id` (so the test is independent of which random hash happens to be smaller), then asserts both input orderings produce the same single result with the expected display name. Bumps the file-size override on nostr_convert.rs from 1170 to 1210 to fit the new test. Signed-off-by: Tyler Longwell <109685178+tlongwell-block@users.noreply.github.com>
d2304d4 to
9d12187
Compare
Collaborator
Author
|
Closing in favor of #__ (link to follow) — same bug, but the new PR catches a third compounding issue (Typesense tokenization of raw kind:0 JSON content, which made vanilla NIP-50 silently miss single-word display names). Net result: server-side ranking via NIP-50 + indexer fix + a smaller client-side ranker than this PR's helper. Branch is |
Collaborator
Author
|
Replacement PR: #569 |
tlongwell-block
added a commit
that referenced
this pull request
May 13, 2026
# Problem
The 'Add members' search in channels silently hid users. Two compounding
bugs, plus a third discovered while testing:
1. **Backend cap of 1000 events.** The Tauri `search_users` command
fetched up to 2000 kind:0 events via POST /query and grepped them
client-side, but the HTTP bridge clamps queries at 1000 events. On
relays with more than 1000 live profiles, the rest were invisible.
2. **Frontend cap of 8 results.** `ChannelMemberInviteCard` requested
only 8 results from `useUserSearchQuery`, with no relevance ranking.
3. **Typesense tokenizer doesn't extract names from raw JSON content.**
This one cost the most time. Kind:0 events store their structured
data as a JSON blob (`{"display_name":"alice",...}`), and Typesense's
default tokenizer glues the leading `"` onto the next word, producing
a token like `"alice` that doesn't match a clean `q=alice`. Empirical
result before this fix: NIP-50 search for "Bob" finds zero users
named Bob, even though their docs are indexed and retrievable.
# Fix
Three changes, smallest surface that actually solves the bug:
**1. Desktop `search_users` uses NIP-50** (`commands/profile.rs`).
Sends `{kinds:[0], search:q, limit:50}` instead of fetching every
kind:0 and grepping. The bridge already routes `search` to Typesense
(`api/bridge.rs::handle_bridge_search`) — we just weren't using it.
Off the 1000-event cap. Scales to any relay size.
**2. New client-side ranker** (`nostr_convert.rs::rank_user_search_results`).
Re-ranks the ≤50 Typesense hits by exact > prefix > substring on
display_name (or name) > nip05 > pubkey-hex. Drops hits that only
matched on `about`/`website` — those are noise in autocomplete.
Dedupes by pubkey defensively. 13 new unit tests cover scoring,
dedupe, edge cases (empty query, non-kind:0 hits, name-only profiles).
**3. Indexer flattens kind:0 content for tokenization**
(`sprout-search/src/index.rs::flatten_kind0_for_indexing`). For
kind:0 only, parse the JSON and append `display_name`/`name`/`nip05`
values to the indexed `content` string with whitespace separators.
The Typesense `content` field is write-only (the bridge fetches
canonical events from Postgres by id after Typesense returns hits —
bridge.rs:471), so appending derived tokens is safe. `about` and
`website` are deliberately excluded to avoid name-prefix false
positives. 10 new unit tests including malformed JSON tolerance,
non-kind:0 untouched, and ordering of appended tokens.
Frontend limit bumped 8 → 25 so server ranking has room to refine
client-side before truncation.
# Backfill
New / updated kind:0 events index correctly automatically. Existing
docs need one-time reindex:
just reindex-kind0
The new `sprout-reindex-kind0` binary streams kind:0s from Postgres
in 500-row batches through `SearchService::index_batch` and exits.
Idempotent (Typesense upsert).
# Tests / verification
- `cargo test -p sprout-search --lib` → 21 pass (10 new)
- `cargo test --lib nostr_convert` (desktop) → 37 pass (13 new)
- `cargo test -p sprout-relay --lib` → 147 pass (no regressions)
- `cargo fmt --all --check` → clean
- `cargo clippy -p sprout-search -p sprout-relay --all-targets -- -D warnings` → clean
- `pnpm typecheck` + `pnpm check` → clean
- End-to-end: built relay in screen, seeded 8 kind:0 profiles with
varied name patterns, ran `just reindex-kind0`, then queried via
`sprout get-users --name` for: alice, Bob, Lev, Banana, Zed, mal,
testbot, charlie. All returned the right user(s). Before the indexer
fix the same queries returned 0 or only false-positive bio matches.
# Replaces
Closes #567 — same UX bug; this PR's solution is more correct (server
side instead of larger-fetch client side) and now also handles the
single-word-name case that #567 didn't notice.
# Things to watch
- The reindex must be run once after deploy for the fix to take effect
on already-indexed profiles. Documented in justfile + binary header.
- `flatten_kind0_for_indexing` only extracts `display_name`/`name`/`nip05`.
If we ever want fuzzy bio search we'd need a different field weight
approach (likely dedicated Typesense fields). Out of scope here.
Signed-off-by: Tyler Longwell <109685178+tlongwell-block@users.noreply.github.com>
tlongwell-block
added a commit
that referenced
this pull request
May 13, 2026
# Problem
The 'Add members' search in channels silently hid users. Two compounding
bugs, plus a third discovered while testing:
1. **Backend cap of 1000 events.** The Tauri `search_users` command
fetched up to 2000 kind:0 events via POST /query and grepped them
client-side, but the HTTP bridge clamps queries at 1000 events. On
relays with more than 1000 live profiles, the rest were invisible.
2. **Frontend cap of 8 results.** `ChannelMemberInviteCard` requested
only 8 results from `useUserSearchQuery`, with no relevance ranking.
3. **Typesense tokenizer doesn't extract names from raw JSON content.**
This one cost the most time. Kind:0 events store their structured
data as a JSON blob (`{"display_name":"alice",...}`), and Typesense's
default tokenizer glues the leading `"` onto the next word, producing
a token like `"alice` that doesn't match a clean `q=alice`. Empirical
result before this fix: NIP-50 search for "Bob" finds zero users
named Bob, even though their docs are indexed and retrievable.
# Fix
Three changes, smallest surface that actually solves the bug:
**1. Desktop `search_users` uses NIP-50** (`commands/profile.rs`).
Sends `{kinds:[0], search:q, limit:50}` instead of fetching every
kind:0 and grepping. The bridge already routes `search` to Typesense
(`api/bridge.rs::handle_bridge_search`) — we just weren't using it.
Off the 1000-event cap. Scales to any relay size.
**2. New client-side ranker** (`nostr_convert.rs::rank_user_search_results`).
Re-ranks the ≤50 Typesense hits by exact > prefix > substring on
display_name (or name) > nip05 > pubkey-hex. Drops hits that only
matched on `about`/`website` — those are noise in autocomplete.
Dedupes by pubkey defensively. 13 new unit tests cover scoring,
dedupe, edge cases (empty query, non-kind:0 hits, name-only profiles).
**3. Indexer flattens kind:0 content for tokenization**
(`sprout-search/src/index.rs::flatten_kind0_for_indexing`). For
kind:0 only, parse the JSON and append `display_name`/`name`/`nip05`
values to the indexed `content` string with whitespace separators.
The Typesense `content` field is write-only (the bridge fetches
canonical events from Postgres by id after Typesense returns hits —
bridge.rs:471), so appending derived tokens is safe. `about` and
`website` are deliberately excluded to avoid name-prefix false
positives. 10 new unit tests including malformed JSON tolerance,
non-kind:0 untouched, and ordering of appended tokens.
Frontend limit bumped 8 → 25 so server ranking has room to refine
client-side before truncation.
# Backfill
New / updated kind:0 events index correctly automatically. Existing
docs need one-time reindex:
just reindex-kind0
The new `sprout-reindex-kind0` binary streams kind:0s from Postgres
in 500-row batches through `SearchService::index_batch` and exits.
Idempotent (Typesense upsert).
# Tests / verification
- `cargo test -p sprout-search --lib` → 21 pass (10 new)
- `cargo test --lib nostr_convert` (desktop) → 37 pass (13 new)
- `cargo test -p sprout-relay --lib` → 147 pass (no regressions)
- `cargo fmt --all --check` → clean
- `cargo clippy -p sprout-search -p sprout-relay --all-targets -- -D warnings` → clean
- `pnpm typecheck` + `pnpm check` → clean
- End-to-end: built relay in screen, seeded 8 kind:0 profiles with
varied name patterns, ran `just reindex-kind0`, then queried via
`sprout get-users --name` for: alice, Bob, Lev, Banana, Zed, mal,
testbot, charlie. All returned the right user(s). Before the indexer
fix the same queries returned 0 or only false-positive bio matches.
# Replaces
Closes #567 — same UX bug; this PR's solution is more correct (server
side instead of larger-fetch client side) and now also handles the
single-word-name case that #567 didn't notice.
# Things to watch
- The reindex must be run once after deploy for the fix to take effect
on already-indexed profiles. Documented in justfile + binary header.
- `flatten_kind0_for_indexing` only extracts `display_name`/`name`/`nip05`.
If we ever want fuzzy bio search we'd need a different field weight
approach (likely dedicated Typesense fields). Out of scope here.
Signed-off-by: Tyler Longwell <109685178+tlongwell-block@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The "Add members" search in the channel UI silently hid users. When a user typed a name or pubkey, some people consistently never appeared in the dropdown while others did — and the hidden set was stable across attempts.
Two compounding bugs:
Frontend cap of 8.
ChannelMemberInviteCardrequested only 8 results fromuseUserSearchQuery(which also defaults to 8). The backend would have accepted up to 50.Backend had no ranking and truncated in arrival order.
commands::profile::search_usersfetched up to 2000 kind:0 events from the relay, scanned them in whatever order the relay returned them, andbreaked as soon asusers.len() >= max. If the match for "alice" was relay event #1500 and 8 other matches appeared earlier, alice was lost every time. Relay return order is stable-ish, so the same people stayed hidden — matching the reported symptom exactly.The DB-side function in
crates/sprout-db/src/user.rsalready had proper ranking (exact > prefix > contains). The Tauri relay path just wasn't using equivalent logic.Fix
ChannelMemberInviteCardfrom 8 → 25. Backend already clamps at 50.commands::profile::search_userswith a new pure helpernostr_convert::filter_and_rank_user_search(events, query, limit)that:created_at, tiebreak min event id per NIP-01). kind:0 is replaceable, so a stale older profile must never outrank the live one regardless of how well its content happens to match.The ranking mirrors the
ORDER BYinsprout-db::user::search_usersso the relay path and the DB path stay consistent.Tests
12 new unit tests in
nostr_convert::tests, including:created_at.Notes / things to watch
NewDirectMessageDialogstill passeslimit: 8to the same hook by design (DM groups are capped at 8 recipients). Same backend bug used to affect it; with this fix the ranking is correct globally, so its limit-of-8 now reflects the intended UX.display_name; the new client-path sorts on lowercased. Different orderings for mixed-case names within the same rank tier. Not user-visible at limit 25; left as-is for simpler code.nostr_convert.rs(870 → 1210) to fit the new helper and tests.Files