Skip to content

Conversation

@alltheseas
Copy link

Summary

Closes #43

  • Add inbox_relays module that fetches, caches (1h TTL / 60s on error), and publishes kind 10050 relay lists
  • Route DM gift_wrap_to() to recipient's preferred inbox relays with automatic fallback to pool broadcast on failure
  • Publish own kind 10050 on connect, advertising read-capable relays as DM inboxes
  • Republish 10050 (debounced, 800ms) on every relay config change (add/remove/toggle/mode)
  • Exclude write-only relays from 10050; publish empty event to clear stale lists when no readable relays exist

Changed files

File Change
inbox_relays.rs New — fetch, cache, publish, send helper + 9 unit tests
lib.rs Register mod inbox_relays
message/sending.rs DM send → inbox_relays::send_gift_wrap
message/mod.rs Reaction & edit sends → inbox_relays::send_gift_wrap
pivx.rs Payment & QR sends → inbox_relays::send_gift_wrap
commands/relays.rs Publish 10050 on connect; republish on config changes

Test plan

  • cargo check — no new warnings
  • 9 unit tests pass (tag parsing, cache TTL/error, debounce coalescing)
  • Manual: send DM to user with kind 10050 published — verify gift wrap routed to their inbox relays
  • Manual: send DM to user without kind 10050 — verify fallback to pool broadcast
  • Manual: change relay config — verify 10050 republished with updated relay list

🤖 Generated with Claude Code

alltheseas and others added 2 commits February 9, 2026 22:14
…050)

Add inbox_relays module that fetches, caches, and publishes kind 10050
relay lists so DM gift wraps are delivered to each recipient's preferred
inbox relays instead of broadcasting to all pool relays.

- Fetch recipient's 10050 relay list with 1h cache (60s on error)
- Route gift_wrap_to() inbox relays, fall back to pool on failure
- Publish own 10050 on connect advertising read-capable relays
- Republish 10050 (debounced) on every relay config change

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
9 tests covering:
- Tag parsing: extracts URLs, ignores non-relay tags, handles empty/missing
- Cache: store/retrieve, TTL expiry, empty results, error short TTL
- Debounce: generation counter supersedes earlier calls

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@YuurinBee
Copy link

Thanks for taking the time to work and submit this PR! @JSKitty will review it in detail tomorrow.

Copy link

@JSKitty JSKitty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work on this one! Kind 10050 is an excellent improvement to DM delivery for cross-Nostr-client compatibility. Absolutely welcome this PR.

I've manually tested all three unticked items from your test plan:

  • Send DM to user with kind 10050 published — verified gift wrap routed to their inbox relays ✓
  • Send DM to user without kind 10050 — verified fallback to pool broadcast ✓
  • Change relay config — verified 10050 republished with updated relay list ✓

Tested between Vector and 0xChat, and confirmed by watching relays directly for kind 10050 events. All working correctly.

One improvement I'd like to see before merge — there's a cache stampede ("thundering herd") issue in get_or_fetch_inbox_relays. If a user sends multiple messages rapidly to the same recipient with a cold cache (think meme-spamming or rage-typing), each message independently fetches the same 10050 from relays in parallel. This wastes bandwidth and puts unnecessary pressure on relays.

Here's an explanation from my Claude on the why and how, with suggestions:


Thundering Herd Fix — Per-Key Fetch Serialization

The problem: get_or_fetch_inbox_relays drops the cache lock before starting the network fetch. During the ~5 second fetch window, every concurrent DM send to the same recipient sees a cache miss and spawns its own duplicate fetch. With N rapid messages, you get N identical relay queries.

Falling back to pool broadcast during the fetch window isn't ideal either — it defeats the purpose of inbox relay routing for those messages.

The fix: Double-checked locking with a per-key tokio::sync::Mutex. Cache hits remain lock-free (fast path unchanged), but cache misses for the same pubkey serialize through a per-key lock. The second caller waits for the first fetch to complete, then hits the cache on the double-check.

Scoped entirely to inbox_relays.rs — one new static, ~15 line refactor of get_or_fetch_inbox_relays, no signature changes, no new dependencies.

static FETCH_LOCKS: Lazy<std::sync::Mutex<HashMap<PublicKey, Arc<tokio::sync::Mutex<()>>>>> =
    Lazy::new(|| std::sync::Mutex::new(HashMap::new()));

async fn get_or_fetch_inbox_relays(client: &Client, pubkey: &PublicKey) -> Vec<String> {
    // Fast path: cache hit (no per-key lock needed)
    {
        let cache = INBOX_RELAY_CACHE.lock().unwrap();
        if let Some(entry) = cache.get(pubkey) {
            let ttl = if entry.fetch_ok { CACHE_TTL_SECS } else { CACHE_TTL_ERROR_SECS };
            if entry.fetched_at.elapsed().as_secs() < ttl {
                return entry.relays.clone();
            }
        }
    }

    // Per-key lock — serializes fetches for the same pubkey only
    let key_lock = {
        let mut locks = FETCH_LOCKS.lock().unwrap();
        locks.entry(*pubkey).or_default().clone()
    };
    let _guard = key_lock.lock().await;

    // Double-check: another task may have filled the cache while we waited
    {
        let cache = INBOX_RELAY_CACHE.lock().unwrap();
        if let Some(entry) = cache.get(pubkey) {
            let ttl = if entry.fetch_ok { CACHE_TTL_SECS } else { CACHE_TTL_ERROR_SECS };
            if entry.fetched_at.elapsed().as_secs() < ttl {
                return entry.relays.clone();
            }
        }
    }

    // We won the race — do the actual fetch
    let result = fetch_inbox_relays(client, pubkey).await;

    {
        let mut cache = INBOX_RELAY_CACHE.lock().unwrap();
        cache.insert(
            *pubkey,
            CachedRelays {
                relays: result.relays.clone(),
                fetched_at: Instant::now(),
                fetch_ok: result.fetch_ok,
            },
        );
    }

    result.relays
}

Behavior during a 10-message spam burst:

  1. Message 1: cache miss → acquires per-key lock → starts fetch
  2. Messages 2–10: cache miss → .lock().await on the same key (non-blocking for other pubkeys)
  3. Fetch completes → result cached → lock released
  4. Messages 2–10 wake up → double-check cache → hit → all use the real relay list

No duplicate fetches, no pool broadcast fallback, different pubkeys never block each other.

…d locking

When multiple messages are sent rapidly to the same recipient with a cold
cache, each would spawn duplicate fetches for the same kind 10050 event.
This wasted bandwidth and put unnecessary pressure on relays.

Implementation:
- Extracted get_or_fetch_with_lock: generic function with injectable fetch
- Production get_or_fetch_inbox_relays wraps it with fetch_inbox_relays
- Tests use same function with mock fetch, eliminating code duplication
- Double-checked locking: fast path (cache hit) → per-key lock → double-check
- Only first concurrent caller fetches; others wait ~5s then hit cache
- Different pubkeys never block each other (per-key async Mutex)

Memory management (strictly bounded, even on cancellation/panic):
- FETCH_LOCKS stores Weak<Mutex> references, not Arc
- Mutex allocation freed when Arc refcount drops (immediate)
- FetchLockEntryCleanup drop guard ensures cleanup on normal return,
  task cancellation, and panic unwind
- Drop implementation checks Arc::strong_count == 2 (guard + upgrade temp)
  to detect last holder, then removes map entry
- Periodic retain() every 100 misses as fallback safety net
- True bounded growth: map size = in-flight fetches only, no idle leak

Performance:
- Periodic pruning: O(n/100) amortized cost vs O(n) per miss
- Drop-guard cleanup: O(1) per call, runs unconditionally
- Avoids global critical section bottleneck under heavy miss fan-out

Testing (production code path, zero duplication):
- concurrent_fetches_for_same_pubkey_serialize: 10 tasks → 1 fetch,
  verifies lock entry removed after all waiters complete
- fetch_locks_do_not_accumulate_after_calls_complete: verifies drop-guard
  removes entries immediately after single-caller fetches
- cancelled_fetch_cleans_up_lock_entry: spawns task with long sleep,
  aborts it, verifies entry still removed via drop guard
- TEST_GLOBALS_LOCK serializes ALL tests touching global statics
- All 12 unit tests pass with --test-threads=16

Addresses PR feedback from reviewer testing between Vector and 0xChat.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@alltheseas alltheseas force-pushed the feat/nip17-inbox-relays branch from 9b9d4a5 to ccc13e2 Compare February 10, 2026 19:31
@alltheseas
Copy link
Author

ccc13e2 👀

@JSKitty JSKitty added the enhancement New feature or request label Feb 10, 2026
Copy link

@JSKitty JSKitty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tACK ccc13e2

Massively appreciate this feature being added to Vector, a step closer to true Nostr portability. 🤟

I will include you in our release notes credits! 💚

@JSKitty JSKitty merged commit 2f5f21e into VectorPrivacy:master Feb 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add NIP-17 kind 10050 (DM relay list) support for reliable cross-client DM delivery

3 participants