Skip to content

fix: bad relay 24h expiry + unsupported relay auto-disconnect#33

Merged
barrydeen merged 1 commit intomainfrom
fix/relay-health-recovery
Feb 22, 2026
Merged

fix: bad relay 24h expiry + unsupported relay auto-disconnect#33
barrydeen merged 1 commit intomainfrom
fix/relay-health-recovery

Conversation

@barrydeen
Copy link
Copy Markdown
Owner

Summary

Fix relay count dropping from 60 to ~6 over time. Two root causes.

Problem

  1. Bad relay markings are permanent. The HealthTracker persists bad relays to SharedPreferences with no expiry. Transient issues (WebSocket deflater crashes from concurrent sends, network blips) cause mid-session disconnects. After 4 disconnects, a relay is marked bad forever. Over time, most relays get marked bad.

  2. Push notification relays waste subscriptions. Relays like notify.damus.io appear in relay lists but don't support standard Nostr protocol. Every REQ triggers "Unsupported message" NOTICE, polluting the console and wasting subscription capacity.

Changes

RelayHealthTracker

  • _badRelays changed from Set<String> to Map<String, Long> (URL → timestamp)
  • Bad relays expire after 24 hoursisBad() and getBadRelays() prune expired entries
  • On load, expired entries are skipped; old format (no timestamps) is discarded entirely — this immediately clears all stale bad relay lists on upgrade
  • BAD_DISCONNECT_SESSIONS raised from 4 to 6 (less aggressive marking)
  • Added clearAllBadRelays() for manual recovery

RelayPool — unsupported relay detection

  • Track consecutive "unsupported"/"not supported" NOTICEs per relay
  • After 3 unsupported notices → disconnect and block the relay URL
  • Counter resets when relay sends a valid event or EOSE (proves it works)
  • Blocked URLs cleared on disconnectAll() (account switch)

Testing

  • Clear app data → verify no bad relays loaded
  • Connect with notify.damus.io in relay list → verify it gets auto-disconnected after 3 notices
  • Run app for extended period → verify relay count stays stable (no gradual drop)
  • Check that previously-bad relays recover after 24h

Two issues causing relay count to drop from 60 to 6:

1. Bad relay markings were permanent (persisted forever in SharedPrefs).
   Transient issues like WebSocket deflater crashes caused relays to
   accumulate mid-session failures, get marked bad, and never recover.

   Fix:
   - Bad relays now expire after 24 hours (BAD_RELAY_EXPIRY_MS)
   - _badRelays changed from Set<String> to Map<String, Long> (URL → timestamp)
   - Expired entries pruned on isBad()/getBadRelays() and on load
   - Old format (StringSet without timestamps) discarded on upgrade,
     immediately clearing stale bad relay lists
   - BAD_DISCONNECT_SESSIONS raised from 4 to 6 (less aggressive)
   - Added clearAllBadRelays() for manual recovery

2. Relays like notify.damus.io (push notification only) don't support
   standard Nostr protocol. They respond to every REQ with 'Unsupported
   message' NOTICE, wasting subscriptions and polluting the console.

   Fix:
   - Track consecutive 'unsupported'/'not supported' NOTICEs per relay
   - After 3 unsupported notices, disconnect and block the relay URL
   - Counter resets when relay sends a valid event or EOSE
   - Blocked URLs cleared on disconnectAll (account switch)
@barrydeen barrydeen merged commit 4ac2277 into main Feb 22, 2026
@barrydeen barrydeen deleted the fix/relay-health-recovery branch March 4, 2026 01:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant