Severity
Critical — effectively bricks both apps (rider + drivestr). No Nostr → no rides, no offers, no chat, no profile/wallet sync. User cannot recover via normal means (force-close, cache clear all fail).
Steps to Reproduce
- Open the app, navigate to Settings → Relay Management.
- Tap "Reconnect to Relays".
- Observe: all relays drop to disconnected and never come back.
- Tap the button again → no effect (does not force reconnect).
- Force-close the app and relaunch → still no relay connections.
- Clear app cache → still no relay connections.
Expected Behavior
- Tapping "Reconnect to Relays" should briefly drop and re-establish WebSocket connections to all configured relays.
- At minimum, a subsequent tap, or a fresh app launch, should restore connectivity.
Actual Behavior
- All relays go to
DISCONNECTED and stay there.
- The button becomes a no-op on repeat presses.
- App restart and cache clear do not recover the connection state.
Affected Code (both apps share common/)
Handler wiring (identical in both MainActivities):
onReconnect = {
nostrService.relayManager.disconnectAll()
nostrService.relayManager.connectAll()
}
UI button: common/src/main/java/com/ridestr/common/ui/RelayManagementScreen.kt:122 — the isReconnecting flag flips back to false after a fixed delay regardless of actual connection result, which masks failure and produces the "button does nothing" feel on retry.
Manager / connection:
Suspected Root Cause(s)
A disconnectAll() immediately followed by connectAll() runs on the UI thread. RelayConnection.disconnect() sets shouldReconnect = false and closes the socket asynchronously outside the lock. connect() then sets shouldReconnect = true and opens a new socket. Several things can go wrong:
-
Stale-callback path: the new socket's onOpen may fire before the old socket's onClosed arrives. The onClosed guard checks socket !== webSocket and bails — but if any callback orders differently against state assignment, we can land in a state where _state.value == DISCONNECTED while socket references a now-dead WebSocket.
-
connect() short-circuit: connect() returns early if state is CONNECTING or CONNECTED. If a previous reconnect attempt left state stuck in CONNECTING (e.g. a never-completed handshake on a torn-down socket), subsequent presses become no-ops. This matches the "second press does nothing" symptom.
-
Survives restart / cache clear: This is the strongest clue that something persisted matters. Cache clear preserves SharedPreferences (only cache dir is wiped). If the user has custom relays saved and one or more of them is unreachable, the relay list itself may be the actual problem — but the UI gives zero feedback. Worth verifying whether the user has custom relays configured. Also worth checking that RelayManager is initialized from the current effective relay list rather than RelayConfig.DEFAULT_RELAYS (see RelayManager.kt:43) — if NostrService is constructed once with defaults, custom relays set later may never be honored.
-
No retry budget reset on manual reconnect: reconnectAttempts backoff persists across the manual button press. If we've already backed off to 60s, the user may think nothing's happening when in fact a delayed retry is pending.
Suggested Fix Direction (for triage, not prescriptive)
- Make
onReconnect call a dedicated relayManager.forceReconnectAll() that:
- Resets
reconnectAttempts to 0 on each connection,
- Awaits actual socket teardown before re-opening (don't fire-and-forget),
- Re-reads the effective relay list from
SettingsRepository so custom relay changes take effect,
- Returns a result the UI can surface (success / per-relay failure with reason).
- Surface per-relay errors in the UI rather than only showing aggregate connected count.
- Add an "are you using custom relays?" diagnostic line on the Relay Management screen.
Environment
- Reported via internal test on 2026-05-14.
- Branch:
claude/cranky-curie-1d8700 (master at fa54d0a).
- App(s): rider-app and drivestr (shared
common/ code — both should be affected; please confirm).
Acceptance Criteria
Severity
Critical — effectively bricks both apps (rider + drivestr). No Nostr → no rides, no offers, no chat, no profile/wallet sync. User cannot recover via normal means (force-close, cache clear all fail).
Steps to Reproduce
Expected Behavior
Actual Behavior
DISCONNECTEDand stay there.Affected Code (both apps share
common/)Handler wiring (identical in both MainActivities):
ensureConnected()— possibly not affected, worth confirming)onReconnect = { nostrService.relayManager.disconnectAll() nostrService.relayManager.connectAll() }UI button: common/src/main/java/com/ridestr/common/ui/RelayManagementScreen.kt:122 — the
isReconnectingflag flips back tofalseafter a fixed delay regardless of actual connection result, which masks failure and produces the "button does nothing" feel on retry.Manager / connection:
connectAll/disconnectAll)connect)disconnect)scheduleReconnect)Suspected Root Cause(s)
A
disconnectAll()immediately followed byconnectAll()runs on the UI thread.RelayConnection.disconnect()setsshouldReconnect = falseand closes the socket asynchronously outside the lock.connect()then setsshouldReconnect = trueand opens a new socket. Several things can go wrong:Stale-callback path: the new socket's
onOpenmay fire before the old socket'sonClosedarrives. TheonClosedguard checkssocket !== webSocketand bails — but if any callback orders differently against state assignment, we can land in a state where_state.value == DISCONNECTEDwhilesocketreferences a now-dead WebSocket.connect()short-circuit:connect()returns early if state isCONNECTINGorCONNECTED. If a previous reconnect attempt left state stuck inCONNECTING(e.g. a never-completed handshake on a torn-down socket), subsequent presses become no-ops. This matches the "second press does nothing" symptom.Survives restart / cache clear: This is the strongest clue that something persisted matters. Cache clear preserves SharedPreferences (only cache dir is wiped). If the user has custom relays saved and one or more of them is unreachable, the relay list itself may be the actual problem — but the UI gives zero feedback. Worth verifying whether the user has custom relays configured. Also worth checking that
RelayManageris initialized from the current effective relay list rather thanRelayConfig.DEFAULT_RELAYS(seeRelayManager.kt:43) — ifNostrServiceis constructed once with defaults, custom relays set later may never be honored.No retry budget reset on manual reconnect:
reconnectAttemptsbackoff persists across the manual button press. If we've already backed off to 60s, the user may think nothing's happening when in fact a delayed retry is pending.Suggested Fix Direction (for triage, not prescriptive)
onReconnectcall a dedicatedrelayManager.forceReconnectAll()that:reconnectAttemptsto 0 on each connection,SettingsRepositoryso custom relay changes take effect,Environment
claude/cranky-curie-1d8700(master atfa54d0a).common/code — both should be affected; please confirm).Acceptance Criteria