Skip to content

Retry PendingConfirm null lookups before dropping#64

Merged
entrius merged 1 commit into
testfrom
harden/pending-confirm-null-retry
Apr 17, 2026
Merged

Retry PendingConfirm null lookups before dropping#64
entrius merged 1 commit into
testfrom
harden/pending-confirm-null-retry

Conversation

@LandynDev
Copy link
Copy Markdown
Collaborator

Summary

When the validator's PendingConfirm loop polls a user's source tx and the chain provider returns tx_info=None, the entry is dropped immediately and the reservation expires silently. If the tx is still propagating through the mempool — or the validator's RPC happens to be on a node that hasn't seen it yet — the user's already-sent funds are stranded with no protocol recourse.

This makes tx_info=None transient for the first 3 consecutive polls, mirroring the existing handling for ProviderUnreachableError: extend the reservation and keep the entry queued. Drop only after the limit. Counter resets on any non-null response.

  • PENDING_CONFIRM_NULL_RETRY_LIMIT = 3 in constants.py
  • pending_confirm_null_polls in-memory counter on Validator, mirrors extend_reservation_voted_at in shape and lifecycle
  • Stale-key cleanup at the top of initialize_pending_user_reservations covers both dicts in one pass — no leak

Narrows the still-propagating-tx window without touching the genuinely-not-found path (typo, dropped from mempool) — those still drop, just ~36s later.

Test plan

  • ruff format + ruff check pass (verified locally)
  • Existing tests/test_pending_confirm_queue.py still passes (state-store-only; unaffected)
  • E2E suite 02 (happy path) — reservation→confirm flow unchanged
  • Manual: stop the validator's BTC RPC briefly during a swap; entry should retry instead of drop

A user's source tx is often invisible to a validator's RPC for the first
few seconds after submission (mempool propagation lag, regional RPC
differences). Dropping the pending entry on the first null poll left the
user without protocol recourse if their tx was still propagating.

Treat tx_info=None like a transient provider failure for the first
PENDING_CONFIRM_NULL_RETRY_LIMIT (3) consecutive polls — extend the
reservation and keep the entry queued. Counter resets on any non-null
response and is cleaned up alongside extend_reservation_voted_at.
@entrius entrius merged commit 9fdc61c into test Apr 17, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants