Skip to content

[Bug] miner unmarked send cache can be dropped after poll misses #353

@JSONbored

Description

@JSONbored

Summary

A miner can lose its local destination-send idempotency record if the poller temporarily stops seeing an active swap. After enough transient get_swap() misses, the poller drops the swap from active; the miner then calls cleanup_stale_sends() with that reduced active set. Today that cleanup removes stale send-cache entries even when mark_fulfilled has not landed yet.

If the same swap is later visible again, the fulfiller no longer has the cached destination tx and treats the swap as unsent, so it can broadcast destination funds again instead of retrying mark_fulfilled for the original tx.

Affected path

  • allways/miner/swap_poller.py: repeated None refreshes can drop an active swap from the poller active set.
  • neurons/miner.py: the miner forwards the current active ids into cleanup_stale_sends().
  • allways/miner/fulfillment.py: stale send-cache entries are removed without preserving unmarked records.
  • allways/miner/fulfillment.py: process_swap() sends destination funds when no cached SentSwap exists.

Expected behavior

Once a miner has successfully broadcast destination funds for a swap, an unmarked local send-cache entry should continue blocking duplicate sends until the miner either records mark_fulfilled or has definitive contract state that the cached tx is safe to discard.

Actual behavior

A poller visibility gap can make an active swap look stale to cleanup. Cleanup can remove the unmarked cached tx, and rediscovery can make the fulfiller send destination funds again.

Suggested fix

Keep unmarked stale send-cache records during cleanup and only remove stale records that have already been marked fulfilled. Rediscovered swaps should use the retained tx to retry mark_fulfilled, not broadcast another destination payment.

Suggested tests

  • Unmarked stale SentSwap entries are retained while marked stale entries are removed.
  • Repeated poller misses followed by cleanup and rediscovery do not call send_dest_funds() again.
  • The rediscovered swap retries mark_fulfilled with the retained cached tx.

Duplicate check

I did not find an open issue or PR for this exact cleanup_stale_sends / unmarked send-cache failure mode. The related crash-window double-send issue was a different root cause: losing state before the cache entry exists. This case happens after the cache entry exists and cleanup removes it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions