Skip to content

fix: harden RTC auto-pay idempotency markers#5562

Merged
Scottcjn merged 3 commits into
Scottcjn:mainfrom
iamdinhthuan:fix-auto-pay-idempotency-markers
May 18, 2026
Merged

fix: harden RTC auto-pay idempotency markers#5562
Scottcjn merged 3 commits into
Scottcjn:mainfrom
iamdinhthuan:fix-auto-pay-idempotency-markers

Conversation

@iamdinhthuan
Copy link
Copy Markdown
Contributor

Summary

  • ignore RTC-AutoPay-Confirmed markers unless they were posted by the repo owner or github-actions[bot]
  • derive a stable payment key from the repo, PR, owner payment comment, amount, and recipient
  • post a trusted RTC-AutoPay-Started marker before transfer and include the same key as the RustChain idempotency_key
  • skip reruns that already have a trusted started or confirmed marker for the same payment key, avoiding duplicate transfer attempts after a confirmation-comment failure

Why

A random PR commenter could spoof the old confirmation marker and suppress a legitimate owner-approved payout. The old flow also transferred RTC before writing any durable workflow-owned marker, so a successful transfer followed by a failed confirmation comment could be retried as a second transfer.

Validation

python3 -m pytest -q scripts/test_auto_pay_idempotency.py .github/actions/rtc-auto-bounty/test_award_rtc.py
python3 -m py_compile scripts/auto-pay.py scripts/test_auto_pay_idempotency.py .github/actions/rtc-auto-bounty/award_rtc.py
git diff --check

Observed locally: 42 passed, py_compile passed, and git diff --check was clean.

Duplicate / safety checks

Before opening this PR I searched public issues and PRs for RTC-AutoPay-Confirmed marker spoof idempotency, auto-pay duplicate transfer confirmation comment failure, idempotency_key auto-pay, RTC-AutoPay-Started payment_key, and scripts/auto-pay.py idempotency. I did not find an exact duplicate. PR #5359 is related to transient retry handling in .github/actions/rtc-auto-bounty/award_rtc.py, but it does not touch scripts/auto-pay.py or the trusted marker/idempotency-key flow fixed here.

No private keys, tokens, credentials, or account-internal data are included.

@github-actions
Copy link
Copy Markdown
Contributor

Welcome to RustChain! Thanks for your first pull request.

Before we review, please make sure:

  • Non-doc PRs have a BCOS-L1 or BCOS-L2 label
  • Doc-only PRs are exempt from BCOS tier labels when they only touch docs/**, *.md, or common image/PDF files
  • New code files include an SPDX license header
  • You've tested your changes against the live node

Bounty tiers: Micro (1-10 RTC) | Standard (20-50) | Major (75-100) | Critical (100-150)

A maintainer will review your PR soon. Thanks for contributing!

@github-actions github-actions Bot added BCOS-L1 Beacon Certified Open Source tier BCOS-L1 (required for non-doc PRs) size/L PR: 201-500 lines labels May 17, 2026
@iamdinhthuan iamdinhthuan force-pushed the fix-auto-pay-idempotency-markers branch from bae1544 to a4cdba0 Compare May 17, 2026 15:31
Copy link
Copy Markdown
Contributor

@kekehanshujun kekehanshujun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewing this as part of the RustChain code review bounty.

The idempotency key and trusted-marker checks are useful, but the new RTC-AutoPay-Started marker creates a payment-liveness bug. The workflow posts the started marker before calling /wallet/transfer; if the transfer then fails with ConnectionError, timeout, a 5xx, or any exception before a confirmed/failed marker is posted, the started marker remains on the PR. On the next run, find_existing_payment_marker() treats that trusted started marker as final enough to skip, so the contributor may never be paid.

The tests cover a failed confirmation-comment post after a successful transfer, but not a failed transfer after the started marker is posted. Please either stop treating RTC-AutoPay-Started as a skip condition unless it has a bounded freshness window and a matching pending transfer, or post the started marker only after the transfer endpoint has accepted an idempotent request. Add a regression where transfer_rtc raises after the started comment and the next run still retries the transfer.

@iamdinhthuan
Copy link
Copy Markdown
Contributor Author

Addressed the liveness issue in commit 6452cc0. RTC-AutoPay-Started is no longer treated as a final skip marker, so a transfer ConnectionError after the started comment can be retried with the same idempotency_key. Trusted final markers still stop reruns.

Added/updated regressions:

  • transfer connection failure after RTC-AutoPay-Started retries and then confirms
  • confirmation-comment failure retries with the same idempotency key rather than relying on the started marker as final state

Validation run on the PR branch:

python3 -m pytest -q scripts/test_auto_pay_idempotency.py .github/actions/rtc-auto-bounty/test_award_rtc.py
python3 -m py_compile scripts/auto-pay.py scripts/test_auto_pay_idempotency.py .github/actions/rtc-auto-bounty/award_rtc.py
git diff --check

Result: 43 passed; py_compile and diff check passed.

I did not bundle CI dependency changes here because missing full-suite dependencies are already covered separately by open PR #5350; this PR stays scoped to auto-pay idempotency/liveness.

Copy link
Copy Markdown
Contributor

@HCIE2054 HCIE2054 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Copy Markdown

@2balmprune 2balmprune left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed updated head 6452cc03858bea53c96e4633429efc8aaf84b8f1 after the RTC-AutoPay-Started liveness follow-up.

The earlier blocker is resolved on this head: RTC-AutoPay-Started is no longer treated as a final skip marker, so a transfer connection failure after the started comment can retry with the same idempotency_key. Final trusted RTC-AutoPay-Confirmed markers still stop reruns, and untrusted confirmation markers no longer suppress an owner payment directive. The key is stable for the repo/PR/payment comment/amount/recipient tuple, so a confirmation-comment failure can rerun without changing the transfer idempotency key.

Validation performed:

  • git diff --check origin/main...HEAD -- scripts/auto-pay.py scripts/test_auto_pay_idempotency.py
  • python -B -m py_compile scripts/auto-pay.py scripts/test_auto_pay_idempotency.py
  • python -B -m pytest -q scripts/test_auto_pay_idempotency.py -> 3 passed
  • python -B -m py_compile .github/actions/rtc-auto-bounty/award_rtc.py .github/actions/rtc-auto-bounty/test_award_rtc.py
  • python .github/actions/rtc-auto-bounty/test_award_rtc.py -> 40 tests passed

No remaining blocker from my review.

Copy link
Copy Markdown

@ZacharyZhang-NY ZacharyZhang-NY left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a blocking duplicate-payment risk in the proposed idempotency flow. The script now sends idempotency_key to POST /wallet/transfer, but the current wallet_transfer_v2 endpoint in node/rustchain_v2_integrated_v2.2.1_rip200.py does not read or persist idempotency_key; rg only finds the field in scripts/auto-pay.py. That means a rerun after transfer success plus confirmation-comment failure will call /wallet/transfer again, and the server will create a second pending transfer.\n\nEvidence from this PR head 6452cc0:\n- scripts/auto-pay.py posts RTC-AutoPay-Started before transfer, but find_existing_payment_marker() only checks RTC-AutoPay-Confirmed, so a started marker does not stop the rerun.\n- node/rustchain_v2_integrated_v2.2.1_rip200.py:/wallet/transfer validates from_miner/to_miner/amount/reason and inserts a new pending_ledger row with a fresh tx_hash; it has no idempotency_key lookup or uniqueness guard.\n- A local probe simulating confirmation-comment failure ran main() twice and produced transfer_calls=2 with the same idempotency_key. Against the current server implementation those are two distinct transfer attempts.\n\nValidation performed:\n- python -B -m py_compile scripts/auto-pay.py scripts/test_auto_pay_idempotency.py .github/actions/rtc-auto-bounty/award_rtc.py\n- git diff --check origin/main...HEAD -- scripts/auto-pay.py scripts/test_auto_pay_idempotency.py\n- PYTHONPATH=scripts python -B -m pytest -q scripts/test_auto_pay_idempotency.py --noconftest -> 3 passed\n- python -B -m pytest -q scripts/test_auto_pay_idempotency.py .github/actions/rtc-auto-bounty/test_award_rtc.py --noconftest -> 43 passed\n- Manual non-idempotent-server probe: two transfer calls after confirmation-comment failure, same idempotency_key.\n\nPlease either implement server-side idempotency for /wallet/transfer before relying on this field, or make the workflow's trusted Started marker block reruns after the transfer boundary is crossed and require manual reconciliation.

Copy link
Copy Markdown

@2balmprune 2balmprune left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up after rechecking head 6452cc03858bea53c96e4633429efc8aaf84b8f1: changing my prior approval to request changes.

The liveness fix handles the started-marker retry path, but the duplicate-transfer guard still relies on /wallet/transfer honoring idempotency_key. I do not see that support in the current server endpoint. scripts/auto-pay.py sends idempotency_key in the payload, but node/rustchain_v2_integrated_v2.2.1_rip200.py::wallet_transfer_v2() validates the admin payload and inserts a fresh pending_ledger row with a new tx hash/pending id; it does not read or persist the idempotency key. The new regression mocks idempotent server behavior with accepted_transfers.setdefault(key, ...), so it proves the client reuses the key but not that the real transfer endpoint dedupes it.

Failure case:

  1. auto-pay posts RTC-AutoPay-Started
  2. /wallet/transfer succeeds and creates a pending transfer
  3. posting RTC-AutoPay-Confirmed fails
  4. rerun does not treat started as final, so it calls /wallet/transfer again with the same idempotency_key
  5. current server creates another pending transfer because the key is ignored

Please either implement idempotency support in the transfer endpoint by storing/checking the payment key before creating a new pending transfer, or keep the workflow in manual-fallback/non-green state after a successful transfer but failed confirmation so maintainers cannot accidentally rerun into a second pending payout.

Validation:

  • rg -n "idempotency_key|wallet_transfer|/wallet/transfer|pending_id" scripts node .github/actions
  • python -B -m pytest -q scripts/test_auto_pay_idempotency.py -> 3 passed
  • python .github/actions/rtc-auto-bounty/test_award_rtc.py -> 40 tests passed

The tests pass, but the real endpoint does not enforce the idempotency property the client now depends on.

@iamdinhthuan iamdinhthuan requested a review from Scottcjn as a code owner May 17, 2026 16:29
@github-actions github-actions Bot added node Node server related tests Test suite changes labels May 17, 2026
@iamdinhthuan
Copy link
Copy Markdown
Contributor Author

iamdinhthuan commented May 17, 2026

Addressed the auto-pay retry blocker in commit 5d3df08.

What changed:

  • /wallet/transfer now honors an optional idempotency_key by deriving a stable tx_hash and returning the existing pending transfer on identical retries.
  • Reusing the same key with changed transfer details now returns 409 idempotency_key_conflict and does not create another pending debit.
  • Added server-side regression coverage plus the missing SPDX header on the auto-pay test file.

Verification:

  • RED first: PYTHONPATH=node python3 -B -m pytest -q tests/test_wallet_transfer_admin_idempotency.py --tb=short failed with duplicate pending IDs (2 == 1).
  • PYTHONPATH=node python3 -B -m pytest -q tests/test_wallet_transfer_admin_idempotency.py scripts/test_auto_pay_idempotency.py tests/test_wallet_transfer_admin_key_unset.py tests/test_signed_transfer_replay.py node/tests/test_payout_preflight.py --tb=short -> 41 passed.
  • python3 -B -m py_compile node/rustchain_v2_integrated_v2.2.1_rip200.py tests/test_wallet_transfer_admin_idempotency.py scripts/test_auto_pay_idempotency.py
  • python3 tools/bcos_spdx_check.py --base-ref origin/main
  • git diff --check HEAD^..HEAD

Copy link
Copy Markdown

@TJCurnutte TJCurnutte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved. I validated the auto-pay idempotency hardening at head 5d3df088bbb56da56eeff905a5f38b4d093e4b9b.

Proof run from a detached PR worktree:

git diff --check origin/main...HEAD -- node/rustchain_v2_integrated_v2.2.1_rip200.py scripts/auto-pay.py scripts/test_auto_pay_idempotency.py tests/test_wallet_transfer_admin_idempotency.py
python3 -B -m py_compile node/rustchain_v2_integrated_v2.2.1_rip200.py scripts/auto-pay.py scripts/test_auto_pay_idempotency.py tests/test_wallet_transfer_admin_idempotency.py
python3 -B -m pytest -q scripts/test_auto_pay_idempotency.py tests/test_wallet_transfer_admin_idempotency.py
# 5 passed, 1 warning in 0.07s

The retry model now has the right boundary: scripts/auto-pay.py derives a stable payment_key from repo, PR, owner payment-comment id, amount, and recipient; untrusted confirmation markers are ignored; the new RTC-AutoPay-Started marker does not suppress a retry; and the transfer request sends that key as idempotency_key.

The node-side /wallet/transfer path validates the key shape, derives a deterministic tx_hash, returns the existing pending transfer on an exact retry, and rejects reused keys with changed transfer parameters as 409 idempotency_key_conflict without inserting a second ledger row. The focused tests cover comment-post failure retry, transfer connection retry, exact pending-ledger reuse, and changed-transfer conflict.

Live duplicate gates were clear before posting: PR open/non-draft, not self-authored, no existing @TJCurnutte PR review, and no current @TJCurnutte bounty-issue claim for this PR.

Copy link
Copy Markdown

@2balmprune 2balmprune left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed updated head 5d3df088bbb56da56eeff905a5f38b4d093e4b9b after the server-side idempotency-key follow-up. Approved.

This resolves my previous blocker: /wallet/transfer now validates an optional idempotency_key, derives a stable tx_hash from it, looks up the existing pending transfer under the write lock, returns the same pending_id/tx_hash on identical retries, and returns 409 idempotency_key_conflict if the same key is reused with changed transfer details. That closes the duplicate-pending-transfer path for the auto-pay retry case where transfer creation succeeds but the confirmation comment fails.

Validation performed:

  • git diff --check origin/main...HEAD -- node/rustchain_v2_integrated_v2.2.1_rip200.py scripts/auto-pay.py scripts/test_auto_pay_idempotency.py tests/test_wallet_transfer_admin_idempotency.py
  • python3 -B -m py_compile node/rustchain_v2_integrated_v2.2.1_rip200.py scripts/auto-pay.py scripts/test_auto_pay_idempotency.py tests/test_wallet_transfer_admin_idempotency.py
  • Static recheck of wallet_transfer_v2() confirmed identical idempotency-key retries return the existing pending row before balance/pending-debit checks, while changed transfer details conflict instead of inserting another row.

I could not run the Flask-backed pytest locally because this review environment is missing the Flask/pytest dependencies, but the patch logic addresses the blocker I raised.

Copy link
Copy Markdown
Contributor

@HCIE2054 HCIE2054 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Copy Markdown
Contributor

@jaxint jaxint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Great work on this PR. 🚀

Copy link
Copy Markdown
Contributor

@HCIE2054 HCIE2054 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Copy Markdown
Contributor

@HCIE2054 HCIE2054 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Copy Markdown
Contributor

@HCIE2054 HCIE2054 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Copy Markdown
Contributor

@HCIE2054 HCIE2054 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Copy Markdown
Contributor

@HCIE2054 HCIE2054 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Copy Markdown
Contributor

@HCIE2054 HCIE2054 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Copy Markdown
Contributor

@HCIE2054 HCIE2054 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Copy Markdown
Contributor

@HCIE2054 HCIE2054 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Copy Markdown
Contributor

@HCIE2054 HCIE2054 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@Scottcjn Scottcjn merged commit f0e8ca2 into Scottcjn:main May 18, 2026
10 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

BCOS-L1 Beacon Certified Open Source tier BCOS-L1 (required for non-doc PRs) node Node server related size/L PR: 201-500 lines tests Test suite changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants