Skip to content

[UTXO-BUG] Fix dual-write reward settlement#6537

Merged
Scottcjn merged 3 commits into
Scottcjn:mainfrom
jonathanpopham:codex/fix-utxo-dual-write-reward-settlement
May 29, 2026
Merged

[UTXO-BUG] Fix dual-write reward settlement#6537
Scottcjn merged 3 commits into
Scottcjn:mainfrom
jonathanpopham:codex/fix-utxo-dual-write-reward-settlement

Conversation

@jonathanpopham
Copy link
Copy Markdown
Contributor

Summary

  • Fixes a UTXO dual-write reward-settlement failure in the Phase 1/2 migration path covered by the Red Team UTXO bounty: [BOUNTY] Red Team UTXO Implementation — Find Bugs, Earn RTC (50-200 RTC) rustchain-bounties#2819
  • Configures the finalize_epoch() SQLite connection with sqlite3.Row before passing it into UtxoDB.apply_transaction(), which expects row-style access.
  • Batches all miner reward outputs into one or more mining_reward transactions instead of applying one mining_reward per miner at the same block height.

Vulnerability class

High/Medium: UTXO dual-write reward settlement failure.

With UTXO_DUAL_WRITE=1, finalize_epoch() passed a plain SQLite connection into the UTXO layer. UtxoDB.apply_transaction() uses row["n"]/named-row access, so that path can fail before UTXO rewards are written. After that is fixed, the old per-miner loop still attempts multiple mining_reward transactions at epoch * 144; the UTXO layer correctly permits only one mining reward per block height, so multi-miner epochs can fail and roll back settlement.

Impact: enabling UTXO dual-write can prevent epoch rewards from settling for normal multi-miner epochs, leaving the migration path unusable until fixed.

Fix

  • Set conn.row_factory = sqlite3.Row before creating the epoch cursor.
  • Collect miner UTXO reward outputs during the account update loop.
  • Apply batched mining_reward transactions after reward calculation, using one block height per batch inside the epoch window.
  • Cap batch size with the UTXO layer MAX_OUTPUTS value.

Validation

  • Added regression coverage in node/tests/test_integrated_balance_scale.py for row-style UTXO connections and multi-miner UTXO reward batching.
  • python3 -m compileall -q node/rustchain_v2_integrated_v2.2.1_rip200.py node/tests/test_integrated_balance_scale.py
  • git show --check --stat --oneline HEAD

Bounty wallet/miner ID for payout if accepted: jonathanpopham

@jonathanpopham jonathanpopham requested a review from Scottcjn as a code owner May 28, 2026 21:43
@github-actions
Copy link
Copy Markdown
Contributor

Welcome to RustChain! Thanks for your first pull request.

Before we review, please make sure:

  • Non-doc PRs have a BCOS-L1 or BCOS-L2 label
  • Doc-only PRs are exempt from BCOS tier labels when they only touch docs/**, *.md, or common image/PDF files
  • New code files include an SPDX license header
  • You've tested your changes against the live node

Bounty tiers: Micro (1-10 RTC) | Standard (20-50) | Major (75-100) | Critical (100-150)

A maintainer will review your PR soon. Thanks for contributing!

@github-actions github-actions Bot added BCOS-L1 Beacon Certified Open Source tier BCOS-L1 (required for non-doc PRs) node Node server related tests Test suite changes size/M PR: 51-200 lines and removed BCOS-L1 Beacon Certified Open Source tier BCOS-L1 (required for non-doc PRs) node Node server related tests Test suite changes labels May 28, 2026
@jonathanpopham
Copy link
Copy Markdown
Contributor Author

Payout address update if this bounty claim is accepted: RTC4730a64f821a590972e8b37781fae5d568b5865c.

@jonathanpopham
Copy link
Copy Markdown
Contributor Author

Maintainer note: this is a non-doc UTXO/node fix and should be tagged BCOS-L1 or BCOS-L2 per the project PR checklist. I do not have label permission on the upstream repo; the current checks are otherwise passing except the repository-wide test job baseline failure.

Copy link
Copy Markdown
Contributor

@eliasx45 eliasx45 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed current head 84b99077773673743cd2f9a5fd71e30d03e42623.

Verdict: request changes.

The production change is directionally sensible: finalize_epoch() now sets conn.row_factory = sqlite3.Row before passing the shared connection into UtxoDB.apply_transaction(), and it batches miner reward outputs so the UTXO layer does not see multiple mining_reward transactions at the same block height.

However, I cannot approve yet because the focused test file does not pass on this Windows checkout.

First run in the repo venv failed before executing tests because prometheus_client is not installed here and is not listed in requirements.txt or requirements-node.txt:

ModuleNotFoundError: No module named 'prometheus_client'

After installing prometheus_client into the repo venv only so I could continue verification, the focused tests executed their assertions but failed in teardown because the temp SQLite DB remained open:

PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: '...\\scale.db'

I reproduced that with both:

PYTHONPATH=node .\.venv\Scripts\python.exe -m unittest node.tests.test_integrated_balance_scale

and the two new focused tests:

PYTHONPATH=node .\.venv\Scripts\python.exe -m unittest node.tests.test_integrated_balance_scale.TestIntegratedBalanceScale.test_finalize_epoch_passes_row_connection_to_utxo_db node.tests.test_integrated_balance_scale.TestIntegratedBalanceScale.test_finalize_epoch_batches_multi_miner_utxo_rewards

Other checks:

  • python -m compileall -q node/rustchain_v2_integrated_v2.2.1_rip200.py node/tests/test_integrated_balance_scale.py passed.
  • git diff --check origin/main...HEAD passed.
  • git merge-tree --write-tree origin/main HEAD produced a clean tree (ade7636699de7e08766704225254b5b61ce24ae5).

Requested fix: make the focused test module runnable from a clean repo test environment and close/release the SQLite handles on Windows before TemporaryDirectory.cleanup(). A small cleanup step such as explicitly stopping/closing imported-node background resources, or at minimum a deterministic gc.collect()/retry around the temp DB cleanup if that matches the repo's test pattern, should be enough. Please also account for prometheus_client as a test dependency or avoid requiring it for this test module.

@github-actions github-actions Bot added BCOS-L1 Beacon Certified Open Source tier BCOS-L1 (required for non-doc PRs) node Node server related tests Test suite changes labels May 28, 2026
@jonathanpopham
Copy link
Copy Markdown
Contributor Author

Updated in b88bb0c.

  • Merged current origin/main into the branch; the PR now includes main at d9ac21c.
  • Removed the focused test dependency on external prometheus_client by installing a local fake module only during the integrated module import.
  • Closed SQLite helper connections with contextlib.closing, dropped the imported module reference, and forced GC before TemporaryDirectory cleanup so Windows can release scale.db.

Local verification passed:

PYTHONPATH=node /tmp/rustchain6537-py311-venv/bin/python -m compileall -q node/rustchain_v2_integrated_v2.2.1_rip200.py node/tests/test_integrated_balance_scale.py
PYTHONPATH=node /tmp/rustchain6537-py311-venv/bin/python -m unittest node.tests.test_integrated_balance_scale

Result: 4 focused tests passed.

@jonathanpopham
Copy link
Copy Markdown
Contributor Author

CI note: the remaining red test check is the repo-wide pytest sweep. Current main is also red on the same CI workflow at d9ac21c, and this PR head has all BCOS/label/PoC checks green.

The failures reported by that broad job are outside this PR path: monitor payload shape, setup-miner artifact hash, tx-handler auth/status expectations, stale-data UTXO tests, and wallet-network retry tests. The focused test for this fix passed locally after the update, as noted above.

Copy link
Copy Markdown
Contributor

@eliasx45 eliasx45 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-reviewed updated head b88bb0c4f2faffb4d6b717b94abd1b16a0ce3f6d after the Windows/test-harness follow-up.

Verdict: approve.

The blockers from my previous review are resolved:

  • The test no longer depends on an installed external prometheus_client; it installs a local fake module during the integrated module import.
  • The Windows temp SQLite cleanup issue is fixed; the focused tests no longer fail with PermissionError on scale.db cleanup.

Validation on this Windows checkout:

  • PYTHONPATH=node .\.venv\Scripts\python.exe -m compileall -q node/rustchain_v2_integrated_v2.2.1_rip200.py node/tests/test_integrated_balance_scale.py passed.
  • PYTHONPATH=node .\.venv\Scripts\python.exe -m unittest node.tests.test_integrated_balance_scale passed: 4 tests OK.
  • Direct focused rerun passed: test_finalize_epoch_passes_row_connection_to_utxo_db and test_finalize_epoch_batches_multi_miner_utxo_rewards both OK.
  • After fetching current origin/main, git diff --check origin/main...HEAD passed.
  • git merge-tree --write-tree origin/main HEAD produced a clean tree (fd40b5f71ef21962f23090ba5e41cd64ca55a2d8).

The hosted broad test check is still red, but the current main branch is also red on that broad workflow per the author's note, and the PR-scoped integrated balance tests now pass locally. I do not see a remaining blocker in the changed files.

Copy link
Copy Markdown
Contributor

@crystal-tensor crystal-tensor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: PR #6537 — Fix dual-write reward settlement

Security Assessment: ✅ APPROVED — Medium/High

Severity: High (UTXO migration path broken)

Summary: Two bugs in finalize_epoch() when UTXO_DUAL_WRITE=1:

  1. SQLite connection lacked row_factory = sqlite3.Row, causing row["n"] access to fail
  2. Per-miner mining_reward transactions at the same block height violate UTXO's one-mining-reward-per-block-height rule

Strengths

  1. Both bugs correctly identified and fixed: Row factory fix is straightforward; batching ensures one mining_reward tx per block height.

  2. Batching respects UTXO constraints: Batch size capped at UTXO_MAX_OUTPUTS, distributed across EPOCH_SLOTS block heights.

  3. Safety check: If reward batches exceed EPOCH_SLOTS, raises RuntimeError — prevents silent failure.

  4. Test coverage: test_integrated_balance_scale.py covers both fixes.

Security Impact

Confirmed: Without fix #1, any epoch finalization with UTXO_DUAL_WRITE=1 fails at the first mining reward. Without fix #2, multi-miner epochs fail and roll back entirely.

Final Verdict

APPROVED — Critical UTXO migration path fix. Ready to merge.

@Scottcjn Scottcjn merged commit 6d4bc7d into Scottcjn:main May 29, 2026
10 of 11 checks passed
@Scottcjn
Copy link
Copy Markdown
Owner

Merged + paid 25 RTC (Medium #2867). Reserved at github:jonathanpopham.

Codex review: feature-flagged path real, fix correct, tests good. Thanks @jonathanpopham.

Scottcjn added a commit that referenced this pull request May 30, 2026
Introduces three foundation pieces to eliminate the recurring UTXO-OOM
bug class (4 [UTXO-BUG] fixes shipped this week — #6526, #6535, #6537,
#6562, #6563, #6571 — all the same .fetchall() shape):

1. node/db_helpers.py (190 LOC):
   - fetch_page(conn, sql, params, *, limit, offset=0, max_limit=1000)
     - Always appends LIMIT/OFFSET before issuing SELECT
     - Rejects sql already containing LIMIT (case-insensitive)
     - Rejects limit > max_limit or negative limit/offset
   - fetch_one_or_none(conn, sql, params)
     - For queries that MUST return 0 or 1 row
     - Raises if >1 row materializes
   - count_estimate(conn, table, *, where=None, params=())

2. tests/test_db_helpers.py (208 LOC, 23 tests):
   - Happy path, edge cases, limit enforcement
   - SQL-already-has-LIMIT rejection (upper/lower/mixed case)
   - offset behavior, semicolon handling, zero-limit, etc.
   - All 23 pass against in-memory sqlite

3. scripts/check_fetchall.sh (117 LOC):
   - CI guard greps node/ for .fetchall() outside tests/deprecated
   - For each hit: checks same-line or prior-line opt-in annotation
     # fetchall-ok: <reason>  where reason in:
     bounded-by-schema, pragma-result, internal-test-helper,
     already-paginated
   - Currently informational (will be wired into GH Actions in Part B)

What this PR does NOT do (left intentionally claimable for #6627 bounty):
- Site conversion of the ~50 .fetchall() instances in
  node/rustchain_v2_integrated_v2.2.1_rip200.py
- Annotation sweep on the ~175 legit sites across other modules
- GH Actions wire-in (.github/workflows/check_fetchall.yml)
- Part B (25 RTC): CI guard wire + annotation sweep
- Part A2 (25 RTC, if claimed separately): main-file conversion

Scott as author, NOT claiming the bounty — this is operator foundation
work so contributors can claim the larger sweep against a stable helper.

Closes: foundation portion of #6627
Refs: #6526, #6535, #6537, #6562, #6563, #6571 (already-merged instances of the class)

Co-authored-by: Scott Boudreaux <scottbphone12@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

BCOS-L1 Beacon Certified Open Source tier BCOS-L1 (required for non-doc PRs) node Node server related size/M PR: 51-200 lines tests Test suite changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants