[UTXO-BUG] Fix dual-write reward settlement#6537
Conversation
|
Welcome to RustChain! Thanks for your first pull request. Before we review, please make sure:
Bounty tiers: Micro (1-10 RTC) | Standard (20-50) | Major (75-100) | Critical (100-150) A maintainer will review your PR soon. Thanks for contributing! |
|
Payout address update if this bounty claim is accepted: |
|
Maintainer note: this is a non-doc UTXO/node fix and should be tagged |
eliasx45
left a comment
There was a problem hiding this comment.
Reviewed current head 84b99077773673743cd2f9a5fd71e30d03e42623.
Verdict: request changes.
The production change is directionally sensible: finalize_epoch() now sets conn.row_factory = sqlite3.Row before passing the shared connection into UtxoDB.apply_transaction(), and it batches miner reward outputs so the UTXO layer does not see multiple mining_reward transactions at the same block height.
However, I cannot approve yet because the focused test file does not pass on this Windows checkout.
First run in the repo venv failed before executing tests because prometheus_client is not installed here and is not listed in requirements.txt or requirements-node.txt:
ModuleNotFoundError: No module named 'prometheus_client'
After installing prometheus_client into the repo venv only so I could continue verification, the focused tests executed their assertions but failed in teardown because the temp SQLite DB remained open:
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: '...\\scale.db'
I reproduced that with both:
PYTHONPATH=node .\.venv\Scripts\python.exe -m unittest node.tests.test_integrated_balance_scale
and the two new focused tests:
PYTHONPATH=node .\.venv\Scripts\python.exe -m unittest node.tests.test_integrated_balance_scale.TestIntegratedBalanceScale.test_finalize_epoch_passes_row_connection_to_utxo_db node.tests.test_integrated_balance_scale.TestIntegratedBalanceScale.test_finalize_epoch_batches_multi_miner_utxo_rewards
Other checks:
python -m compileall -q node/rustchain_v2_integrated_v2.2.1_rip200.py node/tests/test_integrated_balance_scale.pypassed.git diff --check origin/main...HEADpassed.git merge-tree --write-tree origin/main HEADproduced a clean tree (ade7636699de7e08766704225254b5b61ce24ae5).
Requested fix: make the focused test module runnable from a clean repo test environment and close/release the SQLite handles on Windows before TemporaryDirectory.cleanup(). A small cleanup step such as explicitly stopping/closing imported-node background resources, or at minimum a deterministic gc.collect()/retry around the temp DB cleanup if that matches the repo's test pattern, should be enough. Please also account for prometheus_client as a test dependency or avoid requiring it for this test module.
|
Updated in b88bb0c.
Local verification passed: Result: 4 focused tests passed. |
|
CI note: the remaining red The failures reported by that broad job are outside this PR path: monitor payload shape, setup-miner artifact hash, tx-handler auth/status expectations, stale-data UTXO tests, and wallet-network retry tests. The focused test for this fix passed locally after the update, as noted above. |
eliasx45
left a comment
There was a problem hiding this comment.
Re-reviewed updated head b88bb0c4f2faffb4d6b717b94abd1b16a0ce3f6d after the Windows/test-harness follow-up.
Verdict: approve.
The blockers from my previous review are resolved:
- The test no longer depends on an installed external
prometheus_client; it installs a local fake module during the integrated module import. - The Windows temp SQLite cleanup issue is fixed; the focused tests no longer fail with
PermissionErroronscale.dbcleanup.
Validation on this Windows checkout:
PYTHONPATH=node .\.venv\Scripts\python.exe -m compileall -q node/rustchain_v2_integrated_v2.2.1_rip200.py node/tests/test_integrated_balance_scale.pypassed.PYTHONPATH=node .\.venv\Scripts\python.exe -m unittest node.tests.test_integrated_balance_scalepassed: 4 tests OK.- Direct focused rerun passed:
test_finalize_epoch_passes_row_connection_to_utxo_dbandtest_finalize_epoch_batches_multi_miner_utxo_rewardsboth OK. - After fetching current
origin/main,git diff --check origin/main...HEADpassed. git merge-tree --write-tree origin/main HEADproduced a clean tree (fd40b5f71ef21962f23090ba5e41cd64ca55a2d8).
The hosted broad test check is still red, but the current main branch is also red on that broad workflow per the author's note, and the PR-scoped integrated balance tests now pass locally. I do not see a remaining blocker in the changed files.
crystal-tensor
left a comment
There was a problem hiding this comment.
Code Review: PR #6537 — Fix dual-write reward settlement
Security Assessment: ✅ APPROVED — Medium/High
Severity: High (UTXO migration path broken)
Summary: Two bugs in finalize_epoch() when UTXO_DUAL_WRITE=1:
- SQLite connection lacked row_factory = sqlite3.Row, causing row["n"] access to fail
- Per-miner mining_reward transactions at the same block height violate UTXO's one-mining-reward-per-block-height rule
Strengths
-
Both bugs correctly identified and fixed: Row factory fix is straightforward; batching ensures one mining_reward tx per block height.
-
Batching respects UTXO constraints: Batch size capped at UTXO_MAX_OUTPUTS, distributed across EPOCH_SLOTS block heights.
-
Safety check: If reward batches exceed EPOCH_SLOTS, raises RuntimeError — prevents silent failure.
-
Test coverage: test_integrated_balance_scale.py covers both fixes.
Security Impact
Confirmed: Without fix #1, any epoch finalization with UTXO_DUAL_WRITE=1 fails at the first mining reward. Without fix #2, multi-miner epochs fail and roll back entirely.
Final Verdict
✅ APPROVED — Critical UTXO migration path fix. Ready to merge.
|
Merged + paid 25 RTC (Medium #2867). Reserved at Codex review: feature-flagged path real, fix correct, tests good. Thanks @jonathanpopham. |
Introduces three foundation pieces to eliminate the recurring UTXO-OOM bug class (4 [UTXO-BUG] fixes shipped this week — #6526, #6535, #6537, #6562, #6563, #6571 — all the same .fetchall() shape): 1. node/db_helpers.py (190 LOC): - fetch_page(conn, sql, params, *, limit, offset=0, max_limit=1000) - Always appends LIMIT/OFFSET before issuing SELECT - Rejects sql already containing LIMIT (case-insensitive) - Rejects limit > max_limit or negative limit/offset - fetch_one_or_none(conn, sql, params) - For queries that MUST return 0 or 1 row - Raises if >1 row materializes - count_estimate(conn, table, *, where=None, params=()) 2. tests/test_db_helpers.py (208 LOC, 23 tests): - Happy path, edge cases, limit enforcement - SQL-already-has-LIMIT rejection (upper/lower/mixed case) - offset behavior, semicolon handling, zero-limit, etc. - All 23 pass against in-memory sqlite 3. scripts/check_fetchall.sh (117 LOC): - CI guard greps node/ for .fetchall() outside tests/deprecated - For each hit: checks same-line or prior-line opt-in annotation # fetchall-ok: <reason> where reason in: bounded-by-schema, pragma-result, internal-test-helper, already-paginated - Currently informational (will be wired into GH Actions in Part B) What this PR does NOT do (left intentionally claimable for #6627 bounty): - Site conversion of the ~50 .fetchall() instances in node/rustchain_v2_integrated_v2.2.1_rip200.py - Annotation sweep on the ~175 legit sites across other modules - GH Actions wire-in (.github/workflows/check_fetchall.yml) - Part B (25 RTC): CI guard wire + annotation sweep - Part A2 (25 RTC, if claimed separately): main-file conversion Scott as author, NOT claiming the bounty — this is operator foundation work so contributors can claim the larger sweep against a stable helper. Closes: foundation portion of #6627 Refs: #6526, #6535, #6537, #6562, #6563, #6571 (already-merged instances of the class) Co-authored-by: Scott Boudreaux <scottbphone12@gmail.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
finalize_epoch()SQLite connection withsqlite3.Rowbefore passing it intoUtxoDB.apply_transaction(), which expects row-style access.Vulnerability class
High/Medium: UTXO dual-write reward settlement failure.
With
UTXO_DUAL_WRITE=1,finalize_epoch()passed a plain SQLite connection into the UTXO layer.UtxoDB.apply_transaction()usesrow["n"]/named-row access, so that path can fail before UTXO rewards are written. After that is fixed, the old per-miner loop still attempts multiplemining_rewardtransactions atepoch * 144; the UTXO layer correctly permits only one mining reward per block height, so multi-miner epochs can fail and roll back settlement.Impact: enabling UTXO dual-write can prevent epoch rewards from settling for normal multi-miner epochs, leaving the migration path unusable until fixed.
Fix
conn.row_factory = sqlite3.Rowbefore creating the epoch cursor.MAX_OUTPUTSvalue.Validation
node/tests/test_integrated_balance_scale.pyfor row-style UTXO connections and multi-miner UTXO reward batching.python3 -m compileall -q node/rustchain_v2_integrated_v2.2.1_rip200.py node/tests/test_integrated_balance_scale.pygit show --check --stat --oneline HEADBounty wallet/miner ID for payout if accepted:
jonathanpopham