Skip to content

fix: read system health from stats index in estimatecollateral (Bug #34)#395

Open
JohnnyLawDGB wants to merge 1 commit intoDigiByte-Core:feature/digidollar-v1from
JohnnyLawDGB:fix/estimatecollateral-stale-health
Open

fix: read system health from stats index in estimatecollateral (Bug #34)#395
JohnnyLawDGB wants to merge 1 commit intoDigiByte-Core:feature/digidollar-v1from
JohnnyLawDGB:fix/estimatecollateral-stale-health

Conversation

@JohnnyLawDGB
Copy link
Copy Markdown

Summary

  • Bug Fix: Update dnsseeds for mainnet / testnet #34: estimatecollateral reports system_health=0 / emergency / DCA=2x when the system is actually healthy at 475-528%. Doubles the reported collateral requirement, blocking valid mints with "Insufficient funds."
  • Root cause: estimatecollateral calls GetSystemMetrics() which reads a static cache that is only populated when ScanUTXOSet() runs inside getdigidollarstats or getprotectionstatus. If neither has run, the cache has totalDDSupply=0, and line 2579 hardcodes systemHealth=0.
  • Fix: Read totalCollateral and totalDDSupply from the digidollar stats index (fast path) or fall back to ScanUTXOSet() (slow path), matching the pattern getdigidollarstats and getprotectionstatus already use. Also change the totalDD==0 fallback from health=0 to health=30000 (max) — an empty system has nothing at risk.
  • Single file change: src/rpc/digidollar.cpp
  • Still present in RC29 — RH-36a fixed a different cache race (TOCTOU after restart), not this code path.

Evidence — same node, same moment

RPC Health DCA Status
getdigidollarstats 475% 1x healthy
getprotectionstatus 475 1x secure
estimatecollateral 0 2x emergency

System has 15 active positions, 1.69M DGB locked, 150K cents DD supply.

Test plan

  • Build clean on RC28 tag — no warnings
  • Confirmed bug still present on RC29 binary (same GetSystemMetrics() code path)
  • After fix: estimatecollateral returns health=510 / DCA=1x, matching other RPCs
  • Tier-4 mint that required 134K DGB (broken) now correctly requires 67K DGB
  • Mint succeeds at 67K DGB — previously rejected with "Insufficient funds"
  • getdigidollarstats, getprotectionstatus behavior unchanged

🤖 Generated with Claude Code

…igiByte-Core#34)

estimatecollateral read system health from a static cache
(GetSystemMetrics) that is only populated when ScanUTXOSet() runs
inside getdigidollarstats or getprotectionstatus. If estimatecollateral
ran first, the cache had totalDDSupply=0, causing the health check to
hardcode systemHealth=0 (emergency) and apply a 2x DCA multiplier.
This doubled the reported collateral requirement even when the system
was healthy at 475-528%.

The mint transaction builder uses the same path, so valid mints were
rejected with "Insufficient funds for collateral and fees."

Fix: read totalCollateral and totalDDSupply from the digidollar stats
index (fast, already synced) or fall back to ScanUTXOSet(), matching
the pattern getdigidollarstats and getprotectionstatus already use.
Also change the totalDD==0 case from health=0 to health=30000 (max)
since an empty system has nothing at risk.

Tested on testnet21: estimatecollateral now returns health=510 / DCA=1x
matching getdigidollarstats and getprotectionstatus. Tier-4 mint that
previously required 134K DGB now correctly requires 67K and succeeds.
Confirmed still present in RC29 (RH-36a fixed a different cache race).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@DigiSwarm
Copy link
Copy Markdown

Review: PR #395fix: read system health from stats index in estimatecollateral (Bug #34)

Thanks for this fix, @JohnnyLawDGB — this is a well-researched and cleanly scoped bug fix. The root cause analysis in the PR description is excellent, and the inline comments explaining the "why" behind each change make it easy to follow.

What's working well

  • Root cause is correctly identified. GetSystemMetrics() reads from s_currentMetrics, and the ScanUTXOSet() call inside it is commented out (// Internal calls don't have chainstate access). So unless another RPC like getdigidollarstats has been called first, the cache is zero-initialized — leading to totalDDSupply=0systemHealth=0 → 2x DCA multiplier → "Insufficient funds." This is clearly the bug.

  • The fix follows established patterns. The stats-index fast-path with BlockUntilSyncedToCurrentChain() + LookUpStats(), falling back to ForceFlushStateToDisk() + ScanUTXOSet() under LOCK(cs_main), is the exact same pattern used in getdigidollarstats. This is the right approach — reuse proven patterns.

  • The systemHealth = 30000 for empty system is correct. I verified that CalculateSystemHealth() in src/consensus/dca.cpp itself returns 30000 when totalDD == 0 with the log message "No DigiDollars in circulation, returning maximum health." The old code's systemHealth = 0 was inconsistent with the DCA module's own semantics. An empty system has infinite collateral ratio — no liabilities, no risk. This fix aligns estimatecollateral with the DCA module's intended behavior.

  • Lock scoping is correct. cs_main is acquired minimally for ActiveChain().Tip() and released before the index lookup. In the fallback path, holding cs_main during ScanUTXOSet() is consistent with getdigidollarstats.

  • Ubuntu CI passes. macOS failure is infrastructure (Node.js 20 deprecation warnings, homebrew link noise) — no compile or test errors related to this change.

Items to address

1. Silent fallthrough when stats index fails to sync (medium priority)

if (g_digidollar_stats_index) {
    if (g_digidollar_stats_index->BlockUntilSyncedToCurrentChain()) {
        // ... read stats ...
    }
    // ← If BlockUntilSyncedToCurrentChain() returns false, falls through silently
}

If the stats index exists but is catching up from far behind, BlockUntilSyncedToCurrentChain() returns false. The code then silently proceeds with totalCollateral_est = 0 and totalDD_est = 0, which hits the totalDD == 0 case and sets systemHealth = 30000 (max health). This could mask a real problem — the system might have significant DD supply with degraded health, but estimatecollateral would report "healthy, 1x collateral."

By contrast, getdigidollarstats throws an RPC error in this case:

throw JSONRPCError(RPC_INTERNAL_ERROR, "DigiDollar stats index is syncing...");

Suggestion: Either throw like getdigidollarstats does, or fall through to the UTXO scan fallback path instead of silently proceeding with zeros:

if (g_digidollar_stats_index) {
    if (!g_digidollar_stats_index->BlockUntilSyncedToCurrentChain()) {
        throw JSONRPCError(RPC_INTERNAL_ERROR, 
            "DigiDollar stats index is still syncing. Try again shortly.");
    }
    // ... read stats ...
}

2. No fallback when index exists but LookUpStats() returns empty (low priority)

If pindex is null or LookUpStats() returns std::nullopt (possible during a reorg or with a pruned index), the code falls through with zeros. The else branch (UTXO scan) only runs when g_digidollar_stats_index is entirely absent. Consider adding a fallback to UTXO scanning when the index lookup fails.

3. Inconsistent health semantics across RPCs (informational)

After this fix, estimatecollateral will use systemHealth = 30000 when totalDD == 0, but getdigidollarstats in the same file uses systemHealth = 0 for the same condition (with the comment "No DD minted = 0% health, not 30000%"). Both are defensible — display vs. calculation semantics — but a user calling both RPCs with no DD minted will see conflicting values. A brief comment noting this intentional difference would help future contributors.

4. No regression test (nice-to-have)

The manual testing on a live node is well-documented and convincing. For long-term confidence, a functional test that calls estimatecollateral on a fresh node (without first calling getdigidollarstats) would guard against regressions. Not a merge blocker, but worth considering as a follow-up.

Summary

This is a solid, well-scoped fix for a real user-facing bug. The code quality is high and follows the file's established patterns. The main item I'd like to see addressed before merge is #1 (silent fallthrough on index sync failure) — it's a small change that prevents a potentially misleading health reading. Items #2-4 are minor and could be addressed as follow-ups.

We are currently pulling the branch locally to build and run the full test suite — will report back with results.

Thanks again for the thorough bug analysis and clean fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants