Skip to content

chore: prevent non-determinsm between validators and tests#1663

Merged
MicBun merged 3 commits into
mainfrom
QADeterminism
Jan 12, 2026
Merged

chore: prevent non-determinsm between validators and tests#1663
MicBun merged 3 commits into
mainfrom
QADeterminism

Conversation

@MicBun
Copy link
Copy Markdown
Collaborator

@MicBun MicBun commented Jan 12, 2026

resolves: https://github.com/trufnetwork/bridge-validator/issues/2

Summary by CodeRabbit

  • Tests

    • Added extensive ERC20 bridge determinism tests covering cross-node consistency, signature verification, reward aggregation, message-hash determinism, voting-power commutativity, duplicate/empty-epoch cases, and Merkle ordering.
  • Bug Fixes

    • Fixed Ethereum-signed message signature verification to use the proper prefixed hash and standard V handling.
    • Enforced deterministic ordering for reward query results.
  • Style

    • Adjusted import ordering for test shims.

✏️ Tip: You can customize this high-level summary in your review settings.

@MicBun MicBun self-assigned this Jan 12, 2026
@MicBun MicBun added the consensus-change PR that contains consensus changes label Jan 12, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jan 12, 2026

📝 Walkthrough

Walkthrough

Adds comprehensive determinism tests for the ERC20 bridge (cross-node and single-node), updates signature recovery to use Ethereum-prefixed message hashing and normalizes V values, and makes getRewardsForEpoch query results deterministic via an ORDER BY clause.

Changes

Cohort / File(s) Summary
Cross-node determinism tests
node/exts/erc20-bridge/erc20/cross_node_determinism_test.go
Adds 5 kwiltest-guarded tests that simulate two independent PostgreSQL-backed nodes to assert cross-node Merkle root parity, signature verification consistency (including corrupted sig rejection), deterministic reward aggregation, epoch message hash stability, and commutative voting power.
Single-node determinism tests
node/exts/erc20-bridge/erc20/determinism_test.go
Adds 6 kwiltest-guarded tests covering duplicate withdrawal aggregation within an epoch, Merkle-tree determinism under shuffled inserts, deterministic ordering for equal amounts, stable signature ordering across queries, BFT/voting edge cases (some skipped), and empty-epoch handling.
Signature recovery tests update
node/exts/erc20-bridge/erc20/validator_signer_test.go
Changes test recovery to compute Ethereum-prefixed message hash (keccak256(prefix + message)) and normalizes V from Gnosis-style (31/32) to standard Ethereum (27/28) before calling crypto.SigToPub.
Deterministic query ordering
node/exts/erc20-bridge/erc20/meta_sql.go
Adds ORDER BY recipient ASC, amount ASC to getRewardsForEpoch SQL query to produce deterministic result ordering.
Import housekeeping (tests)
node/exts/erc20-bridge/erc20/for_test_epoch_shims.go
Minor import reorder only; no functional or API changes.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Suggested labels

enhancement

Suggested reviewers

  • pr-time-tracker

Poem

🐰 I hopped between two nodes with cheer,

roots matched and signatures clear;
Rewards lined up, totals true,
Votes agreed in varied view—
A carrot-coded test for you! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 4
❌ Failed checks (2 warnings, 2 inconclusive)
Check name Status Explanation Resolution
Linked Issues check ⚠️ Warning The linked issue #2 requests withdraw integration (TN -> Mainnet) support and finality query mechanisms, but the PR adds only determinism tests without implementing the requested withdraw functionality. Either implement the withdraw integration and finality query logic described in issue #2, or clarify that this PR is preparatory work for a subsequent withdraw implementation PR.
Description check ⚠️ Warning The PR has no description in the body; it only references issue #2 without explaining how the added tests relate to withdraw integration or what determinism issues they address. Add a detailed description explaining: the determinism issues discovered, how they affect validator operations, and how these tests prevent regressions in withdraw integration.
Title check ❓ Inconclusive The title mentions preventing 'non-determinism' (with a typo) between validators and tests, but it's unclear what specific non-determinism issues are being addressed. Clarify the title to specifically describe what non-determinism is being prevented, e.g., 'Add determinism tests for ERC20 bridge validator operations' or 'Ensure deterministic ordering in ERC20 reward processing.'
Out of Scope Changes check ❓ Inconclusive The PR adds extensive determinism tests (5 cross-node tests, 6 single-node tests) and updates validator signing logic, which are not explicitly mentioned in the linked issue about withdraw integration. Clarify whether the determinism test infrastructure is a dependency for the withdraw implementation, or if these tests address separate determinism concerns discovered during development.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@holdex
Copy link
Copy Markdown

holdex Bot commented Jan 12, 2026

Time Submission Status

Member Status Time Action Last Update
MicBun ✅ Submitted 4h Update time Jan 12, 2026, 3:03 PM

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
node/exts/erc20-bridge/erc20/validator_signer_test.go (1)

297-318: Test assertion contradicts ON CONFLICT DO NOTHING behavior.

The test calls voteEpoch twice with the same (epoch_id, voter, nonce) combination (the PRIMARY KEY) but different signatures. The voteEpoch function uses ON CONFLICT DO NOTHING, which silently ignores the second insert. The test should verify that sig1 remains stored, not sig2. Either the assertion at line 317 is incorrect, or the test is not catching the actual behavior.

Additionally, the comment at line 304 is misleading—it says "duplicate vote updated" but the SQL does neither: it simply ignores the duplicate insert.

🤖 Fix all issues with AI agents
In @node/exts/erc20-bridge/erc20/cross_node_determinism_test.go:
- Around line 558-572: The test double-prefixes the message:
computeEpochMessageHash produces the raw messageHash, you then manually apply
EthereumSignedMessagePrefix to create ethSignedMessageHash and pass that into
signMessage which itself prefixes again; instead pass the raw messageHash into
signMessage (i.e., replace passing ethSignedMessageHash with messageHash) so
signatures match production behavior, and ensure voteEpoch is still called with
the signature returned by signMessage and that verification paths expect the
single-prefix signing scheme.
- Around line 244-249: The test double-prefixes the message: it constructs
ethSignedMessageHash by applying EthereumSignedMessagePrefix and Keccak256 then
calls signMessage which re-adds the prefix; fix by removing the manual prefixing
and pass the raw messageHash into signMessage (i.e., stop using
ethSignedMessageHash/prefix here and call signMessage(messageHash,
validatorKey)), ensuring references to EthereumSignedMessagePrefix and
ethSignedMessageHash are removed or only used where appropriate so the final
signed value matches the behavior in validator_signer_test.go.

In @node/exts/erc20-bridge/erc20/determinism_test.go:
- Around line 423-443: computeEpochMessageHash already returns the message to be
Ethereum-prefixed by signMessage, but the test pre-prefixes it into
ethSignedMessageHash (using EthereumSignedMessagePrefix) and then calls
signMessage, causing double-prefixing and mismatched signatures; fix by removing
the manual prefixing and pass the raw messageHash into signMessage (or
alternatively change signMessage call to accept a flag for already-prefixed
input), i.e., stop constructing ethSignedMessageHash and call
signMessage(messageHash, key) so signatures are consistent with signMessage's
internal prefixing.
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0955c8d and b1fcbf4.

📒 Files selected for processing (5)
  • node/exts/erc20-bridge/erc20/cross_node_determinism_test.go
  • node/exts/erc20-bridge/erc20/determinism_test.go
  • node/exts/erc20-bridge/erc20/for_test_epoch_shims.go
  • node/exts/erc20-bridge/erc20/meta_sql.go
  • node/exts/erc20-bridge/erc20/validator_signer_test.go
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: outerlook
Repo: trufnetwork/kwil-db PR: 1609
File: node/exts/erc20-bridge/erc20/meta_extension.go:665-684
Timestamp: 2025-09-16T11:49:48.043Z
Learning: Kwil enforces deterministic ordered transaction processing (tx by tx), which eliminates race conditions and TOCTOU vulnerabilities that would exist in traditional concurrent database systems. Balance checks followed by balance updates are effectively atomic at the transaction level.
🧬 Code graph analysis (2)
node/exts/erc20-bridge/erc20/validator_signer_test.go (1)
node/exts/erc20-bridge/erc20/meta_extension.go (1)
  • EthereumSignedMessagePrefix (61-61)
node/exts/erc20-bridge/erc20/cross_node_determinism_test.go (5)
node/exts/ordered-sync/for_test_reset.go (1)
  • ForTestingReset (12-12)
node/exts/erc20-bridge/erc20/for_test_shims.go (1)
  • ForTestingResetSingleton (177-189)
node/exts/evm-sync/chains/chains.go (1)
  • GetChainInfoByID (103-111)
node/exts/erc20-bridge/erc20/meta_extension.go (2)
  • PendingEpoch (2021-2025)
  • EthereumSignedMessagePrefix (61-61)
node/exts/erc20-bridge/utils/crypto.go (1)
  • EthStandardVerifyDigest (147-172)
🔇 Additional comments (12)
node/exts/erc20-bridge/erc20/meta_sql.go (1)

469-469: Good fix for deterministic ordering.

Adding ORDER BY recipient ASC, amount ASC ensures consistent reward ordering across all nodes when generating Merkle trees. Since epoch_rewards aggregates by (epoch_id, recipient) via the ON CONFLICT clause in issueReward, there's at most one row per recipient per epoch—making the amount ASC secondary sort technically redundant but harmless as a safeguard.

node/exts/erc20-bridge/erc20/for_test_epoch_shims.go (1)

9-10: No functional changes.

Import reordering only; the logic remains unchanged.

node/exts/erc20-bridge/erc20/validator_signer_test.go (2)

95-107: Correct Ethereum signature recovery handling.

The test now properly accounts for the Ethereum signed message prefix when verifying signature recovery. The flow is:

  1. signMessage internally applies the prefix before signing
  2. For recovery, the test recomputes ethSignedMessageHash with the same prefix
  3. V is adjusted from 27/28 → 0/1 for crypto.SigToPub

This aligns with the EthereumSignedMessagePrefix constant and EthStandardVerifyDigest utility.


531-542: Consistent signature recovery handling.

Same correct pattern as in TestVoteEpochAction—applying the Ethereum prefix for recovery verification.

node/exts/erc20-bridge/erc20/cross_node_determinism_test.go (3)

20-207: Well-designed cross-node determinism test.

This test correctly validates that two independent nodes processing identical withdrawals in different orders produce the same Merkle root. The test properly:

  • Uses separate DB transactions simulating independent nodes
  • Inserts rewards in forward vs. reverse order
  • Uses consistent escrowAddr and blockHash for both computations
  • Asserts root equality

This directly validates the ORDER BY fix in getRewardsForEpoch.


315-444: Good aggregation determinism test coverage.

The table-driven tests thoroughly verify reward aggregation including edge cases with large numbers. This validates the ON CONFLICT DO UPDATE behavior in issueReward.


446-467: Simple and effective hash determinism test.

Validates that computeEpochMessageHash is a pure function producing consistent output.

node/exts/erc20-bridge/erc20/determinism_test.go (5)

20-122: Good test for reward aggregation.

Validates that multiple issueReward calls to the same recipient correctly aggregate via ON CONFLICT DO UPDATE.


124-231: Effective Merkle determinism test.

This test validates the core fix by verifying that shuffled insertion orders produce identical Merkle roots. The expected total (13850) is correctly calculated.


233-350: Good edge case test for duplicate amounts.

This test validates that ORDER BY recipient ASC, amount ASC produces deterministic results even when amounts are identical. The expected lexicographic ordering is correctly specified.


485-588: Good test structure, appropriately skipped.

The BFT threshold test cases are well-defined and document the expected behavior. The skip with TODO comment is appropriate since it requires mock validator infrastructure.


590-667: Good edge case handling test.

Validates that empty epochs (no rewards) are handled gracefully with nil returns, preventing unnecessary finalization.

Comment thread node/exts/erc20-bridge/erc20/cross_node_determinism_test.go Outdated
Comment thread node/exts/erc20-bridge/erc20/cross_node_determinism_test.go
Comment thread node/exts/erc20-bridge/erc20/determinism_test.go
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @node/exts/erc20-bridge/erc20/determinism_test.go:
- Around line 340-349: The test's expectedOrder uses lowercase addresses but
Recipient.Hex() returns EIP-55 checksummed addresses; update the expectedOrder
values to the EIP-55 checksummed forms (so the entries for addresses containing
letters match common.Address.Hex()), ensuring the expectedOrder array used in
the require.Equal(t, expectedOrder, firstOrdering, ...) call exactly matches the
checksummed output produced by Recipient.Hex().
🧹 Nitpick comments (2)
node/exts/erc20-bridge/erc20/determinism_test.go (1)

484-587: Good test structure, but all cases are skipped.

The test cases are well-defined and cover important BFT threshold edge cases. Consider tracking this TODO in an issue to ensure it's not forgotten.

Would you like me to open an issue to track the implementation of mock validator injection for these BFT threshold tests?

node/exts/erc20-bridge/erc20/cross_node_determinism_test.go (1)

211-224: Unused database connections in signature verification test.

The test creates db1 and db2 but never uses them since signature verification is purely cryptographic and doesn't require database access. Consider removing these to simplify the test.

♻️ Suggested simplification
 func TestCrossNodeSignatureVerification(t *testing.T) {
-	// Create two separate database connections
-	db1, err := newTestDB()
-	if err != nil {
-		t.Skip("PostgreSQL not available")
-	}
-	defer db1.Close()
-
-	db2, err := newTestDB()
-	if err != nil {
-		t.Skip("PostgreSQL not available")
-	}
-	defer db2.Close()
-
 	orderedsync.ForTestingReset()
 	defer orderedsync.ForTestingReset()
 	ForTestingResetSingleton()
 	defer ForTestingResetSingleton()
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b1fcbf4 and 84e150e.

📒 Files selected for processing (3)
  • node/exts/erc20-bridge/erc20/cross_node_determinism_test.go
  • node/exts/erc20-bridge/erc20/determinism_test.go
  • node/exts/erc20-bridge/erc20/validator_signer_test.go
🧰 Additional context used
🧬 Code graph analysis (3)
node/exts/erc20-bridge/erc20/validator_signer_test.go (1)
node/exts/erc20-bridge/erc20/meta_extension.go (1)
  • EthereumSignedMessagePrefix (61-61)
node/exts/erc20-bridge/erc20/determinism_test.go (4)
node/exts/ordered-sync/for_test_reset.go (1)
  • ForTestingReset (12-12)
node/exts/erc20-bridge/erc20/for_test_shims.go (1)
  • ForTestingResetSingleton (177-189)
node/exts/evm-sync/chains/chains.go (1)
  • GetChainInfoByID (103-111)
node/exts/erc20-bridge/erc20/meta_extension.go (1)
  • PendingEpoch (2021-2025)
node/exts/erc20-bridge/erc20/cross_node_determinism_test.go (5)
node/exts/erc20-bridge/erc20/for_test_shims.go (1)
  • ForTestingResetSingleton (177-189)
node/exts/evm-sync/chains/chains.go (1)
  • GetChainInfoByID (103-111)
common/common.go (1)
  • Engine (206-224)
node/exts/erc20-bridge/erc20/meta_extension.go (1)
  • PendingEpoch (2021-2025)
node/exts/erc20-bridge/utils/crypto.go (1)
  • EthStandardVerifyDigest (147-172)
🔇 Additional comments (11)
node/exts/erc20-bridge/erc20/validator_signer_test.go (3)

95-107: LGTM! Correct Ethereum signature recovery with prefixed message hash.

The update properly handles the Ethereum signed message prefix by computing ethSignedMessageHash and adjusting V from 27/28 to 0/1 for crypto.SigToPub. This aligns with the EthereumSignedMessagePrefix constant from meta_extension.go and matches how signMessage creates signatures.


297-317: LGTM! Comments accurately document ON CONFLICT DO NOTHING behavior.

The updated comments correctly explain that duplicate votes are silently ignored and the first signature is retained, which aligns with the test assertions.


531-542: LGTM! Consistent signature recovery pattern.

This follows the same correct pattern as in TestVoteEpochAction, ensuring consistency across tests.

node/exts/erc20-bridge/erc20/determinism_test.go (4)

20-122: LGTM! Good test for reward aggregation via ON CONFLICT DO UPDATE.

This test properly verifies that multiple withdrawals by the same recipient in the same epoch are aggregated into a single row with the summed amount.


124-231: LGTM! Well-structured determinism test for Merkle tree computation.

The test correctly verifies that shuffling the insertion order of withdrawals produces identical Merkle roots across all iterations.


352-482: LGTM! Good test for signature ordering determinism.

The test properly verifies that getEpochSignatures returns signatures in a consistent order across multiple queries, regardless of insertion order.


589-666: LGTM! Good edge case test for empty epochs.

This test properly verifies that epochs with no rewards are handled gracefully, returning nil values to signal that the epoch should not be finalized.

node/exts/erc20-bridge/erc20/cross_node_determinism_test.go (4)

20-207: LGTM! Well-structured cross-node Merkle consistency test.

This test effectively simulates two nodes processing the same withdrawals in different orders and verifies they compute identical Merkle roots, which is critical for consensus.


288-417: LGTM! Comprehensive aggregation determinism tests.

Good table-driven test structure covering various amount combinations including large numbers near uint256 limits. The test complements TestDuplicateWithdrawalsSameEpoch from determinism_test.go by testing more edge cases.


419-440: LGTM! Simple and effective determinism test for message hash computation.


442-567: LGTM! Good test for voting power commutativity.

The test effectively verifies that the order in which validators submit their votes doesn't affect the final vote count, which is important for consensus consistency.

Comment thread node/exts/erc20-bridge/erc20/determinism_test.go
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
node/exts/erc20-bridge/erc20/determinism_test.go (1)

543-587: Consider moving t.Skip to the beginning of each subtest.

The t.Skip at line 585 runs after significant setup work (database transaction, instance creation, epoch creation, finalization). Moving the skip to the beginning of the subtest would avoid unnecessary resource allocation.

💡 Suggested improvement
 	for _, tc := range testCases {
 		t.Run(tc.name, func(t *testing.T) {
+			// TODO: Requires mock validator injection - implement in separate PR
+			t.Skip("Requires mock validator injection - TODO in separate PR")
+
 			tx, err := db.BeginTx(ctx)
 			require.NoError(t, err)
 			defer tx.Rollback(ctx)
 
 			app := setup(t, tx)
 			// ... rest of setup ...
-
-			// Mock validators with specified voting power
-			// TODO: This requires injecting mock validators into app.Validators
-			// For now, skip this test as it requires more infrastructure
-			t.Skip("Requires mock validator injection - TODO in separate PR")
 		})
 	}
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 84e150e and 6b8d717.

📒 Files selected for processing (2)
  • node/exts/erc20-bridge/erc20/cross_node_determinism_test.go
  • node/exts/erc20-bridge/erc20/determinism_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • node/exts/erc20-bridge/erc20/cross_node_determinism_test.go
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: outerlook
Repo: trufnetwork/kwil-db PR: 1609
File: node/exts/erc20-bridge/erc20/meta_extension.go:665-684
Timestamp: 2025-09-16T11:49:48.043Z
Learning: Kwil enforces deterministic ordered transaction processing (tx by tx), which eliminates race conditions and TOCTOU vulnerabilities that would exist in traditional concurrent database systems. Balance checks followed by balance updates are effectively atomic at the transaction level.
🧬 Code graph analysis (1)
node/exts/erc20-bridge/erc20/determinism_test.go (3)
node/exts/ordered-sync/for_test_reset.go (1)
  • ForTestingReset (12-12)
node/exts/erc20-bridge/erc20/for_test_shims.go (1)
  • ForTestingResetSingleton (177-189)
node/exts/erc20-bridge/erc20/meta_extension.go (1)
  • PendingEpoch (2021-2025)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: test
🔇 Additional comments (6)
node/exts/erc20-bridge/erc20/determinism_test.go (6)

1-18: LGTM!

Build tag and imports are appropriate. Using math/rand for test shuffling is acceptable since cryptographic randomness isn't required for determinism testing.


20-122: LGTM!

Well-structured test that verifies the ON CONFLICT DO UPDATE aggregation behavior. The test correctly validates that multiple withdrawals by the same recipient in the same epoch produce a single aggregated row with the correct sum.


124-231: LGTM!

Excellent determinism test that validates Merkle tree consistency across different insertion orders. The approach of running 10 iterations with shuffled inputs and verifying identical roots is effective for catching non-deterministic behavior. This directly addresses the PR's objective of preventing non-determinism between validators.


233-351: LGTM!

This test effectively validates that the ORDER BY recipient ASC, amount ASC clause (mentioned in the AI summary for meta_sql.go) produces deterministic ordering even when amounts are identical. The expected order correctly reflects byte-wise sorting of Ethereum addresses, and the EIP-55 checksummed format matches the Recipient.Hex() output.


353-483: LGTM!

Good test for vote ordering determinism. The approach of inserting votes in random order and then verifying consistent retrieval order across multiple queries is effective. The comment at line 441 clarifies that signMessage handles Ethereum prefix internally, which aligns with the validator_signer_test.go changes mentioned in the summary.


590-667: LGTM!

Good edge case test that validates proper handling of empty epochs. Verifying that genMerkleTreeForEpoch returns zero leaves and nil values for an epoch with no rewards is important for preventing consensus issues when validators encounter empty epochs.

@MicBun MicBun merged commit a72f8a7 into main Jan 12, 2026
4 checks passed
@MicBun MicBun deleted the QADeterminism branch January 12, 2026 15:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

consensus-change PR that contains consensus changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant