fix(forks,validation): post-fork by default on fresh chains + GCREdit hash mismatch#861
Conversation
…fix GCREdit hash mismatch on post-fork txs
Two coupled fixes that surfaced while debugging why a freshly-booted
node refused every transfer with `GCREdit mismatch` even on a chain
whose genesis.balances had loaded correctly.
1) DEFAULT_FORK_CONFIG.osDenomination.activationHeight: null → 0
Every chain shipped via `./run -b true` / `wipe_and_reboot.sh` is a
brand-new chain with no pre-fork peers to maintain wire compatibility
with. The previous default (`null`) was a back-compat hedge for the
incentives-campaign testnet that crossed the fork mid-flight; that
window has closed and operators on a clean docker boot were silently
ending up pre-fork (only whole-DEM transfers accepted, sub-DEM
SubDemPrecisionError on every "10% of a non-round balance" op).
gasFeeSeparation stays at `null` because its activation requires a
real treasury address — the placeholder zero address is loader-rejected
when activationHeight !== null. Operators who want it on a fresh chain
set both fields explicitly.
Operators upgrading an existing pre-fork chain MUST still pin
`activationHeight: <future block>` explicitly in
data/genesis.json.forks.osDenomination — overriding the default to
null remains the safe path for a live cross-fork upgrade.
`data/genesis.json` in this repo is correspondingly updated:
osDenomination.activationHeight 0 (active from block 0); gasFeeSeparation
stays null until a real treasury is wired in.
2) endpointValidation.handleValidateTransaction normalises the
regenerated gcr_edits through the SDK serializer before hashing.
Root cause: when the fork is active, the SDK serializer rewrites every
`gcr_edits[].amount` for balance/escrow/validatorStake entries to a
canonical OS decimal string (e.g. the 1-DEM gas edit becomes
`amount: "1000000000"`). The node then runs `GCRGeneration.generate(tx)`
to recompute the expected edits, but that helper is a pure author
producing the raw author shape (`amount: 1` as a JS number in DEM for
the gas edit). Hashing the two shapes diverged, surfacing as
`GCREdit mismatch` on every post-fork transfer even though the edit
set was structurally identical (subtract + add + gas + nonce, same
accounts, same base amounts).
Fix: wrap the regenerated edits into a throwaway
`{ ...tx.content, gcr_edits }` envelope and run
`denomination.serializeTransactionContent` on it, then pull `.gcr_edits`
back out. That reuses the SDK's canonical wire-shape transform (which
walks gcr_edits[] per transformEditPostFork / …PreFork) without us
having to mirror the per-edit-type rewrite logic node-side.
Block-height source is `Chain.getLastBlockNumber()`; postFork is
decided by `isForkActive("osDenomination", blockHeight)`, the same
gate the SDK ultimately consults via getNetworkInfo.
3) Pre-existing forks unit tests that asserted "default is null" now
pin activationHeight=null explicitly so they continue exercising
the legacy pre-fork path while the library default is post-fork.
No assertion changes — just an explicit setup line.
Tests:
- 24 forks unit tests now pass on the new default.
- Remaining 6 failures in testing/forks/integration.test.ts +
osDenomination migration tests are pre-existing and reproduce on
stabilisation baseline (verified by stashing this diff and re-running);
out of scope for this PR.
Manual repro on dev.node2 mainnet-beta:
- pre-fix: `bun transfer_10pct.mjs` failed at confirm with "GCREdit mismatch"
- with this fix + wipe_and_reboot.sh applied: confirm + broadcast land,
receiver balance updates within one block.
Qodo reviews are paused for this user.Troubleshooting steps vary by plan Learn more → On a Teams plan? Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center? |
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (8)
Disabled knowledge base sources:
WalkthroughThis PR updates the ChangesosDenomination fork default and validation
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Greptile SummaryThis PR delivers two coupled fixes for fresh-chain boot issues: it changes the
Confidence Score: 4/5Safe to merge for fresh-chain deployments; live-upgrade operators should be aware of the fork-boundary timing gap documented in the PR description. The default-flip and the normalisation fix both address clearly diagnosed bugs that were blocking every post-fork transfer on fresh chains. The test hardening is correct and well-isolated. The two remaining concerns — the silent src/libs/network/endpointValidation.ts — the normaliseGcrEditsForHash helper deserves a closer look, particularly the empty-array fallback and the block-height sampling point. Important Files Changed
Sequence DiagramsequenceDiagram
participant SDK as SDK (Client)
participant Node as Node (endpointValidation)
participant Chain as Chain.getLastBlockNumber()
participant GCRGen as GCRGeneration.generate()
participant Ser as denomination.serializeTransactionContent()
SDK->>SDK: getNetworkInfo → isForkActive(blockH_sdk)
SDK->>SDK: serializeTransactionContent(content, postFork_sdk)
note over SDK: gcr_edits[].amount = OS string (post-fork)
SDK->>Node: submit Transaction (signed, wire-shape gcr_edits)
Node->>GCRGen: generate(tx) → raw edits (DEM numbers)
Node->>Chain: getLastBlockNumber() → blockH_node
note over Node: isForkActive("osDenomination", blockH_node) → postFork_node
Node->>Ser: "serializeTransactionContent({...tx.content, gcr_edits: regenEdits}, postFork_node)"
Ser-->>Node: "normalisedRegen (gcr_edits[].amount = OS string)"
Node->>Node: hash(normalisedRegen) vs hash(tx.content.gcr_edits)
note over Node: ✅ both OS strings → hashes match
Reviews (1): Last reviewed commit: "fix(forks,validation): make fresh chains..." | Re-trigger Greptile |
| const parsed = JSON.parse(serialised) as { gcr_edits: GCREdit[] } | ||
| return parsed.gcr_edits ?? [] |
There was a problem hiding this comment.
Silent empty fallback masks serializer failure
parsed.gcr_edits ?? [] returns an empty array if denomination.serializeTransactionContent produces JSON without a gcr_edits key (e.g. unexpected content shape or SDK API drift). In that path, JSON.stringify([]) hashes to a fixed value that will never match the SDK edits, so every transaction silently fails with the generic "GCREdit mismatch" error rather than a diagnostic message pointing at the serializer. A log.warn before the fallback would let operators distinguish a genuine edit mismatch from a serializer contract break.
| const blockHeight = await Chain.getLastBlockNumber() | ||
| const postFork = isForkActive("osDenomination", blockHeight) |
There was a problem hiding this comment.
Fork-state divergence at live-upgrade boundary
blockHeight is sampled at node validation time, but the SDK determines postFork when it calls getNetworkInfo before signing. For a chain doing a live upgrade with an explicit activationHeight: N, a transaction signed at block N-1 (SDK sees pre-fork → serializes pre-fork) can arrive at the node after block N is accepted. The node then samples block N or later, normalises as post-fork, and the hash comparison fails even though the transaction is structurally valid. Fresh chains with activationHeight: 0 are unaffected, but a comment noting this known gap for upgrade scenarios would help operators debugging spurious mismatches at fork transitions.
…38,0) so post-fork OS values cannot overflow at INSERT (#863) Concretely: a 10 % transfer out of a 10^18 DEM genesis-funded account is 10^17 DEM × 10^9 OS/DEM = 10^26 OS — three orders of magnitude past PG `bigint` (int8, max ≈ 9.22 × 10^18). Pre-this-fix `validateTransaction` accepted the tx (`getAccountBalance` already reads `gcr_main.balance` as `numeric(38, 0)`) but `insertBlock` crashed the consensus loop: QueryFailedError: value "100000000000000000000000000" is out of range for type bigint The four money-shaped columns on `transactions` (`amount`, `networkFee`, `rpcFee`, `additionalFee`) are widened to `numeric(38, 0)` to match `gcr_main.balance`. They share its `bigintNumericTransformer` so the application-level type stays `bigint` end-to-end — readers / writers don't change shape. Why not just sub-DEM amounts (which fit)? OS-magnitude wire values already land in `amount` post-fork (see DemosTransactions.pay where `wireAmount = amountOs.toString()` post-fork), and any genesis-funded account big enough to be useful (founder + incentives wallets) will trip this on first transfer. Migration: 1779834000000-WidenTransactionsMoneyColsToNumeric. Idempotent ALTER COLUMN … TYPE numeric(38, 0) USING (implicit cast); defaults restored explicitly after the TYPE change (Postgres drops them). `down()` reverses to `bigint` — narrowing is lossy on any row with a post-fork-magnitude value, so the migration log clarifies operators must wipe before reverting. Manual repro on a fresh devnet booted with the funded-genesis fixture: - pre-fix: confirm() returns valid=true, broadcast lands tx in reference_block=2, then node-1 crashes at insertBlock with "value '1e26' is out of range for type bigint", chain stops at block 2, no receiver-side balance update visible. - post-fix: chain advances to block 9+, receiver balance moves from 10^27 OS (genesis) to 1.1×10^27 OS (genesis + 10^26 OS transfer). GCREdit hash compare still passes (PR #861 is the predecessor here). Why 9 decimals and not 18 (Ethereum-style): Demos uses 1 DEM = 10^9 OS. The overflow comes from supply magnitude (founder accounts pre-funded at 10^18 DEM), not decimal count. Co-authored-by: tcsenpai <tcsenpai@discus.sh>
…he hash compare (#867) dev.node2 production logs after PR #861 still showed: gcrEditsHash: 16f9495100b98... (constant across runs) txGcrEditsHash: 6dabe23e82ad... (varies per run) → GCREdit mismatch The regen side normalises (txhash blanked + serializer-walk) but the tx-side input was hashed straight from `tx.content.gcr_edits` without the same treatment. Whatever field varies in the wire payload — most likely a populated `txhash` on inbound edits some SDK flow ships — ended up in the hash and diverged from the blanked-regen value. The right thing is symmetry: run both sides through the IDENTICAL normalisation. After this patch: - txhash: forced empty on both regen and tx-side edits before hash - amount shape: both sides walk through `normaliseGcrEditsForHash` so the serializer's per-edit-type rewrite applies uniformly Any future SDK or peer that ships gcr_edits with a different edit-internal shape (re-keyed JSON object order, populated txhash, post-fork OS string vs pre-fork DEM number on the amount field) now hashes identically on both sides of the comparison. The compare is truly checking "did the SDK generate the same logical edit set as GCRGeneration.generate?" instead of "does the SDK happen to produce byte-identical JSON to GCRGeneration.generate?". Verified by manual repro on dev.node2 after a rebuild that picks this up: - pre-fix: confirm() returns valid=false, "GCREdit mismatch" - post-fix: confirm() returns valid=true, broadcast lands (Pre-existing PR #861 normalisation was load-bearing on the regen side; this PR makes the symmetric move on the tx-side input.) Co-authored-by: tcsenpai <tcsenpai@discus.sh>
…ion so applyGasFeeSeparation prepends don't break the hash compare (#871) Root cause of the dev.node2 GCREdit mismatch that survived PRs #861/#867/#869: confirmTransaction runs `applyGasFeeSeparation` when the gasFeeSeparation fork is active, which PREPENDS node-computed fee edits onto `tx.content.gcr_edits` in place. The hash comparison ran AFTER that mutation, so: tx.content.gcr_edits (mutated): [fee_a, fee_b, ..., subtract, add, gas, nonce] GCRGeneration.generate(tx): [subtract, add, gas, nonce] The compare was diverging by exactly the prepended fee edits — a structural mismatch the SDK has no way to predict because the fee distribution is the validator's computation, not the signer's. dev.node2 reproduced and the local devnet did NOT because the local devnet boots with `gasFeeSeparation.activationHeight: null` → `applyGasFeeSeparation` is gated off → no prepend → the (already shipped) #861/#867 normalisation was sufficient. Fix: snapshot `tx.content.gcr_edits` via structured clone BEFORE calling `confirmTransaction`, then run the hash compare against the snapshot instead of the mutated array. The invariant we want to verify is "did the SDK ship the same edits GCRGeneration would regenerate?" — that's what the snapshot captures. Fee edits live downstream of this check and become part of the signed validity data via the existing `signValidityData` flow. Verified by tracing call sites: - applyGasFeeSeparation.ts:187 mutates tx.content.gcr_edits in place - validateTransaction.ts:133 calls it inside confirmTransaction - endpointValidation.ts:35 calls confirmTransaction, line ~42 was then reading the already-mutated tx.content.gcr_edits Manual repro from earlier in the session: BROADCAST=1 bun transfer_10pct.mjs against dev.node2 returned `valid: false` / "GCREdit mismatch" on every attempt despite the source on disk being correct (post-#867). Tracing showed regen produced a constant `16f9495100b98…` hash (the SDK-shipped 4-edit set) while tx-side hash varied per run — and the varying part was the SHA-256 of the prepended fee edits, whose contents depend on per-block fee state. Co-authored-by: tcsenpai <tcsenpai@discus.sh>
Summary
Two coupled fixes surfaced while debugging why a freshly-booted node refused every transfer with
GCREdit mismatchon a chain whosegenesis.balanceshad loaded correctly (PR #858 prerequisite applied).Change 1 —
osDenomination.activationHeightdefault:null→0Every chain shipped via
./run -b true/wipe_and_reboot.sh/docker --cleanis a brand-new chain with no pre-fork peers to maintain wire compatibility with. The previous default (null) was a back-compat hedge for the incentives-campaign testnet that crossed the fork mid-flight; that window has closed.With the old default, operators on a clean docker boot silently ended up pre-fork — only whole-DEM transfers accepted,
SubDemPrecisionErroron every "10% of a non-round balance" op.gasFeeSeparationstays atnullbecause its activation requires a real treasury address (the placeholder zero is loader-rejected whenactivationHeight !== null). Operators who want it on a fresh chain set both fields explicitly.Cross-fork upgrade path preserved. Operators upgrading an existing pre-fork chain MUST still pin
activationHeight: <future block>explicitly — overriding tonullremains the safe path for a live cross-fork upgrade.data/genesis.jsonin this repo updated to match:osDenomination.activationHeight: 0.Change 2 —
endpointValidationnormalises regeneratedgcr_editsbefore hashingRoot cause of the
GCREdit mismatcherrors:gcr_edits[].amountfor balance/escrow/validatorStake entries to a canonical OS decimal string when the fork is active. The 1-DEM gas edit becomesamount: "1000000000".GCRGeneration.generate(tx)to recompute the expected edits. That helper is a pure author producing the raw author shape (amount: 1as a JS number in DEM for the gas edit).GCREdit mismatchon every post-fork transfer even though the edit set is structurally identical (subtract + add + gas + nonce, same accounts, same base amounts).Fix: wrap the regenerated edits into a throwaway
{ ...tx.content, gcr_edits }envelope and rundenomination.serializeTransactionContenton it, then pull.gcr_editsback out. That reuses the SDK's canonical wire-shape transform (pertransformEditPostFork/…PreFork) without us having to mirror the per-edit-type rewrite logic node-side.Block-height source is
Chain.getLastBlockNumber();postForkis decided byisForkActive("osDenomination", blockHeight), the same gate the SDK ultimately consults viagetNetworkInfo.Change 3 — Forks unit-test setup hardened
Pre-existing forks unit tests that asserted "default is null" now pin
activationHeight = nullexplicitly so they continue exercising the legacy pre-fork path while the library default is post-fork. No assertion changes — just an explicit setup line per test.Manual repro on
dev.node2.demos.sh:53552bun transfer_10pct.mjsfailed atconfirm()with"GCREdit mismatch".wipe_and_reboot.shapplied: confirm + broadcast land, receiver balance updates within one block (verified via repeatedgetAddressInfopolls).Diagnosis artefacts
Side-by-side dump of SDK-shipped
tx.content.gcr_editsvsGCRGeneration.generate(tx)regen (seedebug_gcr_edits.mjs— kept out of this PR; sanity script):Test plan
bun test testing/forks/— 156 pass / 9 skip (Postgres-gated)bun run type-check-ts— no new errors introduced (pre-existingl2ps-messaging+worker-threads-testerrors unchanged)dev.node2returns "broadcast OK" + receiver balance incrementdev.node3: requireswipe_and_reboot.shfirst (chain isn't yet on a build with PR fix(genesis): overlay genesisData.balances on snapshot rows #858)Pre-existing failures NOT introduced by this PR
bun test testing/forks/shows 6 failures that reproduce onstabilisationbaseline:forks integration (P3d) > scenario 1: snapshot replay across fork height …osDenomination migration > Test 3 / 4 / 6 / 9 / 10Verified by stashing this diff and re-running — same failures. Out of scope.
Summary by CodeRabbit
Configuration Changes
Bug Fixes
Tests