Skip to content

fix(signing): pin Mesh SDK + reject witnesses that don't verify against the tx body#257

Merged
QSchlegel merged 7 commits into
preprodfrom
fix/witness-signature-mismatch-preprod
May 24, 2026
Merged

fix(signing): pin Mesh SDK + reject witnesses that don't verify against the tx body#257
QSchlegel merged 7 commits into
preprodfrom
fix/witness-signature-mismatch-preprod

Conversation

@QSchlegel
Copy link
Copy Markdown
Collaborator

Summary

A user submitting a Ballot Vote on a 2-of-3 DRep multisig hit ConwayUtxowFailure (InvalidWitnessesUTXOW [VKey ce7cbe8f…]). Decoding the failing tx with this repo's CSL showed:

  • blake2b-224(ce7cbe8f…) = f6ed79ef… — one of the 3 script signers. The key is correct.
  • pk.verify(currentBodyHash, signature) = false — the signature is over a different body than what we're submitting.
  • CSL body round-trip is byte-identical in our pipeline, so the body was not mutated locally.

The wallet signed a re-canonicalised body; our mergeSignerWitnesses then grafted just the vkey onto our original body, producing a tx whose witness can never verify on chain.

Root cause

package.json had ^1.9.0-beta.87 on @meshsdk/core/core-csl/core-cst/provider/react; the lockfile had drifted to 1.9.0-beta.102. Some patch in those 15 betas changed how Mesh emits CBOR (likely Conway voting_procedures map encoding), so app-built bodies no longer match what the wallet's encoder produces.

Fix (two parts)

  1. Pin Mesh deps to exact .87 (.86 for provider) — no more silent patch drift.
  2. Client-side guard: mergeSignerWitnesses now returns { txHex, invalidVkeyPubKeysHex }, verifying each newly-merged vkey's signature against the merged tx body hash. transaction-card.signTx() and useTransaction.newTransaction() abort with a destructive toast (and skip submission + DB persistence) when the wallet returns a witness that doesn't verify. The offending pubkey is logged.

Also:

  • jest.config.mjs: one-line moduleNameMapper for libsodium-wrappers-sumo's broken relative import (import "./libsodium-sumo.mjs" resolves to the separate libsodium-sumo package, which Node handles via exports but Jest's ESM resolver doesn't). Pre-existing CI gap that silently broke any test importing real CSL.
  • .gitattributes: mark package-lock.json as linguist-generated so the diff UI collapses it.

Test plan

  • New unit test src/__tests__/mergeSignerWitnesses.test.ts: 3 cases (happy, mismatched body, pre-existing witness preserved) — all pass.
  • npx tsc --noEmit on touched files: clean.
  • Decoded the user-shared failing txCbor manually; confirmed hash(vkey) == 'f6ed79ef…' (valid signer) but pk.verify(body, sig) is false. The new guard catches exactly this case before submit.
  • After deploy to preprod: user re-attempts the failing vote. Either it submits (the pinned Mesh produces bytes the wallet agrees with) or they see the new actionable error.

Pre-existing test failures on preprod (proxyCiPreflight, proxyDRepInfo, signTransaction mock missing isBotJwt, pendingTransactions, apiSecurity) are in files unrelated to this fix.

🤖 Generated with Claude Code

…st the tx body

A user submitting a "Ballot Vote" on a 2-of-3 DRep multisig hit
`ConwayUtxowFailure (InvalidWitnessesUTXOW [VKey ce7cbe8f...])`. Decoding the
failing tx with this repo's CSL showed the witness's vkey hashes to a *valid*
script signer, but its Ed25519 signature does not verify against the tx body
hash. The body itself is byte-stable across CSL round-trips, so the wallet
must have signed a different (re-canonicalised) body.

Cause: package.json had `^1.9.0-beta.87` on @meshsdk/core/core-csl/core-cst/
provider/react; the lockfile had drifted to 1.9.0-beta.102. Some patch in
those 15 betas changed how Mesh emits CBOR (likely voting_procedures), so
app-built bodies no longer match what the wallet's encoder produces — the
wallet's signature verifies against its body but not ours.

Fix:
- Pin Mesh deps to exact .87 (.86 for provider) — no more silent patch drift.
- `mergeSignerWitnesses` returns `{ txHex, invalidVkeyPubKeysHex }`,
  verifying each newly-merged vkey's signature against the merged tx body
  hash. Witnesses already on the tx are not re-verified.
- transaction-card.signTx() and useTransaction.newTransaction() abort with a
  destructive toast (and skip submission + persistence) when the wallet
  returns a witness that doesn't verify. Offending pubkey is logged for
  debugging. The wallet user gets an actionable message instead of a chain
  rejection.
- jest.config.mjs: one-line moduleNameMapper for libsodium-wrappers-sumo's
  broken relative `./libsodium-sumo.mjs` import (pre-existing CI gap).
- .gitattributes: mark package-lock.json as linguist-generated so diff
  reviewers can skip the lockfile churn.

Test plan:
- New unit test `mergeSignerWitnesses.test.ts`: 3 cases (happy, mismatched
  body, pre-existing witness preserved) — all pass.
- `npx tsc --noEmit` on touched files: clean.
- Manual: decoded the user-shared failing txCbor; confirmed
  `blake2b-224(vkey) == 'f6ed79ef...'` (valid signer) but
  `pk.verify(body, sig)` is false. The new guard catches exactly this case
  before submit.
- Pre-existing test failures on preprod (proxyCiPreflight, proxyDRepInfo,
  signTransaction mock missing isBotJwt, pendingTransactions, apiSecurity)
  are in files unrelated to this fix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented May 24, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
multisig Ready Ready Preview, Comment May 24, 2026 9:00pm

Request Review

…resolution

The 0.7.16 release ships `dist/modules-sumo-esm/libsodium-wrappers.mjs` with
a relative `import "./libsodium-sumo.mjs"` whose target file isn't in the
package — the actual sumo binary lives in the separate `libsodium-sumo`
package and is wired up via package.json `exports`. Node's strict ESM
resolver (used by `tsx` in the CI smoke runner, and by Next.js at build
time) doesn't follow the cross-package indirection and throws
`ERR_MODULE_NOT_FOUND` before any Mesh helper has a chance to run.

Preprod's pre-existing lockfile pinned 0.7.10, which works. Regenerating the
lockfile to land the Mesh pin pulled 0.7.16 (the latest in the ^0.7.5 range
the cardano-sdk transitively allows), breaking both `multisig-v1-smoke` and
the Vercel build. Override to 0.7.10 forces all `@cardano-sdk/crypto`
copies onto the working version.

Verified: `node --input-type=module -e "import('@meshsdk/core')"` now
succeeds (failed before the override). Unit tests still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… first signer

CI failure on the previous push exposed two issues with my initial pin to
.87:

1. **Smoke test "libsodium not initialized"**: pinning Mesh to `.87` pulled
   in different `@cardano-sdk/*` transitive versions than preprod's known-
   good lockfile, and the resulting libsodium-wrappers-sumo / libsodium-sumo
   combination fails WASM init under Node 20 in CI. Preprod runs Mesh
   `.102` for `core/core-csl/core-cst`, `.100` for `provider`, `-40` for
   `react`, and its smoke test passes — so pin to those exact versions.

2. **Pin alone doesn't fix the user**: with `.102` we're back on the same
   CBOR-encoding Mesh that broke their wallet's signature in the first
   place. The verify guard added in the previous commit converts that into
   a friendly error, but the user still can't vote.

   Extended `mergeSignerWitnesses`: when the wallet returns a full signed
   Transaction (not just a witness set), and its body bytes differ from
   ours, and we have no pre-existing witnesses to invalidate, *use the
   wallet's body*. The wallet's vkey signature was made over its body, so
   adopting it makes the signature verify and lets the submit succeed.
   This handles the first-signer case — which is the typical case for
   "I clicked Approve & Sign and it failed" — without changing behaviour
   for multi-signer flows where prior witnesses would be invalidated.

Also dropped the libsodium override commit's package.json entry — it's not
needed once we match preprod's exact Mesh versions, and trying to pin
`libsodium-sumo` to 0.7.10 broke Node 22's WASM loader.

Added unit test covering the body-swap recovery path; existing 3 tests
still pass (4 total).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…alpine

The previous push's lockfile was generated with my local npm 11.10.1, which
produces a `lockfileVersion: 3` lockfile npm 10 considers inconsistent:
`@simplewebauthn/browser@9.0.1` + `@simplewebauthn/types@9.0.1` were marked
"Missing from lock file" and `npm ci` refused to proceed in the Dockerfile.ci
build step. (CI runs `node:20-alpine`, which bundles npm 10.8.2 — the same
notice line in the failure log.)

Same fix the repo has applied twice before: regenerate with the matching
npm version. Confirmed:
- lockfile contains 14 @simplewebauthn entries
- Mesh resolved to the pinned versions (.102 / .100 / -40)
- mergeSignerWitnesses tests still 4/4 pass

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ild green

Vercel's previous attempts failed during webpack compile with:
  Module build failed: UnhandledSchemeError: Reading from "node:crypto" /
  "node:process" is not handled by plugins (Unhandled scheme).
  Import trace: @peculiar/webcrypto → @meshsdk/web3-sdk → @meshsdk/react

`@peculiar/webcrypto@1.7.x` switched its compiled output from `require('crypto')`
to ESM `import "node:crypto"`. webpack 5 (Next 16 `--webpack` mode) doesn't
handle the `node:` scheme without an explicit plugin, and we don't want to
add one — preprod's known-good lockfile resolves to `1.5.0`, which still
uses bare `crypto`. Regenerating the lockfile after pinning Mesh re-resolved
this to `1.7.1` (latest matching `^1.5.0`), reintroducing the issue. Pin to
1.5.0 to match preprod.

Verified: with the override, `node:` imports no longer appear in the webpack
trace; the residual local build failure is missing
`NEXT_PUBLIC_BLOCKFROST_API_KEY_PREPROD` env (Vercel has it set).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@QSchlegel
Copy link
Copy Markdown
Collaborator Author

CI status after the dep + libsodium + webcrypto pins

Check Status
Vercel ✅ pass
Vercel Preview Comments ✅ pass
smoke (CI Smoke - Preprod) ✅ pass
multisig-v1-smoke ❌ fail (pre-existing)

The remaining multisig-v1-smoke failure is not from this PR. It fails inside scenario.proxy-full-lifecyclev1.proxy.full.hygiene.legacy with:

ConwayUtxowFailure (PPViewHashesDontMatch Mismatch (RelEQ)
  (supplied: SJust (SafeHash "b4bb2ab5…"),
   expected: SJust (SafeHash "a2ef4cbe…")))

That's a script_data_hash mismatch on a PlutusV3 proxy tx — cost-models / redeemer encoding drift between Mesh's MeshTxBuilder.complete() and what the chain computes. The fix lives in PR #256 (refactor/ci-and-browser-tx-building-alignment), which is the only branch with green multisig-v1-smoke runs on the recent history. It touches ~50 files across proxy tx builders, completeTxWithFreshCostModels, and the proxy API endpoints — out of scope for this fix.

This PR only changes client-side witness signing (mergeSignerWitnesses + the two call sites that consume it) and a tightening of dep pins. The proxy bot signs via the server /api/v1/signTransaction route, which is untouched here.

Suggested options:

  • Merge after PR Refactor/ci-and-browser-tx-building-alignment #256 lands and rebase to pick up the proxy fix.
  • Or merge now if the proxy-lifecycle failure is acceptable (the fix is independently valuable: real users hitting InvalidWitnessesUTXOW on votes get a clear client-side error instead of a chain rejection).

(For audit: pinned Mesh to preprod's current .102/.100/-40, pinned @peculiar/webcrypto to 1.5.0 to keep webpack happy with node: URIs, regenerated lockfile with npm 10.8.2 to match the Dockerfile.ci runtime.)

@QSchlegel
Copy link
Copy Markdown
Collaborator Author

Proof that multisig-v1-smoke failure is pre-existing and unrelated to this PR:

Triggered the smoke workflow against the preprod branch directly with no PR changes: run 26372153640. It fails on v1.proxy.full.hygiene.legacy with the exact same byte-identical hash mismatch:

ConwayUtxowFailure (PPViewHashesDontMatch Mismatch (RelEQ)
  (supplied: SJust (SafeHash "b4bb2ab5a34f50ecdcc0d9a598bc979763caa450ef8e028decd65960a5e7da5d"),
   expected: SJust (SafeHash "a2ef4cbe622dc56e974bb08e80c3d6ac7642b1f86f7c9449d3133090f1dd509d")))

That b4bb2ab5… supplied hash and a2ef4cbe… expected hash are the same values seen on this PR's smoke run. Same proxy script CBOR, same scenario, same failure mode.

Root cause (separate from this PR): preprod's PlutusV3 proxy txs use Mesh's bundled cost models. The Cardano preprod chain has updated its cost models since Mesh 1.9.0-beta.102 published, so the script_data_hash = hash(redeemers ++ datums ++ cost_models_subset) Mesh computes no longer matches what the node recomputes from current params.

The fix exists in PR #256src/lib/server/completeTxWithFreshCostModels.ts, which fetches epochs/latest/parameters from Blockfrost and injects the live cost models into the tx builder before complete(). That PR (~50 files, all proxy + cost-model work) is the only branch with green multisig-v1-smoke runs in recent history. Once it lands on preprod, both this PR and trunk smoke will go green.

Doing that proxy refactor inside this client-side-signing PR is out of scope.

…handling

- Updated proxy transaction APIs to utilize `completeTxWithFreshCostModels` for transaction completion, enhancing cost model handling.
- Adjusted `getTxBuilder` to accept a flag for using the CSL serializer, improving flexibility in transaction building.
- Enhanced unit tests for proxy cleanup, setup, spend, vote, and DRep certificate APIs to validate the new transaction completion logic.
- Added error handling for PPView hash mismatches during transaction submission, ensuring better feedback on transaction integrity issues.
…e tests

- Refactored imports to streamline the usage of `completeTxWithFreshCostModels` across the codebase.
- Updated unit tests to reflect changes in cost model handling, including support for raw arrays and ordering of indexed cost model objects.
- Added new test cases to validate the rejection of improperly ordered cost model objects, ensuring robustness in transaction processing.
@QSchlegel QSchlegel merged commit 8835b95 into preprod May 24, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants