Skip to content

feat: #225 E7 — ERC-4337 accept-batch callData builders (atomic P.2+P.3)#227

Merged
hanwencheng merged 86 commits into
mainfrom
claude/225-onchain-k11-accept
Jun 9, 2026
Merged

feat: #225 E7 — ERC-4337 accept-batch callData builders (atomic P.2+P.3)#227
hanwencheng merged 86 commits into
mainfrom
claude/225-onchain-k11-accept

Conversation

@hanwencheng

@hanwencheng hanwencheng commented Jun 7, 2026

Copy link
Copy Markdown
Member

What this is

#225 — on-chain K11-gated agent-accept (#164 plan E7 + the account-auth cutover): the accept becomes one P256Account.executeBatch([registerAgentDevice, setScope]) UserOp — P.2 + P.3 in one block, one K11 signature, atomic.

This PR is the Rust foundation + the cutover tooling/spec. All Rust is pure + CI-verifiable; the cutover script is bash -n-clean (its live run is operator-gated).

What landed

Rust foundation (all unit-tested):

  1. crates/agentkeys-core/src/erc4337.rs — callData builders (register_agent_device / set_scope / execute_batch / accept_batch_calldata), golden-tested byte-for-byte vs foundry cast.
  2. crates/agentkeys-broker-server/src/sponsored_accept.rsassemble_accept_userop: composes the batch callData + the broker VerifyingPaymaster co-sign (feat: #164 sponsored ERC-4337 register + v2-demo harness restructure #200 Stage A) into a complete PackedUserOperation + the userOpHash to K11-sign.
  3. crates/agentkeys-backend-client/src/protocol.rs — the daemon↔broker /v1/accept/{build,submit} wire types (refactor: #203 agentkeys-backend-client — ONE owner for the broker/worker chain #204 one-owner), frozen-keyset tests + regenerated fixtures.
  4. sponsored_accept.rsPackedUserOpWireUserOp conversion + the /v1/accept/build response builder (round-trip + server-side frozen-key tests).

Cutover tooling + spec (the e2e unblock):
5. docs/plan/chain/account-auth-cutover.md — the precise, idempotent, phased cutover spec, with the consequence (redeploy → state reset → full re-bootstrap) called out, and the decoupling finding: the master can be registered as a P256Account on today's contracts (via the existing erc4337-register-master.sh), so only the accept batch's setScope half actually needs the disruptive redeploy.
6. scripts/heima-cutover-account-auth.sh — the one new script: forces the account-auth v2-set redeploy (FORCE_DEPLOY=1 heima-bring-up.sh), idempotent via a CUTOVER_DONE_<profile> marker + a read-only setScope-selector (d8e9e3c6) bytecode probe, --yes-gated (destructive), surgical-helper classified. bash -n clean.

cargo test + clippy green across core, broker-server, backend-client.

What did NOT land (remaining #225 work)

  • Broker /v1/accept/{build,submit} axum routes — mirror the wire shapes server-side, J1_master auth (like mint_cap), eth_call operatorMasterWallet/getNonce, call assemble_accept_userop + into_build_response; submit relays handleOps. Needs the broker co-sign key + new env (entry_point/paymaster/gas) wired through the Config data class + lazy, config-driven memory list (Phases 1–5) #201 3-file env discipline.
  • Daemon wiring — call /v1/accept/build → K11-sign → /v1/accept/submit; replaces the deployer-cast send in register_pairing.
  • Browser ceremonyceremony.tsx Touch ID over the userOpHash. Hardware; operator-verified.
  • Running the cutoverheima-cutover-account-auth.sh --yes on live mainnet + re-bootstrap (operator), OR the no-disruption partial path (register master-as-account today). Live-mainnet operator action.

Refs #225. Continues #164 / #171 / #200.

🤖 Generated with Claude Code

…st live infra)

The agent-facing consumer of the #216 cred-fetch primitive, verified end-to-end
against the LIVE broker + cred worker:

- agentkeys-cli: `agentkeys cred fetch <service>` (cred_admin.rs) — mints a
  master-self/agent CredFetch cap → BackendClient.cred_fetch → STS → cred worker
  → decrypt → prints the plaintext. Adds the agentkeys-backend-client dep (the
  #204 one-owner path; no re-typed wire shapes).
- harness/cred-fetch-demo.sh — the real e2e: a master VAULTS a probe cred via the
  daemon (web path), then the agent FETCHES it via the CLI (agent path), asserting
  the EXACT secret round-trips through cap-mint → STS → cred worker → S3 → decrypt.
  Idempotent (fixed `cred-e2e-probe`), --ci-tolerant, real-only. Contract-compliant
  (STEP_TOTAL=4, ok/skip/fail, EXIT-trap daemon cleanup).
- keep-docs-in-sync: harness/CLAUDE.md orchestrator table + operator-runbook-harness.md.

VERIFIED LIVE (this run): master vaulted via daemon (HTTP 200), agent
`cred fetch` returned the EXACT key (len matched) — broker.litentry.org +
cred.litentry.org. #216's cred half is proven, not just compiled.

Remaining #216: the Hermes wire (phase1-wire Phase 4.0) — plant the fetched key
into Hermes instead of $OPENROUTER_API_KEY (the full sandbox surprise).
… live, real LLM)

Carries the #216 cred-fetch through the Hermes wire — the complete agent-side
guarantee, proven end-to-end against the LIVE broker + cred worker + aiosandbox:

  master VAULTS the LLM key  (daemon: cap-mint cred-store → STS → cred worker → S3)
    → agent CRED-FETCHES it  (agentkeys cred fetch: cap-mint cred-fetch → STS → decrypt)
    → plant into Hermes      (~/.hermes/.env + hermes config set model.*) IN THE SANDBOX
    → Hermes RUNS on the vault key (real LLM smoke) — NO OPENROUTER_API_KEY in the agent env

harness/cred-wire-demo.sh (STEP_TOTAL=6, contract-compliant, headless): asserts
the key Hermes uses == the master-vaulted key (sha), and that it arrived via the
vault fetch, not an ambient env var (the sandbox shell has no OPENROUTER_API_KEY;
the .env value is the cred-fetch result). The durable, no-Touch-ID complement to
phase1-wire-demo.sh Phase 4.0b — same wire result without the interactive gates.
Routes through the shared agentkeys-backend-client (#204).

VERIFIED LIVE (this run, real OpenRouter key):
  step 4  ok agent fetched the vaulted key from the vault (len=73, sha fddff3ff…) — no env read
  step 5  ok planted the vault-fetched key into ~/.hermes/.env + hermes config
  step 6  ok 6.1 vault-sourced — the key Hermes will use == the master-vaulted key, NOT an env var
  step 6  ok 6.2 llm smoke — Hermes answered using the VAULT-FETCHED key: "OK"
Exit 0. A REAL deepseek-v4-flash call via OpenRouter answered "OK" on the
vault-fetched key — #216's acceptance ("the agent runs on MY authorized key, not
the operator's env") proven with real data.

Idempotent (FIXED openrouter service; the .env key-line is rewritten not appended);
daemon killed on exit; --ci-tolerant. keep-docs-in-sync: harness/CLAUDE.md +
docs/operator-runbook-harness.md.
…→ dev fallback)

Replaces the operator-env-key write (#216's named target: phase1-wire-demo.sh:1072)
with the vault path: Phase 4.0b now fetches the agent's LLM key from the master's
VAULT via `agentkeys cred fetch cred:<service>` and plants THAT into the sandbox
Hermes — the $OPENROUTER_API_KEY/$LLM_API_KEY env becomes a clearly-labelled
DEV-ONLY fallback.

- Phase 4.0b resolves WIRE_KEY VAULT-FIRST (the agent-identity cred-fetch: operator
  session authorizes, actor=agent device — mirrors the memory cap-mint identity
  model), env-fallback only when the vault is unavailable. Backward-compatible: with
  no vaulted key / no cred scope the fetch fails and it degrades to the env key
  exactly as before, so the change is fallback-safe.
- SEED_SCOPE_SERVICES also grants the agent its cred scope (bare `$SERVICE` — the
  cred-fetch cap-mint hashes the bare service, unlike memory's `memory:<ns>`) so the
  P.3 pairing grant authorizes the vault fetch.
- Honest labelling throughout: the 0.6 step, the header, and the top overview now
  state the env key is the dev fallback and the vault is primary; the 4.0 ok line
  prints which source the planted key came from.

The full vault chain (master vaults → agent cred-fetches → plant → Hermes runs on
it, real LLM smoke) is proven headless + live by harness/cred-wire-demo.sh (this
PR). The interactive agent-identity path additionally needs the operator's Touch ID
cred-scope grant (P.3) + a seeded vault — until then Phase 4.0b labels + uses the
dev fallback.
…n fix (verified live)

Completes the CLI cred surface with the store half of `cred fetch`, and folds the
daemon's hand-rolled cred-store body into the crate (closing a #204 drift gap):

- agentkeys-backend-client: `CredStoreBody`/`CredStoreResp`/`CredStoreInput`/
  `CredStoreResult` (mirror the CredFetch types) + `BackendClient::cred_store`
  (cap-mint CredStore → per-actor STS under the VAULT role → cred worker
  `/v1/cred/store` → encrypt + S3 PUT). Exported from the crate.
- agentkeys-daemon: `store_master_credential_inner` now builds the worker body from
  the crate-owned `CredStoreBody` instead of an inline `serde_json::json!({...})`
  (#204 — "broker/worker request shapes have ONE owner"; a drifted field is now a
  compile error, matching the memory-put path).
- agentkeys-cli: `agentkeys cred store <service> --secret|--secret-env` (master-self
  by default). `--secret-env NAME` keeps the plaintext off argv / out of the shell
  history + process list. Prints the worker S3 key.

VERIFIED LIVE (CLI-only store→fetch round-trip, master-self):
  stored `cred-store-probe` → bots/941…/credentials/cred-store-probe.enc
  ✅ CLI store→fetch ROUND-TRIP PASS — agentkeys cred store works end-to-end

Scope note: this is the master-self vault primitive. The master provisioning a key
INTO the agent's S3 prefix (so the agent fetches with actor=agent) needs dual
bearers (operator session for cap-mint + agent session for the STS PrincipalTag)
and is #214's authorization-side job — deliberately out of #216 scope.

clippy -D warnings clean; cargo check green.
…web app + CLI, fresh start)

Restructures the wire runbook from a CLI/sandbox + memory-only "run the demo" doc
into the single fresh-start guide for testing the WHOLE wire — both the #216
vault-fetched LLM key and the permissioned memory — two ways:

- New top: the two guarantees, a two-paths table (web app vs CLI, same agent side),
  the fastest test (`harness/cred-wire-demo.sh`), and a fresh-start checklist
  (3 setup scripts + sandbox + OpenRouter key + master identity).
- Path A — Web app: `bash dev.sh` → onboard → vault the key (credentials page) →
  pair+authorize (pairing page, Touch ID). Honest "wired vs pending" note: the web
  vault + #214 pairing are real/on-chain today; the agent-identity vault-fetch needs
  #214's dual-bearer master-provisioning (not wired yet), so the master-self
  cred-wire-demo is the end-to-end proof.
- Path B — CLI: the existing phase1-wire-demo walkthrough, reframed.
- LLM-key gate now documents Phase 4.0b vault-first/env-fallback; "Verifying it
  worked" splits into the two deterministic checks; +3 web/cred troubleshooting rows;
  Appendix B gains the `cred store`/`cred fetch` primitives; cross-refs add the new
  demos + #216/#214 + dev.sh.

keep-docs-in-sync: folds back the cred-wire-demo + cred-store + Phase 4.0b changes
from this PR into the operator runbook.
Caught in review: Path A had the agent run in the sandbox (agentkeys-daemon
--request-pairing → cred fetch → wire hermes) but never said how the compiled
agentkeys / agentkeys-daemon / agentkeys-mcp-server binaries get INTO the sandbox.
They can't run there unless cross-built for the sandbox's Linux arch and uploaded
(the sandbox is aarch64/x86 Linux, not the operator's Mac) — which is what Path B /
phase1-wire-demo.sh Phase 1 does (target/sandbox-linux cross-build → sbx_put).

Rewrote Path A to be honest:
- The web app is ONLY the master's console; it does not provision the agent device.
- A. Vault the LLM key — fully standalone (no sandbox).
- B. Pair — needs the agent binaries in the sandbox first; and phase1-wire's Phase 1
  bundles the cross-build/upload WITH the CLI pairing (Phase P lives inside Phase 1),
  so there's no clean "binaries only" command and no one-command web-pairing flow yet
  (drive the web claim by hand: upload binaries, open a request, claim in the UI).
- C. End-to-end is the headless cred-wire-demo.sh / Path B.
Also corrected my own first attempt, which suggested `--skip-2..5` to "stage only the
sandbox" — that still runs Phase 1 and therefore CLI-pairs the agent.
…t + add sandbox-build-push.sh

Per review: the runbook treated Path A as leaning on Path B's harness for the agent
side. Now each path is a self-contained quick-start.

- NEW harness/sandbox-build-push.sh — Path A's standalone "compile agentkeys + push to
  the sandbox" command. Cross-builds the 3 binaries (agentkeys / -mcp-server / -daemon)
  for the sandbox's aarch64-Linux arch in the SAME cached arm64 builder image + cargo
  volumes phase1-wire-demo uses (warm tree re-pushes in seconds), uploads them to
  ~/.local/bin. Build + push ONLY — never pairs/wires. Re-run after any local change so
  the in-sandbox agent runs current source. VERIFIED live: pushed to the sandbox, and
  `agentkeys cred --help` there confirms the current #216 source.
- operator-runbook-wire.md restructured: "Two independent paths — pick one" with BRIEF
  quick-starts for each (Path A = sandbox-build-push.sh + dev.sh + 3 UI actions; Path B =
  one phase1-wire-demo command) + a "neither path" headless check (cred-wire-demo). Path A
  details now use sandbox-build-push.sh (dropped the phase1-wire dependence + the
  now-moot "harness bundles pairing" caveat); kept the honest #214 wired-vs-pending note.
- keep-docs-in-sync: harness/CLAUDE.md inventory + operator-runbook-harness.md.
…broker-url

Operator hit `Error: --broker-url (or AGENTKEYS_BROKER_URL) required for
--request-pairing` running the runbook command in the sandbox — my Path A command
dropped the required flag. Verified the corrected invocation in the live sandbox
(produces a pairing_code). Folded the complete, correct flow into Path A:

  1. sandbox: agentkeys-daemon --request-pairing --broker-url https://broker.litentry.org
     → prints pairing_code + a state_file (the request_id lives in the file, not stdout)
  2. web UI: claim the pairing_code (Touch ID)
  3. sandbox: agentkeys-daemon --retrieve-pairing --request-id <from state file> --broker-url …

Matches phase1-wire-demo.sh Phase P.0/P.1b exactly. Fixed both the quick-start and the
Path A — details command.
…needed)

`agentkeys-daemon --request-pairing` / `--retrieve-pairing` required --broker-url
(or AGENTKEYS_BROKER_URL) and errored without it — friction for the Path-A operator
running them in the sandbox. These commands ALWAYS need a broker, so default it:

- main.rs: new `const DEFAULT_PAIRING_BROKER_URL = "https://broker.litentry.org"`;
  run_request_pairing + run_retrieve_pairing now `unwrap_or_else(default)` instead of
  erroring. `--broker-url` / `AGENTKEYS_BROKER_URL` still override (e.g. a test broker).
  Deliberately NOT a global arg default — `--ui-bridge`'s unset broker_url keeps its
  "fall back to pre-sourced AWS creds" meaning (the §191 pre-Stage-7 path).

VERIFIED live: cross-built + pushed the daemon to the sandbox; `agentkeys-daemon
--request-pairing` (no flag) now defaults to prod + opens a §10.2 request (code
9ZpC8nwu…) — the "--broker-url required" error is gone.

Runbook (Path A quick-start + details) simplified to drop the flag; notes the prod
default + the override. clippy -D warnings clean; daemon tests green.
…-create.sh

`accept pairing · Touch ID` POSTed /v1/agent/pairing/register and got 502. Root
cause: register_pairing derived the agent-register script as a SIBLING of
--register-master-script, but the two are NOT co-located — dev.sh's master register
is harness/scripts/heima-register-first-master.sh while heima-agent-create.sh lives
in <repo>/scripts/. The sibling path (harness/scripts/heima-agent-create.sh) doesn't
exist, so `bash <missing>` exited non-zero → register_agent_device errored → 502.

Fix: resolve heima-agent-create.sh from candidates — the sibling (co-located case)
AND <repo>/scripts/ derived from the master script path — picking the first that
exists; fail with a clear SERVICE_UNAVAILABLE message if neither is found.

Verified: scripts/heima-agent-create.sh accepts exactly the args register_agent_device
passes (--label/--agent-address/--actor-omni/--device-key-hash/--pop-sig, from-pubkey
mode auto-detected), and a dry-run with the live agent details returns
{"ok":true,"skipped":"already-registered"} → register_agent_device → Ok(None) → 200.
The "no Touch ID" is expected (browser passkey UserOp is the E7-pending frontend item;
the register goes through the daemon script shell-out today). clippy -D warnings clean;
daemon tests green.
…ull request_id (slice 1)

The master pairing card showed a truncated "PAIR-CODE" that was actually the
request_id (never the agent's one-time code), with no value the operator could
cross-check against the agent — a confused-deputy surface (#224). Slice 1 surfaces
the values that ARE on both sides today, with no broker change/deploy:

- daemon (pending_binding_to_request): map the broker's device_key_hash →
  `deviceKeyHash` (+ short); keep `id` (the full request_id). The agent's
  `--request-pairing` already prints device_key_hash + D_pub, so these are the
  cross-verifiable identity.
- agent (run_request_pairing): print device_key_hash on the human-facing line so the
  operator reads it off the agent to compare.
- frontend (PairingRequest type + pairing card): replace the misleading "pair-code"
  with **device key hash · verify on agent** + **D_pub · verify on agent** (full) +
  **request id** (full handle). Operator confirms the device matches before
  accept · Touch ID.
- test: pending_binding_maps_to_pairing_request asserts the full deviceKeyHash.

Deferred to slice 2 (needs a broker change + deploy): created_at/expires_at
timestamps on the card (the broker pending row has no timestamps today) and the
`--force` supersede-prior-requests behavior. clippy/fmt clean; daemon tests + frontend
typecheck green.
…ual reload

acceptPairing did registerPairing + refreshPairing but never re-fetched the
actor tree, so a freshly-registered agent only appeared in the device/permission
views after the operator reloaded the page. Re-fetch listActors after a
successful register (matches finishPairingCeremony), surfacing the paired device
immediately.
The agent-accept gate (#225 / #164 E7) lands the device binding (registerAgentDevice,
P.2) and the scope grant (setScope, P.3) in ONE P256Account.executeBatch UserOp —
one block, one K11 signature, atomic. This adds the pure callData encoders that the
batch needs (the genuinely new primitive); the sponsored-UserOp envelope is already
owned by the broker's sponsor.rs (#200 Stage A).

New crates/agentkeys-core/src/erc4337.rs:
- register_agent_device_calldata  — registerAgentDevice(bytes32,bytes32,bytes32,bytes,bytes)
- set_scope_calldata              — setScope(bytes32,bytes32,bytes32[],bool,uint128,uint128,uint128,uint32)
- execute_batch_calldata          — executeBatch(address[],uint256[],bytes[])
- accept_batch_calldata           — the headline: executeBatch([register, setScope]);
                                     threads the agent's actor_omni into both inner calls
                                     so they can't disagree on which agent they bind.

Hand-rolled ABI (no alloy/ethabi — matches sponsor.rs/audit::calldata style), reusing
the public audit::calldata::selector so selectors never drift. Golden-tested byte-for-byte
against foundry cast for all three:
  cast calldata "registerAgentDevice(bytes32,bytes32,bytes32,bytes,bytes)" ...
  cast calldata "setScope(bytes32,bytes32,bytes32[],bool,uint128,uint128,uint128,uint32)" ...
  cast calldata "executeBatch(address[],uint256[],bytes[])" "[reg,scope]" "[0,0]" "[reg_cd,scope_cd]"
fixtures committed under src/testdata/. cargo test + clippy green.

First slice of #225; the submission client (#200 Stage B), the daemon wiring, the
browser ceremony, and the on-chain cutover remain (tracked in #225).
Ties the two existing halves into one ready-to-sign PackedUserOperation:
- intent: agentkeys_core::erc4337::accept_batch_calldata (the atomic
  executeBatch([registerAgentDevice, setScope]), P.2+P.3)
- sponsorship: the broker EIP-191 co-signs the VerifyingPaymaster getHash
  (J1-gated Sybil gate = gas-free), via crate::sponsor (#200 Stage A).

New crates/agentkeys-broker-server/src/sponsored_accept.rs:
- AcceptUserOpParams — every chain-derived value (nonce/gas/fees/validity/addrs)
  is an explicit input (nothing hardcoded; caller reads them on-chain).
- assemble_accept_userop(params, broker_sk) -> AssembledAcceptUserOp { user_op,
  user_op_hash, paymaster_get_hash }. Sets paymasterAndData[20:52] (the gas word)
  provisionally so paymaster_get_hash commits the limits the broker approves, then
  rebuilds paymasterAndData with the real co-sign appended; computes the userOpHash
  the master K11 signs. Pure (broker key only, no chain I/O).

Broker-side because the paymaster co-sign needs the broker key; the daemon will
call this via an endpoint and just K11-sign the returned userOpHash (the #200
division of labour). 3 unit tests: callData==accept batch + sender==master +
empty account sig + deterministic hash; paymasterAndData layout + broker co-sign
recovers to the broker EOA; grant change => userOpHash change. cargo test + clippy green.

Slice 2 of #225. Next: the broker HTTP endpoint wrapping this + the daemon call +
the Stage-B handleOps submit (cast-based, mirrors the E8 proof). Refs #225.
Defines the daemon<->broker protocol for the on-chain K11-gated accept, in the
ONE owner crate per the #204 rule (the daemon deps backend-client; the broker
mirrors these shapes server-side, pinned by the frozen key-set tests):

- BuildAcceptUserOpRequest  — POST /v1/accept/build (J1_master): register fields
  (device_key_hash, agent_pop_sig, link_code_redemption) + the granted scope
  (services + u128 caps as wire-safe decimal strings + period_seconds).
- WireUserOp                — ERC-4337 v0.7 PackedUserOperation, hex per field;
  mirrors broker sponsor::PackedUserOp. The daemon fills  with the
  master K11 assertion over user_op_hash.
- BuildAcceptUserOpResponse — { user_op, user_op_hash, entry_point, chain_id }.
- SubmitAcceptUserOpRequest / SubmitAcceptUserOpResponse — POST /v1/accept/submit
  → EntryPoint.handleOps (Stage B), returns { ok, tx_hash, block_number }.

Fixtures regenerated via dump-protocol-fixtures + frozen key-set tests for the
three request bodies (build_accept_userop_request, wire_user_op,
submit_accept_userop_request). cargo test + clippy + fixture --check green.

Slice 3 of #225. Next: the broker /v1/accept/{build,submit} handlers (mirror these
shapes server-side, gate on J1, call assemble_accept_userop) + the daemon call +
K11-sign. Refs #225.
The connective piece the broker accept handler returns: convert the internal
sponsor::PackedUserOp into the hex-encoded wire shape and shape the build body.

crates/agentkeys-broker-server/src/sponsored_accept.rs:
- WireUserOp — broker-side mirror of backend_client::protocol::WireUserOp (the
  broker doesn't dep that crate; frozen key-set tests on both sides pin them).
- WireUserOp::from_packed — hex-0x each PackedUserOp field.
- BuildAcceptResponse + AssembledAcceptUserOp::into_build_response — the
  /v1/accept/build body { user_op, user_op_hash, entry_point, chain_id }.

3 unit tests: every wire field round-trips back to the original bytes; the build
response carries the accept-batch callData + the userOpHash + entry_point/chain_id;
WireUserOp JSON keys match the backend-client frozen shape (server-side #204 pin).
cargo test + clippy green.

Slice 4 of #225. Next (the I/O layer, happy-path gated on a deployed P256Account
master): the axum /v1/accept/{build,submit} routes — J1_master auth (mirror
mint_cap) + eth_call operatorMasterWallet/getNonce + assemble_accept_userop +
into_build_response; submit relays EntryPoint.handleOps. Refs #225.
The precise, idempotent spec for the live-mainnet cutover that unblocks the #225
e2e (PR #227's /v1/accept flow needs the master to BE a deployed P256Account, not
the current EOA). docs/plan/chain/account-auth-cutover.md specifies:

- The gap: registry/scope sources are account-auth in code (E3) but the LIVE
  bytecode is pre-E3; heima-bring-up's cast-code idempotency check skips the
  redeploy, so account-auth never goes live.
- The consequence (loud): DeployAgentKeysV1 redeploys to NEW addresses → all
  on-chain state (master, agents, scopes, epoch, audit) resets → full re-bootstrap;
  demo breaks until re-bootstrapped. Operator-gated, announced, NOT in the plain flow.
- 6 phases (pre-flight → redeploy v2 set FORCE_DEPLOY → redeploy P256AccountFactory
  → onboarding-as-account → re-bootstrap actors → code/doc updates → broker redeploy),
  each idempotent with explicit skip checks.
- Idempotency strategy for a REDEPLOY (cast-code alone is insufficient since the old
  contracts also have code): a CUTOVER_DONE_<profile> marker + a live setScope
  account-auth ABI capability probe.
- The two scripts to implement (heima-cutover-account-auth.sh +
  heima-deploy-master-account.sh), the setup-heima.sh --cutover-account-auth wiring,
  the #201 env 3-file discipline, rollback (restore the .pre-cutover.bak env), and
  the arch.md §10/§12 + deployed-contracts.md sync owed at Phase 5.

Refs #225. Scopes the cutover named in erc4337-master-account.md §3.1.
… master-as-account

Diligence correction to the cutover spec after finding the onboarding-as-account
step already exists:

- Phase 3 (onboarding-as-account) reuses the existing `erc4337-register-master.sh`
  (build+submit) — it already does factory.createAccount + EntryPoint-deposit +
  register-first-master-as-account, idempotently. Dropped the proposed (redundant)
  `heima-deploy-master-account.sh`; only ONE new script remains (the cutover
  orchestrator `heima-cutover-account-auth.sh`).
- Decoupling finding (from that script's header): master-as-account is VIABLE on
  the LIVE pre-cutover registry (no EOA-only guard), so operatorMasterWallet[omni]
  can be the P256Account TODAY — no disruptive redeploy needed for that half.
  The cutover is only required for the accept batch's setScope (P.3): the live
  scope has setScopeWithWebauthn, not the msg.sender-gated setScope. So work can
  stage: register master-as-account now + exercise /v1/accept/build against it;
  do the registry/scope redeploy only when account-auth setScope is needed e2e.

Refs #225.
…er orchestrator)

The one new script the cutover spec calls for. Forces a redeploy of the v2 set so
the account-auth sources (E3) go live, making the #225 accept batch's setScope (P.3)
real. Idempotent + safe + bash -n clean.

scripts/heima-cutover-account-auth.sh:
- Phase 0: pre-flight (assert local AgentKeysScope.sol is account-auth — setScope
  present, setScopeWithWebauthn gone) + back up the env addresses to
  operator-workstation.env.pre-cutover.bak (idempotent: skip if present).
- Phase 1: redeploy via FORCE_DEPLOY=1 heima-bring-up.sh, then verify + set the
  CUTOVER_DONE_<profile> marker. DESTRUCTIVE → gated behind --yes; refuses otherwise.
  Idempotency ground truth is a read-only probe: the live scope bytecode carrying the
  setScope selector d8e9e3c6 (the marker is just the fast path).
- Phase 2: factory CHECK only (E5 recover() isn't needed for accept; no reusable
  factory-deploy helper exists, so it doesn't blind-deploy).
- Prints the follow-ups: re-register master-as-account (erc4337-register-master.sh),
  re-bootstrap agents/scopes, the repo edits (heima-scope-set.sh→setScope, arch.md),
  broker redeploy (setup-broker-host.sh --ref main).

Classified as a directly-callable SURGICAL helper (the three-entry-points exemption
for destructive heima-*-revoke/-rotate tools) — NOT wired into setup-heima.sh's plain
flow, since a plain run must never reset on-chain state. Spec updated to match.

Verified: bash -n clean; --help + unknown-arg guard work; setScope selector d8e9e3c6
confirmed against the earlier cast golden vectors. Cannot run e2e here (live mainnet
redeploy). Refs #225.
…chain show

The script died immediately with "no RPC" because it used a made-up resolution
(AGENTKEYS_CHAIN_RPC_HTTP — a broker-runtime var — plus an invented RPC_HTTP_HEIMA
fallback), neither of which operator-workstation.env carries. Diagnosis: both
heima-bring-up.sh:122 and setup-heima.sh:195 resolve the chain RPC the same way —
`agentkeys chain show "$CHAIN" | jq -r .rpc.http` (no RPC env key exists). Switched
to that; added jq + agentkeys to the tool pre-check. Verified live: it now resolves
https://rpc.heima-parachain.heima.network and runs to the destructive --yes gate.

Also: back up the env to $HOME/.agentkeys/<name>.pre-cutover.bak instead of next to
the git-tracked operator-workstation.env (a .bak there would surface as untracked).
Verified the backup lands in ~/.agentkeys and leaves git status clean.

Other assumptions re-checked against reality (correct): the SCOPE/REGISTRY/FACTORY
address keys exist in operator-workstation.env; the profile suffix uses the sibling
idiom tr 'a-z-' 'A-Z_'; the phase-0 guard holds (source AgentKeysScope.sol has
setScope, no setScopeWithWebauthn). Refs #225.
…t-cutover re-bind path)

Adds docs/operator-runbook-account-auth-cutover.md — the full 5-step operator procedure
for the disruptive cutover, in the operator-runbook-*.md convention (H1, > warning blocks,
ordered steps, rollback).

Writing it surfaced a correctness bug in the earlier spec + the script's printed follow-ups:
post-cutover, agent binding + scope grants go through ACCOUNT UserOps (the #225 accept
flow), because account-auth gates registry/scope writes on msg.sender == operatorMasterWallet
(the P256Account). The pre-cutover scripts do NOT work post-cutover — verified:
  - heima-agent-create.sh sends registerAgentDevice from the deployer EOA (≠ the account);
  - heima-scope-set.sh calls setScopeWithWebauthn (the assertion-in-calldata path account-auth
    removes; the new setScope is msg.sender-gated, no assertion param).
So the runbook leads with two warnings: (1) SEQUENCING — run the cutover only AFTER the
#225 accept flow is wired, else agents are stranded (you can re-register the master but not
re-bind agents); (2) DESTRUCTIVE — state reset → full re-bootstrap.

Corrected to match:
- spec Phase 4 (re-bind = #225 accept flow, not heima-agent-create/heima-scope-set);
- spec Phase 5 (drop the bogus heima-scope-set.sh setScopeWithWebauthn→setScope edit — it's a
  pre-cutover tool, retired post-cutover; just arch.md §10/§12 + deployed-contracts.md);
- the script's printed follow-ups (point at the #225 accept flow + the new runbook).

Verified: script bash -n clean; runbook H1/no-frontmatter/warnings present. Refs #225.
…re-onboard) + fix step-4 command

Per the "no user, only developer, register again" reality:
- Reframe: nothing to migrate. The cutover proper is redeploy + verify + broker redeploy
  (steps 1-3); registering the master + pairing agents (4-5) is just normal onboarding on
  the fresh contracts, not a special re-bootstrap. Dropped the "DESTRUCTIVE / announce +
  schedule / state NOT migrated" alarm.
- Master register is still REQUIRED (the new registry is empty → registerAgentDevice would
  revert OperatorNotRegistered), but it's one command, not the placeholder build/submit dance:
      bash harness/scripts/erc4337-register-master.sh --operator-omni 0x<omni>   # auto Touch ID
  The old step-3 block was not executable (raw 0x<…>/<N> placeholders + a hand-wavy "K11 signs
  the userop_hash"). The default `register` mode auto-runs k11 webauthn-keygen + webauthn-userop-sign;
  the build/submit two-phase split is only for the browser web-flow.
- Synced the script's printed follow-ups + the spec Phase 3 to the one-command form.

Refs #225.
The "accept pairing" had no unpair, and the revoke that did exist (actor-detail
view) only flipped LOCAL daemon state — the device stayed registered on chain. Both
gaps closed:

Daemon (ui_bridge.rs):
- revoke_device now shells out to heima-device-revoke.sh (--agent <label>; agent-tier
  needs no K11; idempotent) BEFORE flipping local state — a binding isn't gone until
  SidecarRegistry.revokeAgentDevice says so. On-chain failure returns 502 and leaves
  local state untouched (no silent local-only revoke). New helpers: resolve_repo_script
  (mirrors register_pairing's heima-agent-create.sh resolution) + revoke_agent_device
  (mirrors register_agent_device). Errors loudly if --register-master-script / the
  revoke script is absent (chain-unconfigured). Test updated to mock the script + a
  make_state_with_script helper; passes. clippy clean.

Frontend:
- Unpair button on each paired-device card (pairing.tsx) → onUnpair → the existing
  K11/confirm revoke flow.
- confirmAction now AWAITS client.revokeDevice + re-fetches the authoritative actor
  tree, instead of fire-and-forget + optimistic flip — so a failed on-chain revoke
  surfaces and a success reflects the real chain state. tsc clean.

NOTE: this revoke is currently deployer-signed (like accept). Per the "sensitive
UserOps need Touch ID" task, it joins accept on the list to be K11-gated (arch.md +
the real gate, next).
…perations

Per the "any sensitive UserOp must be Touch-ID-gated" requirement: formalized the
inline enumeration at §10 (scope grant/revoke, device add/revoke, K10 rotation,
recovery, audit-row mint, typed-data sign) into an explicit table in a new §10.1a.

States the rule — every master-authority mutation is a P256Account UserOp, and
every P256Account UserOp is K11-gated by validateUserOp (challenge == userOpHash) —
maps each sensitive op to its on-chain call + UI trigger + gate status, and marks
accept / unpair / scope-grant as ⏳ #225 (deployer-signed today, no Touch ID — the
gap between the rule and the running code, being closed by E7 + the cutover). Also
draws the authority-vs-usage boundary: cap-mint + worker reads/writes are J1+cap
gated, NOT per-op Touch ID (re-prompting per memory read would be unusable), except
high-value payments above payment_k11_threshold.

Single source of truth (terminology rule) — extends the existing §10 enumeration
rather than duplicating it. Refs #225.
…rser

Starts the real Touch-ID gate (task 2b). The broker handlers/accept.rs:
- BuildAcceptRequest — server-side mirror of the backend-client wire type (the
  /v1/accept/build body, J1_master-gated; broker doesn't dep backend-client, the
  frozen key-set test there pins the shape).
- parse_register_and_grant — pure parse of the wire request into the typed
  agentkeys_core::erc4337 AgentRegister + ScopeGrant that assemble_accept_userop
  consumes. Service strings → bytes32 via keccak256(lowercase(service)) — the SAME
  hash heima-scope-set.sh writes (verified: keccak("memory:personal") golden), so a
  service id is byte-identical on every path. Caps as decimal strings (wire-safe).

3 unit tests (golden service-id, lowercasing, bad-hex/short/non-numeric rejection);
cargo test + clippy green.

Next 2b slices: the axum /v1/accept/build handler (J1 auth like mint_cap + eth_call
operatorMasterWallet/getNonce + assemble_accept_userop + into_build_response), which
needs new ENTRYPOINT/PAYMASTER env (3-file discipline) + the broker EVM co-sign key
loaded; then /v1/accept/submit (handleOps, Stage B); the daemon accept wiring; the
ceremony.tsx browser Touch ID over the userOpHash. Refs #225.
The keystone of the Touch-ID gate: J1_master-gated, assembles the sponsored
executeBatch([registerAgentDevice, setScope]) UserOp and returns the userOpHash
the master K11-signs.

handlers/accept.rs:
- build_accept_response (PURE, tested): request + chain reads (master account +
  nonce) + config + broker co-sign key → BuildAcceptResponse (via the slice-1 parser
  + sponsored_accept::assemble_accept_userop + into_build_response).
- accept_build (axum): bearer + verify_session_jwt + operator_omni == session omni;
  load_accept_config (env: ENTRYPOINT/PAYMASTER/BROKER_SPONSOR_SIGNER_{ADDRESS,KEY},
  registry/scope profile-aware, gas defaults as named consts); eth_call
  operatorMasterWallet (404→CONFLICT if no master) + getNonce; build_accept_response.
- Route POST /v1/accept/build wired in lib.rs.

4 unit tests (parser ×3 + build_accept_response assembles the batch op: sender==master,
0x47e1da2a executeBatch callData, userOpHash present). cargo build + clippy green.

Live prereqs (operator): the new ENTRYPOINT/PAYMASTER/sponsor-key env (set by
setup-broker-host.sh) + a deployed P256Account master (the cutover). Next: slice 3
/v1/accept/submit (handleOps, Stage B). Refs #225.
…, Stage B)

Relays the K11-signed accept UserOp to EntryPoint.handleOps — the broker is sponsor
+ submitter (VerifyingPaymaster covers the account gas; the broker EOA fronts the
outer tx, reimbursed). Submits via `cast send` (the repo's chain-mutation pattern;
E8 proved the handleOps incantation on mainnet; the broker host ships foundry).

handlers/accept.rs:
- SubmitAcceptRequest (mirror; user_op.signature now carries the K11 assertion).
- cast_handleops_arg (pure, tested) — WireUserOp → the cast PackedUserOperation tuple
  for handleOps((address,uint256,bytes,bytes,bytes32,uint256,bytes32,bytes,bytes)[],address).
- accept_submit (axum): J1 auth + cast send handleOps + parse {tx_hash, block_number}.
- Route POST /v1/accept/submit wired.

5 unit tests green; clippy clean. NOTE: --private-key is ps-visible — production should
move the submitter to the broker fee-payer keystore (follow-up, commented in code).
Next: slice 4 daemon accept wiring (build → browser K11-sign → submit). Refs #225.
… → still 502)

`sudo bash setup-broker-host.sh` sets HOME=/root and runs `command -v cast` with root's
PATH, so a foundry in the LOGIN user's ~/.foundry (observed: /home/agentkey/.foundry/bin/cast)
was invisible — the script fell into the (root) auto-install, which failed, and never copied
cast to /usr/local/bin, so the broker kept 502'ing 'spawn cast'. Now _find_cast searches
$SUDO_USER's home + any /home/*/.foundry + root in addition to command -v + $HOME, so the
operator's existing cast is found and copied regardless of sudo. Verified by simulation
(command -v empty → the $SUDO_USER fallback locates the home-dir cast). bash -n clean.
…rts) + verify receipt

New error after cast was found: "Failed to estimate gas: … revert, data: 0x". Heima's
eth_estimateGas can't simulate handleOps (a bare 0x, NOT an ERC-4337 FailedOp), and the
submit also used --json which can't parse Heima's mixHash-less receipt. Match the
mainnet-proven erc4337-register-master.sh pattern:

- pass --gas-limit (default 4_000_000, override AGENTKEYS_HANDLEOPS_GAS_LIMIT) and drop
  eth_estimateGas entirely.
- drop --json; read the tx hash from cast's human output (printed before the receipt parse
  may error), and verify the OUTCOME via a direct eth_getTransactionReceipt (new
  eth_receipt_status helper) — never cast's exit code.
- clear outcomes: did-not-broadcast vs reverted-on-chain (wrong passkey / unregistered
  master / paymaster) vs success, instead of the opaque estimation error.

Builds: broker accept 16 passed.
…y + migration checklist

Single migration-facing home for every Heima-vs-Ethereum EVM divergence the
repo works around, each as gap → symptom → workaround → code site → what-changes-
on-eth, plus a Heima→Ethereum migration checklist:
  1. eth_estimateGas reverts (bare 0x) on handleOps → pinned --gas-limit
  2. mixHash-less receipts break cast/forge parsing → on-chain re-verify
  3. forge create pre-broadcast estimation errors → cast send --create
  4. forge script header validation (prevrandao) → evm_version=london
  5. EVM execution IS Cancun (header format is not a capability signal)
  6. year-prefixed chain_id (mainnet 212013, paseo 2013)

Linked from CLAUDE.md 'Heima EVM compatibility level' (capability-proof home)
and arch.md's ERC-4337 master-account paragraph (the Cancun-not-London claim).
All cited values verified against the live code; all relative links resolve.
…agnosis

Root-caused a '#225 handleOps reverted on-chain (wrong passkey)' report that was
NOT a passkey issue. Decoded the failed UserOp on-chain:
  - the broker builds the accept against registry 0x1ac62f1c / scope 0xd44b375d
    (pre-cutover, setScopeWithWebauthn) while the repo's current account-auth
    contracts are 0xdA44a8D6 / 0x5e94f76e (setScope) — the broker host was never
    re-synced after the account-auth cutover (deployment DRIFT).
  - on that stale registry the operator's operatorMasterWallet is the deployer
    EOA (no code), so the P256Account UserOp sender is an EOA → handleOps reverts.
    No passkey can fix an EOA master.

Two fixes, neither is the passkey:
  - accept.rs: /v1/accept/build now eth_getCode-checks the master and rejects an
    EOA master up front (before any Touch ID / gas) with the actionable cause,
    instead of building a doomed UserOp that misreports as 'wrong passkey'.
  - runbook: rewrite the 'handleOps reverted on-chain' troubleshooting entry to
    lead with (1) broker registry/scope drift check + re-sync, (2) EOA-master
    check, and demote the wrong-passkey case to last (it's the least likely).

Broker accept tests green (16). The onboarding fix (bind a P256Account master
instead of an EOA) + the broker re-sync are the follow-ups.
… app)

Implementation plan to make web onboarding bind the master as a passkey
P256Account (not a deployer EOA), so the #225 Touch-ID accept works. Completes
E7 (erc4337-master-account.md): the browser-passkey master register that
ceremony.tsx left display-only.

Covers all three layers the user asked for, plus the actor-page UI:
  - Broker: re-sync to the current account-auth contracts + a drift guard
    (accept-env registry/scope must match the compiled chain profile, fail loud).
  - Daemon: two-phase passkey register endpoints (/v1/master/register/{build,
    submit}) replacing the single-shot EOA shell-out; dev.sh default repoint.
  - Web app: wire the browser Touch ID to the register UserOp (the two get()s);
    and SHOW the on-chain account address on the actor page (ApiActor.account_
    address — master = its P256Account, agent = device addr, EOA/none states get
    a CTA/warning).
  - Register script: stale 9-param registerFirstMasterDevice → live 10-param
    selfAttestation ABI (selector adad96a6).

Grounded in the verified 2026-06-09 on-chain diagnosis (broker on stale
pre-cutover registry 0x1ac62f1c/scope 0xd44b375d vs current 0xdA44a8D6/0x5e94f76e;
EOA master). Includes an implementation-order table, testing, migration of
existing EOA-bound operators, and open questions (3→2 Touch IDs via a contract
change; deposit funding; agent account address). Indexed from the web-flow
README + linked from erc4337-master-account.md E7.
…on, reject EOA)

Onboarding-P256Account plan, steps 1+3. The master register moves to the pure
account model so onboarding binds operatorMasterWallet[omni] = the passkey
P256Account (a contract), not a deployer EOA — the root fix for the #225 accept.

Contract (SidecarRegistry.registerFirstMasterDevice):
  - drop the K11Assertion selfAttestation param (10->8 param, selector 0x93b14d7c).
    The account's validateUserOp (run by the EntryPoint over the userOpHash, which
    commits this calldata) IS the passkey proof; the explicit #166 self-attestation
    is subsumed by the account model (resolves erc4337-master-account.md §4 for the
    dev system; first-master front-run binding is a documented prod follow-up).
  - REJECT an EOA msg.sender (new error MasterMustBeAccount) — an EOA has no
    validateUserOp, so it could never sign the downstream master mutations. This
    structurally retires the EOA-master class of bug that made handleOps revert.
  - reduces onboarding to ONE register Touch ID (2 total: enroll + register).

Tests: rewrite the first-master tests to the account model (master etched as a
contract; new test_RejectsEoaMaster; removed the self-attestation front-run test).
72 forge tests pass.

VERSION 0.1 -> 0.2 (forces redeploy + chain-profile bump on next bring-up).

Register script (erc4337-register-master.sh): 8-param calldata (selector matches
the compiled contract) + deposit 5 HEI from the deployer (per plan §8).
Onboarding-P256Account plan, steps 6-7. The daemon web onboarding now binds the
master as a passkey P256Account instead of the deployer EOA.

- K11-finish runs the erc4337-register-master.sh 'build' (deploy the P256Account
  + fund its 5-HEI EntryPoint deposit + assemble the register UserOp) and returns
  the userOpHash the browser signs (EnrollFinishResponse.register_userop_hash +
  register_account; chain='register-pending'), stashing the build context in a new
  pending_register state slot.
- NEW POST /v1/master/register/submit: decode the browser get() assertion (new CLI
  decode_web_userop_assertion), run 'submit' -> EntryPoint.handleOps, bind
  operatorMasterWallet[omni] = the P256Account, set registered_master.
- Idempotent skip (operator already has a master) short-circuits to
  chain='master-registered', no pending.
- Replaced the single-shot EOA register_master_device with run_register_script +
  register_master_build + register_master_submit. cred_id_hash = master_cred_id_
  hash(omni) (the synthetic signer key the accept also uses); pubX/pubY split from
  the K11 COSE pubkey.

73 daemon tests pass (rewrote the register-script parser tests; added split_cose_xy).
dev.sh repoint + the frontend second-Touch-ID wiring land next.
….sh repoint

Onboarding-P256Account plan, steps 8 + 11. Completes the onboarding flow so the
master binds as a passkey P256Account end-to-end.

- ceremony.tsx: after K11 enroll, when the daemon returns chain='register-pending'
  + a register userOpHash, a SECOND Touch ID (getAssertionOverHash, allowCredentials
  pinned to the master passkey just created) signs it → client.registerMasterSubmit
  → the daemon lands handleOps, binding operatorMasterWallet = the smart account.
  The register step label is no longer 'deferred (E7)'.
- client: K11EnrollResult gains chain/registerUserOpHash/registerAccount; new
  registerMasterSubmit(assertion) on the AgentKeysClient interface + the daemon
  HttpClient (POST /v1/master/register/submit) + the EmptyBackend stub.
- dev.sh: the daemon --register-master-script default → erc4337-register-master.sh
  (the passkey P256Account path), NOT the deprecated EOA register.

tsc clean (exit 0). Actor-page account-address display lands next.
…teps 9-10)

The actor detail page now shows the master's passkey P256Account address
(operatorMasterWallet) — the address the #225 accept binds — distinguishing a
bound smart-account master from an unregistered one.

- daemon: RegisteredMaster gains `account` (set on register submit + idempotent
  skip); list_actors/get_actor enrich each actor's serialized JSON with
  account_address + account_type (master → p256account, agent → device, unbound →
  none) via enrich_actor_account. get_actor now returns the enriched Value.
- client: apiToActor maps account_address/account_type → accountAddress/accountType;
  Actor type gains the fields.
- dashboard.tsx: master detail shows '<P256Account> · passkey P256Account (ERC-4337
  · operatorMasterWallet)', or a 'not yet bound — complete onboarding' hint.

73 daemon tests pass (get_actor asserts account_type); clippy -D warnings clean;
tsc clean.
The v0.2 contract redeploy (chain: redeploy commit) minted new addresses, and
audit_decode::onchain_event_decodes_calldata_against_real_abi FAILED in CI — it
hardcoded the v0.1 CredentialAudit address (0x8336968273…) and compared it to the
value the decoder resolved from the (now-updated) chain profile (0xe869E1…).

Fix: assert tx.to_address against the SAME chain profile the decoder used
(profile().contract("CredentialAudit").address), so it survives every redeploy
instead of pinning a literal. Also declutter a stale scope address in a
calldata.rs comment. No other address-pinned assertions exist.

Workspace tests green; fmt clean.
…ot the stale host worker env

Root-caused a post-redeploy '#225 EOA-master' accept failure. After the operator
redeployed the v0.2 contracts (new registry 0xF50e…) + re-synced the broker, the
accept STILL read operatorMasterWallet from the OLD pre-cutover registry
0x1ac62f1c (→ the deployer EOA), even though the re-onboard correctly bound a
P256Account (0xb1aAf…, a contract) on the new registry.

Cause: setup-broker-host.sh resolved SCOPE/REGISTRY/K3 from the broker host's
/etc/agentkeys/worker-{creds,memory}.env (read_envfile_var) BEFORE sourcing
operator-workstation.env, and the later '[[ -z ]]' fallback couldn't override the
already-set (stale) value. So the broker pinned itself to the first-deployed
registry and re-running the script read its OWN stale output — never picking up a
redeploy's fresh addresses (operator-workstation.env IS updated on deploy).

Fix: contract addresses (deploy OUTPUTS, change every redeploy) now resolve
CLI flag > operator-workstation.env (source of truth) > worker-env (last resort),
by moving the worker-env reads to AFTER the env source. Buckets/RPC (operator
overrides) keep their sticky-first-run behavior. bash -n clean.
…rd wrong-passkey trap)

Root-caused a post-bind '#225 SIG_VALIDATION_FAILED' accept: re-onboarding mints a
BRAND-NEW passkey each time (create() always does) + overwrites ak_master_cred_id,
but registerFirstMasterDevice is first-master-only so it can't re-bind — the account
stays on the FIRST passkey while the accept auto-selects the LATEST one → AA24.

Fixes:
- ceremony.tsx (B1): if the master is ALREADY bound (onboarding-state chain ==
  'master-registered'), do NOT create() a new passkey — reuse the bound one (or, if
  the local pointer is gone, fall back so the UI offers reset). No more orphans.
- ceremony.tsx (B2): persist ak_master_cred_id ONLY after registerMasterSubmit
  SUCCEEDS (and never overwrite it on an idempotent skip) — a stored-but-unbound
  pointer IS the wrong-passkey trap.
- daemon (B3): POST /v1/master/reset clears registered_master + pending_register +
  the persisted master-session coords (keeps the email/J1 session) so onboarding
  drops to chain:'none' + can re-enroll. Cannot touch on-chain operatorMasterWallet
  (first-master-only) or the OS passkey (WebAuthn) — the response note says so.
- client: resetMaster() on the interface + HttpClient + EmptyBackend.
- webauthn.ts (B4): masterPasskeyPresent(credId) — best-effort 'is the passkey still
  there?' probe (the operator may have deleted it in System Settings).
- App.tsx: a resetMaster handler + an actionable toast — on an accept SIG_VALIDATION/
  revert or a missing-passkey Touch-ID failure, surface a 'Reset master' button + the
  delete-passkey + re-onboard guidance.

Filed agentKeys#231 for the deferred broker drift guard. 73 daemon tests, tsc clean,
fmt clean.
…oss onboarding/accept

Diagnose Touch-ID / SIG_VALIDATION_FAILED bugs at a glance: filter the browser console
by '[agentkeys]' and compare the master ACCOUNT + the signing PASSKEY (credential id)
at the two moments they must agree.

New lib/debug.ts::akLog(event, data). Logged moments:
  onboarding (ceremony.tsx):
    - K11 passkey CREATED (generatedCredentialId, rpId)
    - master already bound → REUSING passkey / bound-but-no-local-pointer
    - master account assembled (account, chain, registerUserOpHash, passkeyCredentialId)
    - signing REGISTER userOpHash (account, signingCredentialId)
    - master REGISTERED + signer persisted (account, txHash, boundCredentialId) / failed
  accept (App.tsx):
    - built UserOp (masterAccount = operatorMasterWallet, userOpHash, entryPoint)
    - signing userOpHash (requestedCredentialId)
    - assertion produced (requestedCredentialId vs signingCredentialId + autoSelectMatched)
    - submit OK ✅ / FAILED ❌ (masterAccount, signingCredentialId, txHash/detail)
    - reset: cleared credential id

The decisive line: boundCredentialId (onboarding) vs signingCredentialId (accept) — a
mismatch (or autoSelectMatched:false) is the wrong-passkey trap. No secrets logged
(credential ids/addresses/hashes are public). tsc clean.
The reset was reachable ONLY via the failure toast (after an accept reverted). Add a
persistent 'reset master' danger button in the master actor detail header (where agents
get 'revoke device'), so the operator can reset proactively — e.g. when they deleted
the master passkey in System Settings and know they must re-onboard. A window.confirm
spells out that it does NOT delete the OS passkey or unbind on chain. Wired to the
existing resetMaster handler. tsc clean.
…ploying

registerFirstMasterDevice is first-master-ONLY, so operatorMasterWallet[omni]
is immutable once set. If the operator loses/deletes the master passkey, its
on-chain P256Account is unusable and the ONLY recovery was redeploying the whole
contract set — and the web "reset master" button only cleared LOCAL state, so a
re-onboard still hit the immutable binding and accept kept failing
SIG_VALIDATION_FAILED (the reported bug).

Add resetMaster(bytes32 operatorOmni) as an owner-gated (deployer) escape hatch
that unbinds in place, plus wire it end-to-end so the reset button actually
clears the on-chain binding:

- Contract (VERSION 0.2->0.3): resetMaster wipes the operator's whole device
  list + clears operatorMasterWallet/recoveryThreshold/operatorNonce; `owner`
  immutable captured at construction; MasterReset event; NotOwner error. Two new
  forge tests (owner-only gating + clears-and-allows-re-register). 26/26 pass.
- scripts/heima-reset-master.sh: surgical recovery helper (like
  heima-device-revoke.sh) — owner-gated cast send, idempotent (skips when already
  address(0)), probes owner() to detect a pre-0.3 registry and fail loud with the
  redeploy path.
- Daemon POST /v1/master/reset now clears BOTH local state AND the on-chain
  binding (reset_master_onchain shells out to the script); the response carries
  an `onchain` block so the UI reports whether the unbind landed.
- Frontend: MasterResetResult/MasterResetOnchain types; the resetMaster handler +
  the "reset master" button copy now say it unbinds on-chain (no longer
  "local only"); surfaces a clear failure note when the on-chain step didn't land.
- Docs: arch.md master-mutation table gains a resetMaster row;
  deployed-contracts.md SidecarRegistry ABI fixed to the live 8-param
  registerFirstMasterDevice (dropping the stale 9-param self-attestation form —
  pre-existing drift from the deferred step-13 doc sync) + resetMaster/owner
  added + version line 0.1->0.2-live/0.3-pending; plan doc §5/§5a updated.

Operator follow-up to make 0.3 live: FORCE_DEPLOY=1 bash scripts/heima-bring-up.sh,
then setup-broker-host.sh --ref main, then re-onboard. The fresh deploy also
clears any currently-stranded binding (new registry = empty state).
…redeploy)

build_if_needed only watched *.rs under each binary's own crate dir, so a
heima-bring-up.sh redeploy — which rewrites crates/agentkeys-core/chain-profiles/
*.json, a different crate and not a *.rs — was missed. dev.sh then printed
"binary is current — skipping build" and ran a STALE binary compiled (via
include_str!) with the OLD contract addresses, silently pointing the local stack
at the orphaned registry after every redeploy.

Watch the chain-profiles JSON explicitly: a redeploy now forces the rebuild
(cargo recompiles agentkeys-core off the changed include_str! dep, then the
daemon + mcp binaries), so `bash dev.sh` alone picks up new addresses.
…ding warn)

The "chain profile changed — rebuilding" line only fires when the rebuild is
triggered SPECIFICALLY by the profile being newer than an existing binary; on a
first run (no binary) or after a code change (*.rs newer) it rebuilds via those
branches instead, so that line is skipped even though the rebuild happened. You
then can't tell which contract set the running stack is on.

Add log_chain_profile() — unconditional, every run, printed right after the
builds: `chain profile: <chain> — contract_set_version X · SidecarRegistry 0x…`,
read from the same crates/agentkeys-core/chain-profiles/<chain>.json the binaries
include_str! at compile time. When the source VERSION (expected set version) is
ahead of the deployed contract_set_version, it prints a loud REDEPLOY PENDING
warning with the FORCE_DEPLOY=1 heima-bring-up.sh command — so a stale-address
run is impossible to miss. Graceful when jq is absent / AGENTKEYS_CHAIN has no
profile.
dev.sh's build_if_needed/build_wasm short-circuit on an mtime/stamp heuristic, so
when that heuristic is wrong (a dependency change the watched dirs miss) the stack
runs stale binaries with no override. Add an explicit escape hatch:

  bash dev.sh --force        # (or -f, or FORCE_BUILD=1 bash dev.sh)

--force sets FORCE_BUILD=1, which makes build_if_needed rebuild the daemon + mcp
unconditionally and build_wasm bypass its stamp cache. A preflight line announces
"--force → rebuilding ..."; --help documents it; unknown args are rejected with a
hint. Plain `bash dev.sh` is unchanged (FORCE_BUILD defaults to 0).
The reset-master control was only on the master actor's detail page (you had to
drill into the actor to find it). Add it to the global nav "account" section,
directly under "log out", in danger red with a [⟲] marker and a confirm dialog —
so the recovery path is discoverable from anywhere. Calls the existing resetMaster
handler (unbinds local + on-chain). The contextual button on the master detail
page stays. tsc clean.
…+ env)

FORCE_DEPLOY=1 heima-bring-up.sh deployed the fresh 0.3 contract set on-chain
(SidecarRegistry 0xC63E6f64…, + new AgentKeysScope/K3EpochCounter/CredentialAudit,
replacing the orphaned 0.2 0xF50ef960… set) and rewrote heima.json +
operator-workstation.env — but those were left UNCOMMITTED in the working tree.

Consequence (the #225 "still SIG_VALIDATION_FAILED" report): the local daemon
compiled the new registry (0xC63E6f64) and onboarded the master into it, but the
broker deploys from origin/<branch>, which still had the OLD 0xF50ef960 — so the
broker read operatorMasterWallet from the orphaned registry (the stranded
0xb1aaf7 master), built the accept UserOp for it, and the new passkey signature
failed on-chain. Split registry = guaranteed accept failure.

Commit the deploy artifacts so the broker host can deploy from the same set the
daemon runs. profile ⟷ env verified in sync (check-deployed-contracts-sync.sh);
VERSION == contract_set_version == 0.3. deployed-contracts.md updated to 0.3-live
with a loud "rebuild the broker from the same committed profile" note.

Operator step to finish: setup-broker-host.sh --ref claude/225-onchain-k11-accept
on the broker host, then reset + re-onboard once.
…rtifacts)

Root cause of the recurring #225 "still SIG_VALIDATION_FAILED after re-onboard":
heima-bring-up.sh rewrites heima.json + operator-workstation.env on a fresh deploy
but does NOT commit them, and its step-7 reminder never named the consequence. So
an operator redeploys the broker (which pulls origin/<branch>) before pushing the
new addresses → broker compiles the OLD registry while the daemon onboards into the
NEW one → broker reads operatorMasterWallet from the orphaned registry → accept
handleOps reverts SIG_VALIDATION_FAILED, looking exactly like a wrong-passkey bug.

Enforcement (per the CI-guard + CLAUDE.md rule bar):
- heima-bring-up.sh: after a fresh deploy, actively `git status` the two machine
  mirrors and, if uncommitted, print a loud guard naming the split-registry failure
  + the exact commit/push/setup-broker-host sequence. Warning only (the on-chain
  deploy already happened), graceful when git is absent.
- CLAUDE.md deployed-contract-registry rule 2: COMMIT + PUSH heima.json +
  operator-workstation.env BEFORE redeploying the broker; names the failure mode +
  the deploy → commit+push → broker-redeploy order.
…hanged)

Onboarding fires Touch ID twice — create() the passkey, then get() to sign its
on-chain registration UserOp — and users found the second prompt confusing/random
(the userOpHash can't be known until create() yields the pubkey, so the two
ceremonies are inherent; see the #225 discussion). Don't change the flow; make the
UI say so up front:

- Up-front banner above the trust-core ceremony: "Touch ID is requested twice …
  the second prompt is expected, not a retry or an error."
- New CeremonyStep.touchId field → a "Touch ID · 1 of 2" / "2 of 2" badge on the
  two biometric steps (amber, distinct from the info-blue on-chain badge).
- Step subs spell out FIRST/SECOND prompt + "No Touch ID here" on the wallet step
  between them, so the gap doesn't read as the flow stalling.

Pure presentation: same action/onchain/order, tsc clean.
…lance check

The onboarding register escrowed a 5 HEI EntryPoint deposit per master, but the
register UserOp is the ONLY thing that deposit pays (accept/scope are
paymaster-sponsored) and it was MEASURED at ~0.028 HEI on Heima — so 5 HEI was
~175× the real cost. The unused ~4.97 HEI stranded in each account's EntryPoint
deposit (unrecoverable after a master reset), which drained the prod deployer to
0.44 HEI. depositTo then couldn't send 5 HEI it didn't have → build exits 1 → no
2nd Touch ID → device_not_active → config/init 403 (the reported failure).

- ERC4337_DEPOSIT_WEI default 5→0.2 HEI (7× the measured ~0.03 HEI cost);
  ERC4337_MIN_DEPOSIT_WEI 1→0.05 HEI. A 0.44 HEI deployer now onboards without a
  top-up, and the dev re-onboard loop stops burning 5 HEI a pop.
- Pre-flight: before depositTo, check the deployer can afford deposit + 0.1 HEI
  (gas + Heima ExistentialDeposit keep-alive, so the spend can't reap the
  deployer), and fail with the real reason + both fixes (top up OR lower
  ERC4337_DEPOSIT_WEI) instead of the opaque "depositTo did not land". awk compare
  so a large deployer balance can't overflow bash 64-bit arithmetic.
- Plan doc §8 + step-3 table updated to 0.2 HEI with the rationale.
…se) + maxFee→40 gwei

The #225 accept failed handleOps with AA24 ("wrong passkey / SIG_VALIDATION_FAILED")
even after the master registered correctly, the registry/account/paymaster all
validated, and the signed userOpHash matched on-chain. On-chain forensics proved it
was NOT a signature problem: account.validateUserOp returns SIG_OK under unlimited
(eth_call) gas, and K11Verifier.verifyAssertion(submitted) returns true.

Root cause: the on-chain WebAuthn/P-256 verify is pure-Solidity on Heima (no RIP-7212
precompile) and costs ~1M+ gas. The broker set the accept's verificationGasLimit to
600k, so the verify ran OUT OF GAS inside the account's
`try checkUserOpSignature() catch { SIG_FAIL }` — the catch maps the OOG to
SIG_VALIDATION_FAILED, which the EntryPoint reports as AA24 (not the AA23 a bare OOG
gives). The working passkey REGISTER UserOp uses 1.5M and passes; the accept used 600k
and failed. The accept's maxFee (2 gwei) was also below Heima's ~25 gwei base fee.

- DEF_VERIFICATION_GAS_LIMIT 600_000 → 1_500_000 (matches the proven register value).
- DEF_MAX_FEE 2 gwei → 40 gwei (clears base+priority; Σ(gasLimits)×maxFee ≈ 0.15 HEI
  stays under the paymaster's 0.2 HEI deposit).
- docs/spec/heima-eth-gap.md: new gap #7 (pure-Solidity P-256 verify gas) + migration
  checklist item, so the next operator/migration knows AA24-after-valid-signer = gas.
…he list

The #225 E7 accept (/v1/accept/build+submit) registers the agent on-chain, but its
broker SubmitAcceptRequest carries no request_id — so the broker can't drop the
rendezvous row itself, and the accept flow never acked it. Result: a successful
accept (txHash + "submit OK") left the pairing request stuck in
GET /v1/agent/pairing/pending, so the UI never updated even though the chain
mutation landed. (The legacy register_pairing path acks via mark_bound; the E7
path didn't.)

- Daemon: new POST /v1/agent/pairing/ack (J1-gated, no Touch ID) forwarding
  {request_id} to the broker's existing /v1/agent/pending-bindings/ack → mark_bound,
  mirroring decline_pairing.
- Client: ackPairing(requestId) (+ interface + EmptyBackend).
- App.tsx accept success: optimistically drop the request from local state, then
  ackPairing(req.id) so it stays gone, then refreshPairing + listActors (the agent
  now shows up bound).

No broker change — ack_binding already exists; it ships with the broker gas-fix
redeploy. Daemon rebuilds locally (dev.sh); frontend hot-reloads.
@hanwencheng hanwencheng merged commit b9735e9 into main Jun 9, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant