Skip to content

Broker drift guard: fail loud when accept-config registry/scope ≠ compiled chain profile #231

@hanwencheng

Description

@hanwencheng

Context

Twice now (2026-06-09) a stale broker contract-address env caused the #225 Touch-ID accept to fail with a confusing, misleading symptom — the broker read operatorMasterWallet from an old registry while the redeployed one was live, surfacing as "operator master 0x… is a legacy EOA" / "wrong passkey (SIG_VALIDATION_FAILED)". Both incidents cost a long diagnosis loop before the real cause (registry drift) was found.

  • 1st incident: broker on the pre-cutover registry 0x1ac62f1c while the repo had 0xdA44a8D6.
  • 2nd incident: after a v0.2 redeploy (0xF50e…), setup-broker-host.sh resolved the registry from the host's stale worker env with precedence over operator-workstation.env — fixed in de4bbd1 (precedence: CLI flag > operator-workstation.env > worker-env). But nothing in the broker itself caught the drift; it silently built doomed UserOps.

Scope

Add a drift guard in the broker (crates/agentkeys-broker-server/src/handlers/accept.rs::load_accept_config, or broker startup): cross-check the accept-config SIDECAR_REGISTRY_ADDRESS / SCOPE_CONTRACT_ADDRESS / ENTRYPOINT_ADDRESS (from env) against the compiled-in agentkeys_core::chain_profile::ChainProfile (the source of truth, include_str!'d heima.json). On mismatch → fail loud with an actionable error naming BOTH addresses + the re-sync command, instead of serving the accept against the wrong contracts.

This was deferred as step 4 of the onboarding-P256Account plan (docs/plan/web-flow/onboarding-p256account-master.md §4 "Broker drift guard").

Acceptance

  • A broker whose accept-env registry/scope/entrypoint ≠ the compiled chain profile fails at config load / boot (or refuses the /v1/accept/* route) with accept-env SIDECAR_REGISTRY_ADDRESS=0xA != chain profile 0xB — the broker is on a STALE deployment; re-sync: setup-broker-host.sh --ref <branch>.
  • Matched env → no error (normal operation).
  • CI-safe escape hatch: the CI/test env may legitimately differ from the compiled prod profile, so gate the hard-fail behind a default-on check with an explicit override env (AGENTKEYS_ACCEPT_ALLOW_ADDR_OVERRIDE=1 → downgrade to a tracing::warn!). The CI materializer sets the override.
  • Unit test: mismatched env → config error; matched → ok; override env → warn not fail.

Effort

~S — a config-load cross-check (the ChainProfile::resolve + .contract(name).address comparison is already sketched in the plan §4), an override escape, and a unit test.

Metadata

Kind: Bug · Priority: High (recurring, high diagnosis cost) · Size: S · Area: broker

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/brokerBroker server, cap-token issuance, OIDC issuance

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions