docs: Agent IAM strategy reset — hooks-first wire architecture#140
Merged
Conversation
5 tasks
Member
Author
|
Companion implementation PR: #141 ( |
Strategic reset for the AI-device wedge (issue #103 lineage): - Move agent-iam-strategy.md from docs/research/ to docs/ (it is the strategic anchor, not third-party research); update all 13 references. - arch.md §22d: IAM-guarantee delivery (hooks-first, proxy-fallback) + the agentkeys wire CLI surface. - New wiki glossary: IAM tool vs IAM guarantee, hooks-vs-proxy trade-off, verified hook-availability table across six runtimes. - New plan docs/spec/plans/phase-1-fresh-user-wire-onboarding.md (the 7-step fresh-user journey + manual-vs-automatic hybrid decision). - strategy doc §3.6/§3.7 + §4 Phase 1 + §5 Phase 3/3b updates. - Archive the superseded Rust-runtime approach (demo runbook, verify script, setup script, sandbox Dockerfile) under docs/archived/. The operator runbook (docs/operator-runbook-wire.md) ships in the companion implementation PR #141, since it documents the agentkeys wire + hook commands that land there.
hanwencheng
added a commit
that referenced
this pull request
May 28, 2026
Phase 1.a of the fresh-user wire-onboarding plan. Turns the shipped MCP tools (#107) into IAM guarantees the LLM cannot bypass, via Task-Host lifecycle hooks (issue #133 track). Companion docs (strategy/arch/wiki/ plan) land in PR #140. - `agentkeys hook check|audit|memory-inject` (src/hook.rs): thin MCP JSON-RPC clients invoked by the wire-generated hook scripts. Read the host stdin payload, call an AgentKeys MCP tool, emit host-shaped stdout JSON. `check` fails CLOSED; audit + memory-inject never block. - `agentkeys wire <runtime>` (src/wire.rs): RuntimeAdapter trait + HermesAdapter. Detects Hermes, writes hook scripts to ~/.hermes/agent-hooks/, merges a sentinel-managed `hooks:` block into ~/.hermes/config.yaml (preserves other keys, refuses to clobber a foreign hooks:), sets hooks_auto_accept: true, verifies via `hermes hooks doctor`. Idempotent (ok/skip/fail per step); --check-only reports drift without writing. - CLI wiring: Commands::Wire + Commands::Hook + HookAction in main.rs; pub mod hook/wire in lib.rs. - Operator runbook (docs/operator-runbook-wire.md): the 7-step fresh-user flow + three-act demo verification — moved here from the docs PR since it documents these exact commands. 13 unit tests (6 hook + 7 wire). Smoke-tested end-to-end against the in-memory MCP backend: Act 1 memory injection, Act 2 over-cap denial, auto-audit, and the full wire apply->idempotent-rerun->check-only cycle.
718362f to
0c6ce7f
Compare
This was referenced May 28, 2026
hanwencheng
added a commit
that referenced
this pull request
May 28, 2026
Phase 1.a of the fresh-user wire-onboarding plan. Turns the shipped MCP tools (#107) into IAM guarantees the LLM cannot bypass, via Task-Host lifecycle hooks (issue #133 track). Companion docs (strategy/arch/wiki/ plan) land in PR #140. - `agentkeys hook check|audit|memory-inject` (src/hook.rs): thin MCP JSON-RPC clients invoked by the wire-generated hook scripts. Read the host stdin payload, call an AgentKeys MCP tool, emit host-shaped stdout JSON. `check` fails CLOSED; audit + memory-inject never block. - `agentkeys wire <runtime>` (src/wire.rs): RuntimeAdapter trait + HermesAdapter. Detects Hermes, writes hook scripts to ~/.hermes/agent-hooks/, merges a sentinel-managed `hooks:` block into ~/.hermes/config.yaml (preserves other keys, refuses to clobber a foreign hooks:), sets hooks_auto_accept: true, verifies via `hermes hooks doctor`. Idempotent (ok/skip/fail per step); --check-only reports drift without writing. - CLI wiring: Commands::Wire + Commands::Hook + HookAction in main.rs; pub mod hook/wire in lib.rs. - Operator runbook (docs/operator-runbook-wire.md): the 7-step fresh-user flow + three-act demo verification — moved here from the docs PR since it documents these exact commands. 13 unit tests (6 hook + 7 wire). Smoke-tested end-to-end against the in-memory MCP backend: Act 1 memory injection, Act 2 over-cap denial, auto-audit, and the full wire apply->idempotent-rerun->check-only cycle.
hanwencheng
added a commit
that referenced
this pull request
May 31, 2026
* feat(cli): agentkeys wire + hook — IAM-guarantee hooks for Hermes Phase 1.a of the fresh-user wire-onboarding plan. Turns the shipped MCP tools (#107) into IAM guarantees the LLM cannot bypass, via Task-Host lifecycle hooks (issue #133 track). Companion docs (strategy/arch/wiki/ plan) land in PR #140. - `agentkeys hook check|audit|memory-inject` (src/hook.rs): thin MCP JSON-RPC clients invoked by the wire-generated hook scripts. Read the host stdin payload, call an AgentKeys MCP tool, emit host-shaped stdout JSON. `check` fails CLOSED; audit + memory-inject never block. - `agentkeys wire <runtime>` (src/wire.rs): RuntimeAdapter trait + HermesAdapter. Detects Hermes, writes hook scripts to ~/.hermes/agent-hooks/, merges a sentinel-managed `hooks:` block into ~/.hermes/config.yaml (preserves other keys, refuses to clobber a foreign hooks:), sets hooks_auto_accept: true, verifies via `hermes hooks doctor`. Idempotent (ok/skip/fail per step); --check-only reports drift without writing. - CLI wiring: Commands::Wire + Commands::Hook + HookAction in main.rs; pub mod hook/wire in lib.rs. - Operator runbook (docs/operator-runbook-wire.md): the 7-step fresh-user flow + three-act demo verification — moved here from the docs PR since it documents these exact commands. 13 unit tests (6 hook + 7 wire). Smoke-tested end-to-end against the in-memory MCP backend: Act 1 memory injection, Act 2 over-cap denial, auto-audit, and the full wire apply->idempotent-rerun->check-only cycle. * test(harness): phase1 wire end-to-end harness + session-bearer plumbing Adds harness/phase1-wire-demo.sh — the idempotent end-to-end test for the agentkeys wire + hook flow, reusing the setup-heima.sh account (master = MacBook, agent = aiosandbox). Two modes: --light (in-memory MCP, fully self-contained) and default (real broker + workers + Heima mainnet). Agent-side steps run via the sandbox REST API (/v1/shell/exec, /v1/file/upload); the aarch64-linux agent binary is cross-built in an arm64 rust container. Manual gates: LLM key, real Touch ID at scope grant, the Hermes surprise + confirm. ok/skip/fail per step; bash 3.2 portable. Also closes the session-bearer gap the real-broker path needs (arch.md §22b.4 "cap-mint daemon->broker auth: session JWT only"): the hook now forwards X-AgentKeys-Session-Bearer (env AGENTKEYS_SESSION_BEARER), and `agentkeys wire` bakes it into the generated hook scripts. The MCP server already relays it to the broker cap-mint; in-memory backend ignores it. Test plan + automation decisions: docs/spec/plans/phase1-wire-harness-test-plan.md. CLAUDE.md: one-line note that harness/demo testing runs on Heima mainnet. Validated: hook 6/6 + wire 7/7 unit tests green; --light Phase 0 + --skip mechanism confirmed on bash 3.2. Full live in-sandbox run pending a reachable rust registry (cross-build) + hermes in the sandbox. * test(harness): auto-resolve operator_omni + session bearer in Phase 0 Phase 0 was failing for two reusable-account fields that don't need operator input: - operator_omni: read from the agent file (heima-agent-create.sh writes both actor_omni AND operator_omni), same as actor_omni. - session bearer: read the JWT from the master session file ~/.agentkeys/<session-id>/session.json (.token), with a soft expiry warning (created_at + ttl_seconds < now) so a stale session is flagged before cap-mint 401s. Both still accept --flag / env overrides. Verified: Phase 0 now passes end-to-end in real mode against the reused heima account. * test(harness): use OPENROUTER_API_KEY env as the 0.6 LLM-key fallback 0.6 (LLM key) no longer always prompts: it falls back to OPENROUTER_API_KEY / LLM_API_KEY from the environment (export it in ~/.zshenv) and only prompts when neither is set. The resolved key is used in Phase 4 to configure the sandbox Hermes model (provider/base_url/default/api_key) so the surprise chat works; if no key is available, Phase 4 is skipped cleanly. Verified: Phase 0 now runs fully unattended in real mode (0.6 auto from env, 0.7 from the master session). Never bakes the key into the repo. * test(harness): add --fast host mode + rewrite runbook harness-first --fast: host-only inner loop (seconds) — in-memory MCP on the Mac + the three hook acts directly. No sandbox, no aarch64 cross-build, no hermes, no account. Verified: Act 1 memory, Act 2 deny/allow, audit all green. operator-runbook-wire.md rewritten as the single "run the demo" doc: harness-first (TL;DR one-command per mode), the three modes table (--fast / --light / real), prerequisites per mode, the manual gates (LLM key auto from OPENROUTER_API_KEY, Touch ID, the surprise), updated troubleshooting (session refresh, RUST_BUILD_IMAGE), and the old manual 7-step flow kept as Appendix B. * test(harness): remove --fast mode Drop the host-only --fast path (run_fast() + flag + main branch + header). Back to two modes: --light (sandbox, in-memory backend) and default (real broker + workers + Heima mainnet). Runbook updated to match — --light is the lighter inner loop. * test(harness): cache cross-build + fix sandbox wiring end-to-end Verifying the demo end-to-end surfaced a chain of bugs that silent failures (|| true, multi-line ErrorObservation, existence-only checks) had been masking. All fixed; a --light run is now fully green and the Hermes surprise is memory-aware (references the Chengdu travel memory). Build caching (re-runs fast + idempotent like a local cargo build): - derived builder image (pkg-config + libssl-dev baked once; reqwest's default native-tls links libssl) instead of apt-get per build - named docker volumes for the cargo registry + git cache (no crates.io re-fetch every run); target/ already bind-mounted - source-aware rebuild gate (rebuild only when a tracked .rs/Cargo.toml/ Cargo.lock is newer than the binary; else skip) - restart the sandbox MCP server when a fresh binary is uploaded Sandbox wiring fixes: - upload binaries to ~/.local/bin, not /usr/local/bin: the sandbox upload API runs non-root -> Errno 13 (resolve_sbx_paths resolves $HOME) - runnability probe uses --help (the MCP server has no --version) - MCP_PORT default 18088: 8088 collides with the sandbox gem-server (Address already in use) Phase 4 (the surprise): - write OPENROUTER_API_KEY to ~/.hermes/.env + set provider=openrouter (was provider=custom + never wrote the key); single-line commands (the sandbox rejects multi-line payloads); verified, not || true-masked - wiring precheck fails loud when hooks/MCP are missing instead of printing open-chat instructions for a non-memory-aware chat - non-fatal 4.1 model smoke surfaces 429/credential errors pre-surprise - default LLM_MODEL=deepseek/deepseek-v4-flash (':free' is 429-throttled) Runbook: troubleshooting rows for every failure above; env overrides for the build-cache + LLM_MODEL/LLM_BASE_URL knobs. * feat(wire): own the runtime hooks: key + audit memory ops + reproducible cross-build agentkeys wire now OWNS the runtime's hooks: key. On (re)wire it REPLACES any existing top-level hooks: block — whether that's our own block whose sentinel comments a host re-serialization (`hermes config set`) dropped, or a hand- authored one. The IAM guarantee requires the hooks be un-bypassable, and a YAML config allows only one hooks: key, so coexistence isn't possible. Documented for users in the new docs/user-manual.md (single home for user-facing behaviors), linked from CLAUDE.md. - wire.rs: strip_top_level_hooks() + merge_block now REPLACES (was: refused) a foreign/de-sentineled hooks: block; preserves other config keys; stays idempotent (sentineled re-runs skip). Unit test updated accordingly. - mcp-server: audit-log every memory.get / memory.put (actor + namespace + bytes) at info — a server-side trail for memory reads/writes. - harness robustness (three silent-failure traps found while iterating live): * pin the cross-build toolchain to the host's rustc — rust-toolchain.toml pins `channel = "stable"`, which FLOATS; a fresh container pulled 1.96 and broke clean builds of pre-release deps (crypto-common 0.2 / hybrid-array). Override via CROSS_RUST_TOOLCHAIN. * check the docker build EXIT CODE, not just file existence — a failed build no longer reports "ok" off a stale binary (warns + falls back, else fails). * check BOTH binaries for staleness — a stale mcp-server is no longer masked by an up-to-date cli. - runbook: recovery row for "MCP died after a sandbox restart" (re-run Phases 0+1). * fix(harness): make MCP step 1.4 mode/token-aware (idempotent restart) Phase 1.4 reused ANY server answering :MCP_PORT/healthz, regardless of its --backend or --vendor-tokens. So a leftover real-mode server (real broker, harness-tok) from a default-mode run was reused for a --light run — the light hook's demo-tok then hit it and every memory.get / permission.check 401'd ("bearer token not recognized"). This caused the recurring interactive-memory flakiness. Now 1.4 REUSES a running server only when its argv carries BOTH the intended --backend AND --vendor-tokens; a mismatched / stale / leftover server is killed and restarted with the correct config. Verified: a planted harness:STALE-TOK server is detected and replaced with magiclick:demo-tok, after which all three acts pass. * fix(harness): stop mcp-server before re-uploading its binary (ETXTBSY) Uploading a new agentkeys-mcp-server while the previous one is still running failed with "Errno 26: Text file busy" — Linux refuses to overwrite a running executable. 1.3 now pkills a live mcp-server before uploading its new binary (1.4 restarts it after). Verified: a changed mcp-server now uploads cleanly, the sandbox file matches the host, and the audit-logging build is live. * feat(harness): respawn-loop MCP + log-append + explicit mode (no silent real) Two robustness changes that close out the recurring demo flakiness. 1. Step 1.4 now starts the MCP server under a RESPAWN LOOP (`nohup bash -c 'while true; do <server>; echo [respawn]; sleep 1; done'`), so a crash self-heals without a harness re-run — the server kept dying under plain nohup, which is what left the wired hook hitting a dead :18088. The log uses >> (append) instead of > (truncate), so a restart no longer wipes the audit trail (that's why the memory log kept showing empty). Restart uses a double-pkill so the loop + any child it respawns mid-restart both go down. Verified: kill the server child → markers 1→2, healthz OK, token preserved. 2. Mode is now REQUIRED and explicit: `--light` OR `--real`, no silent default. Running the bare harness used to default to REAL mode, which flips the sandbox MCP to the live broker (harness-tok, no in-memory Chengdu fixture) and 401s the light-demo hook — the single biggest source of "no memory" confusion. The harness now errors with guidance if neither flag is given, and prints a loud MODE: banner so the active mode is never ambiguous. Added `--real`; --help + the runbook updated to {--light | --real}. * docs(runbook): explicit --light vs --real comparison table "light" was opaque (reads as "fewer features"). Replace the terse two-row table with a full side-by-side: backend, memory data, the Chengdu fixture, broker/ chain, account, cap-mint, token, Touch ID, network, what it proves, cost/risk. Lead with the one-liner: light = self-contained demo (pre-seeded, no external deps); real = the live product wired to real infra. * feat(cli+harness): agentkeys memory put + Mode-R memory seeding (step 1.5) The --real demo had no Chengdu fixture (in-memory only), so the surprise found nothing. Add a way to seed the real memory worker. - New CLI: `agentkeys memory put --namespace <ns> --content <text>` (main.rs + hook::memory_put) — writes via agentkeys.memory.put, reusing the hook MCP client (env-configured actor/operator/token/session-bearer). Errors surface (Err) so a failed seed is loud. Verified end-to-end against a local in-memory MCP: put a new namespace → memory-inject reads it back; server logs the write. - Harness step 1.5 (Mode R ONLY — in-memory auto-seeds): seeds the demo travel memory via `agentkeys memory put`. Idempotent (skips if the namespace already returns content), prompts before the write (auto with --yes), and fails loud with the heima-scope-set.sh --webauthn command if the cap-mint Store is rejected (memory scope not granted). The cap-mint uses the master session + the agent's device_key_hash (now resolved from the agent file and passed to the real-mode MCP via --default-device-key-hash). The authorizing Touch ID is the scope grant at 0.5. SEED_MEMORY_CONTENT overrides the default fixture. - Runbook: the --real "Chengdu surprise" row now ✅ (1.5 seeds it) + a manual- gates entry describing the seed step and its scope/session prereqs. * feat(harness): step 1.5 self-authorizes — WebAuthn grant then seed Per request: 1.5 (Mode R) now grants the memory scope itself instead of just failing-loud with instructions. Flow when the namespace is empty: 1.5a heima-scope-set.sh --webauthn --services memory → real Touch ID (on-chain idempotent; only prompts because we gate on empty-memory, and SETS the full service list — override SEED_SCOPE_SERVICES to add more) 1.5b agentkeys memory put → cap-mint Store → real memory worker Still idempotent: if the namespace already returns content, the whole step skips (no Touch ID). Detects a SKIPPED grant (K11 not webauthn-enrolled) and fails loud with the `agentkeys k11 enroll --webauthn` command. * docs(runbook): self-authorizing 1.5 seed + stale-binary trap Match the runbook to the a6acef6 harness behavior: - mode table: --real "Chengdu surprise" + "Touch ID" rows now say 1.5 SELF- grants the memory scope (heima-scope-set.sh --webauthn) then seeds — the Touch ID is at 1.5, not a separate 0.5 step. - prerequisites: real-mode 1.5 seed needs the master's primary K11 enrolled in webauthn mode (else the grant skips); SEED_SCOPE_SERVICES sets the full list. - manual gates: Touch ID is a hardware prompt --yes can't bypass; 1.5 is idempotent + self-authorizing (check → grant → seed, skip if populated). - env overrides: SEED_MEMORY_CONTENT / SEED_SCOPE_SERVICES. - troubleshooting: the stale-PATH-binary "unrecognized subcommand 'memory'" trap (rebuild + reinstall), 1.5a grant-skipped (k11 enroll), 1.5b seed fail. * fix(harness): make --webauthn actually gate the 1.5 Touch ID grant The --webauthn flag was vestigial — set but only shown in the log; 1.5 hardcoded heima-scope-set.sh --webauthn regardless, so `webauthn=false` in the banner contradicted a Touch ID that would still fire. Now the flag means what it says: - 1.5 tries `agentkeys memory put` directly first — succeeds (no Touch ID) when the scope is already granted. - If the put is scope-rejected: grant via real Touch ID ONLY when --webauthn was passed, then retry; without --webauthn, fail loud telling the operator to re-run with `--real --webauthn` (never triggers an unexpected Touch ID). - This also avoids a wasted Touch ID when the scope already exists. The REAL banner now warns when webauthn=false that 1.5 won't grant the scope. Runbook: TL;DR shows `--real --webauthn`; mode table / Touch ID / seed gates all state the grant is --webauthn-gated. * fix(harness): drop bogus --session-id from 1.5 scope grant heima-scope-set.sh has no --session-id flag (it signs via MASTER_KEY from the env), so 1.5a failed with "unknown flag: --session-id". Removed it. NOTE: real put/fetch is still blocked by infra, not the harness — the deployed broker returns 404 for /v1/cap/memory-{get,put} and the audit worker 404s on /v1/audit/append/v2 (the memory worker is fine, 422). Those routes need redeploying (setup-broker-host.sh) before --real can be green. * docs(CLAUDE): origin/evm deprecated → origin/main is the default + deploy branch evm is frozen; all new work lands on the default branch main (feature branch → PR → main). Rewrote the branch/deploy policy: the broker host deploys from origin/main via `setup-broker-host.sh --ref main` (--upgrade is a back-compat no-op; --ref drives the pull). Updated the land-the-fix policy's push target evm → main. Left the EVM-the-VM references (pallet_evm / evm_version="london") untouched — those are the Heima chain's EVM level, not the git branch. * fix(harness): point real-mode cap-mint at the BROKER, not the signer The cap-mint 404 was a harness config bug, not infra. The MCP server was started with --broker-url ${BACKEND_URL}, and BACKEND_URL = $AGENTKEYS_SIGNER_URL = signer.litentry.org — the dedicated SIGNER listener (:8092, key-derivation only), which 404s /v1/cap/*. The cap-mint routes live on the BROKER, fronted at OIDC_ISSUER = https://$BROKER_HOST = broker.litentry.org (verified live: cap routes return 422 = exist; jwks 200). - Resolve BROKER_URL = AGENTKEYS_BROKER_URL → OIDC_ISSUER (the broker), and use it for the MCP --broker-url AND the 0.2 broker-healthz check (was checking the signer's static /healthz stub). - 1.4 reuse check now also matches --broker-url (real mode), so a server pointed at the wrong broker is replaced instead of silently reused. Empty in light mode (no-op grep). Verified live after the operator's `setup-broker-host.sh --ref main`: broker.litentry.org/v1/cap/{cred-fetch,memory-put} → 422 (routes exist) audit.litentry.org/v1/audit/append/v2 → 422 (worker fixed) memory.litentry.org/v1/memory/{get,put} → 422 (worker OK) * feat(mcp): per-actor STS credential relay for worker S3 (issue #90) The MCP http backend never forwarded per-actor STS creds, so the memory worker fell back to the EC2 instance profile (SES-only, no S3) and every memory put/get 502'd. Wire the relay end to end: - mcp-server: http backend mints agent-tagged STS creds via the broker (mint-oidc-jwt -> AssumeRoleWithWebIdentity) and forwards them to the worker as X-Aws-* headers, AWS-scoping S3 to bots/<actor>/memory/. New config: --agent-session-bearer / --memory-role-arn / --vault-role-arn / --aws-region. Adds agentkeys-provisioner dep. - harness: step 0.8 mints the agent session from agent_private_key (omni == actor_omni); 1.4 passes the relay args (real mode always restarts for a fresh bearer). Also 0.7 operator-session auto-mint (wallet_sig; fixes the expired/wrong-omni alice session), RUSTUP_VOL toolchain cache, sbx_exec --max-time, memory-inject </dev/null. - cli: memory_inject no longer blocks on stdin. - docs: operator-runbook-wire 502/0.7/0.8 rows; CLAUDE.md ssh-agentkeys. Verified live: memory.put -> ok (s3_key bots/82a0.../memory/memory.enc), memory.get -> content, hook memory-inject -> {context}; cross-actor S3 write -> AccessDenied (per-actor IAM isolation holds). * feat(cli): agent device-session — in-sandbox keygen + wallet_sig (interim §10.2) Fixes the "master bootstrap" violation: heima-agent-create.sh generated the agent key on the operator laptop (cast wallet new) with a stub link code. The agent's device key must be born on the agent machine. New keystone: - `agentkeys agent device-session` (crates/agentkeys-cli/src/device_session.rs): generates/loads a secp256k1 device key IN THE SANDBOX (0600, never leaves), derives EVM address + actor_omni (sha256) + device_key_hash (keccak256), mints a broker session via wallet_sig SIWE, and emits {agent_address, actor_omni, device_key_hash, pop_sig, session_jwt} for the master to bind on-chain. EIP-191 signing (k256 low-s, v∈{27,28}) matches the broker's ecrecover exactly. Shell- driven, no python / no new sandbox runtime. Adds k256+sha3 (already workspace deps). Verified live: fresh in-sandbox key → minted session omni == actor_omni; key present only in the sandbox. - harness/phase1-wire-demo.sh: clean_slate step (1.2b) — kills orphaned hermes chats, removes stale fake-MCP launchers, clears Hermes session/state/native memory so each demo proves recall from the live worker (SKIP_CLEAN=1 opts out). - CLAUDE.md: "Agent-side wire demo — REAL memory only" rule + ssh-agentkeys. Full HDKD-literal §10.2 (broker link-code endpoints + daemon keygen + HDKD omni) tracked in #144. Harness Phase P (orchestrate the ceremony + on-chain register + scope grant) lands next. * feat(harness): Phase P — fresh in-sandbox agent pairing (install + approve) Wires the device-session keystone into the demo so every --real run does a real arch.md §10.2-style pairing with the agent key born in the sandbox (interim; full HDKD/broker ceremony = #144). - heima-agent-create.sh: --from-pubkey mode (--agent-address/--actor-omni/ --device-key-hash/--pop-sig) registers the SANDBOX-generated device without generating a key or funding on the master; writes a KEY-LESS metadata file so heima-scope-set can resolve the actor by label. Legacy self-gen path intact. - phase1-wire-demo.sh Phase P (after 1.3, before 1.4): mint one-time link code → `agentkeys agent device-session --regen` in the sandbox → register device on-chain → heima-scope-set --webauthn (approve permissions, Touch ID) for the FRESH actor. Sets ACTOR_OMNI/AGENT_SESSION_BEARER/DEVICE_KEY_HASH from the pairing. Fresh omni each run → empty memory → seeded at 1.5 → recalled in Act 1. - phase0: derive OPERATOR_OMNI from the master key (no agent-file dependency); 0.4 defers actor_omni to Phase P; 0.8 skipped under fresh pairing (Phase P mints the agent session in-sandbox). --reuse-agent / AGENTKEYS_REUSE_AGENT=1 restores the legacy master-side agent for fallback. Verified (no Touch ID / no tx): device-session → from-pubkey --dry-run assembles registerAgentDevice with the sandbox values + operator_omni from the master key; metadata file is key-less. On-chain register + Touch ID scope grant are operator-gated (run with --real --webauthn). * docs(runbook): Phase P fresh-pairing walkthrough + how-to-run Fold the §10.2 fresh-pairing flow into operator-runbook-wire.md: - new "How to run — the --real --webauthn walkthrough" (step-by-step: start sandbox → run → Phase P install/pair → approve (Touch ID) → surprise). - two-modes table, prerequisites, manual gates, flags, env, troubleshooting all updated: agent key is born in the sandbox (Phase P), Touch ID + on-chain register happen each run (fresh pairing), --reuse-agent for the legacy path. * feat(harness): deterministic memory-injection check (hermes hooks test) The chat "surprise" is a flaky success signal — an LLM may phrase a memory-aware reply any way, treat a past-dated memory as "not this weekend", or DISOWN the injected context as a hallucination (observed live). Add a deterministic, no-inference check: - Phase 4.2 (authoritative): `hermes hooks test pre_llm_call` fires the hook through Hermes' OWN config-wired dispatcher and asserts stdout carries a "context" block → proves the permissioned memory is injected into the LLM request. 4.2b: `hermes hooks doctor` (all 3 hooks exec + valid JSON). - 4.3 (the chat surprise) demoted to an OPTIONAL live demo; note to run it WHILE the gate is open (Phase 5 teardown stops the MCP). - runbook: "Verifying it worked — deterministically (no LLM inference)" section. Validated in --light: 4.2 → {"context":"## Memory: travel\nChengdu …"} → ok. * docs(runbook): clarify the deterministic check runs IN THE SANDBOX (standalone) Make explicit: the harness (laptop) does setup; `hermes hooks test pre_llm_call` is a standalone in-sandbox verify (docker exec … or the code-server terminal) — no need to re-run the harness to re-check memory injection. * docs(runbook): add the deterministic verify (hermes hooks test) to the TL;DR Put `hermes hooks test pre_llm_call` front-and-center as a third TL;DR step — the no-inference pass/fail for memory injection, alongside the run commands. * fix(harness): 1.5 auto-seeds in fresh pairing (no redundant [y/N] after Touch ID) Phase P (P.3) already approves the [memory] scope via Touch ID for the fresh actor, and a brand-new actor's namespace is always empty — so step 1.5's legacy "memory empty — seed? [y/N]" gate (and its own scope-grant/second-Touch-ID path) was redundant friction right after the operator already approved. In fresh pairing, 1.5 now seeds automatically with a single memory.put (succeeds because P.3 granted the scope). The gated/scope-aware path is kept only for --reuse-agent (legacy, where the agent wasn't freshly paired this run). * fix: address Codex adversarial review (PR #141) - [high] harness 1.3: a failed cross-build now FAILS (red) instead of skip, so the demo can't silently run STALE binaries and "pass" the deterministic 4.2 verify. Explicit unsafe override: ALLOW_STALE_BINARY=1. - [high] wire.rs: hook scripts 0o755 -> 0o700 + hooks dir 0o700. The scripts export the operator session bearer + vendor token; 0o700 closes the cross-user token-theft vector. (Same-user exposure is architectural — out-of-process custody tracked in #144.) - [med] device_session.rs: enforce owner-only on an EXISTING key (reject symlink / non-regular / group-or-other perm bits), not just on create — so a copied/restored loose-perm key can't silently mint a session. - [med] http_backend sts_headers: warn! on the legitimate no-relay downgrade (both absent) and ERROR on inconsistent config (exactly one of bearer/role-arn set) before calling the worker. Kept both-absent as a valid fallback (--reuse-agent / in-memory) rather than hard-failing. Build-verified (cli + mcp-server). Findings #1/#2/#4 fixed; #3 partial by design. * style: cargo fmt + clippy fix for CI gate rustfmt on the agentkeys-cli files touched this session (device_session.rs, wire.rs) plus pre-existing drift in hook.rs/main.rs; and drop a redundant explicit deref in wire.rs strip_top_level_hooks (clippy explicit_auto_deref). Fixes the cargo-fmt CI job; clippy + build clean on cli and mcp-server.
hanwencheng
added a commit
that referenced
this pull request
May 31, 2026
Merged origin/main (#140 IAM strategy reset + #141 agentkeys wire/hook + #137 audit-vector exporter + #138 CI hardening). The merge brought in the Authority-Host / Task-Host model, which obsoletes the prior agent-onboarding design (paste-a-pair-code into a remote sandbox). Redesigned the web-flow plan to match. What changed in the plan: - stage3-agent-usage.md — full rewrite. Agent onboarding is now pair (agentkeys agent device-session — key born in the runtime, never on the master) → wire (agentkeys wire installs 3 IAM-guarantee hooks the LLM can't bypass: pre_tool_call→check, post_tool_call→audit, pre_llm_call→memory-inject) → the three acts (permissioned memory / deterministic denial / audit) + the memory-aware surprise (deterministically backed by `hermes hooks test pre_llm_call`, not a chat reply). Adds the hook-aware live dashboard + guarantee-health panel. Preserves the 16-step isolation health check. - overview.md — new "two-host model" section up top (Authority Host vs Task Host; IAM tool vs IAM guarantee). Act-3 TODO reframed as "Phase 2 — the wire flow." Master onboarding (Phase 1) unchanged. - data-model.md — replaced the bootstrap/* endpoints with the pair/wire/observe surface: /v1/agents/pair/{init,bind,approve-scope}, /v1/agents/:id/{wire, unwire,verify/memory-inject,guarantee-health}, hook-tagged /v1/audit/stream. Notes the per-actor STS relay config + MCP port 18088. - input-discipline.md — §2.5: runtime choice + wire namespaces/payment-scope are Real inputs; the agent device key is born in the runtime (master never holds it); the runtime list reflects real adapter support, never faked. - README.md — redesign banner + updated source-of-truth + file-map row. - dev.sh — MCP_PORT default 8088 → 18088 (8088 collides with the sandbox gem-server, per #141); header comments aligned. Plan only — no implementation. Master-onboarding docs (stage1/stage2) untouched.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Strategic reset for the AI-device wedge (issue #103 lineage). Docs-only PR; the companion implementation PR #141 ships
agentkeys wire+agentkeys hook(and the operator runbook, since it documents those commands).agent-iam-strategy.mdfromdocs/research/→docs/(it is the strategic anchor, not third-party research); update all 13 references repo-wide.agentkeys wireCLI surface.docs/wiki/agent-iam-guarantee-glossary.md) — IAM tool vs IAM guarantee, hooks-vs-proxy trade-off table, verified hook-availability table across six runtimes (Claude Code, Codex, Hermes, OpenClaw, Kimiclaw, xiaozhi-server).docs/spec/plans/phase-1-fresh-user-wire-onboarding.md) — the 7-step fresh-user journey + the manual-vs-automatic hybrid decision.docs/archived/*-rust-runtime-2026-05*.Decision recorded
Option B (vendor → Task Host → both MCPs) with hooks-first IAM guarantees; OpenAI-compatible proxy as a lower-priority fallback for hosts without a hook surface. Anchored in strategy §2.1/§2.4 (Authority Host vs Task Host) and issue #133.
Companion PR
Implementation (
agentkeys wire+agentkeys hook+ Hermes adapter + the operator runbook) lands in #141 offmain. The two are reviewable independently; both merge to leave the tree consistent. The plan doc's relative links tooperator-runbook-wire.mdresolve onmainonce both merge.Test plan
docs/archived/README.mdlists the 4 archived artifacts with superseded-by pointersdocs/research/agent-iam-strategy.mdreferences remain🤖 Generated with Claude Code