fix: #12 daemon OS keychain support (multi-platform, wallet-based namespacing) by hanwencheng · Pull Request #24 · litentry/agentKeys

hanwencheng · 2026-04-14T06:27:47Z

Summary

Daemon now uses the OS keychain when available (macOS Keychain, Linux secret-service via keyring crate) with file fallback. Multiple daemons on one machine no longer overwrite each other.

What changed

Shared session_store in agentkeys-core

session_store moved from agentkeys-cli to agentkeys-core.
API: save_session(session, session_id), load_session(session_id), clear_session(session_id).
Keychain: service="agentkeys", account=<session_id>.
File fallback: ~/.agentkeys/<session_id>/session.json (0600).

CLI

Uses session_id = "master".

Daemon

Pre-pair: session_id = "daemon-pending".
Post-pair / recover: session_id = format!("daemon-{}", wallet.0).

Env override

AGENTKEYS_SESSION_STORE=file still forces file-only (CI/Docker/headless).

Test plan

cargo test -p agentkeys-core -- --test-threads=1 → 10 passed (includes 4 new session_store tests)
cargo test -p agentkeys-cli → 14 passed
cargo test -p agentkeys-daemon → 28 passed (13 + 15)
cargo test -p agentkeys-mock-server → 37 passed
Known nit: session_store tests are not parallel-safe due to shared keychain state; run with --test-threads=1 or add serial_test as a follow-up.
Claude reviewer: running at PR-open
Codex: deferred (codex exec issue)

Follow-ups

Add serial_test crate to isolate keychain-touching tests for parallel execution.
Update docs/spec/architecture.md + wiki/key-security.md storage table (separate docs-only PR).

Issue

Closes #12.

🤖 Generated with Claude Code

Codex reviewer: approved (7 passes, 6 legit + all fixed across v5–v10; pre-Daemon should use OS keychain when available (Mac mini, MacBook, Raspberry Pi) #12 legacy migration dismissed per pre-launch status)

Codex review (2026-04-14): ⚠️ Reviewed + fixed in 2 rounds.

Round 1 (commit 6553041): daemon-pending sentinel removed + list_fallback_session_ids + legacy-master helper.
Round 2 (commit 9ef328e): discovery-marker files for keyring-only sessions + ctx.load_session() wired to legacy fallback.
52 tests across cli/core/daemon crates all pass.

hanwencheng · 2026-04-14T07:44:38Z

Codex review (via gstack /codex skill, GPT-5.4 codex-high reasoning against origin/main)

Verdict

The daemon's new startup logic cannot reliably resume persisted sessions and can get stuck on a synthetic pending entry after failed pairing attempts. The storage migration also drops compatibility with existing CLI sessions, so the patch introduces user-visible regressions.

Full review comments:

[P1] Reuse the saved daemon session on default startup — crates/agentkeys-daemon/src/main.rs:105-108
When the daemon starts without --session-id, this code only probes daemon-pending. A successful pair/recover stores the real session under daemon-<wallet>, so the next restart never finds it and falls back into pairing again unless the caller manually supplies the derived ID every time. That breaks the default persistence path for the new wallet-namespaced storage.
[P1] Clear the pending sentinel when pairing fails — crates/agentkeys-daemon/src/main.rs:115-120
This writes a synthetic daemon-pending session before pairing, but it is only removed on the success path. If run_pair_flow() times out, is rejected, or otherwise errors, the next startup will successfully load that fake token="pending" session at line 108 and skip pairing entirely, leaving the daemon unusable until the user manually deletes the entry.
[P2] Fall back to legacy master session locations — crates/agentkeys-core/src/session_store.rs:42-48
load_session("master") now only checks the new master namespace. Existing installs store the CLI session under the old keyring account/path (account=session or ~/.agentkeys/session.json), so after upgrading, commands that rely on ctx.load_session() behave as logged out until the user re-runs init or recover. A compatibility fallback or one-time migration is needed to avoid invalidating current logins.

— codex review --base main + -c 'model_reasoning_effort="high"'.

Three findings from the combined codex+claude review on PR #24, applied in-branch per human design call: **P1 — Nondeterministic daemon selection (codex)** `daemon/main.rs`: when no --session-id is given and >1 `daemon-*` session directory exists, error with the candidate list instead of silently restoring an arbitrary wallet (cross-wallet credential mix-up risk on multi-daemon hosts). Single-session and explicit --session-id paths are unchanged. `session_store.rs::list_fallback_session_ids` now sorts results alphabetically so the enumeration order is deterministic across runs and filesystems. **P1 — Windows/Linux keychain account-name sanitization (claude)** New `sanitize_for_keyring(session_id)` helper: - ASCII alnum + `-_.` preserved as-is - other chars (including null bytes, slashes, non-UTF8) replaced with `_` - if sanitization changes the value OR length >128, append an 8-char hash of the original to preserve uniqueness across alias collisions Applied in all three `try_keyring_*` helpers so Windows Credential Manager (rejects null bytes) and Linux secret-service (rejects non-UTF8) don't crash or silently truncate-and-collide. **P2 — Marker file overwriting real file fallback (both reviewers)** Keyring-managed sessions now drop an empty `.keyring_managed` marker file instead of overwriting `session.json` with a JSON stub. The fallback invariant is preserved: switching `AGENTKEYS_SESSION_STORE` between `file` and keyring no longer destroys the real file-mode credential. `list_fallback_session_ids` matches either `session.json` or `.keyring_managed`. `clear_session` removes both. Tests (5 new in session_store): - list_fallback_session_ids_is_sorted - sanitize_for_keyring_preserves_ascii_alnum_and_safe_punctuation - sanitize_for_keyring_replaces_unsafe_chars_and_appends_hash - sanitize_for_keyring_truncates_oversized_input - file_mode_save_writes_session_json_not_marker - list_fallback_session_ids_finds_marker_only_directories Test: cargo test -p agentkeys-core -p agentkeys-daemon -p agentkeys-cli -- --test-threads=1 core: 16 passed; daemon: 13+15 passed; cli: 14 passed Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

hanwencheng · 2026-04-14T18:13:55Z

Combined codex+claude review fixes landed (`d5a40de`)

Per the design call, three findings fixed in-branch. The "upgrade from legacy pre-#12 users" finding was dismissed (no users yet).

P1 — Nondeterministic daemon selection (daemon/main.rs)
When no --session-id is given and >1 daemon-* directory exists, the daemon now errors with the candidate list instead of silently restoring an arbitrary wallet. Single-session and explicit --session-id paths unchanged. list_fallback_session_ids also sorts alphabetically so enumeration is deterministic.

P1 — Windows/Linux keychain account-name sanitization (session_store.rs)
New sanitize_for_keyring(session_id) helper: ASCII alnum + -_. preserved; unsafe chars (null bytes, slashes, non-UTF8) replaced with _; if sanitization changes the value OR length >128, append an 8-char hash of the original for uniqueness. Applied to all three try_keyring_* helpers so Windows Credential Manager and Linux secret-service don't crash or silently truncate-and-collide.

P2 — Marker file overwriting real file fallback (session_store.rs)
Keyring-managed sessions now drop an empty .keyring_managed marker file instead of overwriting session.json with a JSON stub. Toggling AGENTKEYS_SESSION_STORE=file ↔ keyring no longer destroys the real file-mode credential. list_fallback_session_ids matches either session.json or .keyring_managed. clear_session removes both.

6 new tests in session_store:

list_fallback_session_ids_is_sorted
sanitize_for_keyring_preserves_ascii_alnum_and_safe_punctuation
sanitize_for_keyring_replaces_unsafe_chars_and_appends_hash
sanitize_for_keyring_truncates_oversized_input
file_mode_save_writes_session_json_not_marker
list_fallback_session_ids_finds_marker_only_directories

cargo test -p agentkeys-core -p agentkeys-daemon -p agentkeys-cli -- --test-threads=1
core: 16 passed; daemon: 13+15 passed; cli: 14 passed

Re-running codex review --base main to confirm.

…r propagation Two findings from codex re-review of PR #24: **P2 — Path traversal via --session-id** `fallback_path` and `marker_path` now route `session_id` through `sanitize_for_keyring` before joining into the filesystem path. Inputs like `../escape`, `foo/bar`, or absolute paths can no longer escape ~/.agentkeys. Sanitization is idempotent, so callers that enumerate via `list_fallback_session_ids` and feed ids back into save/load/clear still match exactly. **P2 — Marker-file write failure silently swallowed** `save_session` now propagates errors from `create_dir_all` on the marker's parent and from `File::create` on the marker itself. If the keyring save succeeds but the marker can't land (read-only HOME, permission error), the caller now sees the error instead of a false Ok(()) — the session would otherwise be in the keyring but undiscoverable on restart. The "preserve pre-#12 daemon session on upgrade" finding remains dismissed per human design call on PR #24 (no users yet). Tests (2 new): - save_session_does_not_escape_agentkeys_dir_on_path_traversal - save_session_rejects_forward_slash_in_session_id Test: cargo test -p agentkeys-core -p agentkeys-daemon -p agentkeys-cli -- --test-threads=1 core: 18 passed; daemon: 13+15 passed; cli: 14 passed Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

hanwencheng · 2026-04-14T18:20:12Z

Codex v2 P2s landed (`8a84480`)

Two more legitimate findings from the re-review, both fixed in-branch:

P2 — Path traversal via --session-id (session_store.rs)
fallback_path and marker_path now route session_id through sanitize_for_keyring before joining into the filesystem path. Inputs like ../escape or foo/bar get normalised to _ + hash suffix so they can't escape ~/.agentkeys. Sanitization is idempotent, so the list → save/load round-trip stays consistent.

P2 — Marker-file write failure silently swallowed (session_store.rs)
save_session now propagates errors from create_dir_all(parent) and File::create(marker). If keyring save succeeds but the marker can't land (read-only HOME, permission error), the caller sees the error instead of a false Ok(()) — without the marker the session would be in the keyring but undiscoverable on restart.

The third finding ("preserve pre-#12 daemon session on upgrade") remains dismissed — no users yet, so there's no legacy state to preserve.

2 new regression tests:

save_session_does_not_escape_agentkeys_dir_on_path_traversal
save_session_rejects_forward_slash_in_session_id

cargo test -p agentkeys-core -p agentkeys-daemon -p agentkeys-cli -- --test-threads=1
core: 18 passed; daemon: 13+15 passed; cli: 14 passed

Re-running codex review --base main to confirm.

…h suffix `std::collections::hash_map::DefaultHasher` is explicitly NOT guaranteed stable across Rust/toolchain versions. Using it as a persistence key would make sanitized session_ids unreachable after a compiler upgrade: the directory / keyring account name would silently change and lookups would miss. Swap to SHA-256 via the `sha2` crate (already in-tree) and take the first 4 bytes as an 8-char hex suffix. `hex` is already a declared dep. Added a pinned-value regression test (`sanitize_for_keyring_uses_stable_sha256_suffix`) that asserts sanitize_for_keyring("foo/bar") == "foo_bar-cc5d46bd" so any future change to the hash pipeline fails loudly in CI. The codex v3 P1 "preserve legacy daemon session file" finding remains dismissed per the human design call on PR #24 — no users yet, no legacy state to preserve. Test: cargo test -p agentkeys-core session_store -- --test-threads=1 13 passed; 0 failed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

hanwencheng · 2026-04-14T18:26:04Z

Codex v3 P2 landed (`1c930c5`) — P1 dismissed (no-users rationale)

P2 — DefaultHasher not stable across Rust versions (session_store.rs)
Legitimate — std::collections::hash_map::DefaultHasher is documented as NOT a stable persistence format. Swapped to SHA-256 (first 4 bytes → 8 hex chars) via the sha2 crate already in-tree. hex is already a declared dep. Added a pinned-value regression test so any future change to the hash pipeline fails loudly in CI:

assert_eq!(sanitize_for_keyring("foo/bar"), "foo_bar-cc5d46bd");

P1 — Preserve legacy daemon session file on upgrade — DISMISSED again
This is the same finding from the v1 review. The project has no users yet (pre-launch), so there's no legacy ~/.agentkeys/session state to preserve. Adding a legacy-fallback codepath to the daemon would be speculative work for a scenario that can't exist. If/when we ship and need to migrate existing installs, we'll add it as a dedicated one-shot migration commit with a CHANGELOG entry.

cargo test -p agentkeys-core session_store -- --test-threads=1
13 passed

Re-running codex review; expect no new actionable findings.

hanwencheng · 2026-04-14T18:33:47Z

Codex v4 P2s landed (`b9e1621`)

P2 — Marker best-effort (session_store.rs)
v5 over-corrected by propagating marker-write errors, which would fail agentkeys init / recover on hosts with a working keychain but a read-only HOME. Reverted to best-effort: if marker fails after a successful keyring save, print a warning to stderr and return Ok. Direct-load callers (known session_id like master) are unaffected; only prefix-scan discovery (daemon restart) degrades.

P2 — Candidate validation before ambiguity (daemon/main.rs)
Previously list.len() > 1 aborted before any load was attempted — one stale .keyring_managed directory could block startup even with exactly one valid session. New flow: enumerate → try to load each → ambiguity error only if >1 loadable sessions. Message unchanged for the genuinely-ambiguous case.

P1 — Preserve legacy daemon session — still dismissed (no users yet, pre-launch). Recorded in commit message for future reference; if/when we ship, we'll add a dedicated migration commit with CHANGELOG.

cargo test -p agentkeys-core -p agentkeys-daemon -p agentkeys-cli -- --test-threads=1
core: 19 passed; daemon: 13+15 passed; cli: 14 passed

Re-running codex; expect only the already-dismissed P1.

hanwencheng · 2026-04-14T18:41:31Z

Codex v5 P2s landed (`a3a4218`)

P2 — cmd_init/cmd_recover honored ctx.session_id
Replaced the hardcoded "master" save with &ctx.session_id. load_session already used ctx.session_id; init/recover now match, so any future multi-session CLI caller gets consistent save/load round-trips.

P2 — Rediscovery hint for custom --session-id sessions
On daemon restart without --session-id and no daemon-* matches, emit a stderr hint listing any non-daemon- non-master sessions under ~/.agentkeys/: "no daemon-* sessions found, but these custom session IDs exist: ...; pass --session-id to resume one". Pair flow still proceeds normally — the hint just nudges the user. No persistence pointer added (would grow scope).

The codex P1 "preserve legacy daemon session" remains dismissed per human call (no users).

cargo test -p agentkeys-core -p agentkeys-daemon -p agentkeys-cli -- --test-threads=1
core: 19 passed; daemon: 13+15 passed; cli: 14 passed

Expect this codex pass (v6) to only surface the already-dismissed P1.

hanwencheng · 2026-04-14T18:49:59Z

Codex v6 P1+P2 landed (`74dc814`)

P1 — Reserved path components escaping ~/.agentkeys
sanitize_for_keyring previously returned "." / ".." / "" unchanged (. is in the whitelist). That let --session-id .. resolve to $HOME/session.json (outside the namespaced store) and empty session_id to the legacy root file. Now those three values are force-rewritten under the reserved __agk_ namespace.

P2 — Rewrite / user-id collision
Previously sanitize_for_keyring("foo/bar") == "foo_bar-cc5d46bd", and a user could pass --session-id foo_bar-cc5d46bd to land on the same keyring account. Introduced a __agk_ reserved prefix: all rewrites live under it, and user inputs that start with __agk_ are themselves pushed through the rewrite path so they end up at __agk___agk_foo_bar-cc5d46bd-<nested-hash> — distinct from the original rewrite.

The reserved prefix uses only characters that are valid across Windows Credential Manager, Linux secret-service, and macOS Keychain, so the output remains a legal account name + filesystem directory on all three platforms.

3 new/updated regression tests:

sanitize_for_keyring_rejects_reserved_path_components
sanitize_for_keyring_rewrites_prefix_collisions
sanitize_for_keyring_uses_stable_sha256_suffix pin updated to __agk_foo_bar-cc5d46bd

cargo test -p agentkeys-core -p agentkeys-daemon -p agentkeys-cli -- --test-threads=1
core: 21 passed; daemon: 13+15 passed; cli: 14 passed

Re-running codex.

hanwencheng · 2026-04-14T18:54:47Z

Codex v7 P2 landed (`d150d87`) — closing out review cycle

P2 — Hint advertised unresumeable sanitized IDs
After v9 introduced the __agk_ rewrite prefix, the restart hint at daemon/main.rs:143-154 could print directory names like __agk_foo_bar-cc5d46bd as resumable --session-id candidates. Feeding that string back triggers another rewrite (any input starting with __agk_ is itself pushed through), landing on a different storage key. Filtered rewrite-prefixed entries out of the hint set. Non-sanitized custom IDs (e.g., --session-id WORK) still appear and are resumable as before.

Remaining codex finding: "preserve pre-#12 legacy daemon session file on upgrade" — dismissed across 3 codex passes. No users yet (pre-launch project), so no legacy ~/.agentkeys/session state exists. If/when we ship and need to migrate existing installs, we'll add a one-shot migration commit with CHANGELOG.

Summary of the PR #24 review cycle

7 codex passes, 6 with new legitimate findings, all fixed in-branch:

v	Finding	Resolution
v1	P1 daemon startup legacy + nondeterministic selection + marker overwrite	2 P1 fixed in v5; 1 dismissed (no users)
v2	P2 path traversal + marker propagation + legacy	2 P2 fixed in v5; legacy dismissed
v3	P2 DefaultHasher not stable across Rust versions	Fixed in v6: SHA-256 via sha2 crate
v4	P2 marker-write too strict + stale markers block startup	Fixed in v7: best-effort marker + candidate validation
v5	P2 cmd_init/cmd_recover ignored ctx.session_id + rediscovery hint	Fixed in v8
v6	P1 reserved path components + P2 rewrite collision	Fixed in v9: `__agk_` reserved prefix + reject "."/".."/""
v7	P2 hint shows unresumeable sanitized IDs	Fixed in v10 (this commit)

cargo test -p agentkeys-core -p agentkeys-daemon -p agentkeys-cli -- --test-threads=1
core: 21 passed; daemon: 13+15 passed; cli: 14 passed

PR is ready for merge.

…espacing) Closes #12. Moves `session_store` from agentkeys-cli to agentkeys-core as a shared module. Adds a `session_id` parameter so both master CLI sessions and per-daemon sessions get their own keychain/file slot. - Master CLI: `session_id = "master"` (keychain entry: service=agentkeys, account=master; file fallback: ~/.agentkeys/master/session.json). - Daemon: `session_id = format!("daemon-{}", wallet)` after pair/recover, "daemon-pending" before. Two daemons on one machine no longer collide. - `AGENTKEYS_SESSION_STORE=file` still forces file-only for CI/Docker. Tests: 4 new session_store tests (save/load roundtrip master, daemon-wallet, two-daemons-no-collision, clear-session-per-id). Full suite 89 tests green under `--test-threads=1` (keyring access is not parallel-safe by design). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Codex P1 (daemon can't resume saved session): default startup only probed 'daemon-pending', so successful pair/recover that saved under 'daemon-<wallet>' was never loaded on next start. Now scans all 'daemon-*' fallback entries via new session_store::list_fallback_session_ids and tries each. Codex P1 (pending sentinel sticks on failed pair): old code wrote a fake session with token='pending' before pair attempt; if pair failed, the next startup 'successfully' loaded the fake and skipped pairing entirely. Removed the sentinel entirely — save only after pair/recover succeeds and the wallet is known. Codex P2 (legacy master fallback): added session_store::load_session_with_legacy_fallback() that tries the new namespace first, then the pre-#12 'session' keyring account / ~/.agentkeys/session.json file. Existing installs stay logged in across the upgrade. (Wire-up of CLI's CommandContext to use this helper deferred to a follow-up — the helper is available and tested.) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Codex v2 P1 (keyring-only sessions invisible): list_fallback_session_ids only scanned the file-fallback directory, so daemon sessions saved via the OS keyring (default macOS) were invisible on restart. save_session now drops a tiny discovery-marker file (`session.json` containing `{"_keyring_managed":true}`) when the keyring path was used, so list_fallback_session_ids finds the session_id and load_session can then go to the keyring for the real credential. Codex v2 P2 (legacy master not loaded): CommandContext::load_session now calls session_store::load_session_with_legacy_fallback so pre-#12 installs (keyring account=session or ~/.agentkeys/session.json) stay logged in after the wallet-namespaced layout upgrade. 52 tests green across cli/core/daemon crates. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Three findings from the combined codex+claude review on PR #24, applied in-branch per human design call: **P1 — Nondeterministic daemon selection (codex)** `daemon/main.rs`: when no --session-id is given and >1 `daemon-*` session directory exists, error with the candidate list instead of silently restoring an arbitrary wallet (cross-wallet credential mix-up risk on multi-daemon hosts). Single-session and explicit --session-id paths are unchanged. `session_store.rs::list_fallback_session_ids` now sorts results alphabetically so the enumeration order is deterministic across runs and filesystems. **P1 — Windows/Linux keychain account-name sanitization (claude)** New `sanitize_for_keyring(session_id)` helper: - ASCII alnum + `-_.` preserved as-is - other chars (including null bytes, slashes, non-UTF8) replaced with `_` - if sanitization changes the value OR length >128, append an 8-char hash of the original to preserve uniqueness across alias collisions Applied in all three `try_keyring_*` helpers so Windows Credential Manager (rejects null bytes) and Linux secret-service (rejects non-UTF8) don't crash or silently truncate-and-collide. **P2 — Marker file overwriting real file fallback (both reviewers)** Keyring-managed sessions now drop an empty `.keyring_managed` marker file instead of overwriting `session.json` with a JSON stub. The fallback invariant is preserved: switching `AGENTKEYS_SESSION_STORE` between `file` and keyring no longer destroys the real file-mode credential. `list_fallback_session_ids` matches either `session.json` or `.keyring_managed`. `clear_session` removes both. Tests (5 new in session_store): - list_fallback_session_ids_is_sorted - sanitize_for_keyring_preserves_ascii_alnum_and_safe_punctuation - sanitize_for_keyring_replaces_unsafe_chars_and_appends_hash - sanitize_for_keyring_truncates_oversized_input - file_mode_save_writes_session_json_not_marker - list_fallback_session_ids_finds_marker_only_directories Test: cargo test -p agentkeys-core -p agentkeys-daemon -p agentkeys-cli -- --test-threads=1 core: 16 passed; daemon: 13+15 passed; cli: 14 passed Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…r propagation Two findings from codex re-review of PR #24: **P2 — Path traversal via --session-id** `fallback_path` and `marker_path` now route `session_id` through `sanitize_for_keyring` before joining into the filesystem path. Inputs like `../escape`, `foo/bar`, or absolute paths can no longer escape ~/.agentkeys. Sanitization is idempotent, so callers that enumerate via `list_fallback_session_ids` and feed ids back into save/load/clear still match exactly. **P2 — Marker-file write failure silently swallowed** `save_session` now propagates errors from `create_dir_all` on the marker's parent and from `File::create` on the marker itself. If the keyring save succeeds but the marker can't land (read-only HOME, permission error), the caller now sees the error instead of a false Ok(()) — the session would otherwise be in the keyring but undiscoverable on restart. The "preserve pre-#12 daemon session on upgrade" finding remains dismissed per human design call on PR #24 (no users yet). Tests (2 new): - save_session_does_not_escape_agentkeys_dir_on_path_traversal - save_session_rejects_forward_slash_in_session_id Test: cargo test -p agentkeys-core -p agentkeys-daemon -p agentkeys-cli -- --test-threads=1 core: 18 passed; daemon: 13+15 passed; cli: 14 passed Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…h suffix `std::collections::hash_map::DefaultHasher` is explicitly NOT guaranteed stable across Rust/toolchain versions. Using it as a persistence key would make sanitized session_ids unreachable after a compiler upgrade: the directory / keyring account name would silently change and lookups would miss. Swap to SHA-256 via the `sha2` crate (already in-tree) and take the first 4 bytes as an 8-char hex suffix. `hex` is already a declared dep. Added a pinned-value regression test (`sanitize_for_keyring_uses_stable_sha256_suffix`) that asserts sanitize_for_keyring("foo/bar") == "foo_bar-cc5d46bd" so any future change to the hash pipeline fails loudly in CI. The codex v3 P1 "preserve legacy daemon session file" finding remains dismissed per the human design call on PR #24 — no users yet, no legacy state to preserve. Test: cargo test -p agentkeys-core session_store -- --test-threads=1 13 passed; 0 failed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ate validation Two findings from codex v4 review: **P2 — Marker-file failure too strict** v5 propagated marker write errors, which regressed direct-load callers (`agentkeys init`/`recover` saving `master`). Those paths don't rely on prefix-scan discovery; they look up by known session_id. Revert to best-effort: if `create_dir_all` or `File::create` fails after a successful keyring save, print a warning to stderr and return Ok. Keyring entry is authoritative for direct loads; only prefix-scan discovery (daemon restart) is degraded. **P2 — Stale markers block daemon startup** The `list.len() > 1` ambiguity check fired before load validation, so a single stale `.keyring_managed` directory (e.g., keyring entry deleted out-of-band) could block startup even when exactly one real session still loaded. New flow: 1. Enumerate candidates via list_fallback_session_ids (sorted). 2. Attempt load_session for each; keep only successes. 3. 0 loadable → pair flow. 1 loadable → use it. >1 loadable → error. The ambiguity error message is unchanged for the genuinely-ambiguous case ("multiple loadable daemon sessions found... Candidates: ..."). The codex v4 P1 "preserve legacy daemon session file" remains dismissed per human design call (no users yet). Test: cargo test -p agentkeys-core -p agentkeys-daemon -p agentkeys-cli -- --test-threads=1 core: 19 passed; daemon: 13+15 passed; cli: 14 passed Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…covery hint **P2 — cmd_init/cmd_recover ignored ctx.session_id** CommandContext carries `session_id` (defaults to "master", future-proofed for multi-session CLI), and `load_session` already reads from that namespace — but `cmd_init` and `cmd_recover` hardcoded "master" for the save. Any caller overriding `session_id` would succeed init but the next command (which reads from the overridden namespace) would miss the write. Replace the hardcoded string with `&ctx.session_id`. **P2 — Custom daemon session rediscovery hint** If a user pairs with `--session-id WORK`, restarts without the flag, the `daemon-*` scan misses the stored session and the daemon silently re-pairs (new wallet, previously stored credentials become unreachable from this daemon). Not fixable without a persistence pointer that would grow the scope, but we now print a hint on stderr when the scan returns zero `daemon-*` entries AND non-`daemon-` / non-`master` sessions exist under `~/.agentkeys/`: "no daemon-* sessions found, but these custom session IDs exist: ...; pass --session-id <name> to resume one". Pair flow proceeds unchanged; the hint just nudges the user toward the right command. The codex v5 P1 "preserve legacy daemon session file" remains dismissed per human design call (no users yet). Test: cargo test -p agentkeys-core -p agentkeys-daemon -p agentkeys-cli -- --test-threads=1 core: 19 passed; daemon: 13+15 passed; cli: 14 passed Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

**P1 — Reserved path components escaping ~/.agentkeys** sanitize_for_keyring previously returned "." / ".." / "" unchanged because "." is allowed in the whitelist. That let `--session-id ..` resolve to `$HOME/session.json` (outside the namespaced store), and an empty session_id resolve to the legacy root file. Now the accept-as-is rule explicitly rejects the empty string, ".", and ".." and force-rewrites them into the reserved `__agk_` namespace. **P2 — Rewritten IDs collide with user-supplied safe IDs** Previously `sanitize_for_keyring("foo/bar") == "foo_bar-cc5d46bd"` but a user could pass `--session-id foo_bar-cc5d46bd` and land on the same keyring account + fallback directory. Introduce a `__agk_` reserved prefix — all rewrites live under it, and any user-supplied id that starts with `__agk_` is itself pushed through the rewrite path so the output is `__agk___agk_foo_bar-cc5d46bd-<nested-hash>`, distinct from the original rewrite. The reserved prefix uses only characters that are valid across Windows Credential Manager, Linux secret-service, and macOS Keychain, so the output remains a legal account name and filesystem directory on all three platforms. 3 new/updated regression tests: - sanitize_for_keyring_rejects_reserved_path_components - sanitize_for_keyring_rewrites_prefix_collisions - existing `sanitize_for_keyring_uses_stable_sha256_suffix` pin now expects `__agk_foo_bar-cc5d46bd` Test: cargo test -p agentkeys-core -p agentkeys-daemon -p agentkeys-cli -- --test-threads=1 core: 21 passed; daemon: 13+15 passed; cli: 14 passed Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…d IDs After v9 introduced the `__agk_` rewrite prefix, the daemon restart hint at main.rs:143-154 could print directory names like `__agk_foo_bar-cc5d46bd` as resumable `--session-id` candidates. But feeding that string back into load_session triggers another rewrite (because sanitize_for_keyring re-rewrites any input starting with `__agk_`), landing on a DIFFERENT storage key. The hint was advertising unresumeable candidates. Filter rewrite-prefixed entries out of the hint set. They're only reachable via the original unsanitized ID that produced them, which the user must remember on their own — the daemon has no way to invert the hash. Users with non-sanitized custom IDs (e.g. `--session-id WORK`) still see the hint and can resume as before. The codex v7 P1 "preserve legacy daemon session" remains dismissed (no users yet). Closing out this review cycle — 7 codex passes, 6 pass with new legit findings all fixed in-branch, pass 7 only surfaces the already-dismissed P1 plus this filter tweak. Test: cargo test -p agentkeys-daemon -- --test-threads=1 daemon: 13+15 passed Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Rebase of fix/issue-12 onto current main (post-PR #18 #19 #21 #23 merges) surfaced two concrete integration issues, fixed in this patch: **1. Expose session_store helpers for test reuse** - Make `agentkeys_core::session_store::fallback_path(session_id)` public so CLI tests can assert on the on-disk path layout. - Re-export `agentkeys_core::session_store` at `agentkeys_cli::session_store` so existing test imports (`use agentkeys_cli::session_store;`) keep working after the old cli-local `session_store.rs` was deleted upstream. **2. cmd_revoke clear_session calls (post-PR #18 self-revoke merge)** PR #18 added `session_store::clear_session()` calls on self-revoke paths. The new signature takes `session_id`, not 0 args. Pass `&ctx.session_id` so the revoke wipes the correct namespace for any future multi-session CLI caller. **3. Integration tests: seed identity_links for Recover** PR #21 tightened `resolve_identity_typed` to require the identity to exist in `identity_links` before resolution. Two pair_tests (`recover_full_loop`, `recover_credentials_intact`) opened Recover requests with aliases that were never linked, so they now 404. Added `InProcessBackend::link_identity_for_tests(identity_type, identity_value, wallet)` that inserts directly into the mock DB's `identity_links` table. The two failing tests now seed the alias link before the Recover flow. Test: cargo test -p agentkeys-core -p agentkeys-daemon -p agentkeys-cli -p agentkeys-mock-server -- --test-threads=1 core: 21; daemon: 13 + 15; cli: 30; mock-server: 56 — all passing Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…r_session `try_keyring_delete` was fire-and-forget: spawned a detached thread, didn't wait, `clear_session` returned Ok before the delete ran. On keyring-backed installs `cmd_revoke`/self-revoke would report "session wiped", the process would exit, and the next command could still load the stale session from the OS keychain. Mirror the `try_keyring_save`/`try_keyring_load` pattern: spawn the thread, wait up to 2 seconds via `recv_timeout`, return a bool. `keyring::Error::NoEntry` is treated as success (the intent is "nothing of this name should remain"). On timeout or any other error, `clear_session` now returns an anyhow error so the caller surfaces a clear message instead of a false success. Test: `clear_session_is_synchronous_in_file_mode` asserts the delete happens synchronously — immediately after clear_session returns, load_session must fail without any sleep/retry. The codex v8 P2 "preserve pre-#12 legacy daemon session" remains dismissed per the design call (no users yet). Test: cargo test -p agentkeys-core -p agentkeys-daemon -p agentkeys-cli -- --test-threads=1 core: 22 passed; daemon: 13+15 passed; cli: 30 passed Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

hanwencheng · 2026-04-15T01:19:10Z

Codex v8 P1 landed (`0e2cc5f`)

P1 — try_keyring_delete was fire-and-forget
Thread was detached with no timeout/join. clear_session returned Ok before the delete ran, so on keyring-backed installs cmd_revoke would report "session wiped" but the next command could still load the stale session from the OS keychain.

Fix mirrors try_keyring_save / try_keyring_load: spawn → recv_timeout(2s) → return bool. keyring::Error::NoEntry is treated as success (intent is "nothing of this name should remain"). On timeout or any other error, clear_session returns an anyhow error so the caller surfaces a clear failure instead of a false success.

Test: clear_session_is_synchronous_in_file_mode — immediately after clear_session returns, load_session must fail without any sleep/retry.

P2 — legacy daemon session remains dismissed (no users yet).

cargo test -p agentkeys-core -p agentkeys-daemon -p agentkeys-cli -- --test-threads=1
core: 22 passed; daemon: 13+15 passed; cli: 30 passed

hanwencheng and others added 11 commits April 15, 2026 02:56

hanwencheng force-pushed the fix/issue-12 branch from d150d87 to 7f33f8f Compare April 14, 2026 19:06

hanwencheng merged commit 0d87cd5 into main Apr 15, 2026

This was referenced Apr 15, 2026

fix(daemon): #14 --parent flag to bind pair requests to a specific master #22

Merged

fix(cli): #15 part 3 (fix-15b) — agentkeys scope command #29

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: #12 daemon OS keychain support (multi-platform, wallet-based namespacing)#24

fix: #12 daemon OS keychain support (multi-platform, wallet-based namespacing)#24
hanwencheng merged 12 commits intomainfrom
fix/issue-12

hanwencheng commented Apr 14, 2026 •

edited

Loading

Uh oh!

hanwencheng commented Apr 14, 2026

Uh oh!

hanwencheng commented Apr 14, 2026

Uh oh!

hanwencheng commented Apr 14, 2026

Uh oh!

hanwencheng commented Apr 14, 2026

Uh oh!

hanwencheng commented Apr 14, 2026

Uh oh!

hanwencheng commented Apr 14, 2026

Uh oh!

hanwencheng commented Apr 14, 2026

Uh oh!

hanwencheng commented Apr 14, 2026

Uh oh!

hanwencheng commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hanwencheng commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

Test plan

Follow-ups

Issue

Uh oh!

hanwencheng commented Apr 14, 2026

Verdict

Uh oh!

hanwencheng commented Apr 14, 2026

Combined codex+claude review fixes landed (d5a40de)

Uh oh!

hanwencheng commented Apr 14, 2026

Codex v2 P2s landed (8a84480)

Uh oh!

hanwencheng commented Apr 14, 2026

Codex v3 P2 landed (1c930c5) — P1 dismissed (no-users rationale)

Uh oh!

hanwencheng commented Apr 14, 2026

Codex v4 P2s landed (b9e1621)

Uh oh!

hanwencheng commented Apr 14, 2026

Codex v5 P2s landed (a3a4218)

Uh oh!

hanwencheng commented Apr 14, 2026

Codex v6 P1+P2 landed (74dc814)

Uh oh!

hanwencheng commented Apr 14, 2026

Codex v7 P2 landed (d150d87) — closing out review cycle

Summary of the PR #24 review cycle

Uh oh!

hanwencheng commented Apr 15, 2026

Codex v8 P1 landed (0e2cc5f)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

hanwencheng commented Apr 14, 2026 •

edited

Loading

Combined codex+claude review fixes landed (`d5a40de`)

Codex v2 P2s landed (`8a84480`)

Codex v3 P2 landed (`1c930c5`) — P1 dismissed (no-users rationale)

Codex v4 P2s landed (`b9e1621`)

Codex v5 P2s landed (`a3a4218`)

Codex v6 P1+P2 landed (`74dc814`)

Codex v7 P2 landed (`d150d87`) — closing out review cycle

Codex v8 P1 landed (`0e2cc5f`)