Problem
Several integration tests under crates/agentkeys-cli/tests/cli_tests.rs (and similar adjacent test files) mutate the process-global HOME env var via unsafe { std::env::set_var("HOME", ...) } to sandbox session_store::fallback_path() per-test. Because the cli_tests binary runs under tokio::test(flavor = "multi_thread"), multiple tests can be in flight concurrently. Any test that reads HOME (session_store::fallback_path, cmd_init, etc.) can observe the wrong value mid-execution → flaky/nondeterministic CI.
Codex flagged this on PR #18 and the same pattern was added in PRs #19 (fix-15), #20 (fix-16), #27 (fix-11).
Affected tests (current accounting)
Proposed fix
Add the serial_test crate as a dev-dep in crates/agentkeys-cli/Cargo.toml and annotate every test that touches HOME (or AGENTKEYS_SESSION_STORE, or AGENTKEYS_BIOMETRIC) with #[serial]:
use serial_test::serial;
#[tokio::test(flavor = "multi_thread")]
#[serial]
async fn cmd_revoke_self_clears_local_session() {
unsafe { std::env::set_var("AGENTKEYS_SESSION_STORE", "file") };
// ...
}
Per-test serialization is sufficient — these tests are fast (~ms) and the count is small (< 20 across the affected PRs).
Alternative considered: thread an explicit fallback_path parameter through session_store so tests don't need env mutation. More invasive (changes the public API of session_store) and not worth the cost vs the serial_test annotation.
Workaround
Until this lands, run with cargo test -- --test-threads=1 (already documented in PR bodies for affected branches).
Cross-references
Problem
Several integration tests under
crates/agentkeys-cli/tests/cli_tests.rs(and similar adjacent test files) mutate the process-globalHOMEenv var viaunsafe { std::env::set_var("HOME", ...) }to sandboxsession_store::fallback_path()per-test. Because thecli_testsbinary runs undertokio::test(flavor = "multi_thread"), multiple tests can be in flight concurrently. Any test that reads HOME (session_store::fallback_path,cmd_init, etc.) can observe the wrong value mid-execution → flaky/nondeterministic CI.Codex flagged this on PR #18 and the same pattern was added in PRs #19 (fix-15), #20 (fix-16), #27 (fix-11).
Affected tests (current accounting)
fix/issue-17:cmd_revoke_self_clears_local_session,cmd_revoke_with_agent_calls_revoke_by_wallet,cmd_revoke_no_session_errors_cleanly.agentkeys runfor master sessions +--envoverride #19fix/issue-15:cmd_run_master_session_injects_all_credentials, others (look forunsafe { set_varin cli_tests.rs).fix/issue-16:cmd_*_defaults_to_session_wallet,cmd_store_resolves_alias,cmd_read_unknown_identity_errors_cleanly.fix/issue-11:cmd_approve/revoke/teardown_skips_biometric_in_test_mode.Proposed fix
Add the
serial_testcrate as a dev-dep incrates/agentkeys-cli/Cargo.tomland annotate every test that touchesHOME(orAGENTKEYS_SESSION_STORE, orAGENTKEYS_BIOMETRIC) with#[serial]:Per-test serialization is sufficient — these tests are fast (~ms) and the count is small (< 20 across the affected PRs).
Alternative considered: thread an explicit
fallback_pathparameter throughsession_storeso tests don't need env mutation. More invasive (changes the public API of session_store) and not worth the cost vs theserial_testannotation.Workaround
Until this lands, run with
cargo test -- --test-threads=1(already documented in PR bodies for affected branches).Cross-references
agentkeys runfor master sessions +--envoverride #19, fix(cli): #16 wallet-optional + identity aliases #20, fix(cli): #11 biometric gate for high-security master CLI actions (macOS) #27 — same pattern, tracked here