Skip to content

feat(cli): add codex auth why-selected and verify --paths commands#410

Merged
ndycode merged 2 commits intomainfrom
feat/cli-why-selected-and-verify-paths
Apr 17, 2026
Merged

feat(cli): add codex auth why-selected and verify --paths commands#410
ndycode merged 2 commits intomainfrom
feat/cli-why-selected-and-verify-paths

Conversation

@ndycode
Copy link
Copy Markdown
Owner

@ndycode ndycode commented Apr 17, 2026

Implements PR-P from Phase 1 roadmap (docs/audits/MASTER_AUDIT.md §9 F1, F2).

Summary

Two new diagnostic CLI commands:

  1. codex auth why-selected [--now|--last] [--json] — surfaces live per-candidate scoring breakdown of the hybrid account selection algorithm
  2. codex auth verify [--paths|--flagged|--all] [--json] — self-tests the storage path resolution chain and the PR-A resolvePath sandbox guard

Design Decision: Trace Mode Over Persistent Tracker

No persistent selection tracker exists in the codebase. Rather than building a new ring buffer file, why-selected extends selectHybridAccount with a non-mutating selectHybridAccountTraced sibling that mirrors the production scoring logic exactly (same weights, same availability gating, same PID offset, same capability boost) and emits the per-candidate breakdown. A parity test asserts both functions agree on the winning index.

--last consults the existing loadPersistedRuntimeObservabilitySnapshot for a generated-at timestamp but falls back to live recomputation; a user-visible note explains this in text mode.

Verify --paths Coverage

Walks the full path chain: process.cwdfindProjectRootresolveProjectStorageIdentityRootgetProjectStorageKeygetProjectConfigDirgetProjectGlobalConfigDir.

Sandbox probes:

  • sandbox-accept-home — home-dir path accepted
  • sandbox-accept-tmp — tmpdir path accepted
  • sandbox-reject-escape — out-of-sandbox path rejected (UNC on Windows, /etc/shadow on unix)

Confirms the PR-A lookalike-prefix fix is live.

Backward Compatibility

  • selectHybridAccount signature unchanged; selectHybridAccountTraced is a new non-mutating sibling
  • Top-level verify-flagged verb kept as alias (delegates to runRepairVerifyFlagged)
  • --all runs both --paths and --flagged in sequence

Tests

  • 14 tests in test/codex-manager-why-selected-command.test.ts
  • 18 tests in test/codex-manager-verify-command.test.ts
  • 6 parity + trace tests in test/rotation.test.ts
  • 2 dispatcher cases added to test/codex-manager-cli.test.ts
  • Full suite: 227 files / 3458 tests green (up from 3418)
  • npm run typecheck, npm run lint, npm run build: clean

Hands-on QA

Verified the shipping binary codex-multi-auth auth why-selected --json and codex-multi-auth auth verify --paths --json produce structured JSON output matching the documented schema. Bad-flag handling returns clear error + usage print.

Closes

Phase 1 §9 F1 (why-selected) and F2 (verify --paths).

See docs/audits/MASTER_AUDIT.md §9.

note: greptile review for oc-chatgpt-multi-auth. cite files like lib/foo.ts:123. confirm regression tests + windows concurrency/token redaction coverage.

Greptile Summary

adds two diagnostic CLI commands: why-selected surfaces per-candidate hybrid scoring via a non-mutating selectHybridAccountTraced sibling, and verify --paths walks the full storage path chain plus sandbox probes. the pickEscapeProbePath hardening correctly handles the pathological cwd=/ edge case flagged in the earlier audit, and the parity test between the traced and production selectors is solid.

Confidence Score: 5/5

safe to merge — all remaining findings are P2 style/documentation issues with no correctness impact

all three comments are P2: a missing --all entry in the per-command help, a dead resolveActiveIndex dep in the why-selected interface, and a fragile string match for sandbox rejection detection. none affect runtime behavior. prior P1 concerns from the previous thread are not addressed but were already noted; no new regressions introduced.

lib/codex-manager/commands/verify.ts (printVerifyUsage missing --all) and lib/codex-manager/commands/why-selected.ts (resolveActiveIndex unused in interface)

Important Files Changed

Filename Overview
lib/codex-manager/commands/why-selected.ts new command implementing live scoring breakdown; resolveActiveIndex declared in deps interface but never called — dead coupling
lib/codex-manager/commands/verify.ts new verify command with path-chain walk and sandbox probes; --all missing from printVerifyUsage; sandbox rejection identified by fragile error-message regex
lib/rotation.ts adds selectHybridAccountTraced as a non-mutating sibling to production selection; parity test added; no tracker side-effects confirmed
lib/codex-manager.ts wires dispatch for why-selected and verify commands; buildSelectAccountTraced adapter correctly maps storage to AccountWithMetrics
test/codex-manager-verify-command.test.ts 18 tests covering path chain steps, sandbox escape probe (including pathological cwd=/ case), --flagged delegation, and --all

Sequence Diagram

sequenceDiagram
    participant CLI as codex-multi-auth CLI
    participant CM as codex-manager.ts
    participant WS as why-selected.ts
    participant VF as verify.ts
    participant RT as rotation.ts
    participant SP as storage/paths.ts
    participant ST as storage.ts

    Note over CLI,ST: codex auth why-selected [--now|--last] [--json]
    CLI->>CM: runCodexMultiAuthCli(["auth","why-selected",...])
    CM->>CM: buildSelectAccountTraced() → adapter fn
    CM->>WS: runWhySelectedCommand(args, deps)
    WS->>ST: loadAccounts()
    ST-->>WS: AccountStorageV3
    WS->>RT: selectHybridAccountTraced({accounts, healthTracker, tokenTracker})
    RT-->>WS: HybridSelectionTraceResult (candidates + scores, no mutation)
    WS-->>CLI: JSON or human-readable output

    Note over CLI,ST: codex auth verify --paths [--json]
    CLI->>CM: runCodexMultiAuthCli(["auth","verify","--paths"])
    CM->>VF: runVerifyCommand(args, deps)
    VF->>SP: getCwd → findProjectRoot → resolveProjectStorageIdentityRoot
    SP-->>VF: identity root path
    VF->>SP: getProjectStorageKey → getProjectConfigDir → getProjectGlobalConfigDir
    SP-->>VF: storage key + config dirs
    VF->>SP: resolvePath(home probe) → accept
    VF->>SP: resolvePath(tmp probe) → accept
    VF->>SP: resolvePath(escape probe) → throw Access denied
    SP-->>VF: sandbox confirmed
    VF-->>CLI: VerifyPathsReport (steps + sandboxTests)
Loading

Fix All in Codex

Prompt To Fix All With AI
This is a comment left during a code review.
Path: lib/codex-manager/commands/verify.ts
Line: 427-444

Comment:
**`--all` absent from per-command help**

`printVerifyUsage()` lists no usage example or option entry for `--all`, yet `parseVerifyArgs` returns the error "use --all to run both" when someone passes `--paths --flagged`. a user who runs `codex auth verify --help` will never see `--all` mentioned, then hits an error message referencing it — confusing.

```suggestion
export function printVerifyUsage(): void {
	console.log(
		[
			"Usage:",
			"  codex auth verify --paths [--json]",
			"  codex auth verify --flagged [--json] [--dry-run] [--no-restore]",
			"  codex auth verify --all [--json] [--dry-run] [--no-restore]",
			"",
			"Options:",
			"  --paths           Self-test storage path resolution chain and resolvePath sandbox",
			"  --flagged         Verify previously-flagged accounts (delegates to verify-flagged)",
			"  --all             Run both --paths and --flagged in sequence",
			"  --json, -j        Print machine-readable JSON output",
			"",
			"Notes:",
			"  - `codex auth verify-flagged` remains available as a back-compat alias.",
			"  - `--paths` and `--flagged` cannot be combined; use `--all` to run both, or pick one.",
		].join("\n"),
	);
}
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: lib/codex-manager/commands/why-selected.ts
Line: 23-38

Comment:
**`resolveActiveIndex` declared but never called**

`resolveActiveIndex` is a required field in `WhySelectedCommandDeps` — the dispatcher passes it, tests mock it — but `runWhySelectedCommand` never calls it. any future implementer of this interface will provide a dep that does nothing. if there's no plan to use it (e.g. to annotate the "currently active" account in the candidate list), it should be removed or made optional.

```suggestion
export interface WhySelectedCommandDeps {
	parseWhySelectedArgs: (
		args: string[],
	) => ParsedArgsResult<WhySelectedCliOptions>;
	printWhySelectedUsage: () => void;
	setStoragePath: (path: string | null) => void;
	loadAccounts: () => Promise<AccountStorageV3 | null>;
	selectAccountTraced: (
		storage: AccountStorageV3,
	) => HybridSelectionTraceResult;
	loadRuntimeObservabilitySnapshot?: () => Promise<WhySelectedRuntimeSnapshot | null>;
	sanitizeEmail?: (email: string | undefined) => string | undefined;
	logInfo?: (message: string) => void;
	logError?: (message: string) => void;
}
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: lib/codex-manager/commands/verify.ts
Line: 267-274

Comment:
**sandbox rejection identified by fragile string match**

`/access denied/i` is coupled to the exact wording of `resolvePath`'s error in `lib/storage/paths.ts`. if that message is ever rephrased (or a different, unrelated exception propagates — e.g. a windows acl error on the probe path), `ok` silently flips to `false`, making the diagnostic report a broken sandbox when the sandbox is fine. consider exporting a typed error class or sentinel from `storage/paths.ts` and catching that instead.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (2): Last reviewed commit: "fix(cli): harden verify --paths sandbox ..." | Re-trigger Greptile

Closes: Phase 1 §9 F1, F2 (plan PR-P).

- Extend lib/rotation.ts with selectHybridAccountTraced: non-mutating
  variant that mirrors the hybrid scoring logic of selectHybridAccount
  and returns a per-candidate breakdown (health, tokens, freshness,
  capability boost, pid bonus, score). selectHybridAccount signature is
  unchanged and a parity test asserts both agree on the winning index.

- Add codex auth why-selected [--now|--last] [--json]: live
  recomputation of the current selection decision with per-candidate
  reasons. --last additionally consults the persisted runtime
  observability snapshot when available; no persistent selection
  tracker exists yet (trace-mode approach), so --last falls back to
  live recomputation with a user-visible note.

- Add codex auth verify [--paths|--flagged|--all] [--json]: --paths
  walks the storage path resolution chain (cwd -> findProjectRoot ->
  resolveProjectStorageIdentityRoot -> storage key -> project config
  dir -> global project dir) and self-tests resolvePath() with
  known-good (home, tmp) and known-bad (UNC / /etc/shadow) inputs to
  confirm the PR-A sandbox guard is working. --flagged delegates to
  the existing verify-flagged command; --all runs both. The
  verify-flagged top-level verb is kept as a back-compat alias.

- Update help usage block and wire both verbs into runCodexMultiAuthCli.

- Tests: 40 new tests across rotation, command, parser, and dispatcher
  layers. Full suite: 3458 pass (up from 3418).
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 17, 2026

Warning

Rate limit exceeded

@ndycode has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 0 minutes and 58 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 0 minutes and 58 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 66d7bf17-e10c-404e-b787-8ae6c169ecf1

📥 Commits

Reviewing files that changed from the base of the PR and between 1f6da97 and b3c2945.

📒 Files selected for processing (12)
  • README.md
  • docs/reference/commands.md
  • lib/codex-manager.ts
  • lib/codex-manager/commands/verify.ts
  • lib/codex-manager/commands/why-selected.ts
  • lib/codex-manager/help.ts
  • lib/rotation.ts
  • test/codex-manager-cli.test.ts
  • test/codex-manager-verify-command.test.ts
  • test/codex-manager-why-selected-command.test.ts
  • test/documentation.test.ts
  • test/rotation.test.ts
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/cli-why-selected-and-verify-paths
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch feat/cli-why-selected-and-verify-paths

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Comment thread lib/codex-manager.ts
Comment on lines +3252 to +3289
function buildSelectAccountTraced(): (
storage: AccountStorageV3,
) => ReturnType<typeof selectHybridAccountTraced> {
return (storage: AccountStorageV3) => {
const now = Date.now();
const healthTracker = getHealthTracker();
const tokenTracker = getTokenTracker();
const accountsWithMetrics: AccountWithMetrics[] = storage.accounts.map(
(account, index) => {
const enabled = account?.enabled !== false;
const rateLimits = account?.rateLimitResetTimes ?? {};
let rateLimited = false;
for (const value of Object.values(rateLimits)) {
if (typeof value === "number" && value > now) {
rateLimited = true;
break;
}
}
const coolingDown =
typeof account?.coolingDownUntil === "number" &&
account.coolingDownUntil > now;
const isAvailable = enabled && !rateLimited && !coolingDown;
return {
index,
trackerKey: account?.accountId ?? index,
isAvailable,
lastUsed: account?.lastUsed ?? 0,
};
},
);
return selectHybridAccountTraced({
accounts: accountsWithMetrics,
healthTracker,
tokenTracker,
});
};
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 isAvailable parity gap vs production

buildSelectAccountTraced() computes isAvailable from enabled, rateLimitResetTimes, and coolingDownUntil, but the production path in lib/accounts.ts (getCurrentOrNextForFamilyHybrid) also gates on hasEnabledWorkspaces(account) and isCircuitAvailable(account) (circuit-breaker state), and uses the family-specific isRateLimitedForFamily(account, family, model) rather than scanning all rate-limit keys. A circuit-open account shows as available: true in the diagnostic but would never be selected in production, making the "why-selected" output misleading exactly when it matters most (accounts under failure pressure).

Prompt To Fix With AI
This is a comment left during a code review.
Path: lib/codex-manager.ts
Line: 3252-3289

Comment:
**`isAvailable` parity gap vs production**

`buildSelectAccountTraced()` computes `isAvailable` from `enabled`, `rateLimitResetTimes`, and `coolingDownUntil`, but the production path in `lib/accounts.ts` (`getCurrentOrNextForFamilyHybrid`) also gates on `hasEnabledWorkspaces(account)` and `isCircuitAvailable(account)` (circuit-breaker state), and uses the family-specific `isRateLimitedForFamily(account, family, model)` rather than scanning all rate-limit keys. A circuit-open account shows as `available: true` in the diagnostic but would never be selected in production, making the "why-selected" output misleading exactly when it matters most (accounts under failure pressure).

How can I resolve this? If you propose a fix, please make it concise.

Fix in Codex

Comment on lines +1 to +5
import { describe, expect, it, vi } from "vitest";
import {
parseWhySelectedArgs,
printWhySelectedUsage,
runWhySelectedCommand,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Missing coverage: buildSelectAccountTraced adapter parity

the parity test in rotation.test.ts verifies selectHybridAccountTraced agrees with selectHybridAccount given the same inputs, but there's no test that exercises buildSelectAccountTraced() in codex-manager.ts with a real AccountStorageV3 that has a circuit-breaker-open or workspace-disabled account. that's the gap where the diagnostic diverges from production. adding one integration-style test here (or in a new codex-manager-build-account-traced.test.ts) would catch the parity drift described in the inline comment on buildSelectAccountTraced.

Prompt To Fix With AI
This is a comment left during a code review.
Path: test/codex-manager-why-selected-command.test.ts
Line: 1-5

Comment:
**Missing coverage: `buildSelectAccountTraced` adapter parity**

the parity test in `rotation.test.ts` verifies `selectHybridAccountTraced` agrees with `selectHybridAccount` given the same inputs, but there's no test that exercises `buildSelectAccountTraced()` in `codex-manager.ts` with a real `AccountStorageV3` that has a circuit-breaker-open or workspace-disabled account. that's the gap where the diagnostic diverges from production. adding one integration-style test here (or in a new `codex-manager-build-account-traced.test.ts`) would catch the parity drift described in the inline comment on `buildSelectAccountTraced`.

How can I resolve this? If you propose a fix, please make it concise.

Fix in Codex

Addresses oracle audit HIGH findings on PR 410:
- verify --paths now resets storage state on entry (setStoragePath(null))
  matching the convention used by doctor/check commands
- sandbox-reject-escape probe no longer false-fails when cwd is
  an ancestor of the probe path (e.g. POSIX cwd=/)
- Added regression test for cwd=/ edge case
- Documented why-selected and verify commands in:
  - docs/reference/commands.md (full flag + JSON output reference)
  - README.md (Advanced and Repair sections)
@ndycode ndycode merged commit da3d38f into main Apr 17, 2026
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant