Skip to content

fix: eliminate security hook trigger patterns in /codex and /autoplan (#1329)#1506

Open
NikhileshNanduri wants to merge 2 commits into
garrytan:mainfrom
NikhileshNanduri:fix/1329-security-hook-triggers
Open

fix: eliminate security hook trigger patterns in /codex and /autoplan (#1329)#1506
NikhileshNanduri wants to merge 2 commits into
garrytan:mainfrom
NikhileshNanduri:fix/1329-security-hook-triggers

Conversation

@NikhileshNanduri
Copy link
Copy Markdown

Summary

Fixes the remaining 3 of 4 security hook trigger patterns from issue #1329. PR #1496 already shipped Pattern 2 (eval with tilde path via gstack-paths --get). This PR handles Patterns 1, 3, and 4.

Pattern 1 — source with tilde path (both templates):

  • Before: source ~/.claude/skills/gstack/bin/gstack-codex-probe → triggers PreToolUse hook on tilde-path source
  • After: Direct calls to ~/.claude/skills/gstack/bin/gstack-codex-* standalone binaries

Pattern 3 — bare cd "$_REPO_ROOT" (codex template only):

  • Before: Standalone cd "$_REPO_ROOT" line before codex commands triggers cd + command hook
  • After: -C "$_REPO_ROOT" flag on codex commands; git -C "$_REPO_ROOT" diff for git commands; codex exec resume drops the cd entirely (session context preserves directory; -C is not a supported flag for resume)

Pattern 4 — inline python3 with #-comments (codex template only):

  • Before: Three inline python3 -u -c "..." blocks with Python-style comments trigger multi-line comment hook
  • After: All three blocks pipe to ~/.claude/skills/gstack/bin/gstack-codex-jsonl-parser --mode challenge|consult

New files

File Purpose
bin/gstack-codex-auth-probe Standalone multi-signal auth check
bin/gstack-codex-version-check Warns on known-bad Codex CLI versions
bin/gstack-codex-log-event Telemetry emitter (reads config internally)
bin/gstack-codex-log-hang Operational learning writer on timeout
bin/gstack-codex-timeout-wrapper gtimeout/timeout/unwrapped fallback chain
bin/gstack-codex-jsonl-parser Python JSONL parser for codex --json output

Testing scenarios

Static validation (free, bun test)

  1. Pattern guard teststest/codex-hardening.test.ts adds 38 new tests:

    • Pattern 1: asserts no source ~/.*gstack-codex-probe in codex/autoplan templates and generated SKILL.md files
    • Pattern 3: asserts no bare cd "$_REPO_ROOT" as a top-level line in any of the 4 files
    • Pattern 4: asserts no $PYTHON_CMD.*-u\s+-c\s+" pattern (inline python) in any of the 4 files
  2. Standalone binary teststest/codex-hardening.test.ts:

    • All 6 binaries exist and are executable
    • Bash binaries are syntax-valid (bash -n)
    • Python parser is syntax-valid (python3 -c "ast.parse(...)")
    • gstack-codex-auth-probe: AUTH_OK on CODEX_API_KEY, AUTH_FAILED with no auth, AUTH_OK on auth.json
    • gstack-codex-timeout-wrapper: executes directly without timeout binary, prefers gtimeout
    • gstack-codex-jsonl-parser: extracts agent_message, SESSION_ID in consult mode, no SESSION_ID in challenge mode, disconnect warning in challenge mode, tokens from turn.completed, [codex thinking] for reasoning, [codex ran] for command_execution, tolerates malformed JSON
  3. Updated skill-validation test — Python discovery test updated to check for gstack-codex-jsonl-parser binary invocation instead of old $PYTHON_CMD inline pattern

  4. Full free suite: bun test passes with only the 2 pre-existing AGENTS.md/docs/skills.md doc-inventory failures (confirmed on clean main)

Verification command

bun test test/codex-hardening.test.ts test/skill-validation.test.ts test/gen-skill-docs.test.ts
# Expected: 779 pass, 2 fail (pre-existing doc inventory)

🤖 Generated with Claude Code

NikhileshNanduri and others added 2 commits May 15, 2026 03:03
…toplan (issue garrytan#1329)

Extracts the logic from gstack-codex-probe bash functions and the inline Python
JSONL streaming parser into individually-executable binaries so skill templates
can call them directly without triggering Claude Code PreToolUse security hooks.

New binaries (all in bin/):
- gstack-codex-auth-probe: multi-signal auth check (exit 0=ok, 1=failed)
- gstack-codex-version-check: warns on known-bad Codex CLI versions
- gstack-codex-log-event: telemetry emitter (reads config internally, no _TEL env)
- gstack-codex-log-hang: operational learning writer on codex timeout
- gstack-codex-timeout-wrapper: gtimeout/timeout/unwrapped fallback chain
- gstack-codex-jsonl-parser: Python script parsing codex --json streaming output
  with --mode challenge (completeness check) and --mode consult (SESSION_ID extract)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…an (issue garrytan#1329)

Pattern 1 — source with tilde path:
  Replace `source ~/.claude/skills/gstack/bin/gstack-codex-probe` + function calls
  with direct `~/.claude/skills/gstack/bin/gstack-codex-*` binary invocations in
  both codex/SKILL.md.tmpl and autoplan/SKILL.md.tmpl.

Pattern 3 — bare cd "$_REPO_ROOT":
  Replace bare `cd "$_REPO_ROOT"` lines with `-C "$_REPO_ROOT"` flag on codex commands
  (review bare path, exec custom path) and drop the cd entirely for exec resume
  (session context preserves directory; -C is not a supported flag for resume).

Pattern 4 — inline python3 -u -c with #-comments:
  Replace all three inline JSONL parser blocks (Challenge, Consult new-session,
  Consult resume) with pipe to `~/.claude/skills/gstack/bin/gstack-codex-jsonl-parser`.

Also regenerates .kiro/.cursor/.openclaw etc host-specific SKILL.md files via
`bun run scripts/gen-skill-docs.ts --host all`.

Tests: 38 new tests in codex-hardening.test.ts guarding all three patterns and
verifying standalone binary behaviour. Updated skill-validation.test.ts to check
for the jsonl-parser binary instead of the old $PYTHON_CMD inline pattern.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant