Skip to content

Refactor: General code quality improvements and fail-closed handling#10

Merged
SihaoLiu merged 17 commits into
mainfrom
general-refactor-and-review
Jan 16, 2026
Merged

Refactor: General code quality improvements and fail-closed handling#10
SihaoLiu merged 17 commits into
mainfrom
general-refactor-and-review

Conversation

@SihaoLiu
Copy link
Copy Markdown
Contributor

Summary

  • Complete fail-closed handling and fix test harness signing
  • Fix remaining fail-open patterns and complete state parsing migration
  • Section-scoped placeholder detection using awk extraction
  • Add comprehensive test coverage for validators, plan file hooks, todo checker, and template loader

Bug Fixes (Phase 1):
- Fix sed patterns for Goal/AC extraction using POSIX [[:space:]]
- Add numeric validation for line counts before comparison
- Re-validate Codex model/effort at execution time
- Add plan file readability check (-r)
- Validate plan content has non-blank lines

Test Coverage (Phase 2):
- Add template edge case tests (empty vars, unicode)
- Add command injection pattern tests (new test file)
- Add plan file content validation tests
- Add Python todo checker error handling tests (new test file)

Code Simplification (Phase 3):
- Add parse_state_file() shared function
- Extract magic strings to constants (FIELD_*, MARKER_*, EXIT_*)
- Cache git status output to avoid duplicate calls

Edge Case Handling (Phase 4):
- Improve goal tracker placeholder detection with generic regex
- Validate existing git timeout wrapper (portable-timeout.sh)
- Validate existing Python todo checker graceful error handling

Documentation (Phase 5):
- Add template variable syntax documentation
- Document exit reasons with constants

All tests pass. Version bumped to 1.1.5.
- Wire parse_state_file into all hooks (loop-codex-stop-hook.sh,
  loop-plan-file-validator.sh, loop-bash-validator.sh)
- Replace hardcoded magic strings with constants (MARKER_COMPLETE,
  MARKER_STOP, EXIT_* constants)
- Add run_with_timeout for git operations in stop hook
- Implement exit code 2 on parse errors in todo checker
- Fix plan content validation to exclude comment-only plans
- Update tests to reflect new exit code behavior
- Add test for HTML-comment-only plan files

All tests pass: template-loader (41), bash-validator-patterns (48),
todo-checker (16), plan-file-validation (26), template-references (74),
state-exit-naming (13)
Phase 3.1/3.2 State parsing standardization:
- Replace get_current_round with parse_state_file in read/write/edit validators
- Update get_current_round to use FIELD_CURRENT_ROUND constant
- Apply FIELD_* constants in loop-plan-file-validator.sh required field list

Phase 4.2 Git timeout wrapper completion:
- Add run_with_timeout to all git calls in setup-rlcr-loop.sh
- Add run_with_timeout to all git calls in loop-plan-file-validator.sh
- Make stop hook block on git timeout failure instead of empty fallback

Phase 4.3 Todo checker error handling:
- Handle exit code 2 in stop hook (block and surface parse error)

Phase 1.5/2.3 Plan content validation:
- Update validation to treat # lines as non-content (per original plan)
- Add #-comment-only plan test (Test 9.9.2)

Phase 2.1 Unicode test coverage:
- Fix Unicode tests to use actual non-ASCII characters (Tests 38.1, 39.1)

AC6 Test runner:
- Create tests/run-all-tests.sh for full test suite verification

All tests pass:
- test-template-loader.sh: 43/43
- test-plan-file-validation.sh: 27/27
- test-todo-checker.sh: 16/16
- test-bash-validator-patterns.sh: 48/48
- test-template-references.sh: 74/74
- test-state-exit-naming.sh: 13/13
…ng migration

- hooks/loop-plan-file-validator.sh: Use || EXIT=$? pattern for git ls-files
  to prevent set -e from triggering on expected non-zero exit codes
- scripts/setup-rlcr-loop.sh: Same fix for git status/ls-files commands
- hooks/loop-bash-validator.sh: Replace get_current_round with parse_state_file
  for consistent state parsing across all validators
- tests/run-all-tests.sh: Add missing test suites (test-templates-comprehensive.sh,
  test-plan-file-hooks.sh, test-error-scenarios.sh)

All 9 test suites pass (221+ tests total).
- hooks/loop-plan-file-validator.sh: Apply || EXIT=$? pattern to ALL
  run_with_timeout calls (branch check, PLAN_TRACKED=true ls-files/status)
  to prevent set -e from aborting before JSON block output
- scripts/setup-rlcr-loop.sh: Fix HTML comment detection by checking
  single-line comments (<!-- ... -->) BEFORE multi-line start detection
- tests/test-plan-file-validation.sh: Add Test 9.10.1 regression test for
  single-line HTML comment + valid content acceptance

AC7 verified: Version 1.1.5 in .claude-plugin/plugin.json,
.claude-plugin/marketplace.json, and README.md.

All 9 test suites pass (28 plan-file-validation, 28 plan-file-hooks, etc.)
- hooks/loop-codex-stop-hook.sh: Apply || EXIT=$? pattern to branch check
  to prevent set -e from aborting before JSON block output
- hooks/loop-plan-file-validator.sh: Handle non-timeout git errors (exit
  codes other than 0, 1, 124) by blocking with explicit error message
- tests/test-plan-file-validation.sh: Add -c commit.gpgsign=false to all
  git commit calls to work in signing-enforced environments
- tests/test-plan-file-hooks.sh: Same fix for git commit signing
- tests/test-state-exit-naming.sh: Same fix for git commit signing

All git operations now handle exit codes:
- 0: Success (tracked/clean)
- 1: Expected failure (untracked file for ls-files)
- 124: Timeout (block with timeout message)
- Other: Unexpected error (block with error message)

All 9 test suites pass (28 plan-file-validation, 28 plan-file-hooks, etc.)
When CLAUDE_PROJECT_DIR is set (e.g., in an RLCR loop), setup-rlcr-loop.sh
uses it instead of pwd. This caused tests to resolve plan file paths
against the wrong directory (repo root instead of temp test repo).

Fix: Unset CLAUDE_PROJECT_DIR at the start of the test file so the setup
script uses pwd (the temp test repo) for each test case.

All 9 test suites now pass in all environments (AC6 verified).
Changed placeholder detection from regex patterns that overlapped across
sections to section-specific fixed string patterns using grep -qF. This
ensures a placeholder in one section doesn't incorrectly flag other sections.

Added 4 tests (14.1-14.4) to verify each placeholder is detected
independently and all three are reported when all are present.
Changed placeholder detection from fixed string matching across the entire
file to section-scoped detection using awk to extract each section block.
Uses generic placeholder pattern `[To be [a-z]` within each section only.

- Ultimate Goal: extracts lines after `### Ultimate Goal` until next `##`
- Acceptance Criteria: extracts lines after `### Acceptance Criteria` until next `##`
- Active Tasks: extracts lines after `#### Active Tasks` until next `##`

Tightened tests to assert exact missing-item lines (e.g., `**Ultimate Goal**:
Still contains placeholder text`) rather than just section names.
@claude
Copy link
Copy Markdown

claude Bot commented Jan 16, 2026

Code review

No issues found. Checked for bugs and CLAUDE.md compliance.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3f45375368

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread hooks/loop-codex-stop-hook.sh Outdated
Comment on lines +306 to +308
if command -v git &>/dev/null && run_with_timeout "$GIT_TIMEOUT" git rev-parse --git-dir &>/dev/null 2>&1; then
GIT_IS_REPO=true
GIT_STATUS_CACHED=$(run_with_timeout "$GIT_TIMEOUT" git status --porcelain 2>/dev/null || echo "")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Fail closed when git status times out

If git status --porcelain times out or fails, the || echo "" fallback leaves GIT_STATUS_CACHED empty, so both the large-file check and the “git clean” gate treat the repo as clean and allow exit. In large repos or when git is unhealthy, this means the loop can finish even with uncommitted changes or oversized files, undermining the enforcement those checks are meant to provide. Consider blocking on a non‑zero run_with_timeout exit (e.g., 124) rather than swallowing the failure.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed, please check

Previously, git status --porcelain failures were swallowed with || echo "",
treating timeouts or errors as "clean repo" and allowing exit. This could
bypass large-file and uncommitted-changes checks in slow/unhealthy repos.

Now captures exit code and blocks exit on non-zero, with templated error msg.
@SihaoLiu
Copy link
Copy Markdown
Contributor Author

@codex

@SihaoLiu
Copy link
Copy Markdown
Contributor Author

@claude

@claude
Copy link
Copy Markdown

claude Bot commented Jan 16, 2026

Claude finished @SihaoLiu's task in 44s —— View job


Codex Feedback Fix Verified

  • Understand the trigger comment context
  • Review the Codex feedback about fail-closed handling
  • Verify the fix implementation
  • Provide review feedback

Codex Issue: The original code used || echo "" fallback for git status --porcelain, which would fail-open when git timed out or failed, leaving GIT_STATUS_CACHED empty and allowing checks to pass incorrectly.

Fix Applied (lines 299-326 in hooks/loop-codex-stop-hook.sh):

The fix properly implements fail-closed behavior:

# IMPORTANT: Fail-closed on git failures to prevent bypassing checks.
GIT_STATUS_CACHED=""
GIT_IS_REPO=false

if command -v git &>/dev/null && run_with_timeout "$GIT_TIMEOUT" git rev-parse --git-dir &>/dev/null 2>&1; then
    GIT_IS_REPO=true
    # Capture exit code to detect timeout/failure - do NOT use || echo "" which would fail-open
    GIT_STATUS_EXIT=0
    GIT_STATUS_CACHED=$(run_with_timeout "$GIT_TIMEOUT" git status --porcelain 2>/dev/null) || GIT_STATUS_EXIT=$?

    if [[ $GIT_STATUS_EXIT -ne 0 ]]; then
        # Git status failed or timed out - fail-closed by blocking exit
        FALLBACK="# Git Status Failed
        ..."
        REASON=$(load_and_render_safe "$TEMPLATE_DIR" "block/git-status-failed.md" "$FALLBACK" \
            "GIT_STATUS_EXIT=$GIT_STATUS_EXIT")
        jq -n --arg reason "$REASON" --arg msg "Loop: Blocked - git status failed (exit $GIT_STATUS_EXIT)" \
            '{"decision": "block", "reason": $reason, "systemMessage": $msg}'
        exit 0
    fi
fi

Verification:

  1. The exit code is now explicitly captured via || GIT_STATUS_EXIT=$? instead of being swallowed by || echo ""
  2. Non-zero exit codes (including timeout exit code 124) cause the hook to block and fail-closed
  3. A new template prompt-template/block/git-status-failed.md provides a clear error message explaining the failure
  4. The comment explicitly states "do NOT use || echo "" which would fail-open" to document the intent

The fix correctly addresses Codex's concern. The loop will now block exit when git status times out or fails, preventing checks from being bypassed in large repos or when git is unhealthy.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. 🚀

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

The run-all-tests.sh script was reporting "0 tests passed" because the
regex did not properly handle ANSI escape codes in the test output.

The issue: Test output contains "Passed: \x1b[0;32m43\x1b[0m" but the
old regex looked for literal "[0-9;m" patterns without the ESC character.

The fix: Strip ANSI escape codes using sed before parsing the numbers.
This ensures the test runner correctly reports the total test count
(currently 357 tests across 9 suites).
Changes:
1. Fixed ANSI escape code stripping in run-all-tests.sh to be portable
   across GNU (Linux) and BSD (macOS) sed implementations.
   - Changed from \x1b (GNU-specific) to $'\033' (ANSI-C quoting)
   - Reused output_stripped variable for efficiency
   - Removed redundant ${var:-0} fallback (|| echo "0" already handles it)

2. Added test-ansi-parsing.sh with 8 regression tests covering:
   - Basic ANSI color stripping
   - Multiple colors in one line
   - Passed/Failed count extraction from colored output
   - Zero count handling
   - Complex multi-line output parsing
   - Plain text (no ANSI codes) handling
   - Bold+color combined codes

3. Added test-ansi-parsing.sh to the test suite list.

Test results: 365 tests passed (357 original + 8 new), 0 failed.
Changed from exploratory/documentation script to proper assertion-based
tests with pass/fail counting:

- Added 12 structured tests covering error scenarios
- Tests now use pass/fail functions matching other test files
- Reports test counts in standard format for run-all-tests.sh parsing
- Removed redundant set +e/set -e pairs (script uses set -uo pipefail)
- Applied code-simplifier suggestions for cleaner conditionals

Test coverage now includes:
1. Template file not found
2. Template directory not found
3. load_and_render with missing template
4. render_template with empty content
5. Special regex characters in values
6. Strict mode (set -euo pipefail) behavior
7-8. load_and_render_safe fallback behavior
9-10. validate_template_dir validation
11-12. Edge cases (empty/unclosed placeholders)

Test results: 377 tests passed (365 previous + 12 new), 0 failed.
The script uses `set -uo pipefail` (without -e), but Test 6 was using
`set +e` before and `set -e` after a subshell, incorrectly enabling
errexit which was never originally enabled.

Changes:
- Replaced set +e/set -e with `|| true` after command substitution
- Removed unused EXIT_CODE variable (|| true always returns 0)
- Simplified test condition to only check for SCRIPT_COMPLETED output
- Added clarifying comment about the || true purpose

The test verifies that scripts using strict mode can safely call
template-loader functions without crashing.

Test results: 377 tests passed, 0 failed.
Add is_allowlisted_file() helper to permit specific round files
that are required by Codex review process:
- round-1-todos.md, round-2-todos.md: Persisted todos lists
- round-0-summary.md, round-1-summary.md: Historical accuracy updates

Update all validator hooks (read, write, edit, bash) to check allowlist
before blocking access to these specific files.

Files modified:
- hooks/lib/loop-common.sh: Add is_allowlisted_file() function
- hooks/loop-read-validator.sh: Check allowlist for todos and summaries
- hooks/loop-write-validator.sh: Check allowlist for todos and summaries
- hooks/loop-edit-validator.sh: Check allowlist for todos and summaries
- hooks/loop-bash-validator.sh: Check allowlist for round-1/2-todos.md
Improvements:
- Bash validator: Require active loop dir path for round-1/2-todos.md
  (fixes permissive allowlist that allowed any path)
- Add test-allowlist-validators.sh with 24 tests covering:
  - is_allowlisted_file() function
  - Read/Write/Edit validator allowlist behavior
  - Bash validator path-restricted allowlist
- Add test-allowlist-validators.sh to run-all-tests.sh

Test results: 401 tests passed across 11 suites

Files modified:
- hooks/loop-bash-validator.sh: Tighten allowlist path check
- tests/run-all-tests.sh: Add test-allowlist-validators.sh
- tests/test-allowlist-validators.sh: New test file (24 tests)
- Changed bash validator to use full ACTIVE_LOOP_DIR path instead of
  basename-only check to prevent same-basename bypass from different roots
- Added regex escaping for special characters in paths
- Added Test 25 for same-basename different-root security scenario
- Code-simplifier: consolidated comments, fixed regex escape pattern
@SihaoLiu
Copy link
Copy Markdown
Contributor Author

@claude @codex review this PR as a whole

@claude
Copy link
Copy Markdown

claude Bot commented Jan 16, 2026

Claude finished @SihaoLiu's task in 2m 31s —— View job


PR Review Complete

  • Understand PR scope and changes
  • Review code quality and patterns
  • Check for bugs and security issues
  • Verify CLAUDE.md compliance
  • Provide comprehensive review feedback

Code Review Summary

This is a substantial refactoring PR with 2440 additions and 195 deletions across 24 files. The changes focus on fail-closed security patterns, state parsing migration, and comprehensive test coverage.

Positive Aspects

1. Security - Fail-Closed Handling (Excellent)

The most critical fix addresses the Codex feedback about git status timeout handling. The implementation in hooks/loop-codex-stop-hook.sh:299-326 correctly captures exit codes and blocks on failure:

GIT_STATUS_EXIT=0
GIT_STATUS_CACHED=$(run_with_timeout "$GIT_TIMEOUT" git status --porcelain 2>/dev/null) || GIT_STATUS_EXIT=$?

if [[ $GIT_STATUS_EXIT -ne 0 ]]; then
    # Git status failed or timed out - fail-closed by blocking exit
    ...
    jq -n --arg reason "$REASON" --arg msg "Loop: Blocked - git status failed (exit $GIT_STATUS_EXIT)" \
        '{"decision": "block", "reason": $reason, "systemMessage": $msg}'
    exit 0
fi

This replaces the previous fail-open pattern (|| echo "") that would allow checks to be bypassed.

2. Code Quality - Shared Library (hooks/lib/loop-common.sh)

The centralization of common functions is well-done:

  • Constants for state field names and markers
  • Reusable parse_state_file() function
  • Helper functions like is_round_file_type(), extract_round_number(), is_allowlisted_file()
  • Standardized message functions with template fallbacks

3. Template System (hooks/lib/template-loader.sh)

The single-pass template rendering prevents placeholder injection attacks where user content containing {{...}} patterns could be substituted. The awk-based implementation at lines 74-129 is robust.

4. Comprehensive Test Coverage

New test files added:

  • test-allowlist-validators.sh (391 lines) - Tests file access controls
  • test-todo-checker.sh (313 lines) - Tests Python todo checker
  • test-ansi-parsing.sh (178 lines) - Tests ANSI escape handling
  • test-bash-validator-patterns.sh (214 lines) - Tests bash command blocking

5. Input Validation

Good security practices in scripts/setup-rlcr-loop.sh:

  • Path validation rejects absolute paths, symlinks, and shell metacharacters (lines 228-248)
  • Branch name validation for YAML safety (lines 436-443)
  • Codex model/effort validation (lines 447-461)

CLAUDE.md Compliance

  • English only: All code and comments are in English
  • No emoji: No emoji found in code files (only test references for testing purposes)
  • Version consistency: Version 1.1.5 is consistent across plugin.json, marketplace.json, and README.md
  • Version format: 1.1.5 follows the required X.Y.Z format

Minor Observations (Not Issues)

  1. Git timeout constant is set to 30 seconds in multiple files (GIT_TIMEOUT=30). Consider extracting to a shared constant if this becomes a maintenance burden.

  2. Allowlist approach in loop-common.sh:172-186 hardcodes specific filenames. The current approach is acceptable for security-critical code, but the list should be documented if it changes.

CI Status

All 8 CI checks are passing:

  • Version Bump Check: success
  • Template Loader Tests: success
  • Plan File Validation Tests: success
  • Shell Syntax Check: success
  • Claude Code Review: success

Verdict: This PR implements critical security improvements with proper fail-closed semantics, adds comprehensive test coverage, and maintains good code quality. The changes are well-structured and follow the project's guidelines. Approved for merge.


@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. 🚀

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@SihaoLiu SihaoLiu merged commit 3776988 into main Jan 16, 2026
8 checks passed
SihaoLiu added a commit that referenced this pull request Jan 17, 2026
Fixes based on Round 0 Codex review:

HIGH - Haiku sub-agent invocation (AC-2.3):
- Changed from generic "general-purpose" subagent to explicit
  "draft-relevance-checker" agent reference
- Added clear Task tool parameters documentation
- Referenced the custom agent in agents/draft-relevance-checker.md

HIGH - Missing positive/negative test descriptions (AC-6):
- Added TDD-style test format to Acceptance Criteria template
- Each AC now includes Positive Tests (expected to PASS) and
  Negative Tests (expected to FAIL) sections
- Added Generation Rule #8 enforcing TDD-style tests

MEDIUM - Path boundary phrasing (AC-6.3):
- Changed from negative phrasing ("What should NOT be done") to
  affirmative descriptions ("Maximum Acceptable Scope")
- Added examples showing affirmative descriptions
- Added Generation Rule #9 for affirmative path boundaries

MEDIUM - Narrow path rule for deterministic drafts (AC-6.3):
- Added "Note on Deterministic Designs" explaining when upper and
  lower bounds should converge
- Added Generation Rule #10 for respecting fixed approaches
SihaoLiu added a commit that referenced this pull request Jan 17, 2026
…ns (v1.3.0) (#17)

* Feature: Add gen-plan command to transform drafts into structured plans (v1.2.4)

Add a new slash command /humanize:gen-plan that transforms user draft documents
into well-structured implementation plans with acceptance criteria (AC-X format).

New files:
- commands/gen-plan.md: Slash command entry point
- skills/gen-plan/SKILL.md: Main skill implementation with opus model
- scripts/validate-gen-plan-io.sh: IO path validation script
- agents/draft-relevance-checker.md: Haiku-based relevance checker

Workflow:
1. Validates input/output paths with clear error messages
2. Uses haiku sub-agent to check draft relevance to repository
3. Analyzes draft for clarity, consistency, completeness, functionality
4. Engages user via AskUserQuestion to resolve issues
5. Generates plan.md with goals, ACs, path boundaries, and suggestions

Additional changes:
- Unified model names to use short aliases (opus, sonnet, haiku) across all skills
- Updated test validation to accept short model aliases

Updated version to 1.2.4 in plugin.json, marketplace.json, and README.md.

* Fix: Address Codex review findings for gen-plan skill

Fixes based on Round 0 Codex review:

HIGH - Haiku sub-agent invocation (AC-2.3):
- Changed from generic "general-purpose" subagent to explicit
  "draft-relevance-checker" agent reference
- Added clear Task tool parameters documentation
- Referenced the custom agent in agents/draft-relevance-checker.md

HIGH - Missing positive/negative test descriptions (AC-6):
- Added TDD-style test format to Acceptance Criteria template
- Each AC now includes Positive Tests (expected to PASS) and
  Negative Tests (expected to FAIL) sections
- Added Generation Rule #8 enforcing TDD-style tests

MEDIUM - Path boundary phrasing (AC-6.3):
- Changed from negative phrasing ("What should NOT be done") to
  affirmative descriptions ("Maximum Acceptable Scope")
- Added examples showing affirmative descriptions
- Added Generation Rule #9 for affirmative path boundaries

MEDIUM - Narrow path rule for deterministic drafts (AC-6.3):
- Added "Note on Deterministic Designs" explaining when upper and
  lower bounds should converge
- Added Generation Rule #10 for respecting fixed approaches

* Simplify: Clean up gen-plan implementation (Finalize Phase)

Code simplification after Codex review approval:

commands/gen-plan.md:
- Simplified trailing description to single sentence
- Consistent with other command files in the project

scripts/validate-gen-plan-io.sh:
- Removed unnecessary empty echo statements
- Consolidated error messages for conciseness
- 11 lines removed, functionality unchanged

skills/gen-plan/SKILL.md:
- Simplified Phase 6 from numbered list to concise paragraph
- 4 lines removed, same information preserved

All 49 tests pass. Validation script tested and works correctly.

* Bump version to 1.3.0 for gen-plan feature release

Version bump from 1.2.4 to 1.3.0 for the new gen-plan command feature.

* Fix: Validate argument values before shift to ensure correct exit codes

Bug fix:
- When --input or --output is passed without a value, shift 2 would fail
  under set -e and exit with code 1 (mapped to "input not found") instead
  of exit code 6 ("invalid arguments")
- Added checks for missing argument values before shift 2
- Also catches flags followed by another flag (e.g., --input --output)

Test enhancement:
- Added 10 comprehensive tests for validate-gen-plan-io.sh exit codes
- Tests cover: missing values, unknown options, all validation exit codes
- Total tests now: 59 (was 49)

* Fix: Model validation requires exact match for short aliases

The regex was treating opus/sonnet/haiku as prefixes, accepting invalid
values like opus-v2, haiku123, or sonnet-fast.

Fix:
- Short aliases (opus, sonnet, haiku) now require exact match via ^...$
- Full model IDs (claude-*, gpt-*, etc.) still use prefix matching

Added tests:
- NT-6d: Rejects opus-v2 (partial match)
- NT-6e: Rejects haiku123 (partial match)
- NT-6f: Rejects sonnet-fast (partial match)

Total tests: 62

* Fix: Reject directory paths for --output parameter

Bug fix:
- If --output pointed to an existing directory (e.g., /tmp/ instead of
  /tmp/plan.md), validation passed but Write would fail at runtime
- Now rejects any existing path at output location (file, directory, symlink)
- Added specific error message when output is a directory with suggestion

Changes:
- Check 4 now uses -d to detect directories with clear error message
- Check 4 uses -e to catch any existing path (not just regular files)

Added test:
- validate-gen-plan-io: output is directory exits 4

Total tests: 63
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant