Refactor: General code quality improvements and fail-closed handling#10
Conversation
Bug Fixes (Phase 1): - Fix sed patterns for Goal/AC extraction using POSIX [[:space:]] - Add numeric validation for line counts before comparison - Re-validate Codex model/effort at execution time - Add plan file readability check (-r) - Validate plan content has non-blank lines Test Coverage (Phase 2): - Add template edge case tests (empty vars, unicode) - Add command injection pattern tests (new test file) - Add plan file content validation tests - Add Python todo checker error handling tests (new test file) Code Simplification (Phase 3): - Add parse_state_file() shared function - Extract magic strings to constants (FIELD_*, MARKER_*, EXIT_*) - Cache git status output to avoid duplicate calls Edge Case Handling (Phase 4): - Improve goal tracker placeholder detection with generic regex - Validate existing git timeout wrapper (portable-timeout.sh) - Validate existing Python todo checker graceful error handling Documentation (Phase 5): - Add template variable syntax documentation - Document exit reasons with constants All tests pass. Version bumped to 1.1.5.
- Wire parse_state_file into all hooks (loop-codex-stop-hook.sh, loop-plan-file-validator.sh, loop-bash-validator.sh) - Replace hardcoded magic strings with constants (MARKER_COMPLETE, MARKER_STOP, EXIT_* constants) - Add run_with_timeout for git operations in stop hook - Implement exit code 2 on parse errors in todo checker - Fix plan content validation to exclude comment-only plans - Update tests to reflect new exit code behavior - Add test for HTML-comment-only plan files All tests pass: template-loader (41), bash-validator-patterns (48), todo-checker (16), plan-file-validation (26), template-references (74), state-exit-naming (13)
Phase 3.1/3.2 State parsing standardization: - Replace get_current_round with parse_state_file in read/write/edit validators - Update get_current_round to use FIELD_CURRENT_ROUND constant - Apply FIELD_* constants in loop-plan-file-validator.sh required field list Phase 4.2 Git timeout wrapper completion: - Add run_with_timeout to all git calls in setup-rlcr-loop.sh - Add run_with_timeout to all git calls in loop-plan-file-validator.sh - Make stop hook block on git timeout failure instead of empty fallback Phase 4.3 Todo checker error handling: - Handle exit code 2 in stop hook (block and surface parse error) Phase 1.5/2.3 Plan content validation: - Update validation to treat # lines as non-content (per original plan) - Add #-comment-only plan test (Test 9.9.2) Phase 2.1 Unicode test coverage: - Fix Unicode tests to use actual non-ASCII characters (Tests 38.1, 39.1) AC6 Test runner: - Create tests/run-all-tests.sh for full test suite verification All tests pass: - test-template-loader.sh: 43/43 - test-plan-file-validation.sh: 27/27 - test-todo-checker.sh: 16/16 - test-bash-validator-patterns.sh: 48/48 - test-template-references.sh: 74/74 - test-state-exit-naming.sh: 13/13
…ng migration - hooks/loop-plan-file-validator.sh: Use || EXIT=$? pattern for git ls-files to prevent set -e from triggering on expected non-zero exit codes - scripts/setup-rlcr-loop.sh: Same fix for git status/ls-files commands - hooks/loop-bash-validator.sh: Replace get_current_round with parse_state_file for consistent state parsing across all validators - tests/run-all-tests.sh: Add missing test suites (test-templates-comprehensive.sh, test-plan-file-hooks.sh, test-error-scenarios.sh) All 9 test suites pass (221+ tests total).
- hooks/loop-plan-file-validator.sh: Apply || EXIT=$? pattern to ALL run_with_timeout calls (branch check, PLAN_TRACKED=true ls-files/status) to prevent set -e from aborting before JSON block output - scripts/setup-rlcr-loop.sh: Fix HTML comment detection by checking single-line comments (<!-- ... -->) BEFORE multi-line start detection - tests/test-plan-file-validation.sh: Add Test 9.10.1 regression test for single-line HTML comment + valid content acceptance AC7 verified: Version 1.1.5 in .claude-plugin/plugin.json, .claude-plugin/marketplace.json, and README.md. All 9 test suites pass (28 plan-file-validation, 28 plan-file-hooks, etc.)
- hooks/loop-codex-stop-hook.sh: Apply || EXIT=$? pattern to branch check to prevent set -e from aborting before JSON block output - hooks/loop-plan-file-validator.sh: Handle non-timeout git errors (exit codes other than 0, 1, 124) by blocking with explicit error message - tests/test-plan-file-validation.sh: Add -c commit.gpgsign=false to all git commit calls to work in signing-enforced environments - tests/test-plan-file-hooks.sh: Same fix for git commit signing - tests/test-state-exit-naming.sh: Same fix for git commit signing All git operations now handle exit codes: - 0: Success (tracked/clean) - 1: Expected failure (untracked file for ls-files) - 124: Timeout (block with timeout message) - Other: Unexpected error (block with error message) All 9 test suites pass (28 plan-file-validation, 28 plan-file-hooks, etc.)
When CLAUDE_PROJECT_DIR is set (e.g., in an RLCR loop), setup-rlcr-loop.sh uses it instead of pwd. This caused tests to resolve plan file paths against the wrong directory (repo root instead of temp test repo). Fix: Unset CLAUDE_PROJECT_DIR at the start of the test file so the setup script uses pwd (the temp test repo) for each test case. All 9 test suites now pass in all environments (AC6 verified).
Changed placeholder detection from regex patterns that overlapped across sections to section-specific fixed string patterns using grep -qF. This ensures a placeholder in one section doesn't incorrectly flag other sections. Added 4 tests (14.1-14.4) to verify each placeholder is detected independently and all three are reported when all are present.
Changed placeholder detection from fixed string matching across the entire file to section-scoped detection using awk to extract each section block. Uses generic placeholder pattern `[To be [a-z]` within each section only. - Ultimate Goal: extracts lines after `### Ultimate Goal` until next `##` - Acceptance Criteria: extracts lines after `### Acceptance Criteria` until next `##` - Active Tasks: extracts lines after `#### Active Tasks` until next `##` Tightened tests to assert exact missing-item lines (e.g., `**Ultimate Goal**: Still contains placeholder text`) rather than just section names.
Code reviewNo issues found. Checked for bugs and CLAUDE.md compliance. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3f45375368
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if command -v git &>/dev/null && run_with_timeout "$GIT_TIMEOUT" git rev-parse --git-dir &>/dev/null 2>&1; then | ||
| GIT_IS_REPO=true | ||
| GIT_STATUS_CACHED=$(run_with_timeout "$GIT_TIMEOUT" git status --porcelain 2>/dev/null || echo "") |
There was a problem hiding this comment.
Fail closed when git status times out
If git status --porcelain times out or fails, the || echo "" fallback leaves GIT_STATUS_CACHED empty, so both the large-file check and the “git clean” gate treat the repo as clean and allow exit. In large repos or when git is unhealthy, this means the loop can finish even with uncommitted changes or oversized files, undermining the enforcement those checks are meant to provide. Consider blocking on a non‑zero run_with_timeout exit (e.g., 124) rather than swallowing the failure.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
fixed, please check
Previously, git status --porcelain failures were swallowed with || echo "", treating timeouts or errors as "clean repo" and allowing exit. This could bypass large-file and uncommitted-changes checks in slow/unhealthy repos. Now captures exit code and blocks exit on non-zero, with templated error msg.
|
Claude finished @SihaoLiu's task in 44s —— View job Codex Feedback Fix Verified
Codex Issue: The original code used Fix Applied (lines 299-326 in The fix properly implements fail-closed behavior: # IMPORTANT: Fail-closed on git failures to prevent bypassing checks.
GIT_STATUS_CACHED=""
GIT_IS_REPO=false
if command -v git &>/dev/null && run_with_timeout "$GIT_TIMEOUT" git rev-parse --git-dir &>/dev/null 2>&1; then
GIT_IS_REPO=true
# Capture exit code to detect timeout/failure - do NOT use || echo "" which would fail-open
GIT_STATUS_EXIT=0
GIT_STATUS_CACHED=$(run_with_timeout "$GIT_TIMEOUT" git status --porcelain 2>/dev/null) || GIT_STATUS_EXIT=$?
if [[ $GIT_STATUS_EXIT -ne 0 ]]; then
# Git status failed or timed out - fail-closed by blocking exit
FALLBACK="# Git Status Failed
..."
REASON=$(load_and_render_safe "$TEMPLATE_DIR" "block/git-status-failed.md" "$FALLBACK" \
"GIT_STATUS_EXIT=$GIT_STATUS_EXIT")
jq -n --arg reason "$REASON" --arg msg "Loop: Blocked - git status failed (exit $GIT_STATUS_EXIT)" \
'{"decision": "block", "reason": $reason, "systemMessage": $msg}'
exit 0
fi
fiVerification:
The fix correctly addresses Codex's concern. The loop will now block exit when git status times out or fails, preventing checks from being bypassed in large repos or when git is unhealthy. |
|
Codex Review: Didn't find any major issues. 🚀 ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
The run-all-tests.sh script was reporting "0 tests passed" because the regex did not properly handle ANSI escape codes in the test output. The issue: Test output contains "Passed: \x1b[0;32m43\x1b[0m" but the old regex looked for literal "[0-9;m" patterns without the ESC character. The fix: Strip ANSI escape codes using sed before parsing the numbers. This ensures the test runner correctly reports the total test count (currently 357 tests across 9 suites).
Changes:
1. Fixed ANSI escape code stripping in run-all-tests.sh to be portable
across GNU (Linux) and BSD (macOS) sed implementations.
- Changed from \x1b (GNU-specific) to $'\033' (ANSI-C quoting)
- Reused output_stripped variable for efficiency
- Removed redundant ${var:-0} fallback (|| echo "0" already handles it)
2. Added test-ansi-parsing.sh with 8 regression tests covering:
- Basic ANSI color stripping
- Multiple colors in one line
- Passed/Failed count extraction from colored output
- Zero count handling
- Complex multi-line output parsing
- Plain text (no ANSI codes) handling
- Bold+color combined codes
3. Added test-ansi-parsing.sh to the test suite list.
Test results: 365 tests passed (357 original + 8 new), 0 failed.
Changed from exploratory/documentation script to proper assertion-based tests with pass/fail counting: - Added 12 structured tests covering error scenarios - Tests now use pass/fail functions matching other test files - Reports test counts in standard format for run-all-tests.sh parsing - Removed redundant set +e/set -e pairs (script uses set -uo pipefail) - Applied code-simplifier suggestions for cleaner conditionals Test coverage now includes: 1. Template file not found 2. Template directory not found 3. load_and_render with missing template 4. render_template with empty content 5. Special regex characters in values 6. Strict mode (set -euo pipefail) behavior 7-8. load_and_render_safe fallback behavior 9-10. validate_template_dir validation 11-12. Edge cases (empty/unclosed placeholders) Test results: 377 tests passed (365 previous + 12 new), 0 failed.
The script uses `set -uo pipefail` (without -e), but Test 6 was using `set +e` before and `set -e` after a subshell, incorrectly enabling errexit which was never originally enabled. Changes: - Replaced set +e/set -e with `|| true` after command substitution - Removed unused EXIT_CODE variable (|| true always returns 0) - Simplified test condition to only check for SCRIPT_COMPLETED output - Added clarifying comment about the || true purpose The test verifies that scripts using strict mode can safely call template-loader functions without crashing. Test results: 377 tests passed, 0 failed.
Add is_allowlisted_file() helper to permit specific round files that are required by Codex review process: - round-1-todos.md, round-2-todos.md: Persisted todos lists - round-0-summary.md, round-1-summary.md: Historical accuracy updates Update all validator hooks (read, write, edit, bash) to check allowlist before blocking access to these specific files. Files modified: - hooks/lib/loop-common.sh: Add is_allowlisted_file() function - hooks/loop-read-validator.sh: Check allowlist for todos and summaries - hooks/loop-write-validator.sh: Check allowlist for todos and summaries - hooks/loop-edit-validator.sh: Check allowlist for todos and summaries - hooks/loop-bash-validator.sh: Check allowlist for round-1/2-todos.md
Improvements: - Bash validator: Require active loop dir path for round-1/2-todos.md (fixes permissive allowlist that allowed any path) - Add test-allowlist-validators.sh with 24 tests covering: - is_allowlisted_file() function - Read/Write/Edit validator allowlist behavior - Bash validator path-restricted allowlist - Add test-allowlist-validators.sh to run-all-tests.sh Test results: 401 tests passed across 11 suites Files modified: - hooks/loop-bash-validator.sh: Tighten allowlist path check - tests/run-all-tests.sh: Add test-allowlist-validators.sh - tests/test-allowlist-validators.sh: New test file (24 tests)
- Changed bash validator to use full ACTIVE_LOOP_DIR path instead of basename-only check to prevent same-basename bypass from different roots - Added regex escaping for special characters in paths - Added Test 25 for same-basename different-root security scenario - Code-simplifier: consolidated comments, fixed regex escape pattern
|
Claude finished @SihaoLiu's task in 2m 31s —— View job PR Review Complete
Code Review SummaryThis is a substantial refactoring PR with 2440 additions and 195 deletions across 24 files. The changes focus on fail-closed security patterns, state parsing migration, and comprehensive test coverage. Positive Aspects1. Security - Fail-Closed Handling (Excellent) The most critical fix addresses the Codex feedback about git status timeout handling. The implementation in GIT_STATUS_EXIT=0
GIT_STATUS_CACHED=$(run_with_timeout "$GIT_TIMEOUT" git status --porcelain 2>/dev/null) || GIT_STATUS_EXIT=$?
if [[ $GIT_STATUS_EXIT -ne 0 ]]; then
# Git status failed or timed out - fail-closed by blocking exit
...
jq -n --arg reason "$REASON" --arg msg "Loop: Blocked - git status failed (exit $GIT_STATUS_EXIT)" \
'{"decision": "block", "reason": $reason, "systemMessage": $msg}'
exit 0
fiThis replaces the previous fail-open pattern ( 2. Code Quality - Shared Library ( The centralization of common functions is well-done:
3. Template System ( The single-pass template rendering prevents placeholder injection attacks where user content containing 4. Comprehensive Test Coverage New test files added:
5. Input Validation Good security practices in
CLAUDE.md Compliance
Minor Observations (Not Issues)
CI StatusAll 8 CI checks are passing:
Verdict: This PR implements critical security improvements with proper fail-closed semantics, adds comprehensive test coverage, and maintains good code quality. The changes are well-structured and follow the project's guidelines. Approved for merge. |
|
Codex Review: Didn't find any major issues. 🚀 ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
Fixes based on Round 0 Codex review: HIGH - Haiku sub-agent invocation (AC-2.3): - Changed from generic "general-purpose" subagent to explicit "draft-relevance-checker" agent reference - Added clear Task tool parameters documentation - Referenced the custom agent in agents/draft-relevance-checker.md HIGH - Missing positive/negative test descriptions (AC-6): - Added TDD-style test format to Acceptance Criteria template - Each AC now includes Positive Tests (expected to PASS) and Negative Tests (expected to FAIL) sections - Added Generation Rule #8 enforcing TDD-style tests MEDIUM - Path boundary phrasing (AC-6.3): - Changed from negative phrasing ("What should NOT be done") to affirmative descriptions ("Maximum Acceptable Scope") - Added examples showing affirmative descriptions - Added Generation Rule #9 for affirmative path boundaries MEDIUM - Narrow path rule for deterministic drafts (AC-6.3): - Added "Note on Deterministic Designs" explaining when upper and lower bounds should converge - Added Generation Rule #10 for respecting fixed approaches
…ns (v1.3.0) (#17) * Feature: Add gen-plan command to transform drafts into structured plans (v1.2.4) Add a new slash command /humanize:gen-plan that transforms user draft documents into well-structured implementation plans with acceptance criteria (AC-X format). New files: - commands/gen-plan.md: Slash command entry point - skills/gen-plan/SKILL.md: Main skill implementation with opus model - scripts/validate-gen-plan-io.sh: IO path validation script - agents/draft-relevance-checker.md: Haiku-based relevance checker Workflow: 1. Validates input/output paths with clear error messages 2. Uses haiku sub-agent to check draft relevance to repository 3. Analyzes draft for clarity, consistency, completeness, functionality 4. Engages user via AskUserQuestion to resolve issues 5. Generates plan.md with goals, ACs, path boundaries, and suggestions Additional changes: - Unified model names to use short aliases (opus, sonnet, haiku) across all skills - Updated test validation to accept short model aliases Updated version to 1.2.4 in plugin.json, marketplace.json, and README.md. * Fix: Address Codex review findings for gen-plan skill Fixes based on Round 0 Codex review: HIGH - Haiku sub-agent invocation (AC-2.3): - Changed from generic "general-purpose" subagent to explicit "draft-relevance-checker" agent reference - Added clear Task tool parameters documentation - Referenced the custom agent in agents/draft-relevance-checker.md HIGH - Missing positive/negative test descriptions (AC-6): - Added TDD-style test format to Acceptance Criteria template - Each AC now includes Positive Tests (expected to PASS) and Negative Tests (expected to FAIL) sections - Added Generation Rule #8 enforcing TDD-style tests MEDIUM - Path boundary phrasing (AC-6.3): - Changed from negative phrasing ("What should NOT be done") to affirmative descriptions ("Maximum Acceptable Scope") - Added examples showing affirmative descriptions - Added Generation Rule #9 for affirmative path boundaries MEDIUM - Narrow path rule for deterministic drafts (AC-6.3): - Added "Note on Deterministic Designs" explaining when upper and lower bounds should converge - Added Generation Rule #10 for respecting fixed approaches * Simplify: Clean up gen-plan implementation (Finalize Phase) Code simplification after Codex review approval: commands/gen-plan.md: - Simplified trailing description to single sentence - Consistent with other command files in the project scripts/validate-gen-plan-io.sh: - Removed unnecessary empty echo statements - Consolidated error messages for conciseness - 11 lines removed, functionality unchanged skills/gen-plan/SKILL.md: - Simplified Phase 6 from numbered list to concise paragraph - 4 lines removed, same information preserved All 49 tests pass. Validation script tested and works correctly. * Bump version to 1.3.0 for gen-plan feature release Version bump from 1.2.4 to 1.3.0 for the new gen-plan command feature. * Fix: Validate argument values before shift to ensure correct exit codes Bug fix: - When --input or --output is passed without a value, shift 2 would fail under set -e and exit with code 1 (mapped to "input not found") instead of exit code 6 ("invalid arguments") - Added checks for missing argument values before shift 2 - Also catches flags followed by another flag (e.g., --input --output) Test enhancement: - Added 10 comprehensive tests for validate-gen-plan-io.sh exit codes - Tests cover: missing values, unknown options, all validation exit codes - Total tests now: 59 (was 49) * Fix: Model validation requires exact match for short aliases The regex was treating opus/sonnet/haiku as prefixes, accepting invalid values like opus-v2, haiku123, or sonnet-fast. Fix: - Short aliases (opus, sonnet, haiku) now require exact match via ^...$ - Full model IDs (claude-*, gpt-*, etc.) still use prefix matching Added tests: - NT-6d: Rejects opus-v2 (partial match) - NT-6e: Rejects haiku123 (partial match) - NT-6f: Rejects sonnet-fast (partial match) Total tests: 62 * Fix: Reject directory paths for --output parameter Bug fix: - If --output pointed to an existing directory (e.g., /tmp/ instead of /tmp/plan.md), validation passed but Write would fail at runtime - Now rejects any existing path at output location (file, directory, symlink) - Added specific error message when output is a directory with suggestion Changes: - Check 4 now uses -d to detect directories with clear error message - Check 4 uses -e to catch any existing path (not just regular files) Added test: - validate-gen-plan-io: output is directory exits 4 Total tests: 63
Summary