Skip to content

Pipeline Plan 232

Seth Ford edited this page Mar 9, 2026 · 2 revisions

All three exploration agents have completed. The findings confirm the plan's approach is well-aligned with existing codebase patterns. The plan is complete and saved to .claude/pipeline-artifacts/plan.md. a new library module, called during stage_intake(), that checks test file existence/executability/harness patterns, optionally runs shellcheck, writes a JSON report, and fails the intake stage with actionable errors if validation fails.

Implicit requirements:

  • Must handle projects with NO test files yet (new projects) — the build stage may create them
  • Must distinguish Shipwright's bash test harness from other test frameworks (pytest, jest, etc.)
  • Must degrade gracefully when shellcheck is not installed
  • Must not break existing pipelines that work fine today

Acceptance criteria (from issue): validate_test_infrastructure() function, test file checks (exists, executable, harness patterns), shellcheck validation, mock directory validation, JSON report to pipeline-artifacts/, fail intake on validation failure, --skip-test-validation flag, comprehensive test suite.

Alternatives Considered

Option A: Inline validation in stage_intake() — Add 50-80 lines directly to stage_intake().

  • Pros: Simple, no new files, minimal blast radius
  • Cons: stage_intake() already 119 lines, harder to test independently, mixes concerns
  • Blast radius: Low (one file)

Option B: New pipeline library module — Create scripts/lib/pipeline-test-validation.sh with guard loading.

  • Pros: Follows existing module pattern (pipeline-detection.sh, pipeline-quality-checks.sh), independently testable, clean separation of concerns, reusable from other stages
  • Cons: One more module to source, slightly more files to touch
  • Blast radius: Low (new file + 3 small edits)

Option C: Extend pipeline-detection.sh — Add to existing detection module since it has detect_test_cmd().

  • Pros: Keeps detection logic together
  • Cons: pipeline-detection.sh would grow significantly, mixes detection (what) with validation (is it correct)
  • Blast radius: Medium (modifying a widely-sourced file)

Chosen: Option B — New module follows existing conventions, minimizes risk to existing code, is independently testable.

Risk Analysis

Risk Impact Mitigation
False positives on non-Shipwright projects Blocks valid pipelines Only enforce bash harness checks when test files are .sh; use "warning" vs "error" severity
shellcheck not installed Validation crashes Check command -v shellcheck first; skip with info message
New projects with no tests Intake fails unnecessarily Treat "no test infrastructure found" as warning, not error; only fail if files exist but are broken
Existing working pipelines break Regression --skip-test-validation flag as escape hatch; conservative error thresholds
Mock directory check too strict False positives Only check mock dirs if test file references them; pattern-based, not exact-path

Definition of Done

  1. validate_test_infrastructure() function exists and is callable from intake stage
  2. Running a pipeline with broken test files fails at intake with clear error message
  3. Running a pipeline with valid test files passes validation
  4. Running with --skip-test-validation bypasses all checks
  5. test-validation.json report written to .claude/pipeline-artifacts/
  6. shellcheck integration works when available, degrades gracefully when not
  7. Test suite (sw-test-validation-test.sh) covers all positive and negative cases
  8. Registered in package.json test scripts

Context

Failed pipelines often waste significant time and cost (build duration ~4133s per metrics) because broken or missing test harnesses are only discovered late in the build-test loop. A pre-flight validation gate in the intake stage would fail fast, saving ~$5/pipeline run on average.

The Shipwright test harness convention requires: set -euo pipefail, PASS/FAIL counters, ERR trap, and (for complex tests) mock binary directories. Some tests source scripts/lib/test-helpers.sh which provides these automatically. Both patterns are valid.

Decision

Create scripts/lib/pipeline-test-validation.sh as a new guard-loaded module that validates test infrastructure before the build loop begins. The function runs during stage_intake() after test command auto-detection (step 3) but before branch creation (step 4).

Validation Checks (ordered by severity)

  1. Test command exists (ERROR): If TEST_CMD is set, verify the command binary exists on PATH
  2. Test file discovery (INFO): Find test files matching project conventions (.sh for bash, test directories for other frameworks)
  3. Bash test harness validation (ERROR for .sh test files):
    • File exists at expected path
    • File is executable (-x)
    • Contains set -euo pipefail (or set -e)
    • Contains PASS/FAIL counting (either inline PASS=0/FAIL=0 or source.*test-helpers.sh)
    • Contains ERR trap (trap.*ERR)
    • Basic syntax validation (bash -n)
  4. shellcheck validation (WARNING): Run shellcheck --shell=bash if available
  5. Mock directory structure (WARNING for bash tests): If test creates mock binaries, verify the pattern is valid ($TEST_TEMP_DIR/bin/ or equivalent)

Severity Levels

  • ERROR: Blocks intake stage (returns 1)
  • WARNING: Logged but doesn't block
  • INFO: Informational only

Report Format

{
  "timestamp": "2026-03-09T12:00:00Z",
  "valid": true|false,
  "test_cmd": "npm test",
  "test_files_found": 5,
  "checks": [
    {
      "file": "scripts/sw-hello-test.sh",
      "check": "executable",
      "severity": "error",
      "passed": true,
      "message": "File is executable"
    }
  ],
  "errors": 0,
  "warnings": 1,
  "skipped": false,
  "shellcheck_available": true
}

Implementation Plan

Files to Create

  • scripts/lib/pipeline-test-validation.sh — validation logic (~150 lines)
  • scripts/sw-test-validation-test.sh — test suite (~300 lines)

Files to Modify

  • scripts/lib/pipeline-cli.sh — add --skip-test-validation flag (~2 lines)
  • scripts/lib/pipeline-stages-intake.sh — call validate_test_infrastructure() (~10 lines)
  • scripts/lib/pipeline-stages.sh — source new module (~2 lines)
  • package.json — register test suite (~1 line)

Implementation Steps

  1. Create scripts/lib/pipeline-test-validation.sh:

    • Module guard pattern (_PIPELINE_TEST_VALIDATION_LOADED)
    • find_test_files() — discovers test files based on project type
    • validate_bash_test_harness() — checks a single .sh test file for harness patterns
    • validate_test_infrastructure() — main entry point, orchestrates all checks
    • _write_validation_report() — writes JSON to artifacts dir
    • Uses existing helpers: info(), warn(), error(), save_artifact(), emit_event()
  2. Add --skip-test-validation flag to pipeline-cli.sh:

    • Add SKIP_TEST_VALIDATION="${SKIP_TEST_VALIDATION:-false}" default
    • Add --skip-test-validation) SKIP_TEST_VALIDATION=true; shift ;; to parse_args
  3. Wire into stage_intake() in pipeline-stages-intake.sh:

    • After step 3 (test command detection), before step 4 (branch creation)
    • Check SKIP_TEST_VALIDATION flag
    • Call validate_test_infrastructure()
    • On failure: emit event, return 1
  4. Source new module in pipeline-stages.sh:

    • Add source line for pipeline-test-validation.sh
  5. Create test suite scripts/sw-test-validation-test.sh:

    • Source test-helpers.sh
    • Test valid bash test file passes all checks
    • Test non-executable file fails
    • Test missing PASS/FAIL counters detected
    • Test missing ERR trap detected
    • Test missing set -euo pipefail detected
    • Test test-helpers.sh sourcing accepted as valid
    • Test syntax errors detected (bash -n)
    • Test shellcheck integration (when available)
    • Test skip flag bypasses validation
    • Test JSON report format
    • Test non-bash projects (minimal validation)
    • Test no test files found (warning, not error)
  6. Register in package.json

Task Checklist

  • Task 1: Create scripts/lib/pipeline-test-validation.sh with validate_test_infrastructure(), find_test_files(), validate_bash_test_harness(), and _write_validation_report()
  • Task 2: Add --skip-test-validation flag to scripts/lib/pipeline-cli.sh parse_args
  • Task 3: Wire validation into stage_intake() in scripts/lib/pipeline-stages-intake.sh after test command detection
  • Task 4: Source new module from scripts/lib/pipeline-stages.sh
  • Task 5: Create test suite scripts/sw-test-validation-test.sh with positive and negative test cases
  • Task 6: Register test suite in package.json
  • Task 7: Run the test suite and fix any failures
  • Task 8: Run shellcheck on new files

Testing Approach

Test Pyramid Breakdown:

  • 15+ unit tests covering individual validation functions (find_test_files, validate_bash_test_harness)
  • 3-4 integration tests covering the full validate_test_infrastructure flow
  • 1 negative integration test: broken test file causes intake failure

Coverage Targets: 100% of validation check types (executable, harness patterns, syntax, shellcheck), both pass and fail paths.

Critical Paths:

  • Happy path: valid Shipwright test file passes all checks
  • Error: non-executable test file → ERROR severity → intake fails
  • Error: missing PASS/FAIL counters → ERROR severity
  • Edge: test-helpers.sh sourced → PASS/FAIL provided externally → valid
  • Edge: shellcheck not installed → graceful skip
  • Edge: no test files found → warning only (new project)
  • Edge: --skip-test-validation → bypass all checks

Validation Criteria

  • Pipeline with valid test files passes intake
  • Pipeline with broken test file (not executable) fails at intake with clear message
  • Pipeline with --skip-test-validation bypasses checks
  • test-validation.json written correctly
  • shellcheck runs when available, skipped gracefully when not
  • Test suite passes with 0 failures
  • No regressions in existing pipeline tests (sw-pipeline-test.sh)

Clone this wiki locally