Skip to content

Pipeline Design 364

ezigus edited this page Apr 13, 2026 · 1 revision

ADR written to .claude/pipeline-artifacts/design.md.

Key architectural decisions:

  1. Per-finding SHA stamp for JSON — matches the existing compound-audit.sh:556 pattern exactly, preserving all .[] | select(...) read sites. Wrapper-object and sidecar approaches were rejected for higher blast radius.

  2. Fail-open everywhere_pipeline_head_sha() returns "" on error, pipeline_artifact_is_current() returns 0 (pass-through) when SHA is absent or extraction fails. The pipeline never blocks on SHA infrastructure issues.

  3. Prefix matching for SHA comparison — handles variable git rev-parse --short lengths (7-12 chars).

  4. 2 files modified, 0 createdpipeline-quality-checks.sh gets the write-side helpers + 5 stamp sites; pipeline-intelligence.sh gets the read-side guards (9 sites) + cycle file cleanup (4 call sites). compound-audit.sh is untouched.

  5. Cleanup at all exit paths_cleanup_cycle_files() is called at stage entry and all 3 return statements in stage_compound_quality() to prevent orphaned negative-review-cycle*.md files. created_at_commit using jq --arg c "$current_commit" '[.[] | . + {created_at_commit: $c}]'. The new work extends this pattern to all 5 artifact write paths and adds read-side validation.

Constraints:

  • Bash 3.2 compatible (no associative arrays, no ${var,,})
  • set -euo pipefail throughout
  • Artifacts are JSON arrays consumed via .[] | select(...) — wrapper objects break 3+ read sites
  • Must be backward compatible: artifacts without SHA stamps must pass through
  • Existing pipeline_artifact_is_fresh() handles time-based staleness (run epoch); SHA anchoring adds commit-level freshness as a complementary mechanism

Decision

SHA anchoring: Stamp created_at_commit into every quality artifact at write time. Guard every artifact read with a new pipeline_artifact_is_current() validator that compares the stamped SHA against git rev-parse --short HEAD. Stale artifacts are skipped (treated as absent). Artifacts without a SHA stamp pass through (backward compat).

Format choice: Per-finding stamp for JSON (Approach A) — each array element gets created_at_commit: "<sha>" added via jq. This preserves the .[].field access pattern used at 3+ read sites. For markdown and log files, a metadata line is prepended (created_at_commit: <sha> for .md, # created_at_commit: <sha> for .log).

Cycle file cleanup: A _cleanup_cycle_files() function removes negative-review-cycle*.md at stage entry and all exit paths of stage_compound_quality().

Fail-open design: Both _pipeline_head_sha() and pipeline_artifact_is_current() fail open — if git is unavailable, the repo is in detached HEAD, or extraction fails, artifacts pass through. The pipeline never blocks on a SHA infrastructure failure.

Component Diagram

┌─────────────────────────────────────────────────────────┐
│              pipeline-quality-checks.sh                  │
│                                                         │
│  _pipeline_head_sha()  ←── git rev-parse --short HEAD   │
│  pipeline_artifact_is_current(file)  ←── SHA comparison │
│                                                         │
│  quality_check_security()     ──→ security-audit.log    │
│  quality_check_adversarial()  ──→ adversarial-review.*  │
│  quality_check_negative()     ──→ negative-review.md    │
│  quality_check_dod()          ──→ dod-audit.md          │
│         ▲ all stamp created_at_commit at write time     │
└───────────────────────┬─────────────────────────────────┘
                        │ sources / calls
┌───────────────────────▼─────────────────────────────────┐
│              pipeline-intelligence.sh                    │
│                                                         │
│  _extract_blocking_items()                              │
│    guards: adversarial-review.json  ─┐                  │
│            adversarial-review.md     │ pipeline_artifact │
│            security-audit.log       ├─ _is_current()    │
│            negative-review.md       │ before reading     │
│            dod-audit.md             │                    │
│            classified-findings.json ─┘                  │
│                                                         │
│  convergence detection (lines 1826-1864)                │
│    guards: adversarial-review.md  ──→ is_current()      │
│            adversarial-review.json ──→ is_current()     │
│            negative-review.md  ──→ is_current()         │
│                                                         │
│  _cleanup_cycle_files()  ──→ rm negative-review-cycle*  │
│  stage_compound_quality()  calls cleanup at entry+exits │
└─────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────┐
│              compound-audit.sh (UNCHANGED)               │
│  line 556: existing created_at_commit pattern            │
│  (per-finding jq stamp — our JSON format matches this)  │
└─────────────────────────────────────────────────────────┘

Interface Contracts

# Returns HEAD short SHA (8 chars), or "" on failure/detached HEAD/shallow clone.
# Never exits with non-zero. Pure function, no side effects.
_pipeline_head_sha() -> string  # "" | "ab12cd34"

# Returns 0 if artifact's stamped SHA matches current HEAD (prefix match).
# Returns 0 (pass-through) when: no SHA found in artifact, HEAD unavailable, file missing.
# Returns 1 when: SHA found but doesn't match HEAD.
#
# Format detection by file extension:
#   .json  → reads .[0].created_at_commit via jq
#   .md    → reads "created_at_commit: <sha>" from first 5 lines via sed
#   .log   → reads "# created_at_commit: <sha>" from first 3 lines via sed
pipeline_artifact_is_current(file: path) -> 0|1

# Removes all negative-review-cycle*.md from ARTIFACTS_DIR.
# No-op if none exist. Never errors.
_cleanup_cycle_files() -> void

Error contracts:

  • _pipeline_head_sha(): swallows all git errors, returns ""
  • pipeline_artifact_is_current(): returns 0 on any extraction failure (fail-open for backward compat)
  • _cleanup_cycle_files(): rm -f ... 2>/dev/null || true

Data Flow

WRITE PATH (quality_check_* functions):
  1. quality_check_X() produces artifact content
  2. Calls sha=$(_pipeline_head_sha)
  3. Stamps created_at_commit into artifact:
     - JSON: jq --arg c "$sha" '[.[] | . + {created_at_commit: $c}]'
     - MD:   { echo "created_at_commit: $sha"; cat "$file"; } > tmp && mv tmp "$file"
     - LOG:  { echo "# created_at_commit: $sha"; cat "$file"; } > tmp && mv tmp "$file"

READ PATH (_extract_blocking_items):
  1. Check -f "$artifact_file"
  2. Check pipeline_artifact_is_current "$artifact_file"
  3. If stale (returns 1), skip — treat as absent
  4. If current (returns 0), read and extract findings

CONVERGENCE PATH (lines 1826-1864):
  Same pattern as read path — guard with is_current() before counting issues

CLEANUP PATH:
  stage_compound_quality() entry → _cleanup_cycle_files()
  stage_compound_quality() return 0 → _cleanup_cycle_files()
  stage_compound_quality() return 1 (gate failed, ~2175) → _cleanup_cycle_files()
  stage_compound_quality() return 1 (exhausted, ~2216) → _cleanup_cycle_files()

Error Boundaries

Component Error Handling
_pipeline_head_sha() git not available, detached HEAD, shallow clone Returns "" — downstream treats as pass-through
pipeline_artifact_is_current() jq fails, sed fails, malformed artifact Returns 0 (pass-through) — never blocks pipeline on extraction error
pipeline_artifact_is_current() File doesn't exist Returns 0 — caller already checked -f
SHA stamp writes jq not available Already a project dependency; guarded with 2>/dev/null
_cleanup_cycle_files() Permission denied, no files `rm -f ... 2>/dev/null

Alternatives Considered

  1. Wrapper-object JSON format ({created_at_commit: "...", findings: [...]}) — Pros: Clean top-level field, single read for SHA. Cons: Breaks all .[] | select(...) reads at 3+ sites (_extract_blocking_items lines 1053, convergence at 1833, compound-audit.sh). Larger blast radius. Rejected.

  2. Sidecar .sha files (e.g., adversarial-review.json.sha) — Pros: Zero format change to existing artifacts. Cons: Doubles file count in artifacts directory; introduces race conditions between artifact write and sidecar write; orphaned sidecar risk on partial failure. Rejected.

  3. Epoch-based freshness only (extend existing pipeline_artifact_is_fresh()) — Pros: Already implemented for time-based staleness. Cons: Time-based checks don't detect code changes within the same pipeline run; a commit between cycles within the same epoch window would be missed. Complementary but insufficient. Rejected as sole mechanism.

Implementation Plan

  • Files to create: None
  • Files to modify:
    • scripts/lib/pipeline-quality-checks.sh — add _pipeline_head_sha() and pipeline_artifact_is_current() after line 74; stamp SHA in quality_check_security (~line 102), quality_check_adversarial (~lines 659, 675, 726), quality_check_negative (~line 815), quality_check_dod (~line 962)
    • scripts/lib/pipeline-intelligence.sh — guard reads in _extract_blocking_items() (lines 1051, 1060, 1070, 1079, 1085, 1094); guard convergence reads (lines 1826, 1831, 1836); add _cleanup_cycle_files(); wire cleanup into stage_compound_quality() at entry and return points (~2175, ~2204, ~2216)
  • Dependencies: None new (jq, git, sed already required)
  • Risk areas:
    • JSON stamp must use the exact jq pattern from compound-audit.sh:556 to maintain consistency
    • MD/LOG prepend must use atomic tmp+mv pattern (not echo > which truncates) per project conventions
    • pipeline_artifact_is_current() must use prefix matching for SHA comparison — git rev-parse --short length can vary (7-12 chars depending on repo size and git version)
    • Convergence detection (lines 1826-1864) has complex cross-cycle dedup logic for negative reviews — the SHA guard goes outside that logic (at the -f check), not inside it

Validation Criteria

  • _pipeline_head_sha() returns 8-char short SHA or empty string; never exits non-zero
  • pipeline_artifact_is_current() returns 0 for artifacts without SHA field (backward compat)
  • pipeline_artifact_is_current() returns 0 when HEAD is unavailable (empty SHA = pass-through)
  • pipeline_artifact_is_current() returns 1 when artifact SHA differs from HEAD
  • pipeline_artifact_is_current() correctly parses all 3 formats: .json, .md, .log
  • All 5 artifact write paths stamp created_at_commit
  • JSON stamp uses same jq pattern as compound-audit.sh:556
  • _extract_blocking_items() skips stale artifacts (6 read sites guarded)
  • Convergence detection skips stale artifacts (3 read sites guarded)
  • negative-review-cycle*.md files are removed at stage entry and all 3 exit paths
  • ./scripts/sw-pipeline-test.sh passes
  • npm test passes
  • No changes to compound-audit.sh

Clone this wiki locally