Skip to content

Pipeline Plan 324

ezigus edited this page Apr 8, 2026 · 3 revisions

Implementation Plan: Close the Learning Feedback Loop

Overview

This issue closes the feedback loop by invoking ruflo_learn_from_shipwright() at two critical points in the pipeline lifecycle: when validation succeeds (end of stage_validate) and when failure occurs (failure path in stage_monitor). This enables the semantic recall system to learn from actual pipeline outcomes.


Requirements Analysis

Minimum Viable Change

Add two function calls (~10 lines of code per site) at strategic points with fail-open guards.

Implicit Requirements Identified

  • Outcome data (goal, issue, task_type, success/failure) must be passed consistently
  • Event emissions must use existing emit_event infrastructure
  • No blocking behavior — learning failures must not break the pipeline
  • Tests must validate both happy path (success) and error path (failure)

Acceptance Criteria (Explicit)

✅ Call ruflo_learn_from_shipwright() at end of stage_validate() (success path) ✅ Call ruflo_learn_from_shipwright() on failure path in stage_monitor() ✅ Both wrapped with fail-open guards (ruflo_available() + || true) ✅ Event emissions: ruflo.learn_from_shipwright at each call site ✅ Existing tests remain green (sw-ruflo-adapter-test.sh) ✅ New tests: validate-success path + monitor-failure path ✅ Full test suite passes: npm test


Design Alternatives Considered

Alternative 1: Dual Call Sites (CHOSEN)

  • Call ruflo_learn_from_shipwright() at end of stage_validate() (success)
  • Call ruflo_learn_from_shipwright() on failure path in stage_monitor() (failure)

Pros:

  • Explicit success/failure semantics
  • Learning happens immediately when outcome is known
  • Matches issue's explicit requirements
  • Clear separation of concerns

Cons:

  • Two call sites to maintain
  • Must ensure consistent parameter passing

Blast Radius: Minimal (two isolated function calls, fail-open guards)

Alternative 2: Single Unified Call in stage_monitor

Consolidate to one call site with outcome flag.

Pros:

  • Single entry point for learning
  • Easier to maintain

Cons:

  • Delays success-path learning until end of pipeline
  • More complex parameter construction
  • Violates issue's explicit requirement for validate-success call
  • Rejected

Alternative 3: Async Background Learning

Queue learning to a background worker to prevent blocking.

Pros:

  • Non-blocking
  • Scales better for expensive learning operations

Cons:

  • Adds complexity for minimal benefit
  • Existing function is lightweight
  • Rejected (over-engineering)

Risk Assessment & Mitigation

Risk Impact Mitigation
ruflo_learn_from_shipwright unavailable Learning skipped silently, semantic recall stays stale Guard with ruflo_available() + || true (fail-open)
Outcome artifact missing/malformed Function fails or receives wrong data Check artifact exists, validate JSON structure before passing
Learning function throws (unhandled) Pipeline exits abnormally Wrap in ( ... ) || true subshell; log error
Event emission fails Pipeline continues but telemetry lost Guard event emission with || true
Test execution doesn't cover both paths Acceptance criteria not met Explicitly test success + failure paths separately

Task Decomposition

Phase 1: Code Analysis (Prerequisites)

  • Task 1.1: Read scripts/lib/ruflo-adapter.sh ~line 900 — understand function signature, parameters, return value
  • Task 1.2: Read scripts/lib/pipeline-stages-monitor.sh — locate stage_validate() and stage_monitor() functions
  • Task 1.3: Identify outcome artifact location and structure (stored by which stage, what fields)
  • Task 1.4: Verify ruflo.learn_from_shipwright event is registered in config/event-schema.json
  • Task 1.5: Locate existing fail-open pattern examples (e.g., stage_build.sh lines 414–434)

Phase 2: Implementation (Call Site 1: Success Path)

  • Task 2.1: Add ruflo_learn_from_shipwright() call at end of stage_validate() with success outcome
  • Task 2.2: Wrap call with ruflo_available() guard + || true fallback
  • Task 2.3: Extract outcome data: goal, issue_number, task_type from intake artifact
  • Task 2.4: Emit ruflo.learn_from_shipwright event (outcome=success)
  • Task 2.5: Test locally: npm test -- sw-ruflo-adapter-test.sh

Phase 3: Implementation (Call Site 2: Failure Path)

  • Task 3.1: Identify failure path in stage_monitor() (rollback/error handling section)
  • Task 3.2: Add ruflo_learn_from_shipwright() call with failure outcome
  • Task 3.3: Wrap with fail-open guards
  • Task 3.4: Emit ruflo.learn_from_shipwright event (outcome=failure, error context)
  • Task 3.5: Test locally: npm test -- sw-ruflo-adapter-test.sh

Phase 4: Testing (Regression + New Tests)

  • Task 4.1: Run existing tests: npm test — verify no regressions
  • Task 4.2: Add test in sw-ruflo-adapter-test.sh: verify stage_validate success path calls learning
  • Task 4.3: Add test in sw-ruflo-adapter-test.sh: verify stage_monitor failure path calls learning
  • Task 4.4: Add test: verify fail-open behavior (learning unavailable → no error)
  • Task 4.5: Add test: verify event emissions at both call sites
  • Task 4.6: Run full test suite: npm test — all green

Phase 5: Verification & Documentation

  • Task 5.1: Create brief comment above each call site explaining why it's there
  • Task 5.2: Verify both calls extract outcome data correctly
  • Task 5.3: Manual walkthrough: trace data flow from pipeline artifact → function parameters
  • Task 5.4: Update any related documentation (if needed)

Testing Approach

Test Pyramid Breakdown

  • Unit Tests (70%):

    • Test ruflo_learn_from_shipwright() function in isolation (existing)
    • Mock ruflo_available() to return true/false
    • Verify parameter passing
    • Count: 4 existing + 2 new = 6 tests
  • Integration Tests (20%):

    • Test stage_validate() calls learning on success
    • Test stage_monitor() calls learning on failure
    • Verify event emissions
    • Mock pipeline artifacts
    • Count: 2 new tests
  • E2E Tests (10%):

    • Run full pipeline with learning enabled
    • Verify semantic recall uses learned outcomes
    • Count: 1 existing (implicit)

Coverage Targets

  • Critical Path 1 (Success): Happy path in stage_validate → learning called with correct outcome
  • Critical Path 2 (Failure): Failure path in stage_monitor → learning called with error context
  • Edge Case 1: ruflo unavailable → pipeline continues, learning skipped
  • Edge Case 2: Outcome artifact malformed → learning fails gracefully, pipeline continues
  • Edge Case 3: Event emission fails → learning still completes

Test Execution Plan

# Phase 1: Unit tests only
npm test -- sw-ruflo-adapter-test.sh

# Phase 2: All tests
npm test

Files to Modify

File Change Lines
scripts/lib/pipeline-stages-monitor.sh Add ruflo_learn_from_shipwright() call at end of stage_validate() TBD (depends on function location)
scripts/lib/pipeline-stages-monitor.sh Add ruflo_learn_from_shipwright() call on failure path in stage_monitor() TBD
scripts/sw-ruflo-adapter-test.sh Add 2 new tests (validate-success, monitor-failure) ~40 lines
config/event-schema.json Verify event is registered (no changes expected)

Implementation Steps (Detailed)

Step 1: Read and Understand Function Signature

# Extract function definition from ruflo-adapter.sh
grep -A 30 "^ruflo_learn_from_shipwright()" scripts/lib/ruflo-adapter.sh

Expected signature (based on issue context):

ruflo_learn_from_shipwright() {
  # Parameters: goal, issue_number, task_type, outcome (success/failure), error_context?
  # Namespace: learning-<repo_hash>
}

Step 2: Add Call Site 1 — Success Path in stage_validate()

In scripts/lib/pipeline-stages-monitor.sh, at the end of stage_validate():

# Call learning function with success outcome (fail-open)
if ruflo_available; then
  local intake_artifact="$PIPELINE_ARTIFACTS/intake.json"
  if [[ -f "$intake_artifact" ]]; then
    ruflo_learn_from_shipwright \
      "$(jq -r '.goal' "$intake_artifact")" \
      "$(jq -r '.issue_number' "$intake_artifact")" \
      "$(jq -r '.task_type' "$intake_artifact")" \
      "success" \
      || true
    emit_event "ruflo.learn_from_shipwright" "outcome=success" "stage=validate"
  fi
fi || true

Step 3: Add Call Site 2 — Failure Path in stage_monitor()

In scripts/lib/pipeline-stages-monitor.sh, on the failure/rollback path:

# Call learning function with failure outcome (fail-open)
if ruflo_available; then
  local intake_artifact="$PIPELINE_ARTIFACTS/intake.json"
  local error_summary="$PIPELINE_ARTIFACTS/error-summary.json"
  if [[ -f "$intake_artifact" ]]; then
    local error_context=""
    [[ -f "$error_summary" ]] && error_context="$(jq -c . "$error_summary")"
    
    ruflo_learn_from_shipwright \
      "$(jq -r '.goal' "$intake_artifact")" \
      "$(jq -r '.issue_number' "$intake_artifact")" \
      "$(jq -r '.task_type' "$intake_artifact")" \
      "failure" \
      "$error_context" \
      || true
    emit_event "ruflo.learn_from_shipwright" "outcome=failure" "stage=monitor" "error_context=$error_context"
  fi
fi || true

Step 4: Add Tests

In scripts/sw-ruflo-adapter-test.sh:

test_ruflo_learn_called_on_validate_success() {
  # Mock outcome: stage_validate exits 0
  # Verify: ruflo_learn_from_shipwright was called with success=success
  # Verify: event was emitted
}

test_ruflo_learn_called_on_monitor_failure() {
  # Mock outcome: stage_monitor failure path triggered
  # Verify: ruflo_learn_from_shipwright was called with outcome=failure
  # Verify: event was emitted
}

test_ruflo_learn_fail_open_when_unavailable() {
  # Mock: ruflo_available returns 1 (unavailable)
  # Verify: pipeline continues (no error)
  # Verify: no event emitted
}

Definition of Done

  • Both ruflo_learn_from_shipwright() calls added to pipeline-stages-monitor.sh
    • Call 1: At end of stage_validate(), outcome=success
    • Call 2: On failure path in stage_monitor(), outcome=failure
  • Both calls are fail-open (guarded, || true fallback)
  • Both calls extract goal, issue_number, task_type from intake artifact
  • Events emitted: ruflo.learn_from_shipwright at each call site
  • New tests added to sw-ruflo-adapter-test.sh:
    • Test: validate-success path calls learning
    • Test: monitor-failure path calls learning
    • Test: fail-open behavior (unavailable → no error)
  • Existing tests pass: npm test -- sw-ruflo-adapter-test.sh
  • Full test suite passes: npm test
  • No regressions in other stages
  • Acceptance criteria met: All 6 criteria satisfied

Next Steps (After Plan Approval)

  1. Execute Phase 1 (Code Analysis) — read functions, understand data flow
  2. Execute Phase 2 (Success Call Site) — add, test, verify
  3. Execute Phase 3 (Failure Call Site) — add, test, verify
  4. Execute Phase 4 (Testing) — run full suite, add tests
  5. Execute Phase 5 (Verification) — walkthrough, documentation
  6. Create PR with both call sites, tests, and verification notes

Context Efficiency Notes

  • Will read ruflo-adapter.sh at line 900+ (targeted section, ~30 lines)
  • Will read pipeline-stages-monitor.sh for both functions (likely < 100 lines each)
  • Will grep for existing fail-open examples in pipeline-stages-build.sh (reference)
  • Will use jq for JSON parsing (consistent with existing patterns)

This plan is ready to execute once approved.

Clone this wiki locally