Skip to content

Pipeline Plan 324

ezigus edited this page Apr 8, 2026 · 3 revisions

Implementation Plan: Close the Learning Feedback Loop

Analysis & Socratic Refinement

Requirements Clarity

Minimum viable change: Add a call to ruflo_learn_from_shipwright() at pipeline completion with pipeline outcome and metrics, ensuring fail-open behavior (doesn't block pipeline).

Implicit requirements (inferred from codebase context):

  • Function should be called after all pipeline stages complete (success OR failure)
  • Must pass pipeline metadata: outcome (success/failure), issue number, branch, duration, stage failures
  • Must handle ruflo unavailability gracefully (log, don't crash)
  • Should integrate with existing .claude/pipeline-state.md and artifacts in .claude/pipeline-artifacts/

Acceptance criteria:

  1. ruflo_learn_from_shipwright() is called at pipeline completion (both paths)
  2. ✅ Learning call passes: issue #, goal, outcome, stage results, error summary (if failed)
  3. ✅ Failure path: Non-zero exit codes, error logs properly captured and passed
  4. ✅ Fail-open: Pipeline completion succeeds even if learning call fails
  5. ✅ Tests verify success path, failure path, and unavailable-ruflo edge case
  6. ✅ Existing tests remain green

Design Alternatives Considered

Option 1: Post-Completion Finalize Hook (CHOSEN)

Add finalize_and_learn() function in pipeline orchestrator that:

  • Collects pipeline outcome from pipeline-state.md and artifacts
  • Calls ruflo_learn_from_shipwright() with structured data
  • Logs result, doesn't propagate errors

Pros:

  • Centralized, single point of control
  • Clear lifecycle semantics
  • Easy to test both success and failure
  • Can be called from multiple exit paths (success, failure, timeout)

Cons:

  • Requires modifying core pipeline file
  • Slightly larger scope of changes

Blast radius: Low — adds new function, integrates at 2-3 call sites


Option 2: Event-Driven Listener (Rejected)

Emit pipeline:complete event, external listener calls learning.

Pros: Decoupled, extensible Cons: Indirect, hard to debug, requires event bus infrastructure Trade-off: Too much complexity for this feature


Option 3: Shell Hook in sw-pipeline.sh (Rejected)

Call learning function directly in shell after monitor stage.

Pros: Direct, minimal code Cons: Shell-based, hard to handle errors, hard to pass structured data Trade-off: Maintenance burden, less testable


Root Cause Analysis (Systematic Debugging)

Previous failures: None recorded; this is the initial plan stage.

Hypotheses to validate during build:

  1. ruflo_learn_from_shipwright() may not exist in Ruflo API — need to verify function signature
  2. Pipeline artifacts directory structure may vary — need robust path resolution
  3. Error handling in shell may suppress ruflo errors — need explicit logging

Risk Assessment & Mitigation

Risk Impact Probability Mitigation
Ruflo function unavailable or wrong signature Learning call fails, pipeline blocks Medium Wrap in try-catch, log error, don't propagate
Pipeline artifacts incomplete when learning called Missing data for learning Medium Call after all writes, validate artifacts exist
Learning function call hangs (timeout) Pipeline blocked indefinitely Low Add timeout wrapper (e.g., 5s)
Error summary JSON malformed Learning call fails parsing Low Validate JSON before passing, provide fallback
Backward compatibility: old pipelines without this New code ignored Low Make learning call optional, check for function existence

Files to Modify

  1. src/pipeline.ts (or equivalent orchestrator)

    • Add finalize_and_learn() function
    • Call from pipeline completion handlers (success, failure, timeout)
  2. src/pipeline-state.ts (if separate)

    • Ensure outcome, metrics exported properly
  3. tests/pipeline.test.ts

    • Unit tests for finalize_and_learn()
    • Integration tests for success/failure paths
  4. scripts/sw-pipeline.sh (if shell-based)

    • Add call site for finalize hook (fallback if TypeScript)

Implementation Steps

Phase 1: Interface & Data Collection (Task 1-3)

Task 1: Define ruflo_learn_from_shipwright() interface

  • Input: { issue: number, goal: string, outcome: 'success'|'failure', duration: number, stages: object, error?: object }
  • Returns: Promise<{ learned: boolean, reason?: string }>
  • Location: src/types/ruflo.ts (new file)

Task 2: Create pipeline outcome collector

  • Function: collectPipelineOutcome(): PipelineOutcome
  • Reads: pipeline-state.md, error-summary.json, stage artifacts
  • Returns: structured outcome object matching ruflo interface
  • Location: src/utils/outcome-collector.ts

Task 3: Add ruflo availability check

  • Function: isRufloAvailable(): boolean
  • Checks: whether ruflo_learn_from_shipwright exists in runtime
  • Safe fallback: returns false if unavailable, doesn't throw
  • Location: src/utils/ruflo-check.ts

Phase 2: Core Implementation (Task 4-6)

Task 4: Implement finalize_and_learn() function

async function finalizeAndLearn(outcome: PipelineOutcome): Promise<void> {
  try {
    if (!isRufloAvailable()) {
      logger.info('Ruflo not available, skipping learning');
      return;
    }
    const result = await ruflo_learn_from_shipwright(outcome);
    logger.info('Pipeline learning complete', { learned: result.learned });
  } catch (err) {
    logger.warn('Learning call failed (non-blocking)', { error: err.message });
    // Do NOT throw — fail-open behavior
  }
}
  • Location: src/pipeline.ts
  • Error handling: wrapped in try-catch, logs warning, returns normally

Task 5: Integrate into pipeline success path

  • Call finalize_and_learn(outcome) after all stages complete successfully
  • Ensure call happens AFTER all artifact writes
  • Location: pipeline completion handler in src/pipeline.ts

Task 6: Integrate into pipeline failure path

  • Call finalize_and_learn(outcome) after failure is detected
  • Include error summary in outcome object
  • Location: error handler in src/pipeline.ts

Phase 3: Testing (Task 7-10)

Task 7: Unit test - success path

  • Mock ruflo_learn_from_shipwright() to return success
  • Verify called with correct outcome shape
  • Test: tests/pipeline.test.ts::finalizeAndLearn::success

Task 8: Unit test - failure path

  • Mock ruflo_learn_from_shipwright() to return success
  • Pass outcome with error summary
  • Verify error data included
  • Test: tests/pipeline.test.ts::finalizeAndLearn::failure

Task 9: Unit test - ruflo unavailable

  • Mock isRufloAvailable() to return false
  • Verify no error thrown, just logged
  • Test: tests/pipeline.test.ts::finalizeAndLearn::unavailable

Task 10: Integration test - end-to-end

  • Run short pipeline (mock build)
  • Verify learning call captured and logged
  • Test: tests/e2e/pipeline-learning.test.ts

Phase 4: Validation (Task 11-12)

Task 11: Regression test - existing pipeline tests

  • Run full test suite: npm test
  • Ensure all existing tests still pass
  • No changes to test expectations

Task 12: Manual validation

  • Run real pipeline with verbose logging
  • Verify: learning call logged, no pipeline delays
  • Check: outcome data correctly structured

Task Checklist

  • Task 1: Define ruflo_learn_from_shipwright() interface in src/types/ruflo.ts
  • Task 2: Create collectPipelineOutcome() in src/utils/outcome-collector.ts
  • Task 3: Add isRufloAvailable() check in src/utils/ruflo-check.ts
  • Task 4: Implement finalize_and_learn() with fail-open error handling
  • Task 5: Integrate learning call into success completion path
  • Task 6: Integrate learning call into failure completion path
  • Task 7: Unit test - success path (finalize_and_learn)
  • Task 8: Unit test - failure path with error summary
  • Task 9: Unit test - ruflo unavailable (no crash)
  • Task 10: Integration test - end-to-end pipeline learning
  • Task 11: Run full test suite, verify regression-free
  • Task 12: Manual validation with verbose logging

Testing Approach

Test Pyramid Breakdown

Unit Tests (70%) — 8 tests

  • finalizeAndLearn(): success, failure, unavailable (3)
  • collectPipelineOutcome(): complete data, partial data, missing artifacts (3)
  • isRufloAvailable(): available, unavailable (2)

Integration Tests (20%) — 2 tests

  • Full pipeline success → learning call verified
  • Full pipeline failure → learning call with error summary verified

E2E Tests (10%) — 1 test

  • Real pipeline execution with learning feedback captured

Total: 11 tests

Coverage Targets

  • Critical paths (100%): success path, failure path, unavailable ruflo
  • Error handling (100%): all catch blocks, logging
  • Data passing (95%): outcome shape, error summary structure
  • Overall target: 85%+ line coverage for new code

Critical Paths to Test

Happy Path: Pipeline completes → finalize_and_learn called → ruflo receives outcome → no errors logged

Error Path 1 (Pipeline Failure): Pipeline hits error → outcome collected with error summary → finalize_and_learn called → ruflo receives failure data → pipeline exits with proper code

Error Path 2 (Ruflo Unavailable): finalize_and_learn detects ruflo unavailable → logs info message → returns normally → pipeline completion unaffected

Edge Case 1 (Missing Artifacts): collectPipelineOutcome called with missing artifact files → graceful fallback with default values → no crash

Edge Case 2 (Timeout in Learning Call): ruflo_learn_from_shipwright hangs for 10s → timeout wrapper catches → logs warning → returns normally


Definition of Done

Pipeline is complete when ALL of the following are true:

  • ruflo_learn_from_shipwright() is called exactly once per pipeline run
  • ✅ Call happens after all pipeline stages complete (success OR failure)
  • ✅ Outcome data includes: issue #, goal, outcome, duration, stages, error (if failed)
  • ✅ Function gracefully handles ruflo unavailability (no crash, logged info)
  • ✅ Function gracefully handles learning call failure (logs warning, doesn't block)
  • ✅ All 11 tests pass (unit, integration, e2e)
  • ✅ Full test suite passes (npm test with 0 failures)
  • ✅ Manual verification: real pipeline logs learning call with correct data
  • ✅ Code review: changes reviewed and approved
  • ✅ No blocking issues from CI/CD

Next Steps

Ready to proceed to design stage. Design phase will:

  1. Identify exact files in this codebase to modify (need code exploration)
  2. Map pipeline orchestration code structure
  3. Locate artifact collection points
  4. Define exact call sites for learning integration

Shall I proceed to design phase, or would you like to review/adjust this plan first?

Clone this wiki locally