-
Notifications
You must be signed in to change notification settings - Fork 0
Pipeline Plan 327
I now have all the context I need. Here's the implementation plan:
-
scripts/lib/pipeline-stages-build.sh—stage_test()function (lines 596–726) -
scripts/sw-ruflo-adapter-test.sh— add new test sections at end (beforeprint_test_results)
Insert a block after info "Running tests: ..." (line 618) but before bash -c "$test_cmd" (line 620). Pattern follows stage_test_first lines 21–31:
# ── Recall historical flakiness patterns from ruflo ──────────────────
local _ruflo_flakiness_ctx=""
if declare -f ruflo_recall >/dev/null 2>&1 && \
declare -f ruflo_available >/dev/null 2>&1 && \
ruflo_available; then
_ruflo_flakiness_ctx=$(ruflo_recall "test flakiness patterns failures" \
"pipeline-${SHIPWRIGHT_PIPELINE_ID:-unknown}" 2>/dev/null || true)
_ruflo_flakiness_ctx=$(printf '%.2000s' "${_ruflo_flakiness_ctx:-}")
if [[ -n "$_ruflo_flakiness_ctx" ]]; then
info "Ruflo recall: historical test patterns found"
info "${DIM}${_ruflo_flakiness_ctx}${RESET}"
fi
fiKey points:
-
declare -fguards per memory feedback (feedback_ruflo_declare_f_guard.md) -
printf '%.2000s'truncation per memory feedback (feedback_ruflo_recall_plain_text.md) - Namespace
pipeline-<PIPELINE_ID>per acceptance criteria - Logged for human visibility, does not gate execution
- Fail-open:
|| trueon recall, empty string fallback
Current problem: ruflo_store at line 719 only fires on pass (the failure path return 1s at line 665). Fix by:
- Adding a store call inside the failure branch (before
return 1at line 665) - Enriching the existing pass-side store (line 719–722) with more data
Failure path — insert before return 1 at line 665:
# Store failed test result in ruflo for flakiness tracking
if declare -f ruflo_store >/dev/null 2>&1 && \
declare -f ruflo_available >/dev/null 2>&1 && \
ruflo_available; then
local _fail_test_count
_fail_test_count=$(grep -cE 'PASS|FAIL|✓|✗|ok [0-9]' "$test_log" 2>/dev/null || echo "0")
ruflo_store "stage-test-result" \
"Tests FAILED (exit $test_exit). Count: ${_fail_test_count}. Cmd: ${test_cmd}. Coverage: 0%. Time: $(date -u +"%Y-%m-%dT%H:%M:%SZ" 2>/dev/null || echo unknown)." \
"pipeline-${SHIPWRIGHT_PIPELINE_ID:-unknown}" \
"test,stage_test,failed" 2>/dev/null || true
fiPass path — replace lines 718–723 with enriched store:
# Store test results in ruflo for cross-stage context and flakiness tracking
if declare -f ruflo_store >/dev/null 2>&1 && \
declare -f ruflo_available >/dev/null 2>&1 && \
ruflo_available; then
local _pass_test_count
_pass_test_count=$(grep -cE 'PASS|FAIL|✓|✗|ok [0-9]' "$test_log" 2>/dev/null || echo "0")
ruflo_store "stage-test-result" \
"Tests PASSED. Count: ${_pass_test_count}. Cmd: ${test_cmd}. Coverage: ${_cov_pct:-0}%. Time: $(date -u +"%Y-%m-%dT%H:%M:%SZ" 2>/dev/null || echo unknown)." \
"pipeline-${SHIPWRIGHT_PIPELINE_ID:-unknown}" \
"test,stage_test,passed" 2>/dev/null || true
fiInsert before print_test_results (line 2531). Add 4 test sections:
-
stage_test recall: ruflo_recall called at start — mock
ruflo_recall, verify it's called with correct namespace and query - stage_test recall: output logged for visibility — verify recall output appears in info log
-
stage_test store: ruflo_store called on pass — mock test pass, verify store called with
passedtag and correct data (coverage, count, cmd, timestamp) -
stage_test store: ruflo_store called on fail — mock test fail, verify store called with
failedtag
Each test stubs ruflo_available, ruflo_recall, ruflo_store as local functions that log args to temp files, then asserts against those logs. Pattern matches existing stage_test_first tests at lines 2400–2528.
- Task 1: Add ruflo recall block at start of
stage_test()— querypipeline-<PIPELINE_ID>for historical test/flakiness patterns - Task 2: Log recall results via
infofor human visibility (no gating) - Task 3: Add ruflo store in the failure path (before
return 1) with pass/fail, coverage, test count, cmd, timestamp - Task 4: Enrich the existing pass-path ruflo store with test count, cmd, timestamp, and tags
- Task 5: Add
declare -fguards +ruflo_availablechecks on all new ruflo calls - Task 6: Add test — recall-before-test: verify
ruflo_recallinvoked with correct namespace - Task 7: Add test — recall output logged for human visibility
- Task 8: Add test — store-after-test (pass path): verify store args contain
passed, coverage, cmd - Task 9: Add test — store-after-test (fail path): verify store called with
failedtag - Task 10: Run
npm testand confirm all existing tests pass
Test Pyramid Breakdown:
-
Unit tests (4 new): All 4 tests are unit-level, mocking
ruflo_recall/ruflo_store/ruflo_availableas stub functions that log their arguments to temp files. Assertions check those logs. - Integration tests (0): Not needed — the ruflo adapter integration is already covered by existing tests.
- E2E tests (0): Not applicable — pipeline-level E2E would be a separate concern.
Coverage Targets:
- 100% of the new code paths (recall-at-start, store-on-pass, store-on-fail)
- Both the "ruflo available" and "ruflo unavailable" branches are covered (the unavailable branch is implicitly tested by existing guard tests)
Critical Paths to Test:
- Happy path: Ruflo available, recall returns history, tests pass, store called with pass data
- Error case 1: Tests fail — store still called with failure data (the main gap this issue fixes)
- Error case 2: Ruflo unavailable — recall returns empty, store skipped, test execution unaffected
-
Edge case 1:
SHIPWRIGHT_PIPELINE_IDunset — namespace falls back topipeline-unknown - Edge case 2: Recall returns empty string — no flakiness info logged, execution proceeds normally
-
ruflo_recall()called at start ofstage_test()withpipeline-<PIPELINE_ID>namespace - Recall results logged via
infobut do not gate test execution -
ruflo_store()called after BOTH pass and fail outcomes with: pass/fail, coverage %, test count, test cmd, timestamp - Storage namespace:
pipeline-<PIPELINE_ID> - All ruflo calls guarded with
declare -f+ruflo_available(fail-open) - 4 new tests in
sw-ruflo-adapter-test.shcovering recall and store on pass/fail -
npm testpasses with no regressions
- Endpoint Specification: N/A — no new API endpoints; this is shell-to-CLI integration.
-
Error Codes: N/A — all ruflo calls are fail-open (
|| true). - Rate Limiting: N/A — ruflo CLI has built-in circuit-breaker timeout.
- Versioning: N/A — no API versioning change.
- Threat Model (STRIDE): N/A — no new attack surface; ruflo data is local, no secrets stored.
- Auth Flow: N/A — no authentication changes.
-
Input Validation Points: The
test_cmdvalue stored in ruflo is already validated/auto-detected by the existing pipeline config parsing. The recall query is a hardcoded string. - Security Checklist: No secrets in stored data (only test counts/coverage/cmd), no user input flows into ruflo keys without existing sanitization.