Skip to content

Rectify: CI Event Discrimination — Core Scope Extension + Post-Fetch Validation#665

Merged
Trecek merged 7 commits intointegrationfrom
ci-event-discrimination-pipeline-misidentifies-push-event-fa/662
Apr 8, 2026
Merged

Rectify: CI Event Discrimination — Core Scope Extension + Post-Fetch Validation#665
Trecek merged 7 commits intointegrationfrom
ci-event-discrimination-pipeline-misidentifies-push-event-fa/662

Conversation

@Trecek
Copy link
Copy Markdown
Collaborator

@Trecek Trecek commented Apr 8, 2026

Summary

The CI watcher system has two architectural weaknesses that together allow a passing pull_request run to mask a failing push run:

  1. Missing scope axis: CIRunScope carries workflow and head_sha but not event, so the CI watcher cannot distinguish runs triggered by different GitHub events.
  2. No post-fetch validation: The watcher blindly selects completed[0] or active[0] from the API response with zero verification that the returned run matches the requested scope. Even existing scope axes (head_sha, workflow) are trusted entirely to server-side API filtering with no client-side cross-check.

The architectural fix in this part introduces scope-completeness validation — a post-fetch assertion layer that verifies every returned run matches all requested scope fields. This makes the system self-healing: if a future scope axis is added to CIRunScope but the API param wiring is forgotten, the validation layer catches the mismatch immediately in tests and at runtime.

Architecture Impact

Process Flow Diagram

%%{init: {'flowchart': {'nodeSpacing': 40, 'rankSpacing': 50, 'curve': 'basis'}}}%%
flowchart TB
    %% CLASS DEFINITIONS %%
    classDef terminal fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff;
    classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff;
    classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff;
    classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff;
    classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef gap fill:#ff6f00,stroke:#ffa726,stroke-width:2px,color:#000;

    %% TERMINALS %%
    START([START])
    DONE([DONE])
    NO_RUNS([NO_RUNS])
    TIMED_OUT([TIMED_OUT])
    ERR([ERROR])

    subgraph ConfigLayer ["● Config Layer — defaults.yaml / settings.py / context.py"]
        direction LR
        CIConf["● CIConfig<br/>━━━━━━━━━━<br/>ci.workflow: null<br/>ci.event: null"]
        DefScope["● default_ci_scope<br/>━━━━━━━━━━<br/>CIRunScope(workflow, event)<br/>built in ToolContext"]
        CIConf --> DefScope
    end

    subgraph ToolLayer ["● MCP Tool — wait_for_ci (tools_ci.py)"]
        GateCheck{"Gate<br/>enabled?"}
        WatchCheck{"ci_watcher<br/>configured?"}
        SHAInfer["Infer head_sha<br/>━━━━━━━━━━<br/>git rev-parse HEAD"]
        ScopeBuilt["● Build CIRunScope<br/>━━━━━━━━━━<br/>event = arg OR config.event<br/>workflow = arg OR config.workflow<br/>head_sha = inferred or None"]
        RepoRes["Resolve repo<br/>━━━━━━━━━━<br/>infer_repo_from_remote()"]
    end

    subgraph Phase1 ["Phase 1 — Look-back (DefaultCIWatcher.wait)"]
        FetchComp["● _fetch_completed_runs()<br/>━━━━━━━━━━<br/>GitHub API: status=completed<br/>scope.event forwarded as param"]
        TimeWin{"Within<br/>lookback_seconds<br/>(120s) window?"}
        Validate1["● _validate_run_matches_scope()<br/>━━━━━━━━━━<br/>if scope.event: run[event] must match<br/>if scope.head_sha: SHA must match"]
        P1Dec{"Valid completed<br/>run found?"}
        ScopeMis["Log: ci_watcher_scope_mismatch<br/>━━━━━━━━━━<br/>completed found but scope invalid<br/>→ fall through to Phase 2"]
        FetchFailed1["_fetch_failed_jobs()<br/>━━━━━━━━━━<br/>if conclusion in FAILED_CONCLUSIONS"]
    end

    subgraph Phase2 ["Phase 2 — Active Run Polling (deadline loop)"]
        P2Loop{"Deadline<br/>exceeded?"}
        FetchAct["● _fetch_active_runs()<br/>━━━━━━━━━━<br/>GitHub API: status≠completed<br/>scope.event forwarded as param"]
        Validate2["● _validate_run_matches_scope()<br/>━━━━━━━━━━<br/>event + head_sha checks"]
        ActDec{"Valid active<br/>run found?"}
        ReCheckComp["Re-check completed<br/>━━━━━━━━━━<br/>_fetch_completed_runs()<br/>+ _validate_run_matches_scope()"]
        CompMidDec{"Valid completed<br/>appeared?"}
        FetchFailed2["_fetch_failed_jobs()"]
        Backoff2["Jittered backoff<br/>━━━━━━━━━━<br/>min(exp_cap=30s, remaining)<br/>attempt counter increments"]
    end

    subgraph Phase3 ["Phase 3 — Wait for Run Completion (deadline loop)"]
        PollStatus["_poll_run_status()<br/>━━━━━━━━━━<br/>GET /runs/{run_id}"]
        CompCheck{"status ==<br/>completed?"}
        FetchFailed3["_fetch_failed_jobs()<br/>━━━━━━━━━━<br/>if FAILED_CONCLUSIONS"]
        P3Dec{"Deadline<br/>exceeded?"}
        Backoff3["Jittered backoff<br/>━━━━━━━━━━<br/>min(exp_cap=30s, remaining)"]
    end

    subgraph RulesLayer ["● Recipe Semantic Validation — validate_recipe / run_semantic_rules"]
        direction LR
        RuleES["● ci-missing-event-scope<br/>━━━━━━━━━━<br/>WARNING if event key absent<br/>from wait_for_ci with_args"]
        RuleHW["● ci-hardcoded-workflow<br/>━━━━━━━━━━<br/>WARNING if literal workflow<br/>name (not template expr)"]
        RuleCyc["● unbounded-cycle<br/>━━━━━━━━━━<br/>4-tier severity: silent /<br/>WARN (retry re-enters) /<br/>WARN (cond exit only) /<br/>ERROR (no exit)"]
        RuleCFG["ci-failure-missing-conflict-gate<br/>━━━━━━━━━━<br/>ERROR if resolve-failures<br/>reachable without git gate"]
        RuleFind["RuleFinding aggregation<br/>━━━━━━━━━━<br/>feeds compute_recipe_validity()<br/>ERROR findings → recipe invalid"]
        RuleES --> RuleFind
        RuleHW --> RuleFind
        RuleCyc --> RuleFind
        RuleCFG --> RuleFind
    end

    %% MAIN FLOW %%
    START --> GateCheck
    START --> ConfigLayer

    DefScope -.->|"event default"| ScopeBuilt

    GateCheck -->|"disabled"| ERR
    GateCheck -->|"enabled"| WatchCheck
    WatchCheck -->|"missing"| ERR
    WatchCheck -->|"ok"| SHAInfer
    SHAInfer --> ScopeBuilt
    ScopeBuilt --> RepoRes
    RepoRes -->|"repo resolved"| FetchComp
    RepoRes -->|"no repo"| NO_RUNS

    %% Phase 1 flow %%
    FetchComp -->|"HTTP/network error"| ERR
    FetchComp --> TimeWin
    TimeWin -->|"outside window"| P2Loop
    TimeWin -->|"within window"| Validate1
    Validate1 --> P1Dec
    P1Dec -->|"found + valid"| FetchFailed1
    FetchFailed1 --> DONE
    P1Dec -->|"none valid"| ScopeMis
    ScopeMis --> P2Loop

    %% Phase 2 flow %%
    P2Loop -->|"exceeded"| NO_RUNS
    P2Loop -->|"time left"| FetchAct
    FetchAct -->|"HTTP/network error"| ERR
    FetchAct --> Validate2
    Validate2 --> ActDec
    ActDec -->|"found + valid"| PollStatus
    ActDec -->|"none"| ReCheckComp
    ReCheckComp --> CompMidDec
    CompMidDec -->|"found"| FetchFailed2
    FetchFailed2 --> DONE
    CompMidDec -->|"none"| Backoff2
    Backoff2 --> P2Loop

    %% Phase 3 flow %%
    PollStatus --> CompCheck
    CompCheck -->|"completed"| FetchFailed3
    FetchFailed3 --> DONE
    CompCheck -->|"in_progress"| P3Dec
    P3Dec -->|"exceeded"| TIMED_OUT
    P3Dec -->|"time left"| Backoff3
    Backoff3 --> PollStatus

    %% Recipe validation feeds config guidance %%
    RuleFind -.->|"recipe validity gate<br/>guides event config"| DefScope

    %% CLASS ASSIGNMENTS %%
    class START,DONE,NO_RUNS,TIMED_OUT,ERR terminal;
    class CIConf,DefScope stateNode;
    class GateCheck,WatchCheck,TimeWin,P1Dec,ActDec,CompMidDec,CompCheck,P2Loop,P3Dec detector;
    class SHAInfer,ScopeBuilt,RepoRes,FetchComp,Validate1,FetchFailed1,FetchAct,Validate2,ReCheckComp,FetchFailed2,PollStatus,FetchFailed3 handler;
    class ScopeMis,Backoff2,Backoff3 phase;
    class RuleES,RuleHW,RuleCyc,RuleCFG detector;
    class RuleFind output;
Loading

State Lifecycle Diagram

%%{init: {'flowchart': {'nodeSpacing': 50, 'rankSpacing': 65, 'curve': 'basis'}}}%%
flowchart TB
    %% CLASS DEFINITIONS %%
    classDef cli fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff;
    classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff;
    classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff;
    classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff;
    classDef gap fill:#ff6f00,stroke:#ffa726,stroke-width:2px,color:#000;
    classDef terminal fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff;

    START([CALL: wait_for_ci / get_ci_status])

    %% ── LAYER 1: Config Resolution (INIT_PRESERVE) ──
    subgraph ConfigLayer ["CONFIG LIFECYCLE — INIT_PRESERVE (stable for session lifetime)"]
        direction LR
        DefaultsYAML["● defaults.yaml<br/>━━━━━━━━━━<br/>ci.event: null<br/>ci.workflow: null"]
        ProjectConfig["project config.yaml<br/>━━━━━━━━━━<br/>ci.event override<br/>ci.workflow override"]
        ConfigSchemaGate["● validate_layer_keys<br/>━━━━━━━━━━<br/>Rejects unknown ci.* keys<br/>GATE: ConfigSchemaError"]
        CIConfigObj["● CIConfig<br/>━━━━━━━━━━<br/>workflow: str | None<br/>event: str | None"]
    end

    %% ── LAYER 2: Scope Construction (INIT_ONLY) ──
    subgraph ScopeLayer ["SCOPE CONSTRUCTION — INIT_ONLY (frozen=True, immutable)"]
        direction LR
        DefaultScope["● default_ci_scope<br/>━━━━━━━━━━<br/>DERIVED property<br/>CIRunScope(workflow, event)<br/>from CIConfig"]
        FallbackLogic{"caller param<br/>present?"}
        CallSiteScope["● CIRunScope<br/>━━━━━━━━━━<br/>workflow: str | None<br/>head_sha: str | None<br/>event: str | None<br/>frozen=True"]
    end

    %% ── LAYER 3: Design-Time Validation Gates ──
    subgraph DesignGates ["DESIGN-TIME VALIDATION GATES (recipe authoring)"]
        direction LR
        SemanticRule["● ci-missing-event-scope<br/>━━━━━━━━━━<br/>wait_for_ci with no 'event'<br/>→ Severity.WARNING"]
        HardcodedRule["● ci-hardcoded-workflow<br/>━━━━━━━━━━<br/>workflow: hardcoded string<br/>→ Severity.WARNING"]
        ConflictGate["ci-failure-missing-conflict-gate<br/>━━━━━━━━━━<br/>CI failure → resolve-failures<br/>without merge-base gate<br/>→ Severity.ERROR"]
    end

    %% ── LAYER 4: Runtime Execution (MUTABLE local state) ──
    subgraph RuntimePhases ["RUNTIME EXECUTION — MUTABLE local variables"]
        direction TB
        Phase1["Phase 1: Look-back<br/>━━━━━━━━━━<br/>API params: branch, event, head_sha<br/>Cutoff: updated_at ≥ lookback window<br/>Short-circuits if run found"]
        ClientValidate1["● _validate_run_matches_scope<br/>━━━━━━━━━━<br/>event: run.event == scope.event<br/>head_sha: run.head_sha == scope.head_sha<br/>Defense-in-depth client filter"]
        Phase2["Phase 2: Poll<br/>━━━━━━━━━━<br/>Exponential backoff + jitter<br/>found_run: dict | None (MUTABLE)<br/>attempt: int (MUTABLE)<br/>deadline: float (MUTABLE)"]
        ClientValidate2["● _validate_run_matches_scope<br/>━━━━━━━━━━<br/>Same gate applied to active runs<br/>Prevents wrong-event selection"]
        Phase3["Phase 3: Wait<br/>━━━━━━━━━━<br/>Poll found_run by run_id<br/>Status: in_progress → completed<br/>Conclusion extracted"]
        FailedJobs["_fetch_failed_jobs<br/>━━━━━━━━━━<br/>APPEND_ONLY result<br/>Only on FAILED_CONCLUSIONS<br/>failure / timed_out / cancelled"]
    end

    %% ── LAYER 5: Output (DERIVED) ──
    Result["● wait_for_ci result<br/>━━━━━━━━━━<br/>run_id / conclusion / failed_jobs<br/>+ head_sha echo-back (new)<br/>Caller verifies SHA matches HEAD"]

    %% ── CONNECTIONS ──
    START --> DefaultsYAML
    START --> ProjectConfig
    DefaultsYAML --> ConfigSchemaGate
    ProjectConfig --> ConfigSchemaGate
    ConfigSchemaGate -->|valid keys only| CIConfigObj
    CIConfigObj --> DefaultScope

    DefaultScope --> FallbackLogic
    FallbackLogic -->|yes: use caller param| CallSiteScope
    FallbackLogic -->|no: use config default| CallSiteScope

    CallSiteScope -.->|recipe lint| SemanticRule
    CallSiteScope -.->|recipe lint| HardcodedRule
    CallSiteScope -.->|recipe lint| ConflictGate

    CallSiteScope --> Phase1
    Phase1 --> ClientValidate1
    ClientValidate1 -->|scope mismatch: warn + skip| Phase2
    ClientValidate1 -->|scope match: return early| Result
    Phase2 --> ClientValidate2
    ClientValidate2 -->|scope match| Phase3
    ClientValidate2 -->|scope mismatch: warn + skip| Phase2
    Phase3 --> FailedJobs
    FailedJobs --> Result

    %% ── CLASS ASSIGNMENTS ──
    class START terminal;
    class DefaultsYAML,ProjectConfig cli;
    class ConfigSchemaGate,ClientValidate1,ClientValidate2 detector;
    class CIConfigObj,DefaultScope,CallSiteScope stateNode;
    class FallbackLogic phase;
    class SemanticRule,HardcodedRule,ConflictGate gap;
    class Phase1,Phase2,Phase3 handler;
    class FailedJobs phase;
    class Result output;
Loading

Error/Resilience Diagram

%%{init: {'flowchart': {'nodeSpacing': 40, 'rankSpacing': 50, 'curve': 'basis'}}}%%
flowchart TB
    %% CLASS DEFINITIONS %%
    classDef cli fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff;
    classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff;
    classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff;
    classDef newComponent fill:#2e7d32,stroke:#81c784,stroke-width:2px,color:#fff;
    classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff;
    classDef gap fill:#ff6f00,stroke:#ffa726,stroke-width:2px,color:#000;
    classDef terminal fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff;

    subgraph Config ["● CONFIG VALIDATION (settings.py / defaults.yaml)"]
        CFG_CI["● ci.event / ci.workflow<br/>━━━━━━━━━━<br/>defaults: null<br/>falsy value → None coercion"]
        CFG_SCHEMA["● Config Schema Validator<br/>━━━━━━━━━━<br/>Unknown key → ConfigSchemaError<br/>did-you-mean hint via difflib"]
        CI_SCOPE["● default_ci_scope<br/>━━━━━━━━━━<br/>ToolContext property<br/>→ CIRunScope(workflow, event)"]
    end

    subgraph RecipeRules ["● RECIPE VALIDATION GATES (static, pre-execution)"]
        R_EVENT["● ci-missing-event-scope<br/>━━━━━━━━━━<br/>wait_for_ci missing event<br/>in with_args → WARNING"]
        R_WORKFLOW["● ci-hardcoded-workflow<br/>━━━━━━━━━━<br/>workflow not a template expr<br/>→ WARNING"]
        R_CONFLICT["ci-failure-missing-conflict-gate<br/>━━━━━━━━━━<br/>BFS: failure path reaches<br/>resolve-failures w/o merge-base gate<br/>→ ERROR"]
        R_CYCLE["● unbounded-cycle<br/>━━━━━━━━━━<br/>DFS + outer-loop re-entry detection<br/>no exit → ERROR<br/>conditional / re-entry → WARN"]
        R_DEADPARAM["● dead-with-param<br/>━━━━━━━━━━<br/>event now in _TOOL_PARAMS<br/>unknown with_args key → WARNING"]
    end

    subgraph ToolGates ["● TOOL HANDLER GATES (tools_ci.py)"]
        H_ENABLED{"● _require_enabled()<br/>feature disabled?"}
        H_WATCHER{"● ci_watcher<br/>is None?"}
        H_GIT["git rev-parse HEAD<br/>━━━━━━━━━━<br/>try/except Exception<br/>fail → head_sha = None"]
        H_SCOPE["● Build CIRunScope<br/>━━━━━━━━━━<br/>explicit args<br/>+ default_ci_scope fallback"]
        H_BOUNDARY["try/except Exception<br/>━━━━━━━━━━<br/>Last-resort tool boundary<br/>finally: timing.record()"]
    end

    subgraph CIWatcher ["● CI WATCHER — ci.py  (contract: never raises)"]
        W_REPO{"_resolve_repo()<br/>resolved?"}
        W_PH1["Phase 1: _fetch_completed_runs<br/>━━━━━━━━━━<br/>ValueError (timestamp) → skip run<br/>Returns completed run list"]
        W_SCOPE{"● _validate_run_matches_scope<br/>━━━━━━━━━━<br/>scope.event + head_sha filter<br/>false → log warning"}
        W_PH2["Phase 2: Poll active runs<br/>━━━━━━━━━━<br/>_fetch_active_runs()<br/>deadline-clamped loop"]
        DL2{"Phase 2<br/>deadline<br/>exhausted?"}
        W_PH3["Phase 3: Poll run status<br/>━━━━━━━━━━<br/>_poll_run_status()<br/>deadline-clamped loop"]
        DL3{"Phase 3<br/>deadline<br/>exhausted?"}
        W_JOBS["_fetch_failed_jobs<br/>━━━━━━━━━━<br/>only when FAILED_CONCLUSIONS<br/>returns job details list"]
    end

    subgraph Recovery ["RETRY / BACKOFF  (_jittered_sleep)"]
        SLEEP_P2["Backoff — Phase 2<br/>━━━━━━━━━━<br/>Full-jitter exponential<br/>Base 5s · Cap 30s<br/>Clamped to remaining deadline"]
        SLEEP_P3["Backoff — Phase 3<br/>━━━━━━━━━━<br/>Full-jitter exponential<br/>Base 5s · Cap 30s<br/>Clamped to remaining deadline"]
    end

    %% TERMINAL STATES %%
    T_SCHEMA(["ConfigSchemaError raised"])
    T_RWARN(["RuleFinding: WARNING"])
    T_RERR(["RuleFinding: ERROR"])
    T_GATEFAST(["Fast-fail: error JSON returned"])
    T_BOUND(["success:false — tool boundary error"])
    T_NORUNS(["conclusion: no_runs"])
    T_TIMEOUT(["conclusion: timed_out"])
    T_HTTP(["conclusion: error — HTTP failure"])
    T_NET(["conclusion: error — network failure"])
    T_DONE(["conclusion: success / failure<br/>+ failed_jobs list"])

    %% CONFIG FLOW %%
    CFG_CI --> CI_SCOPE
    CFG_SCHEMA -->|"unknown key"| T_SCHEMA

    %% RECIPE VALIDATION FINDINGS %%
    R_EVENT -->|"event absent"| T_RWARN
    R_WORKFLOW -->|"hardcoded string"| T_RWARN
    R_DEADPARAM -->|"unknown param"| T_RWARN
    R_CYCLE -->|"no exit"| T_RERR
    R_CYCLE -->|"re-entry / conditional"| T_RWARN
    R_CONFLICT -->|"unguarded path"| T_RERR

    %% TOOL HANDLER GATES %%
    H_ENABLED -->|"yes: disabled"| T_GATEFAST
    H_ENABLED -->|"no: enabled"| H_WATCHER
    H_WATCHER -->|"yes: null"| T_GATEFAST
    H_WATCHER -->|"no: configured"| H_GIT
    H_GIT --> H_SCOPE
    CI_SCOPE --> H_SCOPE
    H_SCOPE --> H_BOUNDARY
    H_BOUNDARY -->|"Exception"| T_BOUND
    H_BOUNDARY -->|"calls wait()"| W_REPO

    %% CI WATCHER FLOW %%
    W_REPO -->|"no"| T_NORUNS
    W_REPO -->|"yes"| W_PH1
    W_PH1 -->|"completed runs found"| W_SCOPE
    W_PH1 -->|"none found"| W_PH2
    W_SCOPE -->|"match"| W_JOBS
    W_SCOPE -->|"no match: warn"| W_PH2

    %% PHASE 2 RETRY LOOP %%
    W_PH2 --> DL2
    DL2 -->|"yes"| T_NORUNS
    DL2 -->|"no"| SLEEP_P2
    SLEEP_P2 -->|"retry"| W_PH2
    W_PH2 -->|"run found"| W_PH3

    %% PHASE 3 RETRY LOOP %%
    W_PH3 --> DL3
    DL3 -->|"yes"| T_TIMEOUT
    DL3 -->|"no"| SLEEP_P3
    SLEEP_P3 -->|"retry"| W_PH3
    W_PH3 -->|"completed"| W_JOBS

    %% OUTPUT %%
    W_JOBS --> T_DONE

    %% HTTP / NETWORK ERRORS (phases 1–3) %%
    W_PH1 -->|"HTTPStatusError"| T_HTTP
    W_PH2 -->|"HTTPStatusError"| T_HTTP
    W_PH3 -->|"HTTPStatusError"| T_HTTP
    W_PH1 -->|"RequestError"| T_NET
    W_PH2 -->|"RequestError"| T_NET
    W_PH3 -->|"RequestError"| T_NET

    %% CLASS ASSIGNMENTS %%
    class CFG_CI,CFG_SCHEMA detector;
    class CI_SCOPE,H_SCOPE stateNode;
    class R_EVENT,R_WORKFLOW,R_CONFLICT,R_CYCLE,R_DEADPARAM detector;
    class H_ENABLED,H_WATCHER,W_REPO,W_SCOPE detector;
    class H_GIT,H_BOUNDARY,W_PH1,W_PH2,W_PH3,W_JOBS handler;
    class SLEEP_P2,SLEEP_P3 output;
    class DL2,DL3 phase;
    class T_SCHEMA,T_RWARN,T_RERR,T_GATEFAST,T_BOUND,T_NORUNS,T_TIMEOUT,T_HTTP,T_NET,T_DONE terminal;
Loading

Closes #662

Implementation Plan

Plan file: /home/talon/projects/autoskillit-runs/remediation-20260407-185819-949835/.autoskillit/temp/rectify/rectify_ci-event-discrimination_2026-04-07_191500_part_a.md

🤖 Generated with Claude Code via AutoSkillit

Token Usage Summary

Step uncached output cache_read cache_write count time
investigate 27 8.6k 608.8k 45.3k 1 7m 20s
rectify 33 24.8k 1.1M 75.4k 1 10m 59s
dry_walkthrough 49 47.3k 2.3M 156.6k 2 15m 8s
implement 143 48.3k 9.9M 192.8k 2 19m 37s
assess 75 23.1k 2.6M 101.3k 2 15m 7s
prepare_pr 13 7.9k 267.6k 32.0k 1 1m 52s
run_arch_lenses 1.9k 46.5k 721.7k 135.4k 3 15m 12s
compose_pr 13 18.8k 376.3k 38.4k 1 4m 32s
Total 2.3k 225.4k 17.8M 777.2k 1h 29m

Trecek and others added 5 commits April 7, 2026 21:06
Adds `event: str | None` to `CIRunScope`, wires it through config
(`CIConfig`, `defaults.yaml`, `default_ci_scope`), injects it into
`_fetch_completed_runs` and `_fetch_active_runs` API params, and
introduces `_validate_run_matches_scope` for client-side defense-in-depth
validation of returned runs. Adds `event` param to `wait_for_ci` and
`get_ci_status` MCP tool handlers. Regression test covers the core
bug scenario where a passing `pull_request` run masked a failing `push`
run (GitHub issue #662).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…i-hardcoded-workflow

- test_cycle_with_retry_exit_but_success_reenters_is_warning: A→B→C→A cycle
  where B has retry exit but success path stays in cycle → WARNING
- Update test_cycle_with_retry_exit_is_clean: now expects WARNING (outer loop
  unbounded when retrying step's success re-enters cycle)
- test_wait_for_ci_without_event_is_warning / test_wait_for_ci_with_event_is_clean
- test_wait_for_ci_hardcoded_workflow_is_warning / test_wait_for_ci_no_workflow_is_clean
- test_bundled_recipes_have_no_ci_hardcoded_workflow: guards all bundled recipes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ules; update recipes

Cycle validator (rules_graph.py):
- _check_unbounded_cycles now checks if the retrying step's success path stays
  inside the cycle; if it does, the retry budget resets each iteration → WARNING
  instead of the previous false-clean result

New semantic rules (rules_ci.py):
- ci-missing-event-scope: warns when wait_for_ci lacks event filtering, which
  allows pull_request runs to mask failing push runs
- ci-hardcoded-workflow: warns when wait_for_ci hardcodes workflow: "tests.yml",
  blocking the config-level ci.workflow fallback

Recipe parameterization (all four bundled recipes):
- Remove hardcoded workflow: "tests.yml" from all wait_for_ci steps; config
  fallback now activates via tool_ctx.default_ci_scope.workflow
- Add event: "push" to all wait_for_ci steps for push-event discrimination
- Update diagnose-ci invocations: - - tests.yml → - - - push (event as 5th arg)

Skill updates:
- diagnose-ci SKILL.md: document event as 5th positional arg; update Step 2
  to pass --event to gh run list when provided
- resolve-failures SKILL.md: add Step 2a to reproduce CI environment (pre-test
  artifact generation) before running the test suite

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…coping

- test_rules_structure.py: rename test_bounded_cycle_with_retries_does_not_warn
  to test_cycle_with_retries_warns_when_success_stays_in_cycle; update assertion
  to expect WARNING when retrying step's success path stays inside cycle
- test_ci_dev_config.py: replace test_ci_watch_steps_carry_workflow_field with
  test_ci_watch_steps_carry_event_field; recipes now use event: push instead of
  hardcoded workflow: tests.yml per ci-hardcoded-workflow semantic rule
Copy link
Copy Markdown
Collaborator Author

@Trecek Trecek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AutoSkillit PR Review — Verdict: changes_requested. Found 11 blocking issues (warning severity). See inline comments for details.

server-side, but this client-side validation catches any discrepancy.
Each scope field is only checked when it is not None (i.e., was requested).
"""
if scope.event and run.get("event") != scope.event:
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] defense: if scope.event uses truthiness — an empty-string event would silently skip the check. Use if scope.event is not None to correctly handle all non-None values.

"""
if scope.event and run.get("event") != scope.event:
return False
if scope.head_sha and run.get("head_sha") != scope.head_sha:
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] defense: if scope.head_sha uses truthiness — an empty-string head_sha would silently skip validation. Use if scope.head_sha is not None for consistent behavior.

@@ -89,6 +103,8 @@ async def _fetch_completed_runs(
params["workflow_id"] = scope.workflow
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] cohesion: Asymmetric filtering strategy — workflow is pre-fetch only while event is both pre-fetch and post-fetch validated via _validate_run_matches_scope. Either add workflow to the post-fetch scope validation or document this intentional asymmetry.

r for r in completed if _validate_run_matches_scope(r, scope)
]
if not valid_completed:
_log.warning(
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] defense: ci_watcher_scope_mismatch log omits the actual event values on the mismatched runs. Add events=[r.get("event") for r in completed] to make diagnosis actionable.

scope = CIRunScope(
workflow=workflow or tool_ctx.default_ci_scope.workflow,
head_sha=head_sha,
event=event or tool_ctx.default_ci_scope.event,
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] defense: event or tool_ctx.default_ci_scope.event uses falsy coalescing — explicit event="" silently falls through to config default. Prefer event if event is not None else tool_ctx.default_ci_scope.event.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid observation — flagged for design decision. The event or fallback pattern is technically more permissive than event if event is not None else fallback for a str | None typed parameter — event="" would silently fall through with or but not with is not None. However: (1) the identical or pattern for workflow on the same line was not flagged, so a consistent fix would need to address both fields in both functions; (2) event="" has no semantic meaning for a GitHub event filter and is not a realistic MCP caller input. Deferring for human review to decide whether defensive is not None consistency is worth the churn.

if s in recipe.steps
)
if has_retry_exit:
# Check whether the success path of the retrying step stays inside
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[info] slop: Block comments at L62-L64 narrate adjacent code rather than explaining why. Remove or condense to a single intent statement.

### Step 2a: Reproduce CI Environment (before running tests)

Check if the project has artifact generation steps that CI runs before tests:
1. Read `.github/workflows/tests.yml` (or the CI workflow file) if it exists
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[info] cohesion: Step 2a hardcodes tests.yml — the exact filename ci-hardcoded-workflow rule discourages. Use a generic placeholder like {ci_workflow_file}.

{
"ci": RecipeStep(
tool="wait_for_ci",
with_args={"branch": "main", "workflow": "tests.yml", "timeout_seconds": 300},
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[info] tests: Identical recipe setup duplicated between test_wait_for_ci_without_event_is_warning and test_wait_for_ci_hardcoded_workflow_is_warning. Extract a shared _make_wait_for_ci_recipe(with_args=...) helper.

ci.workflow defaults; event discrimination must be explicit in each recipe step.
"""
recipes_dir = REPO_ROOT / "src" / "autoskillit" / "recipes"
for recipe_path in recipes_dir.glob("*.yaml"):
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[info] tests: Overlapping coverage with test_bundled_recipes_have_no_ci_hardcoded_workflow in test_bundled_recipes.py. Consider documenting the intended scope difference or consolidating.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Investigated — this is intentional. test_ci_watch_steps_carry_event_field (test_ci_dev_config.py) asserts that every wait_for_ci step has an event key in with: — it enforces presence of the event discriminator. test_bundled_recipes_have_no_ci_hardcoded_workflow (test_bundled_recipes.py:2210) asserts that the ci-hardcoded-workflow semantic rule fires zero findings — it enforces absence of a hardcoded workflow key. These guard orthogonal properties and are complementary, not overlapping. No change needed.

return findings


@semantic_rule(
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[info] cohesion: ci-hardcoded-workflow fires as WARNING for any explicit workflow name. Repos needing workflow+event together would get spurious warnings. Confirm severity is appropriate or add a suppression mechanism.

Copy link
Copy Markdown
Collaborator Author

@Trecek Trecek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AutoSkillit review found 11 blocking issues. See inline comments for details.

@Trecek Trecek added this pull request to the merge queue Apr 8, 2026
Merged via the queue into integration with commit 8c3ebc4 Apr 8, 2026
2 checks passed
@Trecek Trecek deleted the ci-event-discrimination-pipeline-misidentifies-push-event-fa/662 branch April 8, 2026 05:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant