Skip to content

Rectify: Advisory Step Context-Limit Routing — Remaining Steps — PART B ONLY#483

Merged
Trecek merged 9 commits intointegrationfrom
remediation-recipe-review-step-lacks-on-context-limit-retrie/481
Mar 22, 2026
Merged

Rectify: Advisory Step Context-Limit Routing — Remaining Steps — PART B ONLY#483
Trecek merged 9 commits intointegrationfrom
remediation-recipe-review-step-lacks-on-context-limit-retrie/481

Conversation

@Trecek
Copy link
Copy Markdown
Collaborator

@Trecek Trecek commented Mar 22, 2026

Summary

Part A added the advisory-step-missing-context-limit semantic rule and fixed the review step in three recipes. After Part A, the rule still emitted WARNINGs for three other advisory steps (audit_impl, open_pr_step, ci_conflict_fix) in each of the three main recipes. Part B resolves those remaining WARNINGs by adding explicit on_context_limit routes, then tightens the test gate so that zero advisory-step WARNINGs are permitted in bundled recipes going forward.

Architecture Impact

Process Flow Diagram

%%{init: {'flowchart': {'nodeSpacing': 40, 'rankSpacing': 55, 'curve': 'basis'}}}%%
flowchart TB
    %% CLASS DEFINITIONS %%
    classDef terminal fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff;
    classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff;
    classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff;
    classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff;
    classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff;

    START([START])
    ESCALATE([ESCALATE_STOP])
    RELEASE_FAIL([RELEASE_ISSUE_FAILURE])

    subgraph AdvisoryGate ["Advisory Step Gate (skip_when_false)"]
        direction TB
        REVIEW["● review<br/>━━━━━━━━━━<br/>skip_when_false: review_approach<br/>retries: 1<br/>on_context_limit: dry_walkthrough"]
        AUDIT["● audit_impl<br/>━━━━━━━━━━<br/>skip_when_false: inputs.audit<br/>on_context_limit: escalate_stop"]
        OPEN_PR["● open_pr_step<br/>━━━━━━━━━━<br/>skip_when_false: inputs.open_pr<br/>on_context_limit: release_issue_failure"]
        CI_FIX["● ci_conflict_fix<br/>━━━━━━━━━━<br/>skip_when_false: detect_ci_conflicts<br/>on_context_limit: release_issue_failure"]
    end

    subgraph SemanticRule ["● advisory-step-missing-context-limit Rule"]
        direction LR
        RULE["advisory-step-missing-context-limit<br/>━━━━━━━━━━<br/>WARNING: run_skill + skip_when_false<br/>but no on_context_limit"]
        RULECHECK{"on_context_limit<br/>set?"}
    end

    subgraph NormalFlow ["Normal Execution Path"]
        direction TB
        DRY["dry_walkthrough<br/>━━━━━━━━━━<br/>retries: 3"]
        TEST["test"]
        MERGE["merge"]
        PUSH["push"]
        CI_WATCH["ci_watch"]
    end

    subgraph AuditVerdict ["Audit Verdict Routing"]
        direction LR
        VERDICT{"audit verdict"}
        REMEDIATE["remediate → make_plan"]
    end

    START --> REVIEW
    REVIEW -->|"on_success"| DRY
    REVIEW -->|"on_context_limit (●new)"| DRY
    REVIEW -->|"on_failure"| RELEASE_FAIL

    DRY --> TEST
    TEST --> AUDIT

    AUDIT -->|"on_result: GO"| MERGE
    AUDIT --> VERDICT
    VERDICT -->|"NO GO"| REMEDIATE
    VERDICT -->|"error"| ESCALATE
    AUDIT -->|"on_context_limit (●new)"| ESCALATE
    AUDIT -->|"on_failure"| ESCALATE
    AUDIT -->|"skipped"| MERGE

    REMEDIATE --> DRY

    MERGE --> PUSH
    PUSH --> OPEN_PR
    OPEN_PR -->|"on_success"| CI_WATCH
    OPEN_PR -->|"on_context_limit (●new)"| RELEASE_FAIL
    OPEN_PR -->|"on_failure"| RELEASE_FAIL
    OPEN_PR -->|"skipped"| CI_WATCH

    CI_WATCH -->|"on_success"| MERGE
    CI_WATCH -->|"on_failure → detect_ci_conflict"| CI_FIX
    CI_FIX -->|"on_success"| PUSH
    CI_FIX -->|"on_context_limit (●new)"| RELEASE_FAIL
    CI_FIX -->|"on_failure"| RELEASE_FAIL

    RULE --> RULECHECK
    RULECHECK -->|"missing"| RULE
    RULECHECK -->|"present"| RULE

    %% CLASS ASSIGNMENTS %%
    class START terminal;
    class ESCALATE,RELEASE_FAIL terminal;
    class REVIEW,AUDIT,OPEN_PR,CI_FIX handler;
    class DRY,TEST,MERGE,PUSH,CI_WATCH phase;
    class VERDICT stateNode;
    class REMEDIATE detector;
    class RULE,RULECHECK output;
Loading

Color Legend:

Color Category Description
Dark Blue Terminal START, ESCALATE_STOP, RELEASE_ISSUE_FAILURE
Orange Handler ● Modified advisory steps (review, audit_impl, open_pr_step, ci_conflict_fix)
Purple Phase Normal execution steps (dry_walkthrough, test, merge, push, ci_watch)
Teal State Decision/routing nodes
Red Detector Remediation loop
Dark Teal Output Semantic rule nodes

Error/Resilience Diagram

%%{init: {'flowchart': {'nodeSpacing': 40, 'rankSpacing': 55, 'curve': 'basis'}}}%%
flowchart TB
    %% CLASS DEFINITIONS %%
    classDef terminal fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff;
    classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff;
    classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff;
    classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff;
    classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef gap fill:#ff6f00,stroke:#ffa726,stroke-width:2px,color:#000;

    subgraph SemanticGate ["● Semantic Validation Gate (rules_worktree.py)"]
        direction LR
        RULE["● advisory-step-missing-context-limit<br/>━━━━━━━━━━<br/>run_skill + skip_when_false<br/>but no on_context_limit → WARNING"]
        RCHECK{"on_context_limit<br/>declared?"}
        RULE_OK["Rule passes<br/>━━━━━━━━━━<br/>step is compliant"]
    end

    subgraph ReviewResilience ["● review step (skip_when_false: review_approach, retries: 1)"]
        direction TB
        REVIEW_RUN["run_skill: review-approach<br/>━━━━━━━━━━<br/>advisory; can be skipped"]
        REVIEW_STALE["context exhausted / stale"]
        REVIEW_RETRY{"retries<br/>remaining?"}
        REVIEW_SKIP["● on_context_limit: dry_walkthrough<br/>━━━━━━━━━━<br/>skip to next step (safe)"]
    end

    subgraph AuditResilience ["● audit_impl step (skip_when_false: inputs.audit) — merge gate"]
        direction TB
        AUDIT_RUN["run_skill: audit-impl<br/>━━━━━━━━━━<br/>merge gate; GO/NO GO verdict"]
        AUDIT_FAIL["context exhausted<br/>━━━━━━━━━━<br/>verdict unavailable"]
        AUDIT_LIMIT["● on_context_limit: escalate_stop<br/>━━━━━━━━━━<br/>abort — unapproved merge unsafe"]
    end

    subgraph PRResilience ["● open_pr_step + ci_conflict_fix (skip_when_false guarded)"]
        direction TB
        OPENPR_RUN["run_skill: open-pr<br/>━━━━━━━━━━<br/>skip_when_false: inputs.open_pr"]
        OPENPR_LIMIT["● on_context_limit: release_issue_failure<br/>━━━━━━━━━━<br/>PR state unknown; safe release"]
        CIFIX_RUN["run_skill: ci-conflict-fix<br/>━━━━━━━━━━<br/>skip_when_false: detect_ci_conflicts<br/>retries: 1"]
        CIFIX_LIMIT["● on_context_limit: release_issue_failure<br/>━━━━━━━━━━<br/>incomplete fix; push unsafe"]
    end

    T_SKIP([dry_walkthrough — continue])
    T_ABORT([ESCALATE_STOP])
    T_RELEASE([RELEASE_ISSUE_FAILURE])

    %% SEMANTIC GATE %%
    RULE --> RCHECK
    RCHECK -->|"missing"| RULE
    RCHECK -->|"present"| RULE_OK

    %% REVIEW RESILIENCE %%
    REVIEW_RUN -->|"stale / context limit"| REVIEW_STALE
    REVIEW_STALE --> REVIEW_RETRY
    REVIEW_RETRY -->|"yes (retries: 1)"| REVIEW_RUN
    REVIEW_RETRY -->|"exhausted"| REVIEW_SKIP
    REVIEW_SKIP --> T_SKIP

    %% AUDIT RESILIENCE %%
    AUDIT_RUN -->|"context exhausted"| AUDIT_FAIL
    AUDIT_FAIL --> AUDIT_LIMIT
    AUDIT_LIMIT --> T_ABORT

    %% PR RESILIENCE %%
    OPENPR_RUN -->|"context exhausted"| OPENPR_LIMIT
    OPENPR_LIMIT --> T_RELEASE
    CIFIX_RUN -->|"context exhausted<br/>(after retries)"| CIFIX_LIMIT
    CIFIX_LIMIT --> T_RELEASE

    %% CLASS ASSIGNMENTS %%
    class RULE,RCHECK detector;
    class RULE_OK output;
    class REVIEW_RUN,AUDIT_RUN,OPENPR_RUN,CIFIX_RUN handler;
    class REVIEW_STALE,AUDIT_FAIL gap;
    class REVIEW_RETRY stateNode;
    class REVIEW_SKIP,AUDIT_LIMIT,OPENPR_LIMIT,CIFIX_LIMIT output;
    class T_SKIP,T_ABORT,T_RELEASE terminal;
Loading

Color Legend:

Color Category Description
Red Semantic Gate Validation rule that detects missing on_context_limit
Dark Teal Output Recovery routing destinations (● new on_context_limit routes)
Orange Handler Advisory run_skill steps that were modified
Yellow Failed Context-exhaustion failure states
Teal Circuit Retry-remaining decision
Dark Blue Terminal Final routing destinations

Closes #481

Implementation Plan

Plan file: /home/talon/projects/autoskillit-runs/remediation-20260322-120142-791819/temp/rectify/rectify_review-step-context-limit-routing_2026-03-22_120142_part_b.md

🤖 Generated with Claude Code via AutoSkillit

Trecek and others added 5 commits March 22, 2026 12:40
…-step rule

- Add test_remediation_review_step_has_on_context_limit/retries to TestInvestigateFirstStructure
- Add test_implementation_review_step_has_on_context_limit/retries to TestImplementationPipelineStructure
- Add test_ig_review_step_has_on_context_limit/retries to TestImplementationGroupsStructure
- Add 5 unit tests for advisory-step-missing-context-limit semantic rule in test_rules_worktree.py

All new tests are RED — implementation follows in next commit.
…ipes

Adds the advisory-step-missing-context-limit semantic rule and fixes
the review step routing gap in remediation.yaml, implementation.yaml,
and implementation-groups.yaml.

Changes:
- rules_worktree.py: new advisory-step-missing-context-limit WARNING rule
  fires for any run_skill step with skip_when_false but no on_context_limit
- remediation.yaml review step: add retries:1 and on_context_limit:dry_walkthrough
- implementation.yaml review step: add retries:1 and on_context_limit:verify
- implementation-groups.yaml review step: add retries:1 and on_context_limit:verify

The review step is advisory (skip_when_false). Without on_context_limit,
context exhaustion fell through to on_failure -> release_issue_failure,
aborting the pipeline. Now it routes to the next main step as intended.
… on_context_limit

Adds 9 per-step assertions (3 recipes × 3 steps) inside the appropriate
test classes, plus two cross-recipe coverage tests:
- test_bundled_recipes_emit_no_advisory_context_limit_warnings (standalone)
- test_all_advisory_run_skill_steps_have_on_context_limit (parametrized)

All tests are RED until the recipe YAML fixes land in Phase 2.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…x, create_summary

Adds explicit on_context_limit routing to all advisory run_skill steps
that were missing it across all five bundled recipes:

- audit_impl (remediation, implementation, implementation-groups): escalate_stop
  A context-exhausted audit cannot provide a valid verdict; aborting prevents
  unapproved code from bypassing the merge gate.

- audit_impl (merge-prs): cleanup_failure
  Same rationale as the three main recipes.

- open_pr_step (remediation, implementation, implementation-groups): release_issue_failure
  PR state is unknown after a context limit; releasing the issue is the safe path.

- ci_conflict_fix (remediation, implementation, implementation-groups): release_issue_failure
  An incomplete conflict fix cannot be safely pushed; abort and release.

- create_summary (smoke-test): done
  Terminal reporting step; if context is exhausted, the pipeline completed
  successfully — route to done (clean stop) rather than escalate (failure stop).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…x, direct_merge_conflict_fix, immediate_merge_conflict_fix in all three main recipes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
"the skip intent is already in the note: field but not schema-enforced"
)

def test_implementation_review_step_has_on_context_limit(self, recipe) -> None:
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] tests: Per-recipe review-step on_context_limit tests (test_implementation_review_step_has_on_context_limit here, test_ig_* at L765, test_remediation_* at L1112) are structurally identical across three test classes. The parametrized test at test_all_advisory_run_skill_steps_have_on_context_limit already covers presence for all recipes. If kept for exact-value checking, extract a shared helper to avoid triple-maintenance drift.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid observation — flagged for design decision. The per-recipe tests (L465, L765, L1112) add exact on_context_limit value checks (e.g. == 'verify', == 'dry_walkthrough') that test_all_advisory_run_skill_steps_have_on_context_limit (L1709) does not provide. Whether to consolidate into a parametrized exact-value helper or keep per-recipe tests for explicit regression coverage requires a human decision on test granularity vs. maintenance.

Copy link
Copy Markdown
Collaborator Author

@Trecek Trecek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AutoSkillit review found 7 blocking issues (changes_requested). See inline comments.

Summary of findings:

  • rules_worktree.py:117 [cohesion] Rule is in wrong file — covers all advisory steps, not just worktree steps
  • test_bundled_recipes.py:1510 [tests] Potentially redundant test coverage (semantic rule vs direct field check for same invariant) — requires human decision
  • test_bundled_recipes.py:465 [tests] Copy-pasted per-recipe tests across 3 classes — extract shared helper
  • test_bundled_recipes.py:1125 [cohesion] Inconsistent test name prefixes in remediation class (test_remediation_ vs test_if_)
  • test_rules_worktree.py:311 [slop] Redundant inline comment
  • test_rules_worktree.py:319 [tests] Severity pre-filter obscures assertion intent
  • test_rules_worktree.py:359 [slop] Redundant inline comment

Unpostable finding:

  • test_rules_worktree.py [cohesion] Advisory-step rule tests in wrong file (should mirror source module)

Trecek and others added 4 commits March 22, 2026 14:25
…ypass.py

Rule covers any advisory run_skill step — not worktree-specific logic.
rules_bypass.py already owns skip_when_false routing contracts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ontext_limit_warnings

test_all_advisory_run_skill_steps_have_on_context_limit (L1709) already enforces
the same invariant via a parametrized direct-field check per recipe. Keeping both
creates dual-mechanism overlap with no additional coverage.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
test_remediation_review_step_has_on_context_limit and _has_retries used an
inconsistent prefix; renamed to test_if_review_step_* to match the rest of
the TestRemediationPipelineStructure class.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… assertion

- Remove '# on_context_limit deliberately absent' (test name + docstring convey intent)
- Remove '# skip_when_false absent — non-advisory step' (same reason)
- Assert on rule name across all findings first, then verify severity separately
  so a severity change doesn't silently discard the finding before any() runs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@Trecek Trecek added this pull request to the merge queue Mar 22, 2026
Merged via the queue into integration with commit e76f38a Mar 22, 2026
2 checks passed
@Trecek Trecek deleted the remediation-recipe-review-step-lacks-on-context-limit-retrie/481 branch March 22, 2026 21:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant