Rectify: Advisory Step Context-Limit Routing — Remaining Steps — PART B ONLY#483
Conversation
…-step rule - Add test_remediation_review_step_has_on_context_limit/retries to TestInvestigateFirstStructure - Add test_implementation_review_step_has_on_context_limit/retries to TestImplementationPipelineStructure - Add test_ig_review_step_has_on_context_limit/retries to TestImplementationGroupsStructure - Add 5 unit tests for advisory-step-missing-context-limit semantic rule in test_rules_worktree.py All new tests are RED — implementation follows in next commit.
…ipes Adds the advisory-step-missing-context-limit semantic rule and fixes the review step routing gap in remediation.yaml, implementation.yaml, and implementation-groups.yaml. Changes: - rules_worktree.py: new advisory-step-missing-context-limit WARNING rule fires for any run_skill step with skip_when_false but no on_context_limit - remediation.yaml review step: add retries:1 and on_context_limit:dry_walkthrough - implementation.yaml review step: add retries:1 and on_context_limit:verify - implementation-groups.yaml review step: add retries:1 and on_context_limit:verify The review step is advisory (skip_when_false). Without on_context_limit, context exhaustion fell through to on_failure -> release_issue_failure, aborting the pipeline. Now it routes to the next main step as intended.
… on_context_limit Adds 9 per-step assertions (3 recipes × 3 steps) inside the appropriate test classes, plus two cross-recipe coverage tests: - test_bundled_recipes_emit_no_advisory_context_limit_warnings (standalone) - test_all_advisory_run_skill_steps_have_on_context_limit (parametrized) All tests are RED until the recipe YAML fixes land in Phase 2. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…x, create_summary Adds explicit on_context_limit routing to all advisory run_skill steps that were missing it across all five bundled recipes: - audit_impl (remediation, implementation, implementation-groups): escalate_stop A context-exhausted audit cannot provide a valid verdict; aborting prevents unapproved code from bypassing the merge gate. - audit_impl (merge-prs): cleanup_failure Same rationale as the three main recipes. - open_pr_step (remediation, implementation, implementation-groups): release_issue_failure PR state is unknown after a context limit; releasing the issue is the safe path. - ci_conflict_fix (remediation, implementation, implementation-groups): release_issue_failure An incomplete conflict fix cannot be safely pushed; abort and release. - create_summary (smoke-test): done Terminal reporting step; if context is exhausted, the pipeline completed successfully — route to done (clean stop) rather than escalate (failure stop). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…x, direct_merge_conflict_fix, immediate_merge_conflict_fix in all three main recipes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
| "the skip intent is already in the note: field but not schema-enforced" | ||
| ) | ||
|
|
||
| def test_implementation_review_step_has_on_context_limit(self, recipe) -> None: |
There was a problem hiding this comment.
[warning] tests: Per-recipe review-step on_context_limit tests (test_implementation_review_step_has_on_context_limit here, test_ig_* at L765, test_remediation_* at L1112) are structurally identical across three test classes. The parametrized test at test_all_advisory_run_skill_steps_have_on_context_limit already covers presence for all recipes. If kept for exact-value checking, extract a shared helper to avoid triple-maintenance drift.
There was a problem hiding this comment.
Valid observation — flagged for design decision. The per-recipe tests (L465, L765, L1112) add exact on_context_limit value checks (e.g. == 'verify', == 'dry_walkthrough') that test_all_advisory_run_skill_steps_have_on_context_limit (L1709) does not provide. Whether to consolidate into a parametrized exact-value helper or keep per-recipe tests for explicit regression coverage requires a human decision on test granularity vs. maintenance.
Trecek
left a comment
There was a problem hiding this comment.
AutoSkillit review found 7 blocking issues (changes_requested). See inline comments.
Summary of findings:
rules_worktree.py:117[cohesion] Rule is in wrong file — covers all advisory steps, not just worktree stepstest_bundled_recipes.py:1510[tests] Potentially redundant test coverage (semantic rule vs direct field check for same invariant) — requires human decisiontest_bundled_recipes.py:465[tests] Copy-pasted per-recipe tests across 3 classes — extract shared helpertest_bundled_recipes.py:1125[cohesion] Inconsistent test name prefixes in remediation class (test_remediation_ vs test_if_)test_rules_worktree.py:311[slop] Redundant inline commenttest_rules_worktree.py:319[tests] Severity pre-filter obscures assertion intenttest_rules_worktree.py:359[slop] Redundant inline comment
Unpostable finding:
test_rules_worktree.py[cohesion] Advisory-step rule tests in wrong file (should mirror source module)
…ypass.py Rule covers any advisory run_skill step — not worktree-specific logic. rules_bypass.py already owns skip_when_false routing contracts. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ontext_limit_warnings test_all_advisory_run_skill_steps_have_on_context_limit (L1709) already enforces the same invariant via a parametrized direct-field check per recipe. Keeping both creates dual-mechanism overlap with no additional coverage. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
test_remediation_review_step_has_on_context_limit and _has_retries used an inconsistent prefix; renamed to test_if_review_step_* to match the rest of the TestRemediationPipelineStructure class. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… assertion - Remove '# on_context_limit deliberately absent' (test name + docstring convey intent) - Remove '# skip_when_false absent — non-advisory step' (same reason) - Assert on rule name across all findings first, then verify severity separately so a severity change doesn't silently discard the finding before any() runs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Summary
Part A added the
advisory-step-missing-context-limitsemantic rule and fixed thereviewstep in three recipes. After Part A, the rule still emitted WARNINGs for three other advisory steps (audit_impl,open_pr_step,ci_conflict_fix) in each of the three main recipes. Part B resolves those remaining WARNINGs by adding expliciton_context_limitroutes, then tightens the test gate so that zero advisory-step WARNINGs are permitted in bundled recipes going forward.Architecture Impact
Process Flow Diagram
%%{init: {'flowchart': {'nodeSpacing': 40, 'rankSpacing': 55, 'curve': 'basis'}}}%% flowchart TB %% CLASS DEFINITIONS %% classDef terminal fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff; classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff; classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff; classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff; classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff; classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff; START([START]) ESCALATE([ESCALATE_STOP]) RELEASE_FAIL([RELEASE_ISSUE_FAILURE]) subgraph AdvisoryGate ["Advisory Step Gate (skip_when_false)"] direction TB REVIEW["● review<br/>━━━━━━━━━━<br/>skip_when_false: review_approach<br/>retries: 1<br/>on_context_limit: dry_walkthrough"] AUDIT["● audit_impl<br/>━━━━━━━━━━<br/>skip_when_false: inputs.audit<br/>on_context_limit: escalate_stop"] OPEN_PR["● open_pr_step<br/>━━━━━━━━━━<br/>skip_when_false: inputs.open_pr<br/>on_context_limit: release_issue_failure"] CI_FIX["● ci_conflict_fix<br/>━━━━━━━━━━<br/>skip_when_false: detect_ci_conflicts<br/>on_context_limit: release_issue_failure"] end subgraph SemanticRule ["● advisory-step-missing-context-limit Rule"] direction LR RULE["advisory-step-missing-context-limit<br/>━━━━━━━━━━<br/>WARNING: run_skill + skip_when_false<br/>but no on_context_limit"] RULECHECK{"on_context_limit<br/>set?"} end subgraph NormalFlow ["Normal Execution Path"] direction TB DRY["dry_walkthrough<br/>━━━━━━━━━━<br/>retries: 3"] TEST["test"] MERGE["merge"] PUSH["push"] CI_WATCH["ci_watch"] end subgraph AuditVerdict ["Audit Verdict Routing"] direction LR VERDICT{"audit verdict"} REMEDIATE["remediate → make_plan"] end START --> REVIEW REVIEW -->|"on_success"| DRY REVIEW -->|"on_context_limit (●new)"| DRY REVIEW -->|"on_failure"| RELEASE_FAIL DRY --> TEST TEST --> AUDIT AUDIT -->|"on_result: GO"| MERGE AUDIT --> VERDICT VERDICT -->|"NO GO"| REMEDIATE VERDICT -->|"error"| ESCALATE AUDIT -->|"on_context_limit (●new)"| ESCALATE AUDIT -->|"on_failure"| ESCALATE AUDIT -->|"skipped"| MERGE REMEDIATE --> DRY MERGE --> PUSH PUSH --> OPEN_PR OPEN_PR -->|"on_success"| CI_WATCH OPEN_PR -->|"on_context_limit (●new)"| RELEASE_FAIL OPEN_PR -->|"on_failure"| RELEASE_FAIL OPEN_PR -->|"skipped"| CI_WATCH CI_WATCH -->|"on_success"| MERGE CI_WATCH -->|"on_failure → detect_ci_conflict"| CI_FIX CI_FIX -->|"on_success"| PUSH CI_FIX -->|"on_context_limit (●new)"| RELEASE_FAIL CI_FIX -->|"on_failure"| RELEASE_FAIL RULE --> RULECHECK RULECHECK -->|"missing"| RULE RULECHECK -->|"present"| RULE %% CLASS ASSIGNMENTS %% class START terminal; class ESCALATE,RELEASE_FAIL terminal; class REVIEW,AUDIT,OPEN_PR,CI_FIX handler; class DRY,TEST,MERGE,PUSH,CI_WATCH phase; class VERDICT stateNode; class REMEDIATE detector; class RULE,RULECHECK output;Color Legend:
Error/Resilience Diagram
%%{init: {'flowchart': {'nodeSpacing': 40, 'rankSpacing': 55, 'curve': 'basis'}}}%% flowchart TB %% CLASS DEFINITIONS %% classDef terminal fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff; classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff; classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff; classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff; classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff; classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff; classDef gap fill:#ff6f00,stroke:#ffa726,stroke-width:2px,color:#000; subgraph SemanticGate ["● Semantic Validation Gate (rules_worktree.py)"] direction LR RULE["● advisory-step-missing-context-limit<br/>━━━━━━━━━━<br/>run_skill + skip_when_false<br/>but no on_context_limit → WARNING"] RCHECK{"on_context_limit<br/>declared?"} RULE_OK["Rule passes<br/>━━━━━━━━━━<br/>step is compliant"] end subgraph ReviewResilience ["● review step (skip_when_false: review_approach, retries: 1)"] direction TB REVIEW_RUN["run_skill: review-approach<br/>━━━━━━━━━━<br/>advisory; can be skipped"] REVIEW_STALE["context exhausted / stale"] REVIEW_RETRY{"retries<br/>remaining?"} REVIEW_SKIP["● on_context_limit: dry_walkthrough<br/>━━━━━━━━━━<br/>skip to next step (safe)"] end subgraph AuditResilience ["● audit_impl step (skip_when_false: inputs.audit) — merge gate"] direction TB AUDIT_RUN["run_skill: audit-impl<br/>━━━━━━━━━━<br/>merge gate; GO/NO GO verdict"] AUDIT_FAIL["context exhausted<br/>━━━━━━━━━━<br/>verdict unavailable"] AUDIT_LIMIT["● on_context_limit: escalate_stop<br/>━━━━━━━━━━<br/>abort — unapproved merge unsafe"] end subgraph PRResilience ["● open_pr_step + ci_conflict_fix (skip_when_false guarded)"] direction TB OPENPR_RUN["run_skill: open-pr<br/>━━━━━━━━━━<br/>skip_when_false: inputs.open_pr"] OPENPR_LIMIT["● on_context_limit: release_issue_failure<br/>━━━━━━━━━━<br/>PR state unknown; safe release"] CIFIX_RUN["run_skill: ci-conflict-fix<br/>━━━━━━━━━━<br/>skip_when_false: detect_ci_conflicts<br/>retries: 1"] CIFIX_LIMIT["● on_context_limit: release_issue_failure<br/>━━━━━━━━━━<br/>incomplete fix; push unsafe"] end T_SKIP([dry_walkthrough — continue]) T_ABORT([ESCALATE_STOP]) T_RELEASE([RELEASE_ISSUE_FAILURE]) %% SEMANTIC GATE %% RULE --> RCHECK RCHECK -->|"missing"| RULE RCHECK -->|"present"| RULE_OK %% REVIEW RESILIENCE %% REVIEW_RUN -->|"stale / context limit"| REVIEW_STALE REVIEW_STALE --> REVIEW_RETRY REVIEW_RETRY -->|"yes (retries: 1)"| REVIEW_RUN REVIEW_RETRY -->|"exhausted"| REVIEW_SKIP REVIEW_SKIP --> T_SKIP %% AUDIT RESILIENCE %% AUDIT_RUN -->|"context exhausted"| AUDIT_FAIL AUDIT_FAIL --> AUDIT_LIMIT AUDIT_LIMIT --> T_ABORT %% PR RESILIENCE %% OPENPR_RUN -->|"context exhausted"| OPENPR_LIMIT OPENPR_LIMIT --> T_RELEASE CIFIX_RUN -->|"context exhausted<br/>(after retries)"| CIFIX_LIMIT CIFIX_LIMIT --> T_RELEASE %% CLASS ASSIGNMENTS %% class RULE,RCHECK detector; class RULE_OK output; class REVIEW_RUN,AUDIT_RUN,OPENPR_RUN,CIFIX_RUN handler; class REVIEW_STALE,AUDIT_FAIL gap; class REVIEW_RETRY stateNode; class REVIEW_SKIP,AUDIT_LIMIT,OPENPR_LIMIT,CIFIX_LIMIT output; class T_SKIP,T_ABORT,T_RELEASE terminal;Color Legend:
Closes #481
Implementation Plan
Plan file:
/home/talon/projects/autoskillit-runs/remediation-20260322-120142-791819/temp/rectify/rectify_review-step-context-limit-routing_2026-03-22_120142_part_b.md🤖 Generated with Claude Code via AutoSkillit