Skip to content

Implementation Plan: P2-WP3 Create research-review.yaml Sub-Recipe#1973

Merged
Trecek merged 8 commits into
developfrom
p2-wp3-create-research-review-yaml-sub-recipe/1702
May 6, 2026
Merged

Implementation Plan: P2-WP3 Create research-review.yaml Sub-Recipe#1973
Trecek merged 8 commits into
developfrom
p2-wp3-create-research-review-yaml-sub-recipe/1702

Conversation

@Trecek
Copy link
Copy Markdown
Collaborator

@Trecek Trecek commented May 6, 2026

Summary

Create src/autoskillit/recipes/research-review.yaml as a standalone sub-recipe containing the 22-step PR/review phase extracted from research.yaml. The recipe receives campaign-injected hidden ingredients (worktree_path, research_dir, report_path, experiment_plan, experiment_results, experiment_type, scope_report, visualization_plan_path), lifts all review steps verbatim-and-adapted, replaces the archival phase with dual terminal stops (review_pr_complete for PR mode, review_local_complete for local mode), and corrects routing targets to terminate within this sub-recipe rather than routing to archival or non-existent steps.

Closes #1702

Implementation Plan

Plan file: /home/talon/projects/autoskillit-runs/impl-20260505-184855-274993/.autoskillit/temp/make-plan/p2_wp3_create_research_review_yaml_plan_2026-05-05_185500.md

🤖 Generated with Claude Code via AutoSkillit

Token Usage Summary

Step count uncached output cache_read peak_ctx turns cache_write time
plan 1 72 10.4k 782.6k 65.1k 58 47.5k 7m 3s
verify 1 46 26.0k 1.2M 88.0k 101 75.0k 11m 41s
implement 1 5.3M 43.6k 2.2M 31.9k 236 148.9k 20m 47s
prepare_pr 1 52 3.7k 153.4k 35.7k 15 23.3k 1m 3s
compose_pr 1 59 2.1k 158.9k 25.9k 15 12.9k 43s
review_pr 3 1.1k 147.5k 2.9M 105.2k 154 275.2k 35m 44s
resolve_review 3 999 91.5k 7.5M 102.9k 294 227.3k 33m 45s
ci_conflict_fix 2 303 11.8k 1.4M 49.3k 87 71.3k 3m 53s
diagnose_ci 1 92 2.8k 344.4k 45.0k 24 32.6k 1m 2s
resolve_ci 1 118 4.2k 520.6k 46.8k 36 33.8k 4m 4s
Total 5.3M 343.6k 17.1M 105.2k 947.7k 1h 59m

Token Efficiency

Step LoC Changed cache_read/LoC cache_write/LoC output/LoC
plan 0
verify 0
implement 681 3298.1 218.7 64.0
prepare_pr 0
compose_pr 0
review_pr 0
resolve_review 61 123140.0 3726.3 1500.1
ci_conflict_fix 3143 445.7 22.7 3.8
diagnose_ci 0
resolve_ci 1 520633.0 33773.0 4237.0
Total 3886 4410.5 243.9 88.4

Copy link
Copy Markdown
Collaborator Author

@Trecek Trecek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AutoSkillit PR Review — Verdict: changes_requested

Comment thread src/autoskillit/recipes/research-review.yaml Outdated
Comment thread src/autoskillit/recipes/research-review.yaml Outdated
Comment thread src/autoskillit/recipes/research-review.yaml
Comment thread src/autoskillit/recipes/research-review.yaml
guard_pr_url:
action: route
on_result:
- when: "${{ context.pr_url }}"
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] defense: guard_pr_url routes the missing-pr_url fallback to review_pr_complete (the success terminal stop) rather than escalate_stop. If compose_research_pr captured an empty pr_url, the recipe silently terminates as if it succeeded. Consider routing the empty-pr_url branch to escalate_stop to surface the failure explicitly.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid observation — flagged for design decision. The fallthrough to review_pr_complete is intentional graceful degradation (step note: 'Guard — skip review if pr_url is empty'). test_guard_pr_url_fallback_to_review_pr_complete asserts this routing. Whether to hard-fail on empty pr_url instead is a product-owner decision.

step_name: review_research_pr
capture:
review_verdict: "${{ result.verdict }}"
skip_when_false: inputs.review_pr
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] defense: review_research_pr uses skip_when_false but has no explicit on_success — only on_result. Confirm that the recipe engine treats skip_when_false as advancing via on_result's unconditional route when the condition is false, otherwise a skipped step may leave the recipe in an undefined state.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid observation — flagged for design decision. The step note documents 'All paths (skip, success, failure, context-limit) route to audit_claims'. Whether skip_when_false advances via on_result's unconditional route needs explicit verification from the recipe engine team.

step_name: audit_claims
capture:
audit_verdict: "${{ result.verdict }}"
skip_when_false: inputs.audit_claims
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] defense: audit_claims uses skip_when_false but has no explicit on_success — only on_result/on_failure/on_exhausted/on_context_limit. Same skip_when_false advancement ambiguity as review_research_pr. Verify the engine handles the skipped-step advancement case explicitly.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid observation — flagged for design decision. Same skip_when_false advancement question as review_research_pr. Step note documents 'All paths (skip, success, failure) route to route_review_resolve'. Engine behavior verification needed from recipe engine team.

stale_threshold: 2400
idle_output_timeout: 0
optional_context_refs: [worktree_path]
with:
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] defense: re_run_experiment on_context_limit routes to re_push_research, bypassing re_generate_report and re_stage_bundle. On context-limit, the report and bundle will be stale relative to the re-run results. Consider routing to re_generate_report instead.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid observation — flagged for design decision. Current on_context_limit → re_push_research is best-effort: push code fixes even if re-experiment didn't complete. Routing to re_generate_report risks hitting another context limit. Design decision for orchestration team.

cmd: "cd '${{ context.worktree_path }}' && git push"
cwd: "${{ context.worktree_path }}"
step_name: re_push_research
on_success: finalize_bundle_render
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] cohesion: re_push_research routes on_failure to escalate_stop, diverging from the best-effort fallthrough pattern used by all other re-validation steps (re_run_experiment, re_generate_report, re_stage_bundle all route on_failure to the next step). Confirm whether push failure should hard-stop or fall through to finalize_bundle_render.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid observation — flagged for design decision. Push failure hard-stops while other re-validation steps fall through. Without a successful push the PR won't reflect re-validation results, so hard-stop may be correct. Design decision needed.

Comment thread src/autoskillit/recipes/research-review.yaml
Trecek added a commit that referenced this pull request May 6, 2026
- Fix re_push_research on_success: routes to finalize_bundle instead of
  finalize_bundle_render to ensure context.report_path_after_finalize is
  populated before finalize_bundle_render runs
- Remove duplicate LENS ITERATION note from run_experiment_lenses step
  (kitchen_rules already carries this directive)
- Fix review_pr_complete stop message: use context.research_dir instead
  of inputs.research_dir (consistent with all other step references)
- Document intentional ingredient shadowing in re_run_experiment and
  re_generate_report notes (experiment_results and report_path)
- Fix test_prepare_research_pr_uses_context_worktree_path: use recipe
  fixture and step.with_args assertions instead of raw YAML string count
- Update test_re_push_research_routes_to_finalize_bundle to match new routing
- Remove narrating inline comments in test

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator Author

@Trecek Trecek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AutoSkillit PR Review -- Verdict: changes_requested

Comment thread src/autoskillit/recipes/research-review.yaml
Comment thread src/autoskillit/recipes/research-review.yaml
Comment thread src/autoskillit/recipes/research-review.yaml
model: ""
optional_context_refs: [worktree_path, research_dir, report_path, experiment_plan, experiment_results, experiment_type, scope_report, visualization_plan_path]
with:
skill_command: "/autoskillit:prepare-research-pr ${{ context.report_path }} ${{ context.experiment_plan }} ${{ context.worktree_path }} ${{ inputs.base_branch }}"
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] defense: prepare_research_pr skill_command interpolates context.report_path and context.experiment_plan without shell quoting. Both are listed as optional_context_refs (so either may be absent) and both are path-like values that may contain spaces, causing argument splitting or silent empty-argument injection. Wrap in quotes: "${{ context.report_path }}" "${{ context.experiment_plan }}".

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid observation — flagged for design decision. prepare_research_pr skill_command interpolates context.report_path and context.experiment_plan as unquoted comma-separated path lists; shell word-splitting would affect paths containing spaces.

capture:
html_path: "${{ result.html_path }}"
on_success: route_archive_or_export
on_failure: route_archive_or_export
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] arch: export_local_bundle calls autoskillit.recipe._cmd_rpc.export_local_bundle via run_python, accessing a private submodule by dotted path string. The prior review round noted this is the established pattern for run_python callables (verdict=REJECT). Raising for awareness only: if _cmd_rpc is intended as the stable recipe-layer RPC surface, promoting the callable to a public name would make that contract explicit and prevent silent breakage on rename.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid observation — flagged for design decision. export_local_bundle accesses autoskillit.recipe._cmd_rpc (private submodule) via run_python. This pattern was previously investigated in comment 3192708589 (thread resolved as intentional: _cmd_rpc is the recipe-layer callable interface explicitly tested in tests/recipe/test_cmd_rpc.py).

def test_requires_packs(self, recipe) -> None:
assert set(recipe.requires_packs) == {"research", "exp-lens", "vis-lens"}

def test_no_autoskillit_version(self) -> None:
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] tests: test_no_autoskillit_version bypasses the class fixture and reads the YAML file directly via path.read_text(). This continues the raw-file-access pattern resolved at L119. The assertion can be done on the loaded recipe model instead of raw text, keeping all structural assertions on the model.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid observation — flagged for design decision. test_no_autoskillit_version reads the YAML file directly via path.read_text() rather than using the class-scoped recipe fixture. Flagged for consistent fixture usage review.

content = path.read_text()
assert "patch_token_summary" not in content

def test_no_begin_archival_references(self) -> None:
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] tests: test_no_patch_token_summary_references (L176) and test_no_begin_archival_references (L181) both bypass the class fixture and do raw substring searches on the YAML file. The negative-guard intent is already fully covered by test_no_archival_steps (L162) which asserts against the parsed step name set. These raw-text tests are redundant with L162 and conflate YAML serialisation detail with semantic recipe state.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid observation — flagged for design decision. test_no_patch_token_summary_references (L176) and test_no_begin_archival_references (L181) both bypass the class fixture with raw path.read_text() calls. Flagged for consistent fixture usage review.

Copy link
Copy Markdown
Collaborator Author

@Trecek Trecek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AutoSkillit review: 1 blocking finding detected (critical/bugs). See inline comments for details — changes required before merge.

Comment thread src/autoskillit/recipes/research-review.yaml Outdated
Comment thread tests/recipe/test_research_review_recipe.py
Comment thread tests/recipe/test_research_review_recipe.py
Comment thread tests/recipe/test_research_review_recipe.py Outdated
Comment thread tests/recipe/test_research_review_recipe.py
Comment thread src/autoskillit/recipes/research-review.yaml
Comment thread src/autoskillit/recipes/research-review.yaml
Comment thread src/autoskillit/recipes/research-review.yaml
Comment thread src/autoskillit/recipes/research-review.yaml
Comment thread src/autoskillit/recipes/research-review.yaml
Comment thread src/autoskillit/recipes/research-review.yaml Outdated
Comment thread src/autoskillit/recipes/research-review.yaml Outdated
Copy link
Copy Markdown
Collaborator Author

@Trecek Trecek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AutoSkillit review: 2 critical findings detected — changes required. (Self-authored PR: REQUEST_CHANGES downgraded to COMMENT.)

Blocking (critical):

  • tests/recipe/test_research_review_recipe.py L133: Duplicate method test_finalize_bundle_failure_routes_to_escalate_stop — first definition asserts on_success (wrong) and is silently shadowed.
  • tests/recipe/test_research_review_recipe.py L137: Second duplicate definition replaces first; remove the dead first definition.

⚠️ Outside Diff Range

tests/recipe/test_research_review_recipe.py

  • L198 [warning/tests]: test_review_local_complete_sentinel_fields only asserts 'local_bundle_path' is present, while the parallel review_pr_complete test (L191-L196) checks four fields. If the local-complete sentinel carries additional context fields, they are untested.

Trecek added a commit that referenced this pull request May 6, 2026
- Fix re_push_research on_success: routes to finalize_bundle instead of
  finalize_bundle_render to ensure context.report_path_after_finalize is
  populated before finalize_bundle_render runs
- Remove duplicate LENS ITERATION note from run_experiment_lenses step
  (kitchen_rules already carries this directive)
- Fix review_pr_complete stop message: use context.research_dir instead
  of inputs.research_dir (consistent with all other step references)
- Document intentional ingredient shadowing in re_run_experiment and
  re_generate_report notes (experiment_results and report_path)
- Fix test_prepare_research_pr_uses_context_worktree_path: use recipe
  fixture and step.with_args assertions instead of raw YAML string count
- Update test_re_push_research_routes_to_finalize_bundle to match new routing
- Remove narrating inline comments in test

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@Trecek Trecek force-pushed the p2-wp3-create-research-review-yaml-sub-recipe/1702 branch from c8d23c7 to 612f525 Compare May 6, 2026 04:01
Trecek and others added 8 commits May 5, 2026 21:15
Add src/autoskillit/recipes/research-review.yaml — a standalone
22-step PR/review sub-recipe extracted from research.yaml. Receives
campaign-injected hidden ingredients (worktree_path, research_dir,
report_path, etc.), runs the review pipeline, and terminates with
sentinel emission at review_pr_complete or review_local_complete.

Key adaptations vs research.yaml:
- Removes all archival steps (begin_archival, capture_experiment_branch,
  create_artifact_branch, open_artifact_pr, tag_experiment_branch,
  close_experiment_pr, patch_token_summary, research_complete)
- Replaces archival routing with dual terminal stops:
  review_pr_complete (PR mode) and review_local_complete (local mode)
- finalize_bundle.on_success → finalize_bundle_render (not re_push_research)
- route_archive_or_export PR fall-through → review_pr_complete
- export_local_bundle → review_local_complete on both success/failure
- re_push_research → finalize_bundle_render on success
- Uses optional_context_refs for campaign-injected context variables

Also adds tests/recipe/test_research_review_recipe.py with 32 tests
covering header, ingredients, steps, routing adaptations, negative
guards, terminal stop sentinels, and kitchen_rules.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…view

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix re_push_research on_success: routes to finalize_bundle instead of
  finalize_bundle_render to ensure context.report_path_after_finalize is
  populated before finalize_bundle_render runs
- Remove duplicate LENS ITERATION note from run_experiment_lenses step
  (kitchen_rules already carries this directive)
- Fix review_pr_complete stop message: use context.research_dir instead
  of inputs.research_dir (consistent with all other step references)
- Document intentional ingredient shadowing in re_run_experiment and
  re_generate_report notes (experiment_results and report_path)
- Fix test_prepare_research_pr_uses_context_worktree_path: use recipe
  fixture and step.with_args assertions instead of raw YAML string count
- Update test_re_push_research_routes_to_finalize_bundle to match new routing
- Remove narrating inline comments in test

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…d_pr_url path

review_pr_complete is reachable via guard_pr_url fallback when pr_url is empty,
bypassing finalize_bundle. In that path, context.report_path_after_finalize is
never captured and interpolates to empty string. Add a note to the stop message
documenting this known-absent case so the LLM orchestrator emits empty string
rather than silently misrepresenting the sentinel field.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Comment decomposing 15 = 7 + 8 creates false precision since the
filter does not distinguish the two groups; plain count is sufficient.

Addresses reviewer comment 3192847593.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…t ref

- Remove LOCAL MODE OUTPUT kitchen_rule: file-layout documentation
  already in export_local_bundle step note (comment 3192848609)
- Remove SENTINEL EMISSION kitchen_rule: restates terminal stop
  messages which already contain the sentinel field list (comment 3192848715)
- Remove redundant cd in re_push_research cmd: cwd field already sets
  working directory; sibling steps use cwd-only (comment 3192848495)
- Add all_diagram_paths to finalize_bundle_render optional_context_refs:
  capture_list output may be absent if no lenses ran (comment 3192848381)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Ruff E501 violations introduced by develop commit 2577763 — two docstrings
exceeded the 99-char line limit after rebasing onto upstream/develop.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@Trecek Trecek force-pushed the p2-wp3-create-research-review-yaml-sub-recipe/1702 branch from c437b07 to 9b52626 Compare May 6, 2026 04:17
@Trecek Trecek added this pull request to the merge queue May 6, 2026
Merged via the queue into develop with commit 4f91b54 May 6, 2026
2 checks passed
@Trecek Trecek deleted the p2-wp3-create-research-review-yaml-sub-recipe/1702 branch May 6, 2026 04:32
Trecek added a commit that referenced this pull request May 8, 2026
…1973)

## Summary

Create `src/autoskillit/recipes/research-review.yaml` as a standalone
sub-recipe containing the 22-step PR/review phase extracted from
`research.yaml`. The recipe receives campaign-injected hidden
ingredients (`worktree_path`, `research_dir`, `report_path`,
`experiment_plan`, `experiment_results`, `experiment_type`,
`scope_report`, `visualization_plan_path`), lifts all review steps
verbatim-and-adapted, replaces the archival phase with dual terminal
stops (`review_pr_complete` for PR mode, `review_local_complete` for
local mode), and corrects routing targets to terminate within this
sub-recipe rather than routing to archival or non-existent steps.

Closes #1702

## Implementation Plan

Plan file:
`/home/talon/projects/autoskillit-runs/impl-20260505-184855-274993/.autoskillit/temp/make-plan/p2_wp3_create_research_review_yaml_plan_2026-05-05_185500.md`

🤖 Generated with [Claude Code](https://claude.com/claude-code) via
AutoSkillit
<!-- autoskillit:pipeline-signature
steps=prepare_pr,run_arch_lenses,compose_pr,annotate_pr_diff,review_pr
-->

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant