Skip to content

feat: multi-pass skill pipeline#144

Open
gricha wants to merge 12 commits intomainfrom
feat/multi-pass-pipeline
Open

feat: multi-pass skill pipeline#144
gricha wants to merge 12 commits intomainfrom
feat/multi-pass-pipeline

Conversation

@gricha
Copy link
Member

@gricha gricha commented Feb 13, 2026

Summary

  • Multi-pass pipeline: skills run in phases, with later phases receiving findings from earlier ones
  • Two skill scopes: diff (default, per-hunk analysis) and report (single SDK call on all prior findings)
  • Report-scoped skills skip the file/hunk loop entirely, running once with all prior findings serialized into the prompt
  • scope = "report" implies last phase automatically (no need to set phase manually)
  • phase field still available for ordering between diff-scoped skills

Changes

  • Config: scope and phase fields on SkillConfigSchema, threaded through ResolvedTrigger
  • Pipeline: groupByPhase() generic utility groups items by phase, sorted ascending. Report-scoped triggers forced to phase = Infinity
  • Prompt: buildReportUserPrompt() for report-scoped skills, serializeAllPriorFindings() (all files), serializePriorFindings() (per-file). Extracted shared formatPRContext() helper
  • Analysis: analyzeReport() function for single-call report analysis. Early returns in runSkill() and runSkillTask() for report scope
  • SDK types: scope and priorReports on SkillRunnerOptions
  • CLI: Phase-aware loops in runConfigMode() and runSkills(), scope threaded to runner options
  • Action: Phase-aware executeAllTriggers() in PR workflow, scope threaded through executor
  • warden.toml: warden-lint-judge uses scope = "report" with remote from getsentry/skills

Test plan

  • pnpm lint && pnpm build && pnpm test (1044 tests pass)
  • loader.test.ts: scope schema validation, report-scoped triggers forced to last phase
  • prompt.test.ts: buildReportUserPrompt includes all findings, changed files, handles empty findings
  • prompt.test.ts: serializeAllPriorFindings includes findings from all files and skills
  • Manual: run against gricha/perry#162, verify single SDK call for report-scoped skill, correct findings
  • Verify phase-1-only configs are unaffected (no behavioral change when no scope/phase field set)

@vercel
Copy link

vercel bot commented Feb 13, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
warden Ready Ready Preview, Comment Feb 14, 2026 1:22am

Request Review

Copy link
Contributor

@sentry-warden sentry-warden bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [PAX-3A8] Missing phase property when skill specified via --skill CLI flag (src/cli/main.ts:244) (high confidence)

When a skill is explicitly specified via --skill flag, line 244 does not include phase: match?.phase, so phase-2 skills invoked directly via CLI will always run in phase 1 (the default), ignoring their configured phase.

Identified by Warden via notseer

Copy link
Contributor

@sentry-warden sentry-warden bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [LUG-YU5] Missing phase property when skill specified via --skill CLI flag (src/cli/main.ts:244) (high confidence)

When a skill is explicitly specified via --skill CLI flag, line 244 creates a SkillToRun without the phase property, while line 257 correctly includes it. This means phase-2 skills run via --skill flag will be treated as phase-1 and won't receive prior findings.

Identified by Warden via notseer

Copy link
Contributor

@sentry-warden sentry-warden bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 [BXF-6WZ] Missing phase property when skill is specified via CLI flag (src/cli/main.ts:244) (high confidence)

When a skill is explicitly specified via --skill CLI flag, the phase property from config is not carried forward. Line 244 includes remote and filters from the matched config but omits phase. A phase-2 skill invoked via CLI will run as phase 1 and won't receive priorReports in its context.

Identified by Warden via notseer

Copy link
Contributor

@sentry-warden sentry-warden bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 [2FX-L5J] Missing phase property when skill is specified via CLI flag (src/cli/main.ts:244) (high confidence)

When a skill is explicitly specified via --skill, the phase property is not included from the matched config. This causes phase-2 skills to run in phase 1 when invoked directly, and they won't receive prior findings from phase-1 skills.

Identified by Warden via notseer

Copy link
Contributor

@sentry-warden sentry-warden bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 [HC5-8YX] Missing phase property when skill specified via CLI (src/cli/main.ts:244) (high confidence)

When a skill is explicitly specified via --skill flag, line 244 extracts remote and filters from the matched config but omits phase. Skills configured as phase 2 in warden.toml will incorrectly run in phase 1 when invoked via CLI.

Identified by Warden via notseer

Copy link
Contributor

@sentry-warden sentry-warden bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [R86-6M3] Missing phase property when explicit skill is specified via CLI (src/cli/main.ts:244) (high confidence)

On line 244, when constructing SkillToRun for an explicit --skill flag, the phase property from match is not included. This causes phase-2 skills to be incorrectly grouped as phase-1 and not receive prior findings. Line 257 correctly includes phase: t.phase.

Identified by Warden via notseer

Copy link
Contributor

@sentry-warden sentry-warden bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [78K-KEM] Missing phase property when skill specified via --skill CLI flag (src/cli/main.ts:244) (high confidence)

Line 244 creates a SkillToRun object but omits the phase property from match. Line 257 correctly includes phase: t.phase for config-based skills. When a user explicitly runs a phase-2 skill via --skill my-skill, the phase will be undefined and default to 1 (line 286), causing the skill to run without receiving priorReports from phase-1 skills.

Identified by Warden via notseer

Copy link
Contributor

@sentry-warden sentry-warden bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 [M47-LKS] Explicit skill CLI option drops phase and scope from config (src/cli/main.ts:245) (high confidence)

When a skill is specified via --skill CLI flag (line 245), the phase and scope fields from the matched config are not included in the skillsToRun object, unlike the config-based path (line 258). This causes phase-2 or report-scoped skills to run incorrectly as phase-1 diff-scoped when invoked explicitly.

🟠 [ZJP-6KG] Report-scoped skill silently ignores analysis failure (src/cli/output/tasks.ts:200) (high confidence)

When analyzeReport() returns failed: true (due to SDK error, auth error, or exception), the code ignores the failed flag and treats it as a successful run with zero findings. The file-scoped path tracks failedHunks in the report, but report-scoped path does not track or surface failure.

⚠️ [88N-N89] Report-scoped findings lose file path - overwritten with empty string (src/sdk/analyze.ts:678) (high confidence)

validateFindings unconditionally overwrites location.path with the filename parameter (line 293 of extract.ts). When analyzeReport passes empty string for filename (line 678), any valid path the LLM provided gets overwritten with empty string. The comment claims it 'validates/fills location from the finding itself' but the code does the opposite.

Identified by Warden via notseer

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.


// Report-scoped findings don't have a single filename context.
// Pass empty string — parseHunkOutput validates/fills location from the finding itself.
const parseResult = await parseHunkOutput(resultMessage, '', apiKey);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Report-scoped findings get paths overwritten to empty string

High Severity

analyzeReport passes an empty string as the filename argument to parseHunkOutput, which internally calls validateFindings(findings, ''). That function unconditionally overwrites every finding's location.path with the supplied filename — so all report-scoped findings end up with path: ''. The comment claims parseHunkOutput preserves the finding's own path, but validateFindings does the opposite: it force-sets path to the provided filename at both mutation (line 284 in extract.ts) and construction (line 293). This breaks inline PR comments, deduplication keys, and all location rendering for report-scoped skills.

Fix in Cursor Fix in Web

gricha and others added 6 commits February 13, 2026 17:21
Skills can declare `phase = 2` in warden.toml to run after phase-1 skills
complete. Phase-2 skills receive all phase-1 SkillReport[] serialized into
their prompt context, enabling meta-analysis skills like a linter rule judge.

- Add `phase` field to SkillConfigSchema and ResolvedTrigger
- Add groupByPhase() utility in src/pipeline/
- Inject prior findings into buildHunkUserPrompt() between PR context and diff
- Thread priorReports through SkillRunnerOptions → analyzeHunk → prompt
- Phase-aware execution loops in CLI (runConfigMode, runSkills) and Action
- Example linter-rule-judge skill as proof-of-concept phase-2 skill

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Rewrite skill to only report when it can generate an actual lint rule
config or implementation as a suggestedFix. Skip silently otherwise.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Skill now must detect the project's actual linter system before proposing
anything. Only reports when it can produce a concrete rule with high
precision. Silent when nothing qualifies.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sharpen the confidence bar: rules must match specific syntactic patterns
in the AST with zero/near-zero false positives. Explicitly distinguish
deterministic checks (ban eval(), require ===) from heuristic guesses
(looks like user input, probably a bug). Skip anything that isn't the
former.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Findings target linter config/plugin files, not source code. Setting
location to the original finding's source line causes nonsensical
inline comments (oxlint config snippets suggested on loader.ts lines).
Omitting location makes them top-level review comments instead.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…r local --fix only

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
gricha and others added 6 commits February 13, 2026 17:21
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The summary now includes the skill's description from SKILL.md frontmatter,
giving phase-2 skills like linter-rule-judge a proper framing paragraph
in PR comments instead of just stats.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Description explains GitHub limitations and directs users to
copy-paste each finding as a prompt to their local coding agent.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The runSkillTask path in tasks.ts was the second call site that
wasn't passing skill.description, so the summary never included
the framing text.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Report-scoped skills run once with all prior findings instead of
per-hunk. Single SDK call, no diff. Implicitly ordered after all
diff-scoped skills via phase = Infinity.

- Add `scope` field to SkillConfigSchema and ResolvedTrigger
- Add `buildReportUserPrompt` and `serializeAllPriorFindings`
- Add `analyzeReport` function, branch in `runSkill`/`runSkillTask`
- Thread scope through CLI and action callers
- Use `scope = "report"` for warden-lint-judge in warden.toml

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Now using remote warden-lint-judge from getsentry/skills.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@gricha gricha force-pushed the feat/multi-pass-pipeline branch from 45ef09c to 57dd6d3 Compare February 14, 2026 01:21
gricha added a commit to getsentry/dotagents that referenced this pull request Feb 14, 2026
Add GitHub Actions workflow and update warden.toml to use the current
skills schema. Configures notseer and security-review skills on PR
events targeting src/**/*.ts. warden-lint-judge is commented out
pending getsentry/warden#144.

Co-Authored-By: Claude <noreply@anthropic.com>
gricha added a commit to getsentry/dotagents that referenced this pull request Feb 14, 2026
* ci: Set up Warden PR review

Add GitHub Actions workflow and update warden.toml to use the current
skills schema. Configures notseer and security-review skills on PR
events targeting src/**/*.ts. warden-lint-judge is commented out
pending getsentry/warden#144.

Co-Authored-By: Claude <noreply@anthropic.com>

* fix(ci): Use warden@v0 (v1 not yet available)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix(ci): Pass WARDEN_ANTHROPIC_API_KEY to warden action

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments