feat: multi-pass skill pipeline by gricha · Pull Request #144 · getsentry/warden

gricha · 2026-02-13T20:46:23Z

Summary

Multi-pass pipeline: skills run in phases, with later phases receiving findings from earlier ones
Two skill scopes: diff (default, per-hunk analysis) and report (single SDK call on all prior findings)
Report-scoped skills skip the file/hunk loop entirely, running once with all prior findings serialized into the prompt
scope = "report" implies last phase automatically (no need to set phase manually)
phase field still available for ordering between diff-scoped skills

Changes

Config: scope and phase fields on SkillConfigSchema, threaded through ResolvedTrigger
Pipeline: groupByPhase() generic utility groups items by phase, sorted ascending. Report-scoped triggers forced to phase = Infinity
Prompt: buildReportUserPrompt() for report-scoped skills, serializeAllPriorFindings() (all files), serializePriorFindings() (per-file). Extracted shared formatPRContext() helper
Analysis: analyzeReport() function for single-call report analysis. Early returns in runSkill() and runSkillTask() for report scope
SDK types: scope and priorReports on SkillRunnerOptions
CLI: Phase-aware loops in runConfigMode() and runSkills(), scope threaded to runner options
Action: Phase-aware executeAllTriggers() in PR workflow, scope threaded through executor
warden.toml: warden-lint-judge uses scope = "report" with remote from getsentry/skills

Test plan

pnpm lint && pnpm build && pnpm test (1044 tests pass)
loader.test.ts: scope schema validation, report-scoped triggers forced to last phase
prompt.test.ts: buildReportUserPrompt includes all findings, changed files, handles empty findings
prompt.test.ts: serializeAllPriorFindings includes findings from all files and skills
Manual: run against gricha/perry#162, verify single SDK call for report-scoped skill, correct findings
Verify phase-1-only configs are unaffected (no behavioral change when no scope/phase field set)

vercel · 2026-02-13T20:46:24Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
warden	Ready	Preview, Comment	Feb 14, 2026 1:22am

sentry-warden

⚠️ [PAX-3A8] Missing phase property when skill specified via --skill CLI flag (src/cli/main.ts:244) (high confidence)

When a skill is explicitly specified via --skill flag, line 244 does not include phase: match?.phase, so phase-2 skills invoked directly via CLI will always run in phase 1 (the default), ignoring their configured phase.

_{Identified by Warden via notseer}

sentry-warden

⚠️ [LUG-YU5] Missing phase property when skill specified via --skill CLI flag (src/cli/main.ts:244) (high confidence)

When a skill is explicitly specified via --skill CLI flag, line 244 creates a SkillToRun without the phase property, while line 257 correctly includes it. This means phase-2 skills run via --skill flag will be treated as phase-1 and won't receive prior findings.

_{Identified by Warden via notseer}

sentry-warden

🟠 [BXF-6WZ] Missing phase property when skill is specified via CLI flag (src/cli/main.ts:244) (high confidence)

When a skill is explicitly specified via --skill CLI flag, the phase property from config is not carried forward. Line 244 includes remote and filters from the matched config but omits phase. A phase-2 skill invoked via CLI will run as phase 1 and won't receive priorReports in its context.

_{Identified by Warden via notseer}

sentry-warden

🟠 [2FX-L5J] Missing phase property when skill is specified via CLI flag (src/cli/main.ts:244) (high confidence)

When a skill is explicitly specified via --skill, the phase property is not included from the matched config. This causes phase-2 skills to run in phase 1 when invoked directly, and they won't receive prior findings from phase-1 skills.

_{Identified by Warden via notseer}

sentry-warden

🟠 [HC5-8YX] Missing phase property when skill specified via CLI (src/cli/main.ts:244) (high confidence)

When a skill is explicitly specified via --skill flag, line 244 extracts remote and filters from the matched config but omits phase. Skills configured as phase 2 in warden.toml will incorrectly run in phase 1 when invoked via CLI.

_{Identified by Warden via notseer}

sentry-warden

⚠️ [R86-6M3] Missing phase property when explicit skill is specified via CLI (src/cli/main.ts:244) (high confidence)

On line 244, when constructing SkillToRun for an explicit --skill flag, the phase property from match is not included. This causes phase-2 skills to be incorrectly grouped as phase-1 and not receive prior findings. Line 257 correctly includes phase: t.phase.

_{Identified by Warden via notseer}

src/sdk/analyze.ts

sentry-warden

⚠️ [78K-KEM] Missing phase property when skill specified via --skill CLI flag (src/cli/main.ts:244) (high confidence)

Line 244 creates a SkillToRun object but omits the phase property from match. Line 257 correctly includes phase: t.phase for config-based skills. When a user explicitly runs a phase-2 skill via --skill my-skill, the phase will be undefined and default to 1 (line 286), causing the skill to run without receiving priorReports from phase-1 skills.

_{Identified by Warden via notseer}

sentry-warden

🟠 [M47-LKS] Explicit skill CLI option drops phase and scope from config (src/cli/main.ts:245) (high confidence)

When a skill is specified via --skill CLI flag (line 245), the phase and scope fields from the matched config are not included in the skillsToRun object, unlike the config-based path (line 258). This causes phase-2 or report-scoped skills to run incorrectly as phase-1 diff-scoped when invoked explicitly.

🟠 [ZJP-6KG] Report-scoped skill silently ignores analysis failure (src/cli/output/tasks.ts:200) (high confidence)

When analyzeReport() returns failed: true (due to SDK error, auth error, or exception), the code ignores the failed flag and treats it as a successful run with zero findings. The file-scoped path tracks failedHunks in the report, but report-scoped path does not track or surface failure.

⚠️ [88N-N89] Report-scoped findings lose file path - overwritten with empty string (src/sdk/analyze.ts:678) (high confidence)

validateFindings unconditionally overwrites location.path with the filename parameter (line 293 of extract.ts). When analyzeReport passes empty string for filename (line 678), any valid path the LLM provided gets overwritten with empty string. The comment claims it 'validates/fills location from the finding itself' but the code does the opposite.

_{Identified by Warden via notseer}

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.}

cursor · 2026-02-14T01:20:56Z

src/sdk/analyze.ts

+
+        // Report-scoped findings don't have a single filename context.
+        // Pass empty string — parseHunkOutput validates/fills location from the finding itself.
+        const parseResult = await parseHunkOutput(resultMessage, '', apiKey);


Report-scoped findings get paths overwritten to empty string

High Severity

analyzeReport passes an empty string as the filename argument to parseHunkOutput, which internally calls validateFindings(findings, ''). That function unconditionally overwrites every finding's location.path with the supplied filename — so all report-scoped findings end up with path: ''. The comment claims parseHunkOutput preserves the finding's own path, but validateFindings does the opposite: it force-sets path to the provided filename at both mutation (line 284 in extract.ts) and construction (line 293). This breaks inline PR comments, deduplication keys, and all location rendering for report-scoped skills.

Skills can declare `phase = 2` in warden.toml to run after phase-1 skills complete. Phase-2 skills receive all phase-1 SkillReport[] serialized into their prompt context, enabling meta-analysis skills like a linter rule judge. - Add `phase` field to SkillConfigSchema and ResolvedTrigger - Add groupByPhase() utility in src/pipeline/ - Inject prior findings into buildHunkUserPrompt() between PR context and diff - Thread priorReports through SkillRunnerOptions → analyzeHunk → prompt - Phase-aware execution loops in CLI (runConfigMode, runSkills) and Action - Example linter-rule-judge skill as proof-of-concept phase-2 skill Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Rewrite skill to only report when it can generate an actual lint rule config or implementation as a suggestedFix. Skip silently otherwise. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Skill now must detect the project's actual linter system before proposing anything. Only reports when it can produce a concrete rule with high precision. Silent when nothing qualifies. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Sharpen the confidence bar: rules must match specific syntactic patterns in the AST with zero/near-zero false positives. Explicitly distinguish deterministic checks (ban eval(), require ===) from heuristic guesses (looks like user input, probably a bug). Skip anything that isn't the former. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Findings target linter config/plugin files, not source code. Setting location to the original finding's source line causes nonsensical inline comments (oxlint config snippets suggested on loader.ts lines). Omitting location makes them top-level review comments instead. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…r local --fix only Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The summary now includes the skill's description from SKILL.md frontmatter, giving phase-2 skills like linter-rule-judge a proper framing paragraph in PR comments instead of just stats. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Description explains GitHub limitations and directs users to copy-paste each finding as a prompt to their local coding agent. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The runSkillTask path in tasks.ts was the second call site that wasn't passing skill.description, so the summary never included the framing text. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Report-scoped skills run once with all prior findings instead of per-hunk. Single SDK call, no diff. Implicitly ordered after all diff-scoped skills via phase = Infinity. - Add `scope` field to SkillConfigSchema and ResolvedTrigger - Add `buildReportUserPrompt` and `serializeAllPriorFindings` - Add `analyzeReport` function, branch in `runSkill`/`runSkillTask` - Thread scope through CLI and action callers - Use `scope = "report"` for warden-lint-judge in warden.toml Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Now using remote warden-lint-judge from getsentry/skills. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add GitHub Actions workflow and update warden.toml to use the current skills schema. Configures notseer and security-review skills on PR events targeting src/**/*.ts. warden-lint-judge is commented out pending getsentry/warden#144. Co-Authored-By: Claude <noreply@anthropic.com>

* ci: Set up Warden PR review Add GitHub Actions workflow and update warden.toml to use the current skills schema. Configures notseer and security-review skills on PR events targeting src/**/*.ts. warden-lint-judge is commented out pending getsentry/warden#144. Co-Authored-By: Claude <noreply@anthropic.com> * fix(ci): Use warden@v0 (v1 not yet available) Co-Authored-By: Claude <noreply@anthropic.com> * fix(ci): Pass WARDEN_ANTHROPIC_API_KEY to warden action Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>

vercel bot deployed to Preview February 13, 2026 20:46 View deployment

sentry-warden bot reviewed Feb 13, 2026

View reviewed changes

vercel bot deployed to Preview February 13, 2026 21:28 View deployment

sentry-warden bot reviewed Feb 13, 2026

View reviewed changes

vercel bot deployed to Preview February 13, 2026 21:33 View deployment

vercel bot deployed to Preview February 13, 2026 21:34 View deployment

vercel bot deployed to Preview February 13, 2026 21:49 View deployment

sentry-warden bot reviewed Feb 13, 2026

View reviewed changes

vercel bot deployed to Preview February 13, 2026 23:43 View deployment

sentry-warden bot reviewed Feb 13, 2026

View reviewed changes

vercel bot deployed to Preview February 13, 2026 23:52 View deployment

sentry-warden bot reviewed Feb 13, 2026

View reviewed changes

vercel bot deployed to Preview February 13, 2026 23:58 View deployment

sentry-warden bot reviewed Feb 14, 2026

View reviewed changes

vercel bot deployed to Preview February 14, 2026 00:04 View deployment

sentry-warden bot reviewed Feb 14, 2026

View reviewed changes

src/sdk/analyze.ts Show resolved Hide resolved

vercel bot deployed to Preview February 14, 2026 00:16 View deployment

sentry-warden bot reviewed Feb 14, 2026

View reviewed changes

gricha mentioned this pull request Feb 14, 2026

feat: add warden-lint-judge skill getsentry/skills#44

Merged

vercel bot deployed to Preview February 14, 2026 01:14 View deployment

gricha marked this pull request as ready for review February 14, 2026 01:14

sentry-warden bot reviewed Feb 14, 2026

View reviewed changes

cursor bot reviewed Feb 14, 2026

View reviewed changes

gricha and others added 6 commits February 13, 2026 17:21

refine linter-rule-judge: produce rules, not commentary

c110caa

Rewrite skill to only report when it can generate an actual lint rule config or implementation as a suggestedFix. Skip silently otherwise. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

linter-rule-judge: description is the primary output, suggestedFix fo…

f621f97

…r local --fix only Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

gricha and others added 6 commits February 13, 2026 17:21

render skill summary as header in review body for locationless findings

9233626

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

linter-rule-judge: frame findings as agent-promptable instructions

a72eefb

Description explains GitHub limitations and directs users to copy-paste each finding as a prompt to their local coding agent. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: pass skill description to generateSummary in tasks.ts

0b704b9

The runSkillTask path in tasks.ts was the second call site that wasn't passing skill.description, so the summary never included the framing text. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

remove local linter-rule-judge skill

57dd6d3

Now using remote warden-lint-judge from getsentry/skills. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

gricha force-pushed the feat/multi-pass-pipeline branch from 45ef09c to 57dd6d3 Compare February 14, 2026 01:21

vercel bot deployed to Preview February 14, 2026 01:22 View deployment

gricha mentioned this pull request Feb 14, 2026

ci: Set up Warden PR review getsentry/dotagents#4

Merged

dcramer mentioned this pull request Feb 17, 2026

Constrain skill findings to diff hunk line range #150

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: multi-pass skill pipeline#144

feat: multi-pass skill pipeline#144
gricha wants to merge 12 commits intomainfrom
feat/multi-pass-pipeline

gricha commented Feb 13, 2026 •

edited

Loading

Uh oh!

vercel bot commented Feb 13, 2026 •

edited

Loading

Uh oh!

sentry-warden bot left a comment

Uh oh!

sentry-warden bot left a comment

Uh oh!

sentry-warden bot left a comment

Uh oh!

sentry-warden bot left a comment

Uh oh!

sentry-warden bot left a comment

Uh oh!

sentry-warden bot left a comment

Uh oh!

Uh oh!

sentry-warden bot left a comment

Uh oh!

sentry-warden bot left a comment

Uh oh!

cursor bot left a comment

Uh oh!

cursor bot Feb 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Uh oh!

Conversation

gricha commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Test plan

Uh oh!

vercel bot commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sentry-warden bot left a comment

Choose a reason for hiding this comment

Uh oh!

sentry-warden bot left a comment

Choose a reason for hiding this comment

Uh oh!

sentry-warden bot left a comment

Choose a reason for hiding this comment

Uh oh!

sentry-warden bot left a comment

Choose a reason for hiding this comment

Uh oh!

sentry-warden bot left a comment

Choose a reason for hiding this comment

Uh oh!

sentry-warden bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sentry-warden bot left a comment

Choose a reason for hiding this comment

Uh oh!

sentry-warden bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot Feb 14, 2026

Choose a reason for hiding this comment

Report-scoped findings get paths overwritten to empty string

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

gricha commented Feb 13, 2026 •

edited

Loading

vercel bot commented Feb 13, 2026 •

edited

Loading