refactor(ci): adds more detail to the fix-dependabot claude prompt#336
refactor(ci): adds more detail to the fix-dependabot claude prompt#336umair-ably merged 1 commit intomainfrom
Conversation
…smarter fixing Rewrites the Fix Dependabot PRs workflow from a single job that duplicated build/lint/test internally to a two-job architecture that waits for all CI workflows to complete and uses Claude to fix failures with full context. Structure: - Job 1 (regen-lockfile): guard + regen pnpm-lock.yaml + push - Job 2 (fix-failures): poll check runs API for all CI workflows, collect failure logs, invoke Claude with detailed fix instructions Key improvements: - Captures failures from ALL CI workflows (unit tests, E2E CLI, Web CLI Playwright E2E, security audit) instead of only internal build/lint/test - Claude researches changelogs before fixing (ordered: GitHub Releases, CHANGELOG.md, npm, web search) and cross-references against codebase - Explicit ban on reverting/downgrading dependency bumps - Proactive migration concern checklist (peer deps, type changes, config files, module format, React/bundler compat, monorepo impact) - Early exit with detailed PR comment if fix is too complex - Structured assessment comment on every run - 50 turns instead of 30 for complex migrations - Randomised heredoc delimiters to prevent log content collision - Concurrency group to prevent duplicate polling - Fails explicitly on polling timeout with no data
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
WalkthroughThis PR rewrites the Changes
Review Notes
|
There was a problem hiding this comment.
Review: refactor(ci): rewrite fix-dependabot to capture all CI failures with smarter fixing
This PR rewrites the Claude prompt in the fix-failures job (the polling infrastructure and two-job architecture appear to have been introduced in a previous commit on this branch). The actual diff is a prompt engineering improvement plus adding WebSearch/WebFetch tools and raising --max-turns from 30 to 50.
What changed (actual diff)
- Replaced the minimal 5-instruction prompt with a structured 6-step guide
- Added explicit "never revert/downgrade" rule (directly fixes the regression from #332)
- Added changelog research steps (GitHub Releases → CHANGELOG.md → npm → web search)
- Added proactive migration concern checklist (peer deps, types, config files, ESM, React compat, monorepo)
- Added structured assessment comment template — gives reviewers an audit trail on every run
- Added "stop and comment" decision gate for complex migrations
--max-turns 30→50andWebSearch/WebFetchtools enabled for research
Issues found
None that would cause bugs or breakage.
One thing worth noting for awareness: failure logs are interpolated directly into the Claude prompt. The logs from failed CI runs land verbatim in the prompt via ${{ steps.wait-for-checks.outputs.failure_logs }}. A compromised npm package could embed prompt-injection text in its build output. The allowed_bots: "dependabot[bot]" gate limits exposure to genuine Dependabot PRs, but the threat surface exists at the dependency supply chain layer. This is inherent to this design pattern and pre-existing — no change needed unless you want to document it.
Minor observations (not issues)
- The "Step 2b" label is slightly awkward — looks like it was an afterthought insertion. Purely cosmetic.
EXPECTED_CHECKSincludes"setup"(Web CLI E2E build prep) but not the downstream Playwright jobs. If build prep passes but Playwright fails independently, it's still collected as afailed_checkvia the conclusion filter — just confirming the intent is correct.- The 500-line truncation per workflow run (
tail -n 500) is pragmatic. Very verbose test suites could lose the actual error, but it's a reasonable prompt-size trade-off.
Verdict
The changes are solid. The structured prompt with explicit research steps, the "fix forward" mandate, and the "stop and comment" gate are meaningful improvements over the original. The assessment comment template gives reviewers a useful audit trail. Looks good to merge.
| ``` | ||
| claude_args: | | ||
| --max-turns 30 | ||
| --max-turns 50 |
There was a problem hiding this comment.
--max-turns 50 seems like a lot, we will be reversing it back to minimal value after testing right
There was a problem hiding this comment.
yeah still tinkering with it... I had a PR that hit the 30 limit and it only cost a dollar!
Summary
Rewrites the Claude prompt in the
Fix Dependabot PRsworkflow to produce smarter, more reliable fixes. Builds on the infrastructure from #333 (two-job architecture, check run polling, failure log collection).Problem
The previous prompt was too basic — Claude received raw CI error logs with minimal guidance. This led to:
What this PR changes (prompt only)
Fix-forward enforcement:
Changelog research before fixing:
Proactive migration concern checklist:
@types/*packages)pnpm why react)Dependency scope classification:
Complexity assessment with early exit:
Structured assessment comment on every run:
Other:
Test plan