Skip to content

refactor(ci): adds more detail to the fix-dependabot claude prompt#336

Merged
umair-ably merged 1 commit intomainfrom
refactor/dependabot-workflow-v2
Apr 16, 2026
Merged

refactor(ci): adds more detail to the fix-dependabot claude prompt#336
umair-ably merged 1 commit intomainfrom
refactor/dependabot-workflow-v2

Conversation

@umair-ably
Copy link
Copy Markdown
Collaborator

@umair-ably umair-ably commented Apr 16, 2026

Summary

Rewrites the Claude prompt in the Fix Dependabot PRs workflow to produce smarter, more reliable fixes. Builds on the infrastructure from #333 (two-job architecture, check run polling, failure log collection).

Problem

The previous prompt was too basic — Claude received raw CI error logs with minimal guidance. This led to:

  • Claude reverting the dependency bump instead of fixing our code to work with the new version
  • No upstream research — Claude debugged blind without knowing what the new version actually changed
  • No structured output — hard to review what Claude did and why

What this PR changes (prompt only)

Fix-forward enforcement:

  • Explicit rule: NEVER revert or downgrade the dependency bump
  • Claude must adapt our code to work with the new version

Changelog research before fixing:

  • Ordered lookup: GitHub Releases → CHANGELOG.md → npm page → web search
  • Cross-references each notable change against our actual codebase usage
  • Maps CI failures to specific upstream changes before attempting fixes

Proactive migration concern checklist:

  • Peer dependency conflicts
  • Type/export changes (especially @types/* packages)
  • Config file compatibility
  • Module format changes (ESM/CJS)
  • React/bundler duplicate-instance detection (pnpm why react)
  • Monorepo impact across workspace packages

Dependency scope classification:

  • Identifies whether the dep is runtime/devDependency/build tool/type definitions
  • Parses dependabot title patterns including group bumps

Complexity assessment with early exit:

  • If the fix requires architectural changes, design decisions, or Claude isn't confident → stop early
  • Post a detailed PR comment explaining what broke, what changed upstream, and what a human needs to do

Structured assessment comment on every run:

  • Package info, scope, workspace location
  • What changed upstream (with changelog links)
  • Migration concerns checked
  • What broke and what was fixed
  • Verification results (build/lint/test)
  • Notes for reviewer

Other:

  • 50 max turns (up from 30) for complex migrations
  • WebSearch + WebFetch added to allowed tools for changelog/migration guide research

Test plan

  • Verify Claude researches changelogs before attempting fixes
  • Verify Claude does NOT revert dependency bumps
  • Verify Claude posts a structured assessment comment
  • Verify Claude stops early and comments if fix is too complex
  • Verify the additional tools (WebSearch, WebFetch) work in the action context

…smarter fixing

Rewrites the Fix Dependabot PRs workflow from a single job that
duplicated build/lint/test internally to a two-job architecture that
waits for all CI workflows to complete and uses Claude to fix failures
with full context.

Structure:
- Job 1 (regen-lockfile): guard + regen pnpm-lock.yaml + push
- Job 2 (fix-failures): poll check runs API for all CI workflows,
  collect failure logs, invoke Claude with detailed fix instructions

Key improvements:
- Captures failures from ALL CI workflows (unit tests, E2E CLI,
  Web CLI Playwright E2E, security audit) instead of only internal
  build/lint/test
- Claude researches changelogs before fixing (ordered: GitHub Releases,
  CHANGELOG.md, npm, web search) and cross-references against codebase
- Explicit ban on reverting/downgrading dependency bumps
- Proactive migration concern checklist (peer deps, type changes,
  config files, module format, React/bundler compat, monorepo impact)
- Early exit with detailed PR comment if fix is too complex
- Structured assessment comment on every run
- 50 turns instead of 30 for complex migrations
- Randomised heredoc delimiters to prevent log content collision
- Concurrency group to prevent duplicate polling
- Fails explicitly on polling timeout with no data
@vercel
Copy link
Copy Markdown

vercel bot commented Apr 16, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
cli-web-cli Ready Ready Preview, Comment Apr 16, 2026 2:17pm

Request Review

@claude-code-ably-assistant
Copy link
Copy Markdown

Walkthrough

This PR rewrites the fix-failures job in the Fix Dependabot PRs workflow to give Claude far better context and guardrails when repairing CI failures after dependency bumps. The previous implementation duplicated CI steps internally (missing failures from separate workflows like E2E and Security Audit) and gave Claude a minimal prompt that led to at least one case of reverting the dependency bump instead of migrating code forward. The new design polls the GitHub check runs API to collect failures from all CI workflows and uses a structured, multi-step prompt that enforces a "fix forward, never revert" policy.

Changes

Area Files Summary
Config / CI .github/workflows/dependabot-lockfile.yml Rewrites fix-failures job: collects failures from all CI workflows via polling, adds WebSearch/WebFetch tools, raises --max-turns from 30 → 50, and replaces the minimal Claude prompt with a 6-step structured prompt

Review Notes

  • Single-file change — all 167 additions and 17 deletions are confined to the one workflow file; no source code, tests, or docs are touched.
  • Claude prompt is the key change — reviewers should read the new prompt carefully. It introduces a hard rule against reverting/downgrading dependency bumps, structured changelog research steps, a migration-concern checklist (peer deps, ESM format, React deduplication, monorepo impact), a fix-vs-stop decision framework, and a mandatory structured assessment comment template.
  • New tools granted to ClaudeWebSearch and WebFetch are added to --allowedTools. This is intentional (changelog research), but reviewers should be comfortable with Claude being able to make outbound web requests during the automated fix run.
  • --max-turns increase — raised from 30 to 50 to accommodate the more thorough research + fix + verification cycle. This may increase Claude API cost per invocation.
  • Polling for check runs — the wait-for-checks step polls the GitHub check runs API until all CI workflows complete before handing results to Claude. The failure path (polling timeout) explicitly fails the job and logs pending checks for debugging.
  • Randomised heredoc delimiters — prevents content-collision in $GITHUB_OUTPUT when log output contains heredoc-like strings; low-risk but worth a sanity-check.
  • Concurrency group — prevents duplicate polling runs when the lockfile-regeneration commit re-triggers the workflow; ensures only one fix-failures run is active per PR branch at a time.
  • No migration or deployment steps needed — this is a CI workflow change only; it takes effect on the next triggered Dependabot PR.

Copy link
Copy Markdown

@claude-code-ably-assistant claude-code-ably-assistant bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: refactor(ci): rewrite fix-dependabot to capture all CI failures with smarter fixing

This PR rewrites the Claude prompt in the fix-failures job (the polling infrastructure and two-job architecture appear to have been introduced in a previous commit on this branch). The actual diff is a prompt engineering improvement plus adding WebSearch/WebFetch tools and raising --max-turns from 30 to 50.

What changed (actual diff)

  • Replaced the minimal 5-instruction prompt with a structured 6-step guide
  • Added explicit "never revert/downgrade" rule (directly fixes the regression from #332)
  • Added changelog research steps (GitHub Releases → CHANGELOG.md → npm → web search)
  • Added proactive migration concern checklist (peer deps, types, config files, ESM, React compat, monorepo)
  • Added structured assessment comment template — gives reviewers an audit trail on every run
  • Added "stop and comment" decision gate for complex migrations
  • --max-turns 3050 and WebSearch/WebFetch tools enabled for research

Issues found

None that would cause bugs or breakage.

One thing worth noting for awareness: failure logs are interpolated directly into the Claude prompt. The logs from failed CI runs land verbatim in the prompt via ${{ steps.wait-for-checks.outputs.failure_logs }}. A compromised npm package could embed prompt-injection text in its build output. The allowed_bots: "dependabot[bot]" gate limits exposure to genuine Dependabot PRs, but the threat surface exists at the dependency supply chain layer. This is inherent to this design pattern and pre-existing — no change needed unless you want to document it.

Minor observations (not issues)

  • The "Step 2b" label is slightly awkward — looks like it was an afterthought insertion. Purely cosmetic.
  • EXPECTED_CHECKS includes "setup" (Web CLI E2E build prep) but not the downstream Playwright jobs. If build prep passes but Playwright fails independently, it's still collected as a failed_check via the conclusion filter — just confirming the intent is correct.
  • The 500-line truncation per workflow run (tail -n 500) is pragmatic. Very verbose test suites could lose the actual error, but it's a reasonable prompt-size trade-off.

Verdict

The changes are solid. The structured prompt with explicit research steps, the "fix forward" mandate, and the "stop and comment" gate are meaningful improvements over the original. The assessment comment template gives reviewers a useful audit trail. Looks good to merge.

@umair-ably umair-ably requested a review from sacOO7 April 16, 2026 14:24
```
claude_args: |
--max-turns 30
--max-turns 50
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--max-turns 50 seems like a lot, we will be reversing it back to minimal value after testing right

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah still tinkering with it... I had a PR that hit the 30 limit and it only cost a dollar!

@umair-ably umair-ably changed the title refactor(ci): rewrite fix-dependabot to capture all CI failures with smarter fixing refactor(ci): adds more detail to the fix-dependabot claude prompt Apr 16, 2026
Copy link
Copy Markdown
Contributor

@sacOO7 sacOO7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@umair-ably umair-ably merged commit 893c51f into main Apr 16, 2026
15 checks passed
@umair-ably umair-ably deleted the refactor/dependabot-workflow-v2 branch April 16, 2026 14:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants