feat(advisor): run PR review advisor as focused turns#4281
Conversation
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
📝 WalkthroughWalkthroughReplaces a single monolithic advisor prompt with a multi-turn conversation: session runner now accepts ordered promptTurns and captures per-turn text; PR-review analyzer builds system prompt from a trusted security skill, constructs/writes per-turn prompt artifacts, and invokes the SDK with promptTurns. Tests validate prompt-turns and artifact writing. ChangesMulti-turn advisor system
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
ESLint skipped: no ESLint configuration detected in root package.json. To enable, add Comment |
E2E Advisor RecommendationRequired E2E: None Full advisor summaryE2E Recommendation AdvisorBase: Required E2E
Optional E2E
New E2E recommendations
|
E2E Scenario Advisor RecommendationRequired scenario E2E: None Full scenario advisor summaryE2E Scenario AdvisorBase: Required scenario E2E
Optional scenario E2E
Relevant changed files
|
PR Review AdvisorFindings: 0 needs attention, 1 worth checking, 0 nice ideas Review findings🛠️ Needs attention
🔎 Worth checking
🌱 Nice ideas
Since last review detailsCurrent findings:
This is an automated advisory review. A human maintainer must make the final merge decision. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/pr-review-advisor.test.ts`:
- Around line 232-240: The test builds `written` from
`fs.readdirSync(path.join(tmp, "prompts")).map((file) => \`prompts/${file}\`)`
and then asserts ordered equality, but `fs.readdirSync()` ordering is
non-deterministic; update the test to sort the filenames before asserting (e.g.,
call `.sort()` on the mapped array or sort the result of `readdirSync` first) so
the assertion compares deterministically ordered arrays.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: e964446e-794b-4acb-a142-32684f9a0047
📒 Files selected for processing (5)
test/pr-review-advisor.test.tstools/advisors/session.mtstools/e2e-advisor/analyze.mtstools/pr-review-advisor/README.mdtools/pr-review-advisor/analyze.mts
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
|
Addressed CodeRabbit discussion r3307795065 in e135348 by sorting the prompt artifact filenames before asserting ordered equality. Validated with |
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
tools/pr-review-advisor/analyze.mts (2)
687-702:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winEmit the actual fallback security rubric when the trusted skill is unavailable.
If
readTrustedSecurityReviewSkill()returns"", this prompt only says to fall back to a built-in 9-category review, but it never includes the categories fromSECURITY_CATEGORIES. That leaves the degraded path running without the structured security rubric it claims to use.Suggested fix
export function buildSystemPrompt(): string { const securityReviewSkill = readTrustedSecurityReviewSkill(); + const embeddedSecurityRubric = securityReviewSkill || [ + "Trusted security review skill was unavailable; use this built-in 9-category security rubric instead:", + ...SECURITY_CATEGORIES.map((category, index) => `${index + 1}. ${category}`), + ].join("\n"); return [ @@ "Trusted security review skill from main checkout:", "```markdown", - securityReviewSkill || "Security review skill was unavailable; fall back to the built-in 9-category security review.", + embeddedSecurityRubric, "```",🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tools/pr-review-advisor/analyze.mts` around lines 687 - 702, The prompt currently omits the actual fallback rubric when readTrustedSecurityReviewSkill() returns empty; update buildSystemPrompt() to compute an embeddedSecurityRubric (e.g., join SECURITY_CATEGORIES into a markdown list or formatted string) and replace the fallback placeholder securityReviewSkill || "Security review skill was unavailable; fall back to the built-in 9-category security review." with embeddedSecurityRubric so the degraded path includes the full built-in security rubric; refer to buildSystemPrompt and readTrustedSecurityReviewSkill to locate where to build and insert embeddedSecurityRubric (using SECURITY_CATEGORIES or the existing rubric source to generate the text).
742-745:⚠️ Potential issue | 🟠 Major | ⚡ Quick winEscape PR-controlled diff content before fencing it.
diffis untrusted PR input, but it is interpolated verbatim inside a fenced Markdown block. A diff line containing triple backticks can close that fence and surface attacker-controlled text as normal instructions for later turns, which weakens the prompt-injection boundary this advisor depends on.Suggested fix
- Git diff, truncated if large: - \`\`\`diff - ${diff || "<no diff available>"} - \`\`\` + Git diff, truncated if large: + ${fencedBlock(diff || "<no diff available>", "diff")} `,Add a small helper so the fence is always longer than any backtick run in the diff:
function fencedBlock(content: string, language = ""): string { const longestBacktickRun = Math.max(0, ...Array.from(content.matchAll(/`+/g), (match) => match[0].length)); const fence = "`".repeat(Math.max(3, longestBacktickRun + 1)); return `${fence}${language}\n${content}\n${fence}`; }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tools/pr-review-advisor/analyze.mts` around lines 742 - 745, The PR exposes untrusted `diff` into a triple-backticked markdown fence, allowing backtick runs in PR content to break the fence; add a helper named fencedBlock(content: string, language = "") that computes the longest backtick run and builds a fence longer than that, then replace the current inline interpolation (`\`\`\`diff\n${diff || "<no diff available>"}\n\`\`\``) with a call to fencedBlock(diff || "<no diff available>", "diff") so the generated block always uses a safe, longer fence and prevents injected closing fences.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@tools/pr-review-advisor/analyze.mts`:
- Around line 687-702: The prompt currently omits the actual fallback rubric
when readTrustedSecurityReviewSkill() returns empty; update buildSystemPrompt()
to compute an embeddedSecurityRubric (e.g., join SECURITY_CATEGORIES into a
markdown list or formatted string) and replace the fallback placeholder
securityReviewSkill || "Security review skill was unavailable; fall back to the
built-in 9-category security review." with embeddedSecurityRubric so the
degraded path includes the full built-in security rubric; refer to
buildSystemPrompt and readTrustedSecurityReviewSkill to locate where to build
and insert embeddedSecurityRubric (using SECURITY_CATEGORIES or the existing
rubric source to generate the text).
- Around line 742-745: The PR exposes untrusted `diff` into a triple-backticked
markdown fence, allowing backtick runs in PR content to break the fence; add a
helper named fencedBlock(content: string, language = "") that computes the
longest backtick run and builds a fence longer than that, then replace the
current inline interpolation (`\`\`\`diff\n${diff || "<no diff
available>"}\n\`\`\``) with a call to fencedBlock(diff || "<no diff available>",
"diff") so the generated block always uses a safe, longer fence and prevents
injected closing fences.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 41ee63b7-326c-4fe0-a9a3-cabe7f5fc205
📒 Files selected for processing (2)
test/pr-review-advisor.test.tstools/pr-review-advisor/analyze.mts
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
|
Addressed CodeRabbit review 4369142535 in 85eca01: added the built-in 9-category security rubric to the unavailable-skill fallback, and wrapped dynamic prompt content with fences longer than any backtick run in the content. Validated with |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tools/pr-review-advisor/analyze.mts (1)
573-587:⚠️ Potential issue | 🟠 Major | ⚡ Quick winKeep overlap discovery from breaking the rest of GitHub context.
This
Promise.all(...)makes the optional open-PR scan a single point of failure. If the overlaps path hits a transient API/permission/rate-limit error, we droppullRequest,previousAdvisorReview, and linked-issue extraction too, even when those endpoints were fine. Making the overlap fetch best-effort would preserve the core review context.Suggested direction
- const [pullRequest, issueComments, openPulls] = await Promise.all([ + const [pullRequestResult, issueCommentsResult, openPullsResult] = await Promise.allSettled([ githubRest<unknown>(`repos/${repo}/pulls/${prNumber}`, token), githubRestPaginated<unknown>(`repos/${repo}/issues/${prNumber}/comments`, token, 100), githubRestPaginated<unknown>(`repos/${repo}/pulls?state=open&sort=updated&direction=desc`, token, 100), ]); - context.pullRequest = pullRequest; - context.previousAdvisorReview = extractPreviousAdvisorReview(issueComments); + + const pullRequest = pullRequestResult.status === "fulfilled" ? pullRequestResult.value : undefined; + const issueComments = issueCommentsResult.status === "fulfilled" ? issueCommentsResult.value : undefined; + const openPulls = openPullsResult.status === "fulfilled" ? openPullsResult.value : []; + + if (pullRequest) context.pullRequest = pullRequest; + if (issueComments) context.previousAdvisorReview = extractPreviousAdvisorReview(issueComments); + + if (openPullsResult.status === "rejected") { + context.fetchError = openPullsResult.reason instanceof Error + ? openPullsResult.reason.message + : String(openPullsResult.reason); + }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tools/pr-review-advisor/analyze.mts` around lines 573 - 587, The current Promise.all groups the open PR scan with core fetches so any failure in the open-PR path (githubRestPaginated for openPulls / collectOpenPrOverlaps) aborts all context setup; split the calls so pullRequest and issueComments (and extraction of previousAdvisorReview and linked issues) are awaited together but fetch openPulls and run collectOpenPrOverlaps in a separate try/catch. Specifically: keep the existing githubRest(...) and githubRestPaginated(...) calls used to populate pullRequest, issueComments, and openPulls for core context (refs: pullRequest, issueComments, getPath, extractPreviousAdvisorReview, extractIssueRefs, collectLinkedIssue), then call collectOpenPrOverlaps(repo, prNumber, token, openPulls, issueNumbers) inside its own try/catch and on failure assign context.openPrOverlaps = [] (and log a warning) so transient errors in collectOpenPrOverlaps or githubRestPaginated for openPulls don’t prevent setting context.pullRequest, context.previousAdvisorReview, or context.linkedIssues.
🧹 Nitpick comments (1)
tools/pr-review-advisor/analyze.mts (1)
805-813: ⚡ Quick winTrim the PR payload before embedding it into the prompt.
pullRequest: context.github?.pullRequestinjects the full REST PR object into the model context, including surfaces the system prompt explicitly says not to review (mergeability/reviewer-status/CI-related metadata), and it burns tokens on fields the acceptance turn does not need. A small prompt-safe projection here would keep the turn deterministic and better aligned with the rubric.Suggested direction
function buildValidationTurnContext(context: DeterministicReviewContext): Record<string, unknown> { return { testDepth: context.testDepth, localizedPatchSignals: context.localizedPatchSignals, previousAdvisorReview: context.previousAdvisorReview, - pullRequest: context.github?.pullRequest ?? null, + pullRequest: promptSafePullRequest(context.github?.pullRequest), linkedIssues: context.github?.linkedIssues ?? [], githubFetchError: context.github?.fetchError, }; } + +function promptSafePullRequest(pullRequest: unknown): Record<string, unknown> | null { + if (!isRecord(pullRequest)) return null; + return { + title: stringOrDefault(pullRequest.title, ""), + body: stringOrDefault(pullRequest.body, ""), + headRef: stringOrUndefined(getPath<unknown>(pullRequest, ["head", "ref"])) ?? null, + labels: recordItems(getPath<unknown>(pullRequest, ["labels"])) + .map((label) => stringOrUndefined(label.name)) + .filter((label): label is string => Boolean(label)), + }; +}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tools/pr-review-advisor/analyze.mts` around lines 805 - 813, The code currently injects the entire GitHub PR object via pullRequest: context.github?.pullRequest which exposes CI/reviewer/mergeability metadata and wastes tokens; in buildValidationTurnContext replace that direct pass-through with a small, explicit projection (e.g. pick number, title, body/description, user.login, state, draft flag, base.ref/head.ref, labels as names, created_at/updated_at, and high-level size metrics like additions/deletions/changed_files) and return null when missing so the prompt receives only safe, deterministic fields; update the buildValidationTurnContext function to construct this trimmed object from context.github?.pullRequest instead of passing the full object.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@tools/pr-review-advisor/analyze.mts`:
- Around line 573-587: The current Promise.all groups the open PR scan with core
fetches so any failure in the open-PR path (githubRestPaginated for openPulls /
collectOpenPrOverlaps) aborts all context setup; split the calls so pullRequest
and issueComments (and extraction of previousAdvisorReview and linked issues)
are awaited together but fetch openPulls and run collectOpenPrOverlaps in a
separate try/catch. Specifically: keep the existing githubRest(...) and
githubRestPaginated(...) calls used to populate pullRequest, issueComments, and
openPulls for core context (refs: pullRequest, issueComments, getPath,
extractPreviousAdvisorReview, extractIssueRefs, collectLinkedIssue), then call
collectOpenPrOverlaps(repo, prNumber, token, openPulls, issueNumbers) inside its
own try/catch and on failure assign context.openPrOverlaps = [] (and log a
warning) so transient errors in collectOpenPrOverlaps or githubRestPaginated for
openPulls don’t prevent setting context.pullRequest,
context.previousAdvisorReview, or context.linkedIssues.
---
Nitpick comments:
In `@tools/pr-review-advisor/analyze.mts`:
- Around line 805-813: The code currently injects the entire GitHub PR object
via pullRequest: context.github?.pullRequest which exposes
CI/reviewer/mergeability metadata and wastes tokens; in
buildValidationTurnContext replace that direct pass-through with a small,
explicit projection (e.g. pick number, title, body/description, user.login,
state, draft flag, base.ref/head.ref, labels as names, created_at/updated_at,
and high-level size metrics like additions/deletions/changed_files) and return
null when missing so the prompt receives only safe, deterministic fields; update
the buildValidationTurnContext function to construct this trimmed object from
context.github?.pullRequest instead of passing the full object.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 7bf92c06-43df-401a-8149-36da7249bb2c
📒 Files selected for processing (2)
test/pr-review-advisor.test.tstools/pr-review-advisor/analyze.mts
🚧 Files skipped from review as they are similar to previous changes (1)
- test/pr-review-advisor.test.ts
Summary
This PR changes the PR Review Advisor from a single large Pi prompt into a multi-turn conversation within one Pi session. It also writes each prompt as a separate ordered artifact so maintainers can inspect the system prompt and each review turn directly.
Changes
prompts/00-system.mdthroughprompts/04-synthesize-json.mdinstead of one combined prompt file.Type of Change
Verification
npx prek run --all-filespassesnpm testpassesmake docsbuilds without warnings (doc changes only)Signed-off-by: Carlos Villela cvillela@nvidia.com
Summary by CodeRabbit
New Features
Documentation
Behavior
Tests