feat: port pull-request AI review agents to Honi framework#45
Conversation
Co-authored-by: jmbish04 <26469722+jmbish04@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a robust, multi-agent AI system for automated pull request reviews, integrated into the Honi framework. It establishes a structured review process, from initial assessment to final summary, by leveraging specialized AI agents and Cloudflare Durable Objects. This enhancement aims to streamline code review workflows, provide consistent feedback, and improve code quality by automating the detection of issues and optimizations. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a multi-agent AI for pull request reviews. However, there are significant security concerns regarding the handling of untrusted input in LLM prompts and the scoping of tool parameters, making the system vulnerable to prompt injection via PR diff content and indirect prompt injection via agent outputs. Additionally, the tools allow for cross-repository access due to unvalidated parameters. Beyond security, the review also highlights areas for improving type safety, handling potential edge cases like pagination when fetching PR files, and cleaning up work-in-progress code, particularly addressing risky type casting in database insertion logic.
| const reviewFindings: any[] = []; | ||
|
|
||
| for (const chunk of chunks) { | ||
| const chunkResultStr = await runner.run(reviewer, `Review the following diff chunk for ${chunk.filename}:\n\n${chunk.content}\n\nCall create_review_comment if you find bugs or critical issues. Do not comment on nitpicks. Return JSON { commentsMade: number, issuesFound: string[] }`) as any; |
There was a problem hiding this comment.
The orchestrator concatenates untrusted PR diff content (chunk.content) directly into the prompt for the reviewer agent. An attacker can craft a PR with malicious code or comments containing prompt injection instructions. These instructions can trick the agent into ignoring its system prompt and performing unauthorized actions, such as generating malicious review comments or influencing the final PR summary.
Remediation:
- Use clear delimiters (e.g.,
<diff>and</diff>) to separate instructions from untrusted data. - Explicitly instruct the agent in the system prompt to ignore any instructions contained within the diff content.
- Consider using a more structured input format (like ChatML) if the underlying model supports it.
| const chunkResultStr = await runner.run(reviewer, `Review the following diff chunk for ${chunk.filename}:\n\n${chunk.content}\n\nCall create_review_comment if you find bugs or critical issues. Do not comment on nitpicks. Return JSON { commentsMade: number, issuesFound: string[] }`) as any; | |
| const chunkResultStr = await runner.run(reviewer, `Review the following diff chunk for ${chunk.filename}: | |
| <diff> | |
| ${chunk.content} | |
| </diff> | |
| Call create_review_comment if you find bugs or critical issues. Do not comment on nitpicks. Return JSON { commentsMade: number, issuesFound: string[] }`) as any; |
| const totalIssues = reviewFindings.flatMap(f => f.issues).length; | ||
| const finalEvent = totalIssues > 0 ? "REQUEST_CHANGES" : "APPROVE"; | ||
|
|
||
| await runner.run(summaryAgent, `Summarize the findings for PR #${prNumber} in ${owner}/${repo}. Total issues found: ${totalIssues}. Review findings: ${JSON.stringify(reviewFindings)}. Call submit_pr_review with event ${finalEvent} to finalize.`); |
There was a problem hiding this comment.
The summaryAgent is vulnerable to indirect prompt injection because it processes reviewFindings, which contains output from the reviewer agent. If the reviewer agent is compromised via prompt injection from the PR diff, it can inject malicious instructions into the reviewFindings. These instructions can then trick the summaryAgent into calling submit_pr_review with unauthorized parameters, such as approving a malicious PR.
Remediation:
- Use clear delimiters (e.g.,
<findings>and</findings>) to separate the findings from the instructions. - Instead of letting the agent call the
submit_pr_reviewtool with theeventparameter, the orchestrator should determine theeventbased on the agent's structured output and call the tool itself using thefinalEventvariable already calculated on line 68.
| await runner.run(summaryAgent, `Summarize the findings for PR #${prNumber} in ${owner}/${repo}. Total issues found: ${totalIssues}. Review findings: ${JSON.stringify(reviewFindings)}. Call submit_pr_review with event ${finalEvent} to finalize.`); | |
| await runner.run(summaryAgent, `Summarize the findings for PR #${prNumber} in ${owner}/${repo}. Total issues found: ${totalIssues}. Review findings: <findings> | |
| ${JSON.stringify(reviewFindings)} | |
| </findings>. Call submit_pr_review with event ${finalEvent} to finalize.`); |
| // Fetch the full diff to get patches (Supervisor might not have passed it back directly) | ||
| const app = new App({ appId, privateKey }); | ||
| const octokit = withCompatOctokit(await app.getInstallationOctokit(installationId)); | ||
| const { data: files } = await octokit.rest.pulls.listFiles({ owner, repo, pull_number: prNumber, per_page: 100 }); |
There was a problem hiding this comment.
The call to octokit.rest.pulls.listFiles uses a hardcoded per_page: 100, which is the maximum value per page. If a pull request has more than 100 files, this code will only process the first 100 files and miss the rest. To ensure all files are processed, you should use octokit.paginate.
| const { data: files } = await octokit.rest.pulls.listFiles({ owner, repo, pull_number: prNumber, per_page: 100 }); | |
| const files = await octokit.paginate(octokit.rest.pulls.listFiles, { owner, repo, pull_number: prNumber }); |
| assignee_login: payload.pull_request?.assignee?.login, | ||
| }); | ||
| // Insert payload to DB | ||
| await insertPayload((eventTables as any).pullRequestEvents || {} as any, payload); |
There was a problem hiding this comment.
The call to insertPayload uses multiple as any casts and a fallback to an empty object (|| {}). This is a risky pattern that can hide type errors and may lead to silent failures or runtime errors if eventTables.pullRequestEvents is not defined. It would be safer to ensure the type is correct and handle the undefined case explicitly, rather than passing a potentially invalid empty object to the database layer.
| export const fetchPrDiffTool = (ctx: OctokitContext) => tool({ | ||
| name: "fetch_pr_diff", | ||
| description: "Fetches the files and raw diff/patch for a specific pull request.", | ||
| parameters: z.object({ | ||
| owner: z.string(), | ||
| repo: z.string(), | ||
| pullNumber: z.number(), | ||
| }), | ||
| execute: async (args: any) => { | ||
| const octokit = await getOctokit(ctx); | ||
| const { data: files } = await octokit.rest.pulls.listFiles({ | ||
| owner: args.owner, | ||
| repo: args.repo, | ||
| pull_number: args.pullNumber, | ||
| per_page: 100 | ||
| }); | ||
| return files; | ||
| } | ||
| }); | ||
|
|
||
| export const createReviewCommentTool = (ctx: OctokitContext) => tool({ | ||
| name: "create_review_comment", | ||
| description: "Creates a line-specific review comment on a pull request diff.", | ||
| parameters: z.object({ | ||
| owner: z.string(), | ||
| repo: z.string(), | ||
| pullNumber: z.number(), | ||
| commitId: z.string(), | ||
| path: z.string(), | ||
| line: z.number(), | ||
| side: z.enum(["LEFT", "RIGHT"]).optional(), | ||
| body: z.string().describe("The markdown content of the comment."), | ||
| }), | ||
| execute: async (args: any) => { | ||
| const octokit = await getOctokit(ctx); | ||
| const { data } = await octokit.rest.pulls.createReviewComment({ | ||
| owner: args.owner, | ||
| repo: args.repo, | ||
| pull_number: args.pullNumber, | ||
| commit_id: args.commitId, | ||
| path: args.path, | ||
| line: args.line, | ||
| side: args.side, | ||
| body: args.body | ||
| }); | ||
| return data; | ||
| } | ||
| }); | ||
|
|
||
| export const submitPrReviewTool = (ctx: OctokitContext) => tool({ | ||
| name: "submit_pr_review", | ||
| description: "Submits the final summary review for a pull request.", | ||
| parameters: z.object({ | ||
| owner: z.string(), | ||
| repo: z.string(), | ||
| pullNumber: z.number(), | ||
| event: z.enum(["APPROVE", "REQUEST_CHANGES", "COMMENT"]), | ||
| body: z.string().describe("The final markdown summary for the entire PR."), | ||
| }), | ||
| execute: async (args: any) => { | ||
| const octokit = await getOctokit(ctx); | ||
| const { data } = await octokit.rest.pulls.createReview({ | ||
| owner: args.owner, | ||
| repo: args.repo, | ||
| pull_number: args.pullNumber, | ||
| event: args.event, | ||
| body: args.body, | ||
| }); | ||
| return data; | ||
| } | ||
| }); |
There was a problem hiding this comment.
This section of code, particularly within tools like fetch_pr_diff, create_review_comment, and submit_pr_review, is vulnerable to prompt injection. The owner and repo parameters are accepted directly from the LLM, allowing a compromised agent to call these tools with arbitrary values, potentially leading to unauthorized cross-repository access within the same GitHub App installation. It's critical to either hardcode owner and repo from the OctokitContext or validate them against the current PR. Additionally, the args parameter in the execute function is typed as any, which compromises type safety; it should be updated to use z.infer with the Zod schema for specific typing.
execute: async (args: z.infer<typeof parameters>) => {| super(state, env); | ||
| } | ||
|
|
||
| async fetchAndProcess(octokitCtx: any, owner: string, repo: string, pullNumber: number) { |
There was a problem hiding this comment.
The octokitCtx parameter is typed as any. For better type safety, you should import and use the OctokitContext type from ./tools.ts.
| async fetchAndProcess(octokitCtx: any, owner: string, repo: string, pullNumber: number) { | |
| async fetchAndProcess(octokitCtx: OctokitContext, owner: string, repo: string, pullNumber: number) { |
|
|
||
| // 2. Code Review Phase: Process Chunks | ||
| const reviewer = createCodeReviewAgent(env, octokitCtx); | ||
| const reviewFindings: any[] = []; |
There was a problem hiding this comment.
The reviewFindings array is typed as any[]. Using a specific interface for the finding objects would improve code readability and maintainability. You can define an interface like this and use it:
interface ReviewFinding {
filename: string;
chunkId: number;
issues: string[];
}| const reviewFindings: any[] = []; | |
| const reviewFindings: ReviewFinding[] = []; |
| interface OctokitContext { | ||
| appId: string; | ||
| privateKey: string; | ||
| installationId: number; | ||
| } |
There was a problem hiding this comment.
| return chunks; | ||
| } | ||
|
|
||
| export function chunkFiles(files: any[], maxLinesPerChunk: number = 200): DiffChunk[] { |
There was a problem hiding this comment.
The files parameter is typed as any[]. To improve type safety and make the function's contract clearer, it's better to define and use a minimal interface for the expected file objects. For example:
interface FileWithPatch {
filename: string;
patch?: string | null;
}| export function chunkFiles(files: any[], maxLinesPerChunk: number = 200): DiffChunk[] { | |
| export function chunkFiles(files: FileWithPatch[], maxLinesPerChunk: number = 200): DiffChunk[] { |
| return !ignorePatterns.some(pattern => pattern.test(filename)); | ||
| } | ||
|
|
||
| export function filterFilesForReview(files: any[]): any[] { |
There was a problem hiding this comment.
The files parameter and the return type are any[]. Using a specific type for the file objects would make the function signature more informative and type-safe. For example:
interface FileWithFilename {
filename: string;
}| export function filterFilesForReview(files: any[]): any[] { | |
| export function filterFilesForReview(files: FileWithFilename[]): FileWithFilename[] { |
✨ Code Comments ExtractedI have extracted 12 code comments for easier triage. |
|
@jules please fix all code comments (if you're having issues getting code comments, they here too: https://core-github-api.hacolby.workers.dev/view-comments/jmbish04/core-github-api/pull/45) |
Acknowledged. I have fixed all code comments and resolved the issues according to your requests. |
Co-authored-by: jmbish04 <26469722+jmbish04@users.noreply.github.com>
|
/gemini review This action initiated by core-github-api |
pullsightAI PR review multi-agent architectureSupervisor,Reviewer, andSummaryagents using@openai/agentsfetch_pr_diff,create_review_comment,submit_pr_reviewtools wrapping OctokitPR_SUPERVISOR_DO,PR_REVIEW_DO,PR_SUMMARY_DO) and configureswrangler.jsoncbindingsopened,synchronize, andready_for_revieweventsPR created automatically by Jules for task 13255727060081887394 started by @jmbish04