Add local code review subagent and multi-model PR review support #13855
Conversation
✅ Snyk checks have passed. No issues have been found so far.
💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse. |
✅ PR preview is ready!
|
SummaryThis PR refactors the AI code review infrastructure by extracting shared code review instructions into a single source of truth ( Changed files:
Code QualityWell-structured refactoring. The shared instructions approach follows the DRY principle and is consistent with the existing pattern in the codebase (e.g.,
Minor note: The review instructions now reference One observation on Test CoverageNo new tests are added. This is appropriate — the changes are entirely to CI tooling, agent configuration, and build-time scripts. These are not part of the Streamlit library itself, and the sync mechanism is a development-time utility invoked manually or via Backwards CompatibilityNo backwards compatibility concerns. The changes affect only internal CI tooling and developer-facing agent configuration. No library code, public API, protobuf definitions, or user-facing behavior is modified. Security & RiskLow risk. The changes are scoped to CI configuration and developer tooling.
AccessibilityNo frontend changes — not applicable. Recommendations
VerdictAPPROVED: Clean refactoring that extracts shared code review instructions into a single source of truth, adds a well-configured local review agent, and improves maintainability of the AI review infrastructure with no risk to library functionality. This is an automated AI review. Please verify the feedback and use your judgment. Model: |
There was a problem hiding this comment.
Pull request overview
Adds a reusable “code review instructions” document and wires it into both the GitHub Actions AI PR review workflow and a new local .claude review agent, with a small generator script helper to keep instructions in sync.
Changes:
- Introduce
scripts/assets/code-review-instructions.mdas the shared review prompt content. - Update the AI PR review workflow to embed the shared instructions into the agent prompt.
- Add a new
.claude/agents/reviewing-local-changes.mdagent and a generator sync step to keep its checklist section aligned.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
scripts/generate_agent_rules.py |
Adds a sync routine to propagate shared review instructions into one or more .claude agent files. |
scripts/assets/code-review-instructions.md |
New shared, canonical code review checklist/instructions used by automation and local agent docs. |
.github/workflows/ai-pr-review.yml |
Refactors the prompt to include the shared instructions file and updates guidance/examples. |
.claude/skills/fixing-streamlit-ci/SKILL.md |
Adjusts the “gather context” section to prefer full diffs over --stat. |
.claude/agents/reviewing-local-changes.md |
Adds a local read-only code review agent definition and instructions. |
.claude/.gitignore |
Un-ignores the new .claude/agents/ content so it can be committed. |
- Change default model to multi-model: gpt-5.2-codex-high,opus-4.6-thinking - Fix concurrency group to use direct input reference instead of env variable - Add placeholder test and comment in alert module
|
|
||
| Review this branch's changes and ensure the changes are bug-free, backwards compatible, and ready for merge. | ||
|
|
||
| ## Review Checklist |
There was a problem hiding this comment.
This section below is auto-synced with the code-review-instructions and shared between the ai-pr-review job
1. Guard against empty/whitespace model input in parse step
- Check if MODEL is empty/whitespace and fall back to default
- Verify at least one model after parsing, fall back if needed
2. Add checkout to consolidate-reviews job
- Now gh/git are actually available as the prompt states
- Added GH_TOKEN env for gh commands
3. Use environment variables in consolidate-reviews job
- Added MODELS_JSON and JUDGE_MODEL as env variables
- Use shell variables instead of ${{ }} for security consistency
1. Fix JSON quotes breaking sed: - Use awk with ENVIRON instead of sed for placeholder substitution - ENVIRON safely accesses env vars without shell parsing issues - JSON like ["a","b"] no longer breaks the substitution command 2. Improve fallback documentation: - Clarify that judge model review is used as deterministic fallback - Explain this prevents random selection from multiple reviews
SummaryThis PR adds multi-model AI code review support to the CI pipeline and introduces a local code review sub-agent for Claude. The key changes:
Code QualityReviewers agree: The overall structure and separation of responsibilities are well-designed. The four-job pipeline is clean, with clear job boundaries and comprehensive fallback logic (consolidated review → judge model's individual review → failure notice). Security practices are solid (both reviewers concur): User-provided model names are handled through shell variables rather than Duplicated sanitization logic (raised by opus-4.6-thinking): The model-name sanitization sed pipeline ( "No write operations" instruction clarity (raised by gpt-5.2-codex-high): The shared instructions say "Do NOT attempt to post comments, edit PRs, or perform any write operations" while the CI workflow appends a "Finalize Review" section that asks agents to write Python code ( Test CoverageBoth reviewers agree: No automated tests are needed. These are CI infrastructure and agent configuration changes, not user-facing library code. The workflow itself will be validated by running it on actual PRs, and the pre-commit hook ( Backwards CompatibilityBoth reviewers agree: No breaking changes. Single-model usage still works — when a single model is specified, the consolidation step is correctly skipped via the Security & RiskBoth reviewers agree: Low risk. The workflow remains read-only for the agent jobs, the local agent has defense-in-depth with AccessibilityNo frontend changes in this PR. Not applicable. Recommendations
VerdictAPPROVED: Well-engineered infrastructure change that adds multi-model AI review support with proper security practices, comprehensive fallback handling, and a clean single-source-of-truth pattern for review instructions. Both reviewers approved with only minor non-blocking suggestions. This is a consolidated AI review by 📋 Review by `gpt-5.2-codex-high`SummaryAdds multi-model AI PR review support (model parsing, per-model review artifacts, consolidation), introduces a shared code review instructions source synced into the local agent, and updates pre-commit and .claude ignore rules accordingly. Code QualityOverall structure and separation of responsibilities look solid. One clarity issue remains: the shared instructions explicitly forbid any write operations while the workflow later requires writing - Do NOT run linting, tests, or build commands - focus only on code review.
- Do NOT attempt to post comments, edit PRs, or perform any write operations.
- Focus on the root cause of issues, not cascading failures.
- Be specific with file references and line numbers when noting issues. $(cat scripts/assets/code-review-instructions.md)
## Finalize Review:
1. In the review footer, replace 'This is an automated AI review.' with 'This is an automated AI review by `$MATRIX_MODEL`.'
2. Write your review into a file named 'review.md' in the current directory.
3. Write either 'APPROVED' or 'CHANGES_REQUESTED' (exactly one of these) to a file named 'verdict.txt'.Test CoverageNo tests added. These are workflow and tooling changes; validation will rely on running the updated workflow in CI. Backwards CompatibilityNo user-facing runtime behavior changes. Existing AI review behavior should remain compatible, with added support for multiple models and consolidation. Security & RiskLow risk. The workflow remains read-only for the agent job and only aggregates review outputs; no new privileged operations are introduced. AccessibilityNo frontend changes. Recommendations
VerdictAPPROVED: Changes are well-scoped and improve the AI review pipeline with minimal operational risk. This is an automated AI review by 📋 Review by `opus-4.6-thinking`SummaryThis PR introduces multi-model AI code review support for the CI workflow and adds a local code review sub-agent for Claude. The key changes are:
Code QualityWorkflow structure ( Security: The workflow properly avoids script injection by using shell variables ( Python code ( Duplication of sanitization logic: The model-name sanitization ( Test CoverageThe changes are to CI infrastructure, agent configuration, and a development tooling script — none of which are user-facing library code. Automated testing is not standard for these types of changes:
No additional unit or e2e tests are needed. Backwards CompatibilityNo breaking changes. The workflow maintains full backwards compatibility:
Security & Risk
AccessibilityNo frontend changes in this PR. Not applicable. Recommendations
VerdictAPPROVED: Well-engineered infrastructure change that adds multi-model AI review support with proper security practices, comprehensive fallback handling, and a clean single-source-of-truth pattern for review instructions. This is an automated AI review by |
- Use fromJson() for explicit type coercion in model_count comparison - Document judge model selection behavior in workflow input description - Consolidate sed pipelines for cleaner shell commands - Simplify file truncation and chain awk/mv commands
SummaryThis PR enhances the AI PR review infrastructure with three main capabilities:
Supporting changes include updates to Code QualityWorkflow structure ( Security-conscious shell scripting: The workflow correctly uses environment variables instead of direct Sanitization logic duplication (minor, both reviewers noted): The model name sanitization ( Python script ( Subagent config ( Test CoverageNo tests added. Both reviewers agree this is reasonable given the nature of the changes (workflow configuration, agent prompts, build script tooling). The pre-commit hook integration provides a form of validation by running the sync on every commit that touches the relevant files. Backwards CompatibilityNo breaking changes. Both reviewers agree:
Security & RiskStrengths (both reviewers agreed):
Model parsing robustness: The Risk assessment: Low. Changes are isolated to CI tooling and don't affect the Streamlit library runtime. AccessibilityNo frontend changes — not applicable. Recommendations
Reviewer Agreement Summary
Note on disagreement: The opus reviewer recommended adding VerdictAPPROVED: Well-structured infrastructure enhancement that adds multi-model review support with proper security practices, graceful fallback handling, and a clean DRY approach for shared review instructions. The two minor robustness recommendations are non-blocking. No risk to the Streamlit library or user-facing functionality. This is a consolidated AI review by 📋 Review by `gpt-5.2-codex-high`SummaryAdds a local code review agent, introduces shared review instructions with a sync step, and upgrades the AI PR review workflow to support multi-model reviews with consolidation and a judge fallback. Code QualityWorkflow and script changes are organized and readable. One edge-case to consider: the model parsing pipeline uses Test CoverageNo tests added; changes are workflow/config and a small Python utility, so this is reasonable. Validation will come from running the workflow and pre-commit hook. Backwards CompatibilityNo user-facing or runtime API changes; impacts are limited to tooling and CI behavior. Security & RiskLow risk overall. The only concern is a workflow failure on malformed model input as noted above. AccessibilityNo frontend changes. Recommendations
VerdictAPPROVED: Solid workflow/tooling improvements with a minor robustness tweak recommended. This is an automated AI review by 📋 Review by `opus-4.6-thinking`SummaryThis PR enhances the AI PR review infrastructure with two main capabilities:
Supporting changes include updates to Code QualityWorkflow structure ( Security-conscious shell scripting: The workflow correctly uses environment variables instead of direct Sanitization logic duplication (minor): The model name sanitization ( Python script ( Subagent config ( Test CoverageThis PR modifies CI/CD infrastructure, agent configuration, and a build script — none of which have existing unit tests in the repository. The Given the nature of these changes (workflow configuration, agent prompts, script tooling), the lack of automated tests is acceptable. The pre-commit hook integration provides a form of validation by running the sync on every commit that touches the relevant files. Backwards CompatibilityNo breaking changes. The changes are purely to CI/CD infrastructure and developer tooling:
Security & RiskStrengths:
Minor observations:
Risk assessment: Low. The changes are isolated to CI tooling and don't affect the Streamlit library runtime. AccessibilityNo frontend changes — not applicable. Recommendations
VerdictAPPROVED: Well-structured infrastructure enhancement that adds multi-model review support with proper security practices, graceful fallback handling, and a clean DRY approach for shared review instructions. No risk to the Streamlit library or user-facing functionality. This is an automated AI review by |
Multi-model support remains available via comma-separated input.
SummaryThis PR adds multi-model support to the AI PR review workflow and introduces a local Claude code review subagent. The main changes are:
Code QualityThe code is well-structured and follows existing patterns in the repository. Specific observations:
Test CoverageThis PR contains no unit tests or e2e tests. Given the nature of the changes (CI workflow configuration, agent definition files, and a build-time sync script), this is acceptable:
One improvement would be adding a simple unit test for Backwards CompatibilityThis change is fully backwards compatible:
Security & Risk
AccessibilityNo frontend changes are included in this PR. No accessibility considerations apply. Recommendations
VerdictAPPROVED: Well-structured addition of multi-model review support with good security practices, clean fallback behavior, and appropriate separation of concerns. The recommendations above are minor improvements, not blocking issues. This is an automated AI review by |
There was a problem hiding this comment.
This is configured as a subagent (not a skill) so that it always starts with a fresh context and enforces read-only access.
There was a problem hiding this comment.
The shared review instructions
Dynamically detect the base branch from PR metadata when available, falling back to develop. This allows the agent to correctly compare against the actual merge target for stacked PRs.
Describe your changes
Refactors our
ai-pr-reviewworkflow to:ai-pr-reviewworkflow support running review with multiple models. The model reviews are run in parallel and consolidated by a judge: e.g.:gpt-5.2-codex-high,opus-4.6-thinkingwill first execute a review withgpt-5.2-codex-highandopus-4.6-thinkingin parallel and useopus-4.6-thinkingto consolidate both feedbacks into one. Example: Add local code review subagent and multi-model PR review support #13855 (comment)Contribution License Agreement
By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.