Problem
Copilot code review drip-feeds ~2 comments per pass. When the responder fixes only the flagged instances, Copilot re-reviews and finds more of the same class in the next round. The implementer also ships code with patterns the reviewer predictably catches (stale docstrings, weak assertions, dead references).
Evidence
Analysis of 32 pipeline PRs:
Top repeated issue classes
| Class |
PRs affected |
Agent |
| Stale docstrings/comments |
#736, #781, #719, #706, #768 |
Both |
| Weak/misleading test assertions |
#654, #726, #760 |
Both |
| Same bug in parallel code paths |
#736, #727 |
Responder |
| Dead code references |
#699 |
Both |
| Naming mismatches |
#654, #719 |
Implementer |
Solution
Responder: fix-forward scan (step 5)
After addressing each review comment, the responder identifies the class of issue and scans ALL changed files for other instances. Prevents the reviewer from flagging the same class in the next round.
Implementer: pre-push self-review
Before committing, the implementer audits its own changes for the reviewer's most common findings:
- Docstring accuracy (does it still match new behavior?)
- Test assertion strength (exact match vs substring)
- Dead references (renamed/removed → grep for stale refs)
- Parallel code paths (same bug/gap in sibling functions)
- Naming consistency (test names match behavior)
Investigation findings
See #782 for the full research into Copilot review behavior (comment limits, configuration options, platform changes, workarounds).
Key facts:
- No configurable comment limit exists — hardcoded in the product
- GitHub's blog warns against "be more thorough" instructions — adds noise
- Average 5.1 comments/review, 29% produce zero
- Instructions have 4,000 char limit for code review
- May 2025: "80% more comments per PR" (platform change, not configurable)
- Jul 2025: Lifted file count limits for large PRs
Problem
Copilot code review drip-feeds ~2 comments per pass. When the responder fixes only the flagged instances, Copilot re-reviews and finds more of the same class in the next round. The implementer also ships code with patterns the reviewer predictably catches (stale docstrings, weak assertions, dead references).
Evidence
Analysis of 32 pipeline PRs:
parse_data()andEventDatafrom public API (#670) #674), the reviewer caught 2 instances of the same class in round 1 that the implementer could have caught before pushingTop repeated issue classes
Solution
Responder: fix-forward scan (step 5)
After addressing each review comment, the responder identifies the class of issue and scans ALL changed files for other instances. Prevents the reviewer from flagging the same class in the next round.
Implementer: pre-push self-review
Before committing, the implementer audits its own changes for the reviewer's most common findings:
Investigation findings
See #782 for the full research into Copilot review behavior (comment limits, configuration options, platform changes, workarounds).
Key facts: