Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,7 @@ After the reviewer batch finishes, launch `ReviewJudge` with:
- the same review target
- the full reviewer outputs from every reviewer that ran, including timeout/cancel/failure notes
- if file splitting was used, include outputs from **all** same-role instances and label each by group (e.g. "Security Reviewer [group 1/3]")
- an instruction to validate, reject, merge, or downgrade findings, and to deduplicate any overlapping findings from same-role instances
- an instruction to validate, reject, merge, or downgrade findings from a **third-party perspective** — the judge primarily examines reviewer reports for logical consistency and evidence quality, and only uses code inspection tools for targeted spot-checks when a specific claim needs verification

If the execution policy says `judge_timeout_seconds > 0`, pass `timeout_seconds` with that value to the judge Task call.

Expand All @@ -152,6 +152,7 @@ The judge must explicitly call out:

- likely false positives
- optimization advice that is too risky or directionally wrong
- findings where the reviewer's evidence does not support their conclusion
- which findings should survive into the final report

### Phase 4: Report and wait for user approval
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ You are the **Review Quality Inspector** for BitFun deep reviews.

{LANGUAGE_PREFERENCE}

You are not another broad reviewer. Your job is to validate the outputs from the specialist reviewers and prevent false positives, low-signal nitpicks, or directionally-wrong optimization advice from reaching the final report.
Your primary role is an independent third-party arbiter that validates the **reports submitted by other reviewers**. You do not perform a broad independent code review from scratch. Instead, you examine each reviewer's findings from a logical and evidentiary standpoint, and use code inspection tools **only when necessary** to verify specific claims made by reviewers.

## Inputs

Expand All @@ -18,20 +18,24 @@ You will receive:
For every candidate finding from the reviewers:

1. decide whether it is **validated**, **downgraded**, or **rejected**
2. verify it against the code/diff when needed
3. check whether the suggested fix is actually safe and directionally correct
4. if multiple same-role instances reported overlapping or duplicate findings, **merge them into a single finding** with the strongest severity and evidence
2. evaluate the **internal consistency** of the reviewer's reasoning — does the evidence they cited actually support their conclusion?
3. when a finding's validity is unclear from the reviewer's report alone, use read-only tools to **spot-check the specific code location** the reviewer referenced
4. check whether the suggested fix direction is **logically sound** and **safe in principle**
5. if multiple same-role instances reported overlapping or duplicate findings, **merge them into a single finding** with the strongest severity and evidence

**Important**: Your code inspection should be targeted and minimal. Do not broadly re-review the codebase. Only inspect specific lines or files when a reviewer's claim needs verification or when you suspect a false positive / false negative.

Be especially skeptical of:

- speculative bugs with no evidence
- "optimize this" advice without meaningful impact
- recommendations that would widen scope or add risk without strong payoff
- duplicated findings reported by multiple reviewers or multiple same-role instances
- findings where the stated evidence does not logically lead to the stated conclusion

## Tools

Use only read-only investigation:
Use read-only investigation when needed:

- `GetFileDiff`
- `Read`
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ define_readonly_subagent!(
ReviewJudgeAgent,
REVIEW_JUDGE_AGENT_TYPE,
"Review Quality Inspector",
r#"Independent read-only quality inspector that validates reviewer findings, removes false positives, checks whether optimization advice is directionally correct, and decides what should appear in the final deep-review report."#,
r#"Independent third-party arbiter that validates reviewer reports for logical consistency and evidence quality. It spot-checks specific code locations only when a claim needs verification, rather than re-reviewing the codebase from scratch."#,
"review_quality_gate_agent",
&["Read", "Grep", "Glob", "LS", "GetFileDiff", "Git"]
);
Expand Down
76 changes: 66 additions & 10 deletions src/web-ui/src/app/scenes/agents/components/ReviewTeamPage.scss
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@
width: 100%;
padding: 16px;
border: 0;
border-radius: 0;
border-radius: inherit;
background: transparent;
color: var(--color-text-primary);
text-align: left;
Expand Down Expand Up @@ -79,7 +79,7 @@

&__policy-metrics {
display: grid;
grid-template-columns: repeat(5, minmax(86px, 1fr));
grid-template-columns: repeat(auto-fit, minmax(86px, 1fr));
gap: 8px;
min-width: 0;
}
Expand Down Expand Up @@ -113,6 +113,14 @@
}
}

/* Sections that render their own cards should not show the default body frame. */
&__section--no-body-frame .bitfun-config-page-section__body {
background: transparent;
border: 0;
border-radius: 0;
overflow: visible;
}

&__policy-action {
display: inline-flex;
align-items: center;
Expand Down Expand Up @@ -283,16 +291,12 @@

&__member-card {
display: flex;
align-items: flex-start;
gap: $size-gap-3;
flex-direction: column;
width: 100%;
min-height: 148px;
padding: 14px;
border-radius: 8px;
border: 1px solid var(--border-subtle);
background: color-mix(in srgb, var(--element-bg-soft) 78%, transparent);
text-align: left;
cursor: pointer;
transition:
transform $motion-fast $easing-standard,
border-color $motion-fast $easing-standard,
Expand All @@ -310,6 +314,30 @@
0 0 0 1px color-mix(in srgb, var(--member-accent, #64748b) 30%, transparent),
0 12px 24px color-mix(in srgb, var(--shadow-color, #0f172a) 10%, transparent);
}

&.is-expanded {
grid-column: 1 / -1;
}
}

&__member-card-header {
display: flex;
align-items: flex-start;
gap: $size-gap-3;
width: 100%;
min-height: 148px;
padding: 14px;
border: 0;
border-radius: inherit;
background: transparent;
color: var(--color-text-primary);
text-align: left;
cursor: pointer;

&:focus-visible {
outline: none;
box-shadow: inset 0 0 0 2px color-mix(in srgb, var(--member-accent, #64748b) 50%, transparent);
}
}

&__member-card-icon {
Expand Down Expand Up @@ -385,11 +413,29 @@
color: color-mix(in srgb, #f59e0b 82%, var(--color-text-muted));
}

&__member-card-chevron {
display: inline-flex;
align-items: center;
align-self: flex-start;
color: var(--color-text-muted);
margin-top: 2px;
}

&__member-card-detail {
overflow: hidden;
animation: review-team-card-expand $motion-base $easing-standard forwards;
border-top: 1px solid var(--border-subtle);
}

&__member-card-detail-inner {
padding: 16px;
}

&__detail-hero {
display: flex;
align-items: flex-start;
gap: $size-gap-4;
padding: 16px;
padding: 0 0 16px;
border-bottom: 1px solid var(--border-subtle);
}

Expand Down Expand Up @@ -454,8 +500,7 @@
display: flex;
flex-direction: column;
gap: 10px;
padding: 16px;
border-bottom: 1px solid var(--border-subtle);
padding: 16px 0 0;
}

&__block-label {
Expand Down Expand Up @@ -510,6 +555,17 @@
}
}

@keyframes review-team-card-expand {
from {
opacity: 0;
max-height: 0;
}
to {
opacity: 1;
max-height: 800px;
}
}

@media (max-width: 960px) {
.review-team-page {
&__summary-grid,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -299,7 +299,7 @@ describeWithJsdom('ReviewTeamPage', () => {
memberButton!.dispatchEvent(new dom.window.MouseEvent('click', { bubbles: true }));
});

expect(container.textContent).toContain('Member Detail');
expect(container.textContent).toContain('Responsibilities');
expect(container.textContent).toContain('Logic');
});
});
Loading
Loading