feat: synthesize review findings from Alcove transcripts (#186) by decko · Pull Request #213 · decko/raki

decko · 2026-04-25T10:34:27Z

Summary

Extracts ReviewFinding objects from Alcove/bridge session transcripts (test failures, lint errors)
Adds finding_source: Literal["review", "synthesized"] | None field to ReviewFinding model
Knowledge metrics and self_correction_rate exclude synthesized findings (different quality signal)
CLI summary and HTML report distinguish synthesized vs real findings
New test fixture alcove-with-failures.json for end-to-end testing
741 new lines across 14 files

Test plan

uv run pytest tests/ -v -m "not slow" — 1164 passed
Synthesized findings extracted from transcript failures
finding_source field correctly set on all findings
Knowledge metrics skip synthesized findings
self_correction_rate only counts real review findings
CLI/HTML distinguish finding sources

🤖 Generated with SODA + Claude Code

Add `finding_source: Literal["review", "synthesized"] | None = None` to ReviewFinding so callers can distinguish human review findings from findings synthesized from transcript tool failures. Default is None for backward compatibility with existing serialized data. Add comprehensive tests for the new field including roundtrip serialization.

Add AlcoveAdapter._synthesize_findings() that walks the tool sequence and creates ReviewFinding objects (finding_source='synthesized', severity='major', reviewer='synthesized') from every testing-phase tool call that ended in a failure. Also set finding_source='review' on all explicitly parsed findings from the JSON 'findings' array so consumers can distinguish the two kinds. Deduplication: same failure text across repeated test runs → 1 finding. Synthesis only runs when no explicit findings are present in the JSON.

…thesized findings Skip findings with finding_source='synthesized' in three metric computations: - knowledge_gap_rate._compute_with_doc_chunks(): synthesized findings contain raw tool output that matches doc chunks too broadly, polluting gap rate. - knowledge_miss_rate._compute_with_doc_chunks(): same reason, same guard. - self_correction_rate.compute(): synthesized findings are not actionable review feedback; excluding them from the denominator prevents repeated test failures from artificially lowering the correction rate. Severity metric intentionally unchanged: synthesized findings still count toward review_severity_distribution (real code quality signal). Knowledge metrics legacy path (context_source='synthesized') was already guarded since ticket #183; this completes the doc_chunks path.

…report CLI summary sentence (generate_summary_sentence): - When both review and synthesized findings exist, appends '(N synthesized from test failures)' as a parenthetical note. - When only synthesized findings exist, uses a distinct sentence 'N issue(s) synthesized from test/lint failures'. - Synthesized findings are excluded from the per-severity count so that 'Reviewers found X major issues' stays accurate. HTML report (report.html.j2): - Synthesized findings show a grey 'synthesized' badge beside the severity badge so they are visually distinguishable. - A footnote under the findings list explains what synthesized findings are and which metrics exclude them. - Reviewer name is suppressed for synthesized findings (shows 'synthesized' already via badge).

decko added 5 commits April 25, 2026 07:10

chore(#186): add towncrier fragment changes/186.feature

0627a64

decko merged commit 0d9d886 into main Apr 25, 2026
4 checks passed

decko deleted the soda/186 branch April 25, 2026 11:02

decko mentioned this pull request Apr 25, 2026

feat: synthesize review findings from Alcove transcripts #186

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: synthesize review findings from Alcove transcripts (#186)#213

feat: synthesize review findings from Alcove transcripts (#186)#213
decko merged 5 commits into
mainfrom
soda/186

decko commented Apr 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

decko commented Apr 25, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant