Conversation
Wrap <section class='worst-sessions'> in {% if worst_sessions %} so the
entire section is omitted from the HTML output when no retrieval metrics
produce a session ranking.
Previously the section was always rendered, showing 'Worst 0 Sessions'
and an empty-state paragraph. This was visual noise when the report was
generated without --judge.
Also updates test_no_needs_attention_when_all_pass to use a direct
element-presence assertion instead of relying on the now-removed
'Worst 0 Sessions' heading text.
…ing and ≥3 session threshold (#284) Replace the exact-text-match grouping with a greedy Jaccard-similarity clustering algorithm: * Issues whose token sets have Jaccard similarity ≥ 0.60 are merged into the same cluster (cross-file fuzzy matching). * Raise the session threshold from >1 to ≥3, eliminating single-pair noise from the Recurring Failures table. * Tokenisation reuses raki.metrics.knowledge._common.tokenize() for consistent stop-word removal and lower-casing. * Internal _IssueCluster dataclass replaces the untyped dict accumulator. New tests cover: 3-session pass, 2-session exclusion, Jaccard merging, Jaccard non-merging, severity escalation, and count-descending sort.
Wrap the entire <section> containing the Recurring Failures table in an
{% if recurring_failures %} guard so the section is completely omitted
from the rendered HTML when no issue cluster meets the ≥3 session
threshold.
Previously the section was always shown with an empty-state paragraph,
which was misleading noise at the top of the page.
Adds test_recurring_failures_section_hidden_when_empty to verify the
section element does not appear in reports with no qualifying failures.
…ing (#284) Three test updates caused by the new ≥3 session threshold and hidden sections: * test_recurring_failures_section: switched from _make_report_with_samples (SQL injection in 2 sessions, below threshold) to the new _make_report_with_recurring_failures fixture (3 sessions) and updated the assertion to check for the actual section heading element. * test_no_recurring_failures_shows_empty_state: behaviour changed from 'show empty-state paragraph' to 'hide section entirely'; now asserts that <h2>Recurring Failures</h2> is absent from the HTML. * test_no_worst_sessions_shows_empty_state: behaviour changed from 'show empty-state paragraph' to 'hide section entirely'; now asserts that class="worst-sessions" is absent from the HTML. Also adds _make_report_with_recurring_failures() helper for tests that need a report where recurring failures threshold is met.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Hide the Worst N Sessions and Recurring Failures sections in the HTML report when they have no data to display. Both sections are now omitted entirely rather than rendered as empty placeholders.
The recurring-failure detector was also improved: exact-text grouping was replaced with Jaccard-similarity clustering (≥ 60 % token overlap) so near-duplicate findings across different files are merged into one row. The minimum session threshold was also raised from 2 → 3 to filter out one-off coincidences.
Acceptance Criteria
{% if worst_sessions %}, entire<section>omitted when list is emptytokenize()from_common.py{% if recurring_failures %}, entire<section>omittedchanges/284.fixpresent with clear descriptionReview Results
All 1617 tests passed (4 skipped, 4 deselected).
ruff check,ruff format, andty checkall clean.Refs #284
Assisted-by: Claude Opus 4.6 (1M context) noreply@anthropic.com
Assigned-by: decko