feat: add pre-debate hallucination filter (#428)#433
Conversation
Validates evidence documents against actual diff before L2 debate: - Remove findings referencing files not in diff - Remove findings with line ranges outside diff hunks - Penalize confidence for fabricated code quotes Zero model cost. Catches obvious hallucinations like references to bundled dependency files or non-existent code patterns. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughA new hallucination filter module validates evidence documents against the actual diff content before pipeline confidence computation. The filter removes findings when their filePath or lineRange don't match the diff, and penalizes confidence for code quotes not present in the diff. The filter is integrated into the pipeline orchestrator after learned-pattern dismissal. Changes
Sequence DiagramsequenceDiagram
participant Pipeline
participant HallucinationFilter
participant DiffParser
participant DocumentValidator
Pipeline->>HallucinationFilter: filterHallucinations(allEvidenceDocs, diffContent)
HallucinationFilter->>DiffParser: Parse diff content
DiffParser-->>HallucinationFilter: filePaths set, hunkMap
loop For each EvidenceDocument
HallucinationFilter->>DocumentValidator: Validate filePath in diff
DocumentValidator-->>HallucinationFilter: exists or not
alt filePath not in diff
HallucinationFilter->>HallucinationFilter: Mark as removed
else filePath exists
HallucinationFilter->>DocumentValidator: Validate lineRange overlap
alt lineRange overlaps hunk
HallucinationFilter->>DocumentValidator: Check code quotes in diff
alt quotes found
HallucinationFilter->>HallucinationFilter: Keep with original confidence
else quotes not found
HallucinationFilter->>HallucinationFilter: Halve confidence, keep
end
else lineRange no overlap
HallucinationFilter->>HallucinationFilter: Mark as removed
end
end
end
HallucinationFilter-->>Pipeline: FilterResult {filtered, removed}
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~35 minutes Possibly related issues
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
CodeAgora Review
📋 Triage: 4 must-fix · 4 verify · 5 ignore
Verdict: 🔴 REJECT · 4 critical · 9 warning
All three CRITICAL findings have high confidence (75–90%) and survived debate; the strongest (d002) shows an undefined
computeValue()call that will break compilation, while d003 documents a stray backtick that will also break syntax, and d001 identifies a logic flaw that can silently drop evidence. None of these were debunked—only one reviewer dissented on d001 and d003 without producing counter-evidence—so the change is unsafe to merge.
Blocking Issues
| Severity | File | Line | Issue | Confidence |
|---|---|---|---|---|
| 🔴 CRITICAL | packages/core/src/pipeline/hallucination-filter.ts |
54–61 | Incorrect Handling of Line Ranges in Hallucination Filter | 🟡 64% |
| 🔴 CRITICAL | src/utils.ts |
10–12 | Undefined computeValue in utils.ts |
🟢 82% |
| 🔴 CRITICAL | src/index.ts |
7–7 | Syntax Error – stray backtick in index.ts |
🟢 93% |
| 🔴 CRITICAL | src/index.ts |
2–2 | Unused / Potentially Missing Import newModule in index.ts |
🟢 87% |
9 warning(s)
| Severity | File | Line | Issue | Confidence |
|---|---|---|---|---|
| 🟡 WARNING | packages/core/src/pipeline/hallucination-filter.ts |
73 | Inconsistent Confidence Calculation for Fabricated Code Quotes | 🟡 44% |
| 🟡 WARNING | packages/core/src/pipeline/hallucination-filter.ts |
47 | Potential Null Pointer Exception in Hallucination Filter | 🟡 54% |
| 🟡 WARNING | packages/core/src/pipeline/hallucination-filter.ts |
42 | Inconsistent Handling of Unknown File Paths | 🟡 48% |
| 🟡 WARNING | packages/core/src/pipeline/orchestrator.ts |
743 | Potential Insecure Use of console.log in Orchestrator | 🟡 56% |
| 🟡 WARNING | packages/core/src/pipeline/hallucination-filter.ts |
24 | Missing Sanitization of Shell Args in Hallucination Filter | 🟡 78% |
| 🟡 WARNING | packages/core/src/pipeline/hallucination-filter.ts |
28 | Potential Resource Leak in Hallucination Filter | 🟡 72% |
| 🟡 WARNING | packages/core/src/pipeline/hallucination-filter.ts |
54 | Potential Off-by-One Error in Hallucination Filter | 🟡 46% |
| 🟡 WARNING | packages/core/src/pipeline/hallucination-filter.ts |
81 | No Confidence Penalty for Non-Hallucinated Documents | 🔴 32% |
| 🟡 WARNING | packages/core/src/pipeline/hallucination-filter.ts |
47 | No Error Handling for File Existence Check | 🟡 42% |
Issue distribution (4 file(s))
| File | Issues |
|---|---|
packages/core/src/pipeline/hallucination-filter.ts |
████████████ 9 |
src/index.ts |
███ 2 |
src/utils.ts |
█ 1 |
packages/core/src/pipeline/orchestrator.ts |
█ 1 |
Agent consensus log (3 discussion(s))
✅ d001 — 1 round(s), consensus → CRITICAL
Verdict: CRITICAL — Majority consensus (2/3 agree)
✅ d002 — 1 round(s), consensus → CRITICAL
Verdict: CRITICAL — All supporters agreed on the issue
✅ d003 — 1 round(s), consensus → CRITICAL
Verdict: CRITICAL — Majority consensus (2/3 agree)
CodeAgora · Session: 2026-04-01/001
|
|
||
| // Check 2: Line range overlaps with diff hunks | ||
| if (doc.filePath !== 'unknown' && doc.lineRange[0] > 0) { | ||
| const hunks = hunkMap.get(doc.filePath); |
There was a problem hiding this comment.
🔴 CRITICAL — Incorrect Handling of Line Ranges in Hallucination Filter
Confidence: 🟡 64%
Problem: In packages/core/src/pipeline/hallucination-filter.ts:54-61
The current implementation of the hallucination filter has a logical flaw in how it handles line ranges. Specifically, it checks if doc.lineRange[0] > 0 before verifying if the line range overlaps with any diff hunks. This condition is unnecessary and could potentially lead to incorrect filtering of evidence documents.
Evidence:
- The condition
doc.lineRange[0] > 0is not required because the subsequent checkhunks.some(([start, end]) => ...)will correctly handle all line ranges, including those that start at or below line 0. - The current implementation may incorrectly remove evidence documents that have line ranges starting at 0 or below, which could be valid.
| const hunks = hunkMap.get(doc.filePath); | |
| // Check 2: Line range overlaps with diff hunks | |
| if (doc.filePath !== 'unknown') { | |
| const hunks = hunkMap.get(doc.filePath); | |
| if (hunks && hunks.length > 0) { | |
| const HUNK_TOLERANCE = 10; // Allow some tolerance for context lines | |
| const overlaps = hunks.some(([start, end]) => | |
| doc.lineRange[0] <= end + HUNK_TOLERANCE && | |
| doc.lineRange[1] >= start - HUNK_TOLERANCE | |
| ); | |
| if (!overlaps) { | |
| removed.push(doc); | |
| continue; | |
| } | |
| } | |
| } |
🔍 Individual Reviews (2 reviewers)
r-scout 💬 meta-llama/llama-4-scout-17b-16e-instruct (🔴 CRITICAL)
Problem: In packages/core/src/pipeline/hallucination-filter.ts:54-61 The current implementation of the hallucination filter has a logical flaw in how it handles line ranges. Specifically, it checks if `doc....
Evidence:
- The condition
doc.lineRange[0] > 0is not required because the subsequent check `hunks.some(([start, end]) => ......- The current implementation may incorrectly remove evidence documents that have line ranges starting at 0 or below, which could be valid.
Suggestion: Remove the unnecessary condition
doc.lineRange[0] > 0to ensure that all line ranges are properly checked for overlap with diff hunks....
r-llama31 💬 llama-3.1-8b-instant (🟡 WARNING)
Problem: In packages/core/src/pipeline/hallucination-filter.ts:54-65 The hallucination filter function uses the `hunkMap....
Evidence:
- Lack of null check for the get() function result.
- Potential off-by-one error if the get() function returns undefined.
Suggestion: Add a null check for the get() function result to prevent potential off-by-one errors. ---
Flagged by: r-scout, r-llama31 | CodeAgora
| const codeQuotes = doc.problem.match(/`([^`]{10,})`/g); | ||
| if (codeQuotes && codeQuotes.length > 0) { | ||
| let fabricatedCount = 0; | ||
| for (const quote of codeQuotes) { |
There was a problem hiding this comment.
🟡 WARNING — Inconsistent Confidence Calculation for Fabricated Code Quotes
Confidence: 🟡 44%
Problem: In packages/core/src/pipeline/hallucination-filter.ts:73-84
The confidence calculation for fabricated code quotes seems inconsistent. It checks if more than half of the code quotes are fabricated, and if so, it reduces the confidence by half. However, this approach may not be suitable for all cases.
Evidence:
- The current implementation only checks if more than half of the code quotes are fabricated, which may not be sufficient to determine the overall confidence.
- The reduction of confidence by half may be too harsh, as it does not take into account the severity of the fabricated code quotes.
- There is no consideration for the case where all code quotes are fabricated, which should likely result in a significant reduction in confidence.
| for (const quote of codeQuotes) { | |
| if (fabricatedCount > 0) { | |
| const confidenceReduction = (fabricatedCount / codeQuotes.length) * 0.5; | |
| doc.confidence = Math.round((doc.confidence ?? 50) * (1 - confidenceReduction)); | |
| } |
Flagged by: r-llama33 | CodeAgora
| } | ||
|
|
||
| // Check 1: File exists in diff | ||
| if (doc.filePath !== 'unknown' && !diffFiles.has(doc.filePath)) { |
There was a problem hiding this comment.
🟡 WARNING — Potential Null Pointer Exception in Hallucination Filter
Confidence: 🟡 54%
Problem: In packages/core/src/pipeline/hallucination-filter.ts:47-51
The doc.filePath is checked for existence in the diffFiles set, but it is not checked for null or undefined. If doc.filePath is null or undefined, a null pointer exception may occur.
Evidence:
- The
doc.filePathis not explicitly checked for null or undefined before being used in thediffFiles.has()method. - The
diffFilesset is created from theextractFileListFromDiff()function, which may return null or undefined if the diff content is invalid.
| if (doc.filePath !== 'unknown' && !diffFiles.has(doc.filePath)) { | |
| if (doc.filePath !== null && doc.filePath !== undefined && !diffFiles.has(doc.filePath)) { | |
| removed.push(doc); | |
| continue; | |
| } |
🔍 Individual Reviews (2 reviewers)
r-llama33 💬 llama-3.3-70b-versatile (🟡 WARNING)
Problem: In packages/core/src/pipeline/hallucination-filter.ts:47-51 The
doc.filePathis checked for existence in thediffFilesset, but it is not checked for null or undefined. If `doc....Evidence:
- The
doc.filePathis not explicitly checked for null or undefined before being used in thediffFiles.has()method.- The
diffFilesset is created from theextractFileListFromDiff()function, which may return null or undefined if the diff content is invalid.Suggestion: Add a null check for
doc.filePathbefore using it in thediffFiles.has()method: ```typescript if (doc.filePath !== null && doc.filePath !== undefined && !diffFiles.has(doc.filePath)) { removed....
r-llama31 💬 llama-3.1-8b-instant (🟡 WARNING)
Problem: In packages/core/src/pipeline/hallucination-filter.ts:47-50 The hallucination filter function does not handle errors when checking file existence....
Evidence:
- Lack of error handling for file existence check.
- Potential unhandled errors and unexpected behavior.
Suggestion: Add error handling for file existence check to handle potential errors and ensure expected behavior. --- No issues found.
Flagged by: r-llama33, r-llama31 | CodeAgora
| for (const doc of docs) { | ||
| // Skip rule-based findings (they come from static analysis, not LLM) | ||
| if (doc.source === 'rule') { | ||
| filtered.push(doc); |
There was a problem hiding this comment.
🟡 WARNING — Inconsistent Handling of Unknown File Paths
Confidence: 🟡 48%
Problem: In packages/core/src/pipeline/hallucination-filter.ts:42-44
The hallucination filter checks if the doc.source is 'rule' and if so, it skips the file existence check. However, it does not handle the case where the doc.filePath is 'unknown' consistently.
Evidence:
- The
doc.filePathis checked for existence in thediffFilesset, but if it is 'unknown', the check is skipped. - The
makeDoc()function in the test cases creates evidence documents withfilePathset to 'unknown', but the hallucination filter does not handle this case consistently.
| filtered.push(doc); | |
| if (doc.filePath === 'unknown') { | |
| filtered.push(doc); | |
| continue; | |
| } |
Flagged by: r-llama33 | CodeAgora
| } | ||
|
|
||
| // === HALLUCINATION FILTER: Remove findings referencing non-existent code (#428) === | ||
| const { filterHallucinations } = await import('./hallucination-filter.js'); |
There was a problem hiding this comment.
🟡 WARNING — Potential Insecure Use of console.log in Orchestrator
Confidence: 🟡 56%
Problem: In packages/core/src/pipeline/orchestrator.ts:743-748
The orchestrator function logs sensitive information (e.g., removed findings) to the console using console.log. This could potentially leak sensitive information to unauthorized parties if the console output is exposed.
Evidence:
- Logging sensitive information (e.g., removed findings) to the console without proper redaction or sanitization.
- Lack of secure logging practices in the orchestrator function.
Suggestion: Use a secure logging library (e.g., bunyan, loglevel) to log sensitive information and ensure proper redaction and sanitization. Alternatively, use a secure logging framework to prevent sensitive information from being exposed.
Flagged by: r-llama31 | CodeAgora
| * 3. Code quote: inline code in problem text should exist in diff | ||
| */ | ||
| export function filterHallucinations( | ||
| docs: EvidenceDocument[], |
There was a problem hiding this comment.
🟡 WARNING — Missing Sanitization of Shell Args in Hallucination Filter
Confidence: 🟡 78%
Problem: In packages/core/src/pipeline/hallucination-filter.ts:24-26
The hallucination filter function uses the diffContent parameter, which is not sanitized for shell args. This could potentially lead to shell injection vulnerabilities if the diffContent string contains shell metacharacters.
Evidence:
- Lack of sanitization for shell args (e.g., no use of
sanitizeShellArg()or similar). - Potential for shell injection vulnerabilities if
diffContentcontains shell metacharacters.
Suggestion: Use a sanitization function (e.g., sanitizeShellArg()) to properly sanitize the diffContent string for shell args.
Flagged by: r-llama31 | CodeAgora
| diffContent: string, | ||
| ): FilterResult { | ||
| const diffFiles = new Set(extractFileListFromDiff(diffContent)); | ||
| const diffRanges = parseDiffFileRanges(diffContent); |
There was a problem hiding this comment.
🟡 WARNING — Potential Resource Leak in Hallucination Filter
Confidence: 🟡 72%
Problem: In packages/core/src/pipeline/hallucination-filter.ts:28-36
The hallucination filter function creates a Map of file -> hunk ranges, but the Map is not cleared or disposed of after use. This could potentially lead to a resource leak if the function is called multiple times without clearing the Map.
Evidence:
- Creation of a Map without proper disposal or clearing.
- Potential resource leak if the function is called multiple times.
Suggestion: Clear or dispose of the Map after use to prevent potential resource leaks. Alternatively, use a weak Map or a garbage-collected Map to allow the Map to be collected by the garbage collector.
✅ Discussion d001 — 1 round(s), consensus
Verdict: CRITICAL — Majority consensus (2/3 agree)
Flagged by: r-llama31 | CodeAgora
|
|
||
| // Check 2: Line range overlaps with diff hunks | ||
| if (doc.filePath !== 'unknown' && doc.lineRange[0] > 0) { | ||
| const hunks = hunkMap.get(doc.filePath); |
There was a problem hiding this comment.
🟡 WARNING — Potential Off-by-One Error in Hallucination Filter
Confidence: 🟡 46%
Problem: In packages/core/src/pipeline/hallucination-filter.ts:54-65
The hallucination filter function uses the hunkMap.get() function to get the hunks for a file, but it does not check if the get() function returns undefined. This could potentially lead to an off-by-one error if the get() function returns undefined.
Evidence:
- Lack of null check for the get() function result.
- Potential off-by-one error if the get() function returns undefined.
Suggestion: Add a null check for the get() function result to prevent potential off-by-one errors.
🔍 Individual Reviews (2 reviewers)
r-scout 💬 meta-llama/llama-4-scout-17b-16e-instruct (🔴 CRITICAL)
Problem: In packages/core/src/pipeline/hallucination-filter.ts:54-61 The current implementation of the hallucination filter has a logical flaw in how it handles line ranges. Specifically, it checks if `doc....
Evidence:
- The condition
doc.lineRange[0] > 0is not required because the subsequent check `hunks.some(([start, end]) => ......- The current implementation may incorrectly remove evidence documents that have line ranges starting at 0 or below, which could be valid.
Suggestion: Remove the unnecessary condition
doc.lineRange[0] > 0to ensure that all line ranges are properly checked for overlap with diff hunks....
r-llama31 💬 llama-3.1-8b-instant (🟡 WARNING)
Problem: In packages/core/src/pipeline/hallucination-filter.ts:54-65 The hallucination filter function uses the `hunkMap....
Evidence:
- Lack of null check for the get() function result.
- Potential off-by-one error if the get() function returns undefined.
Suggestion: Add a null check for the get() function result to prevent potential off-by-one errors. ---
Flagged by: r-scout, r-llama31 | CodeAgora
| } | ||
| // If majority of code quotes are fabricated, penalize confidence | ||
| if (fabricatedCount > codeQuotes.length / 2) { | ||
| doc.confidence = Math.round((doc.confidence ?? 50) * 0.5); |
There was a problem hiding this comment.
🟡 WARNING — No Confidence Penalty for Non-Hallucinated Documents
Confidence: 🔴 32%
Problem: In packages/core/src/pipeline/hallucination-filter.ts:81-83
The hallucination filter function does not penalize confidence for non-hallucinated documents. This could potentially lead to incorrect confidence scores being assigned to non-hallucinated documents.
Evidence:
- Lack of confidence penalty for non-hallucinated documents.
- Potential incorrect confidence scores being assigned to non-hallucinated documents.
Suggestion: Add a confidence penalty for non-hallucinated documents to ensure accurate confidence scores are assigned.
Flagged by: r-llama31 | CodeAgora
| } | ||
|
|
||
| // Check 1: File exists in diff | ||
| if (doc.filePath !== 'unknown' && !diffFiles.has(doc.filePath)) { |
There was a problem hiding this comment.
🟡 WARNING — No Error Handling for File Existence Check
Confidence: 🟡 42%
Problem: In packages/core/src/pipeline/hallucination-filter.ts:47-50
The hallucination filter function does not handle errors when checking file existence. This could potentially lead to unhandled errors and unexpected behavior.
Evidence:
- Lack of error handling for file existence check.
- Potential unhandled errors and unexpected behavior.
Suggestion: Add error handling for file existence check to handle potential errors and ensure expected behavior.
No issues found.
🔍 Individual Reviews (2 reviewers)
r-llama33 💬 llama-3.3-70b-versatile (🟡 WARNING)
Problem: In packages/core/src/pipeline/hallucination-filter.ts:47-51 The
doc.filePathis checked for existence in thediffFilesset, but it is not checked for null or undefined. If `doc....Evidence:
- The
doc.filePathis not explicitly checked for null or undefined before being used in thediffFiles.has()method.- The
diffFilesset is created from theextractFileListFromDiff()function, which may return null or undefined if the diff content is invalid.Suggestion: Add a null check for
doc.filePathbefore using it in thediffFiles.has()method: ```typescript if (doc.filePath !== null && doc.filePath !== undefined && !diffFiles.has(doc.filePath)) { removed....
r-llama31 💬 llama-3.1-8b-instant (🟡 WARNING)
Problem: In packages/core/src/pipeline/hallucination-filter.ts:47-50 The hallucination filter function does not handle errors when checking file existence....
Evidence:
- Lack of error handling for file existence check.
- Potential unhandled errors and unexpected behavior.
Suggestion: Add error handling for file existence check to handle potential errors and ensure expected behavior. --- No issues found.
Flagged by: r-llama33, r-llama31 | CodeAgora
🔴 CRITICAL — Undefined Confidence: 🟢 82% Problem: In src/utils.ts:10-12 The function Evidence:
✅ Discussion d002 — 1 round(s), consensusVerdict: CRITICAL — All supporters agreed on the issue Flagged by: r-gpt-oss | CodeAgora |
🔴 CRITICAL — Syntax Error – stray backtick in Confidence: 🟢 93% Problem: In src/index.ts:7 A stray backtick ( Evidence:
Flagged by: r-gpt-oss | CodeAgora |
🔴 CRITICAL — Unused / Potentially Missing Import Confidence: 🟢 87% Problem: In src/index.ts:2
Evidence:
✅ Discussion d003 — 1 round(s), consensusVerdict: CRITICAL — Majority consensus (2/3 agree) Flagged by: r-gpt-oss | CodeAgora |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/core/src/pipeline/hallucination-filter.ts`:
- Around line 54-67: The current logic skips overlap validation when
hunkMap.get(doc.filePath) is missing, so treat missing or empty hunk lists as
"no overlap" and remove the doc; modify the block around
hunkMap.get(doc.filePath) (referencing hunkMap, doc, HUNK_TOLERANCE, removed) to
explicitly check for a missing/empty hunks array and push the document into
removed with continue, and only perform the tolerance-based overlaps check when
hunks is a non-empty array.
- Around line 28-29: The diff-file parsers may both return empty results and the
current logic then removes all non-`unknown` LLM findings; change the logic in
hallucination-filter.ts so that if both extractFileListFromDiff(diffContent) and
parseDiffFileRanges(diffContent) produce no metadata (diffFiles is empty AND
diffRanges is empty), you skip/disable the diff-based filtering step that
removes non-`unknown` findings rather than applying it. Locate references to
diffFiles and diffRanges and the removal code handling non-`unknown` LLM
findings, and add an early guard that only runs that heuristic when at least one
of the parser results is non-empty.
- Around line 69-86: The code only checks doc.problem for backtick-wrapped code
and mutates doc.confidence in place; update the logic in hallucination-filter.ts
to scan all relevant text fields (doc.problem, doc.issueTitle, doc.suggestion,
and each item in doc.evidence[]) for backtick-wrapped code snippets using the
existing regex, aggregate fabricatedCode counts across all those sources, and if
majority fabricated reduce confidence; do not mutate the original doc — create a
shallow copy (e.g., newDoc = { ...doc }) and set newDoc.confidence before
pushing (replace filtered.push(doc) with filtered.push(newDoc)) so the helper
remains pure.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 79538f20-db48-48aa-9c10-cf296b444231
📒 Files selected for processing (3)
packages/core/src/pipeline/hallucination-filter.tspackages/core/src/pipeline/orchestrator.tspackages/core/src/tests/hallucination-filter.test.ts
| const diffFiles = new Set(extractFileListFromDiff(diffContent)); | ||
| const diffRanges = parseDiffFileRanges(diffContent); |
There was a problem hiding this comment.
Fail open when diff parsing returns no file metadata.
If extractFileListFromDiff() and parseDiffFileRanges() both fail to extract anything from a non-empty diff, Line 48 removes every non-unknown LLM finding. A parser miss should disable this heuristic, not wipe the whole L1 signal.
💡 Minimal guard
export function filterHallucinations(
docs: EvidenceDocument[],
diffContent: string,
): FilterResult {
const diffFiles = new Set(extractFileListFromDiff(diffContent));
const diffRanges = parseDiffFileRanges(diffContent);
+ if (docs.length > 0 && diffContent.trim() && diffFiles.size === 0 && diffRanges.length === 0) {
+ return { filtered: [...docs], removed: [] };
+ }Also applies to: 47-50
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/core/src/pipeline/hallucination-filter.ts` around lines 28 - 29, The
diff-file parsers may both return empty results and the current logic then
removes all non-`unknown` LLM findings; change the logic in
hallucination-filter.ts so that if both extractFileListFromDiff(diffContent) and
parseDiffFileRanges(diffContent) produce no metadata (diffFiles is empty AND
diffRanges is empty), you skip/disable the diff-based filtering step that
removes non-`unknown` findings rather than applying it. Locate references to
diffFiles and diffRanges and the removal code handling non-`unknown` LLM
findings, and add an early guard that only runs that heuristic when at least one
of the parser results is non-empty.
| if (doc.filePath !== 'unknown' && doc.lineRange[0] > 0) { | ||
| const hunks = hunkMap.get(doc.filePath); | ||
| if (hunks && hunks.length > 0) { | ||
| const HUNK_TOLERANCE = 10; // Allow some tolerance for context lines | ||
| const overlaps = hunks.some(([start, end]) => | ||
| doc.lineRange[0] <= end + HUNK_TOLERANCE && | ||
| doc.lineRange[1] >= start - HUNK_TOLERANCE | ||
| ); | ||
| if (!overlaps) { | ||
| removed.push(doc); | ||
| continue; | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Handle files with no parsed hunk ranges explicitly.
When hunkMap.get(doc.filePath) is missing, Lines 55-66 skip the overlap check and keep the document. That means a line-based finding is treated as “validated” even though no hunk overlap was checked at all.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/core/src/pipeline/hallucination-filter.ts` around lines 54 - 67, The
current logic skips overlap validation when hunkMap.get(doc.filePath) is
missing, so treat missing or empty hunk lists as "no overlap" and remove the
doc; modify the block around hunkMap.get(doc.filePath) (referencing hunkMap,
doc, HUNK_TOLERANCE, removed) to explicitly check for a missing/empty hunks
array and push the document into removed with continue, and only perform the
tolerance-based overlaps check when hunks is a non-empty array.
| // Check 3: Code quote verification | ||
| // Extract inline code from problem text (backtick-wrapped) | ||
| const codeQuotes = doc.problem.match(/`([^`]{10,})`/g); | ||
| if (codeQuotes && codeQuotes.length > 0) { | ||
| let fabricatedCount = 0; | ||
| for (const quote of codeQuotes) { | ||
| const code = quote.slice(1, -1); // Remove backticks | ||
| if (!diffContent.includes(code)) { | ||
| fabricatedCount++; | ||
| } | ||
| } | ||
| // If majority of code quotes are fabricated, penalize confidence | ||
| if (fabricatedCount > codeQuotes.length / 2) { | ||
| doc.confidence = Math.round((doc.confidence ?? 50) * 0.5); | ||
| } | ||
| } | ||
|
|
||
| filtered.push(doc); |
There was a problem hiding this comment.
Scan all finding text for fabricated code quotes.
Only doc.problem is inspected here. Backticks in issueTitle, evidence[], or suggestion keep full confidence even though they ship downstream with the finding. While touching this, return a copied document instead of mutating doc.confidence in place so the helper stays pure.
🔎 Expand quote sources without mutating the input
- const codeQuotes = doc.problem.match(/`([^`]{10,})`/g);
+ const quoteSources = [doc.issueTitle, doc.problem, ...doc.evidence, doc.suggestion];
+ const codeQuotes = [...new Set(
+ quoteSources.flatMap((text) => text.match(/`([^`]{10,})`/g) ?? [])
+ )];
+ const nextDoc = { ...doc };
+
if (codeQuotes && codeQuotes.length > 0) {
let fabricatedCount = 0;
for (const quote of codeQuotes) {
const code = quote.slice(1, -1); // Remove backticks
if (!diffContent.includes(code)) {
fabricatedCount++;
}
}
// If majority of code quotes are fabricated, penalize confidence
if (fabricatedCount > codeQuotes.length / 2) {
- doc.confidence = Math.round((doc.confidence ?? 50) * 0.5);
+ nextDoc.confidence = Math.round((nextDoc.confidence ?? 50) * 0.5);
}
}
- filtered.push(doc);
+ filtered.push(nextDoc);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/core/src/pipeline/hallucination-filter.ts` around lines 69 - 86, The
code only checks doc.problem for backtick-wrapped code and mutates
doc.confidence in place; update the logic in hallucination-filter.ts to scan all
relevant text fields (doc.problem, doc.issueTitle, doc.suggestion, and each item
in doc.evidence[]) for backtick-wrapped code snippets using the existing regex,
aggregate fabricatedCode counts across all those sources, and if majority
fabricated reduce confidence; do not mutate the original doc — create a shallow
copy (e.g., newDoc = { ...doc }) and set newDoc.confidence before pushing
(replace filtered.push(doc) with filtered.push(newDoc)) so the helper remains
pure.
Summary
Changes
packages/core/src/pipeline/hallucination-filter.ts- 3단계 검증 로직 (파일 존재, 라인 범위, 코드 인용)packages/core/src/pipeline/orchestrator.ts- LEARNING 단계 이후, CONFIDENCE 계산 이전에 hallucination filter 삽입packages/core/src/tests/hallucination-filter.test.ts- 9개 테스트 케이스Test plan
Closes #428
🤖 Generated with Claude Code
Summary by CodeRabbit