Skip to content

feat(duplicate): hybrid reasoning step — typed related issues, configurable top-N, confidence fix (closes #56)#100

Merged
gh-simili-bot merged 9 commits intomainfrom
feat/hybrid-reasoning-step-issue-56
Mar 5, 2026
Merged

feat(duplicate): hybrid reasoning step — typed related issues, configurable top-N, confidence fix (closes #56)#100
gh-simili-bot merged 9 commits intomainfrom
feat/hybrid-reasoning-step-issue-56

Conversation

@Kavirubc
Copy link
Copy Markdown
Contributor

@Kavirubc Kavirubc commented Mar 5, 2026

Summary

  • Configurable LLM candidate window — new defaults.duplicate_candidates (default: 5) replaces the hardcoded top-3 limit across both the pipeline step and pr-duplicate CLI
  • Typed RelatedIssues []RelatedIssueRef — replaces the opaque json.RawMessage SimilarIssues field so downstream code can parse the LLM's per-candidate classification (duplicate / related / distinct) without manual JSON munging
  • Updated prompt — guides the LLM to classify every candidate in related_issues and explicitly forbids marking related issues as is_duplicate: true, closing the 0.80–0.84 ambiguity gap
  • Confidence threshold aligned — default raised 0.80 → 0.85 to match the prompt's stated scale, preventing related issues being falsely flagged as low-confidence duplicates
  • Related issues surfaced in comment[!WARNING] block appends "Also related to: #N" inline; when not a duplicate but related issues exist a softer [!NOTE] block is emitted
  • pr-duplicate top-k parity — LLM candidate limit now respects --top-k flag instead of hardcoded 3

Files changed

File Change
internal/core/config/config.go Add DuplicateCandidates, fix threshold default 0.80 → 0.85, wire mergeConfigs
internal/integrations/ai/llm.go Add RelatedIssueRef struct, replace SimilarIssues with RelatedIssues
internal/integrations/ai/prompts.go Add related_issues schema, DISTINCT category, anti-hedging instruction
internal/steps/duplicate_detector.go Configurable top-N, duplicateDetectorLLM interface, metadata storage, threshold 0.85
internal/steps/response_builder.go Surface related issues in [!WARNING] and [!NOTE] blocks
cmd/simili/commands/pr_duplicate.go Use prDupTopK for LLM candidate limit
internal/steps/duplicate_detector_test.go New — 5 unit tests via fakeLLM interface
internal/core/config/config_test.go Add TestDuplicateCandidatesDefault and TestDuplicateCandidatesYAML
tests/integration/e2e_test.go Explicitly wire DuplicateCandidates: 5 in E2E config
.github/simili.yaml + DOCS/examples/ Document duplicate_candidates field

Test plan

  • go build ./... — clean
  • go vet ./... — clean
  • go test ./... — 12 packages, 0 failures
  • All 9 commits signed off (Signed-off-by: Kavirubc)
  • Manual: trigger triage on an issue with related-but-not-duplicate candidates — verify [!NOTE] block appears
  • Manual: simili pr-duplicate --top-k 7 — verify 7 candidates forwarded to LLM

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added configurable duplicate candidate limit (default 5) for LLM-based duplicate analysis
    • Enhanced related issues display in duplicate detection output
  • Improvements

    • Increased default duplicate confidence threshold from 0.80 to 0.85 for more conservative detection
    • Improved related issues formatting and visibility in results
  • Tests

    • Added comprehensive test coverage for duplicate detection configuration and behavior

Kavirubc and others added 9 commits March 5, 2026 14:01
- Add `duplicate_candidates` to DefaultsConfig (default: 5) for
  configurable LLM candidate limit per issue #56
- Raise DuplicateConfidenceThreshold default from 0.8 → 0.85 to align
  with the prompt's confidence scale and prevent related issues being
  falsely flagged as duplicates
- Wire DuplicateCandidates through mergeConfigs for inheritance support

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>
…plicateResult

Add RelatedIssueRef struct with number, title, and relationship fields.
Replace json.RawMessage SimilarIssues with []RelatedIssueRef RelatedIssues
so downstream code can parse and surface the LLM's per-candidate
classification (duplicate/related/distinct) without manual JSON munging.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>
…es schema

- Add DISTINCT relationship category alongside DUPLICATE and RELATED
- Add explicit instruction to classify ALL candidates in related_issues
- Instruct LLM not to set is_duplicate for "related" issues to prevent
  related issues being falsely flagged as low-confidence duplicates
- Update JSON schema to use related_issues array instead of similar_issues

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>
…stable LLM interface

- Replace hardcoded maxSimilar=3 with Config.Defaults.DuplicateCandidates
  (fallback 5) for per-repo tunability
- Extract duplicateDetectorLLM interface to enable unit tests without a
  live LLM connection
- Store result.RelatedIssues in ctx.Metadata["related_issues"] so the
  response builder can surface related-but-not-duplicate issues
- Update in-code threshold fallback from 0.8 → 0.85 to match config default

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>
…ateSection

- When IsDuplicate=true: append "Also related to: #N (title)" inline
- When not a duplicate but related issues exist: emit a [!NOTE] block
  instead of [!WARNING] to distinguish related-but-distinct from actual
  duplicates
- Extract buildRelatedRefList helper for formatting related refs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>
The LLM candidate window was hardcoded to 3 independent of --top-k.
Replace the literal with prDupTopK so the flag controls both the
vector search result count and the number of candidates forwarded to
the LLM for duplicate analysis.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>
Five tests covering:
- UsesConfigDuplicateCandidates: DuplicateCandidates=2 sends exactly 2 to LLM
- DefaultsToFiveWhenZero: DuplicateCandidates=0 falls back to 5
- ConfidenceGateBlocks: 0.75 confidence blocked by 0.8 threshold
- RelatedIssuesStoredInMetadata: related_issues persisted to ctx.Metadata
- RelatedNotMarkedDuplicate: 0.82 confidence blocked by 0.85 threshold,
  related issues still appear in metadata

Uses fakeLLM implementing duplicateDetectorLLM interface to avoid
live LLM dependency in CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>
Replace C-style for loop with idiomatic `for i := range n` in
newTestCtx helper per Go 1.22 range-over-integer feature.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>
…ests

- Add duplicate_candidates: 5 with explanatory comment to .github/simili.yaml
  and both DOCS example configs so users can discover and tune the field
- Wire DuplicateCandidates into the E2E test config struct explicitly
- Add TestDuplicateCandidatesDefault: verifies applyDefaults sets 5 and
  DuplicateConfidenceThreshold defaults to 0.85
- Add TestDuplicateCandidatesYAML: verifies field round-trips through YAML

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 5, 2026

📝 Walkthrough

Walkthrough

This PR introduces a configurable duplicate_candidates parameter (default 5) that controls the number of similar issues sent to the LLM for duplicate analysis, replacing a hardcoded value. It also refactors the LLM duplicate detection result structure from raw JSON (SimilarIssues) to a typed array (RelatedIssues with RelatedIssueRef objects), updating associated configurations, prompts, logic, and response formatting.

Changes

Cohort / File(s) Summary
Configuration files
.github/simili.yaml, DOCS/examples/multi-repo/simili.yaml, DOCS/examples/single-repo/simili.yaml
Added new duplicate_candidates: 5 configuration key under defaults to control the maximum number of LLM candidates.
Config structure and logic
internal/core/config/config.go, internal/core/config/config_test.go
Added DuplicateCandidates int field to DefaultsConfig struct with default value 5; updated defaults application and merging logic; raised DuplicateConfidenceThreshold from 0.80 to 0.85. Added two new test functions to verify config parsing and defaults application.
Command layer
cmd/simili/commands/pr_duplicate.go
Replaced hardcoded top-k constant 3 with configurable prDupTopK for controlling the number of similar candidates collected.
LLM integration
internal/integrations/ai/llm.go, internal/integrations/ai/prompts.go
Introduced RelatedIssueRef public type; refactored DuplicateResult to replace SimilarIssues: json.RawMessage with RelatedIssues: []RelatedIssueRef. Updated duplicate detection prompt to use related_issues field and guidance for populating candidates.
Duplicate detection pipeline
internal/steps/duplicate_detector.go, internal/steps/duplicate_detector_test.go
Introduced duplicateDetectorLLM interface; updated DuplicateDetector to use configurable candidate limit from context (default 5), with context-based safeguards. Added comprehensive unit test suite covering config usage, defaults, confidence gating, and related issue metadata storage.
Response formatting
internal/steps/response_builder.go
Refactored buildDuplicateSection to surface related-but-not-duplicate issues; added buildRelatedRefList helper to format RelatedIssueRef entries; now emits warnings for duplicates and notes for related issues based on new RelatedIssues structure.
E2E tests
tests/integration/e2e_test.go
Added DuplicateCandidates: 5 to test configuration.

Sequence Diagram

sequenceDiagram
    participant Config as Configuration
    participant Detector as DuplicateDetector
    participant LLM as LLM Client
    participant Builder as ResponseBuilder

    Config->>Detector: Provide DuplicateCandidates=5
    Detector->>Detector: Collect up to 5 similar issues from context
    Detector->>LLM: DetectDuplicate(context, input with 5 candidates)
    LLM->>LLM: Analyze candidates, populate RelatedIssues array
    LLM-->>Detector: DuplicateResult with RelatedIssues: []RelatedIssueRef
    Detector->>Detector: Apply confidence threshold (0.85)
    Detector->>Detector: Store RelatedIssues in context metadata
    Detector-->>Builder: Context with duplicate result & related issues
    Builder->>Builder: Format RelatedIssueRef entries
    Builder->>Builder: Emit duplicate warning or related-issues note
    Builder-->>Response: Formatted response section
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested labels

enhancement

Poem

🐰 A configurable hop through duplicates we go,
Five candidates now dance where three would show,
RelatedIssues gleam in typed array grace,
No more raw JSON in this rabbit's space! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 38.89% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat(duplicate): hybrid reasoning step — typed related issues, configurable top-N, confidence fix (closes #56)' directly and comprehensively describes the main changes: introducing typed related issues, making the candidate window configurable, and fixing the confidence threshold.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/hybrid-reasoning-step-issue-56

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gh-simili-bot
Copy link
Copy Markdown
Contributor

Simili Triage Report

Note

Quality Score: 8.8/10 (Good)
The issue could be improved. See suggestions below.

Classification

Category Value
Labels
Quality Improvements
  • Body content is truncated, potentially missing minor details
  • Ensure the full description is provided to avoid any ambiguity
Similar Threads
Similarity Type Thread Status
86% 🔀 #36 Similar Issue
86% 🔀 #40 Similar Issue
85% 🔀 #36 Similar Issue

Warning

Possible Duplicate (Confidence: 90%)
This pull request might be a duplicate of #36.
Reason: The current issue directly implements and refines the core objectives outlined in Similar Issue 3 (Issue #36). Both issues focus on improving the distinction between 'DUPLICATE' and 'RELATED' issues, aligning the confidence threshold to 0.85, and enhancing the data structure for similar issues in results. The current issue's 'Typed RelatedIssues' is an evolution of the 'json.RawMessage SimilarIssues' concept mentioned in Similar Issue 3, indicating a direct progression or completion of the same feature request. Fixing the current issue would fully resolve the intent of Similar Issue 3.
Also related to: #36 (Similar Issue), #40 (Similar Issue)

This pull request will be automatically closed in 72 hours if no objections are raised. If you believe this is not a duplicate, please leave a comment explaining why.


Generated by Simili Bot

@gh-simili-bot
Copy link
Copy Markdown
Contributor

🧪 E2E Test

Bot responded: yes

| Auto-closer (dry-run) | processed: 0 closed: 0 grace: 0 human: 0 |

Test repo → gh-simili-bot/simili-e2e-22709371361
Run → logs

Auto-generated by E2E pipeline

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
cmd/simili/commands/pr_duplicate.go (1)

197-209: ⚠️ Potential issue | 🔴 Critical

Guard --top-k before slice allocation to prevent runtime panic.

If prDupTopK is negative, make([]ai.SimilarIssueInput, 0, prDupTopK) panics. Also, cutoff should use >= for defensive bounds handling.

🛡️ Proposed fix
-			similar := make([]ai.SimilarIssueInput, 0, prDupTopK)
+			limit := prDupTopK
+			if limit <= 0 {
+				limit = 1
+			}
+			similar := make([]ai.SimilarIssueInput, 0, limit)
 			for _, c := range out.Candidates {
 				if c.Type != "issue" {
 					continue
 				}
 				similar = append(similar, ai.SimilarIssueInput{
 					Number:     c.Number,
 					Title:      c.Title,
 					URL:        c.URL,
 					Similarity: c.Score,
 				})
-				if len(similar) == prDupTopK {
+				if len(similar) >= limit {
 					break
 				}
 			}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/simili/commands/pr_duplicate.go` around lines 197 - 209, Guard against
negative prDupTopK before allocating similar slice and tighten the cutoff check:
ensure prDupTopK is clamped to a non-negative value (e.g., set to 0 if negative)
before calling make([]ai.SimilarIssueInput, 0, prDupTopK) in the code around the
loop that builds similar, and change the exit condition from if len(similar) ==
prDupTopK { break } to if len(similar) >= prDupTopK { break } so the loop
defensively handles bounds; reference the prDupTopK variable and the similar
slice construction and the loop that appends ai.SimilarIssueInput in
pr_duplicate.go.
internal/steps/duplicate_detector.go (1)

95-103: ⚠️ Potential issue | 🟠 Major

Guard against nil DuplicateResult before metadata writes.

At Line 102, a (nil, nil) return from DetectDuplicate would panic on dereference.

Proposed fix
 	result, err := s.llm.DetectDuplicate(ctx.Ctx, input)
 	if err != nil {
 		log.Printf("[duplicate_detector] Failed to detect duplicates: %v (non-blocking)", err)
 		return nil // Graceful degradation
 	}
+	if result == nil {
+		log.Printf("[duplicate_detector] Empty duplicate result, skipping (non-blocking)")
+		return nil
+	}
 
 	// Store full result and related issues in context.
 	ctx.Metadata["duplicate_result"] = result
 	ctx.Metadata["related_issues"] = result.RelatedIssues
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/steps/duplicate_detector.go` around lines 95 - 103, The code assumes
s.llm.DetectDuplicate returns a non-nil *DuplicateResult; add a nil-check after
calling s.llm.DetectDuplicate to avoid panics before writing ctx.Metadata.
Specifically, after result, err := s.llm.DetectDuplicate(ctx.Ctx, input) and the
err check, verify result != nil (and if nil, log a non-blocking message and
return or skip storing related issues) before assigning
ctx.Metadata["duplicate_result"] = result and ctx.Metadata["related_issues"] =
result.RelatedIssues so you never dereference a nil DuplicateResult.
🧹 Nitpick comments (2)
internal/steps/duplicate_detector.go (1)

105-109: Harden threshold fallback for invalid configuration values.

Current fallback only handles 0; negative or >1 values can silently mis-gate duplicate decisions.

Suggested hardening
 	// Get threshold from config (default 0.85 to align with prompt guidance).
 	threshold := ctx.Config.Transfer.DuplicateConfidenceThreshold
-	if threshold == 0 {
+	if threshold <= 0 || threshold > 1 {
 		threshold = 0.85
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/steps/duplicate_detector.go` around lines 105 - 109, The current
fallback only treats a zero value as invalid; update the threshold handling
around ctx.Config.Transfer.DuplicateConfidenceThreshold so any out-of-range
value (<= 0 or > 1) uses the default 0.85 instead of being accepted, and
optionally emit a warning via the existing logger (e.g., ctx.Logger or process
logger) when you replace an invalid config value; modify the block that declares
threshold (variable name threshold) to validate the value and set threshold =
0.85 for invalid cases.
internal/integrations/ai/llm.go (1)

326-329: Normalize Relationship values after JSON parse for resilience.

Downstream logic relies on exact string matches (related/duplicate/distinct). Normalizing here avoids missing classifications due to casing/whitespace drift in LLM output.

Suggested normalization
 	// Ensure non-nil RelatedIssues
 	if result.RelatedIssues == nil {
 		result.RelatedIssues = []RelatedIssueRef{}
 	}
+	for i := range result.RelatedIssues {
+		rel := strings.ToLower(strings.TrimSpace(result.RelatedIssues[i].Relationship))
+		switch rel {
+		case "duplicate", "related", "distinct":
+			result.RelatedIssues[i].Relationship = rel
+		default:
+			result.RelatedIssues[i].Relationship = "distinct"
+		}
+	}
 
 	return &result, nil
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/integrations/ai/llm.go` around lines 326 - 329, After ensuring
result.RelatedIssues is non-nil, normalize each RelatedIssueRef.Relationship
value to a canonical form (e.g., trim whitespace and lower-case) so downstream
exact string checks (like "related", "duplicate", "distinct") won't break on
casing/spacing; update the loop that populates result.RelatedIssues (or add a
small post-parse pass) to set ref.Relationship =
strings.ToLower(strings.TrimSpace(ref.Relationship)) and optionally map known
synonyms to your canonical tokens.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/core/config/config.go`:
- Around line 313-315: Update the inline comment for the
DuplicateConfidenceThreshold field to match the runtime default of 0.85: locate
the Transfer struct definition or the DuplicateConfidenceThreshold field
declaration (referencing DuplicateConfidenceThreshold and c.Transfer) and change
the comment that currently says "0.8" to "0.85" so the inline documentation
matches the runtime default set in the initialization block.
- Around line 400-402: In mergeConfigs, change the override condition for
DuplicateCandidates to only accept positive values: instead of copying
child.Defaults.DuplicateCandidates when it's non-zero, copy it only when
child.Defaults.DuplicateCandidates > 0 so negative or zero child values do not
override the parent; update the branch that sets
result.Defaults.DuplicateCandidates accordingly (refer to mergeConfigs,
child.Defaults.DuplicateCandidates, result.Defaults.DuplicateCandidates) to
match applyDefaults' expectations.

In `@internal/steps/response_builder.go`:
- Around line 296-327: The duplicate-warning block should use the final gated
decision (ctx.Result.IsDuplicate) instead of the raw duplicateResult.IsDuplicate
so below-threshold predictions don't render warnings; update the condition to if
ctx.Result.IsDuplicate while still using duplicateResult for confidence and
reasoning display (keep references to duplicateResult.Confidence,
duplicateResult.DuplicateOf, duplicateResult.Reasoning), and leave the
relatedRefs branch intact (refs built via buildRelatedRefList) and the
auto-close gracePeriod logic unchanged.

---

Outside diff comments:
In `@cmd/simili/commands/pr_duplicate.go`:
- Around line 197-209: Guard against negative prDupTopK before allocating
similar slice and tighten the cutoff check: ensure prDupTopK is clamped to a
non-negative value (e.g., set to 0 if negative) before calling
make([]ai.SimilarIssueInput, 0, prDupTopK) in the code around the loop that
builds similar, and change the exit condition from if len(similar) == prDupTopK
{ break } to if len(similar) >= prDupTopK { break } so the loop defensively
handles bounds; reference the prDupTopK variable and the similar slice
construction and the loop that appends ai.SimilarIssueInput in pr_duplicate.go.

In `@internal/steps/duplicate_detector.go`:
- Around line 95-103: The code assumes s.llm.DetectDuplicate returns a non-nil
*DuplicateResult; add a nil-check after calling s.llm.DetectDuplicate to avoid
panics before writing ctx.Metadata. Specifically, after result, err :=
s.llm.DetectDuplicate(ctx.Ctx, input) and the err check, verify result != nil
(and if nil, log a non-blocking message and return or skip storing related
issues) before assigning ctx.Metadata["duplicate_result"] = result and
ctx.Metadata["related_issues"] = result.RelatedIssues so you never dereference a
nil DuplicateResult.

---

Nitpick comments:
In `@internal/integrations/ai/llm.go`:
- Around line 326-329: After ensuring result.RelatedIssues is non-nil, normalize
each RelatedIssueRef.Relationship value to a canonical form (e.g., trim
whitespace and lower-case) so downstream exact string checks (like "related",
"duplicate", "distinct") won't break on casing/spacing; update the loop that
populates result.RelatedIssues (or add a small post-parse pass) to set
ref.Relationship = strings.ToLower(strings.TrimSpace(ref.Relationship)) and
optionally map known synonyms to your canonical tokens.

In `@internal/steps/duplicate_detector.go`:
- Around line 105-109: The current fallback only treats a zero value as invalid;
update the threshold handling around
ctx.Config.Transfer.DuplicateConfidenceThreshold so any out-of-range value (<= 0
or > 1) uses the default 0.85 instead of being accepted, and optionally emit a
warning via the existing logger (e.g., ctx.Logger or process logger) when you
replace an invalid config value; modify the block that declares threshold
(variable name threshold) to validate the value and set threshold = 0.85 for
invalid cases.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 1955cb8a-429d-4b0d-a432-dbbf893e4168

📥 Commits

Reviewing files that changed from the base of the PR and between 93e8d90 and 2ee7f63.

📒 Files selected for processing (12)
  • .github/simili.yaml
  • DOCS/examples/multi-repo/simili.yaml
  • DOCS/examples/single-repo/simili.yaml
  • cmd/simili/commands/pr_duplicate.go
  • internal/core/config/config.go
  • internal/core/config/config_test.go
  • internal/integrations/ai/llm.go
  • internal/integrations/ai/prompts.go
  • internal/steps/duplicate_detector.go
  • internal/steps/duplicate_detector_test.go
  • internal/steps/response_builder.go
  • tests/integration/e2e_test.go

Comment on lines 313 to 315
if c.Transfer.DuplicateConfidenceThreshold == 0 {
c.Transfer.DuplicateConfidenceThreshold = 0.8
c.Transfer.DuplicateConfidenceThreshold = 0.85
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Update stale default documentation for duplicate confidence threshold.

Runtime default is now 0.85, but the inline field comment still states 0.8, which can mislead config authors.

📝 Proposed doc fix
-	DuplicateConfidenceThreshold float64          `yaml:"duplicate_confidence_threshold,omitempty"` // Default: 0.8
+	DuplicateConfidenceThreshold float64          `yaml:"duplicate_confidence_threshold,omitempty"` // Default: 0.85
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/config/config.go` around lines 313 - 315, Update the inline
comment for the DuplicateConfidenceThreshold field to match the runtime default
of 0.85: locate the Transfer struct definition or the
DuplicateConfidenceThreshold field declaration (referencing
DuplicateConfidenceThreshold and c.Transfer) and change the comment that
currently says "0.8" to "0.85" so the inline documentation matches the runtime
default set in the initialization block.

Comment on lines +400 to +402
if child.Defaults.DuplicateCandidates != 0 {
result.Defaults.DuplicateCandidates = child.Defaults.DuplicateCandidates
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use positive-only override when merging duplicate_candidates.

mergeConfigs currently overrides parent when child value is any non-zero integer, including negative values. That conflicts with applyDefaults (which treats non-positive as invalid), and can unexpectedly replace a valid parent value before falling back to 5.

🔧 Proposed fix
-	if child.Defaults.DuplicateCandidates != 0 {
+	if child.Defaults.DuplicateCandidates > 0 {
 		result.Defaults.DuplicateCandidates = child.Defaults.DuplicateCandidates
 	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if child.Defaults.DuplicateCandidates != 0 {
result.Defaults.DuplicateCandidates = child.Defaults.DuplicateCandidates
}
if child.Defaults.DuplicateCandidates > 0 {
result.Defaults.DuplicateCandidates = child.Defaults.DuplicateCandidates
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/config/config.go` around lines 400 - 402, In mergeConfigs,
change the override condition for DuplicateCandidates to only accept positive
values: instead of copying child.Defaults.DuplicateCandidates when it's
non-zero, copy it only when child.Defaults.DuplicateCandidates > 0 so negative
or zero child values do not override the parent; update the branch that sets
result.Defaults.DuplicateCandidates accordingly (refer to mergeConfigs,
child.Defaults.DuplicateCandidates, result.Defaults.DuplicateCandidates) to
match applyDefaults' expectations.

Comment on lines +296 to +327
if duplicateResult.IsDuplicate {
confidencePct := int(duplicateResult.Confidence * 100)

parts = append(parts, "> [!WARNING]")
parts = append(parts, fmt.Sprintf("> **Possible Duplicate** (Confidence: %d%%)", confidencePct))
parts = append(parts, fmt.Sprintf("> This %s might be a duplicate of #%d.", subject, duplicateResult.DuplicateOf))

if duplicateResult.Reasoning != "" {
parts = append(parts, fmt.Sprintf("> _Reason: %s_", duplicateResult.Reasoning))
}

// Surface related issues inline with the duplicate warning.
if len(relatedRefs) > 0 {
refs := buildRelatedRefList(relatedRefs)
parts = append(parts, fmt.Sprintf("> _Also related to: %s_", refs))
}

// Auto-close grace period warning
gracePeriod := ctx.Config.AutoClose.GracePeriodHours
if gracePeriod <= 0 {
gracePeriod = 72
}
parts = append(parts, ">")
parts = append(parts, fmt.Sprintf("> ⏳ **This %s will be automatically closed in %d hours** if no objections are raised. "+
"If you believe this is **not** a duplicate, please leave a comment explaining why.", subject, gracePeriod))
} else if len(relatedRefs) > 0 {
// Not a duplicate but related issues were found — emit a softer NOTE block.
refs := buildRelatedRefList(relatedRefs)
parts = append(parts, "> [!NOTE]")
parts = append(parts, "> **Related Issues Found**")
parts = append(parts, fmt.Sprintf("> This %s is not a duplicate, but is related to %s.", subject, refs))
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Use the gated decision for duplicate warning rendering.

This branch uses raw duplicateResult.IsDuplicate, so below-threshold results can still show a duplicate warning and auto-close message. It should follow the final gated decision (ctx.Result.IsDuplicate).

Proposed fix
-	if duplicateResult.IsDuplicate {
-		confidencePct := int(duplicateResult.Confidence * 100)
+	if ctx.Result.IsDuplicate {
+		confidencePct := int(ctx.Result.DuplicateConfidence * 100)
 
 		parts = append(parts, "> [!WARNING]")
 		parts = append(parts, fmt.Sprintf("> **Possible Duplicate** (Confidence: %d%%)", confidencePct))
-		parts = append(parts, fmt.Sprintf("> This %s might be a duplicate of #%d.", subject, duplicateResult.DuplicateOf))
+		parts = append(parts, fmt.Sprintf("> This %s might be a duplicate of #%d.", subject, ctx.Result.DuplicateOf))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/steps/response_builder.go` around lines 296 - 327, The
duplicate-warning block should use the final gated decision
(ctx.Result.IsDuplicate) instead of the raw duplicateResult.IsDuplicate so
below-threshold predictions don't render warnings; update the condition to if
ctx.Result.IsDuplicate while still using duplicateResult for confidence and
reasoning display (keep references to duplicateResult.Confidence,
duplicateResult.DuplicateOf, duplicateResult.Reasoning), and leave the
relatedRefs branch intact (refs built via buildRelatedRefList) and the
auto-close gracePeriod logic unchanged.

@gh-simili-bot gh-simili-bot merged commit 3002181 into main Mar 5, 2026
9 checks passed
@gh-simili-bot gh-simili-bot deleted the feat/hybrid-reasoning-step-issue-56 branch March 5, 2026 08:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

e2e enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants