Skip to content

fix: KG search via pure SQL path — no embedding model required (R69)#191

Merged
EtanHey merged 1 commit intomainfrom
fix/kg-search-fallback
Apr 3, 2026
Merged

fix: KG search via pure SQL path — no embedding model required (R69)#191
EtanHey merged 1 commit intomainfrom
fix/kg-search-fallback

Conversation

@EtanHey
Copy link
Copy Markdown
Owner

@EtanHey EtanHey commented Apr 3, 2026

Summary

Root cause of KG results never appearing in brain_search: the entire KG path required loading bge-large-en-v1.5 (~650MB), and any failure silently fell back to text-only search via except Exception.

Fix: Two-path architecture:

  • Path 1 (always runs): _kg_facts_sql() — pure SQL lookup against kg_relations, no embeddings needed. Returns typed relations excluding co_occurs_with.
  • Path 2 (optional): Full kg_hybrid_search with vector similarity. If it fails, Path 1 results still shown with degradation notice.

Before/After

Query Before After
brain_search('anthropic created claude code') Text chunks only KG: anthropic --created--> Claude Code + text
brain_search('who works at Cantaloupe AI') Text chunks only KG: Josh --affiliated_with--> Cantaloupe AI + text
Any query matching entity name Silent fallback on embedding failure Always shows SQL KG facts

What changed

  • _kg_facts_sql(): new pure-SQL KG lookup (no embeddings, always works)
  • Exception handler: except Exception → specific RuntimeError, OSError, MemoryError + logger.warning
  • Degradation notice when Path 2 fails: "⚠ KG search degraded — showing SQL-only results"
  • KG facts always returned when entity is detected

Test plan

  • 42 tests pass
  • _kg_facts_sql('anthropic')created → Claude Code
  • _kg_facts_sql('Etan Heyman') → 9 relations, 0 co_occurs_with
  • Entity detection finds "anthropic" + "Claude Code" in query

🤖 Generated with Claude Code

Note

Fix KG search in _brain_search to work without an embedding model via pure SQL fallback

  • Adds _kg_facts_sql in search_handler.py, a pure SQL helper that looks up KG relations for an entity by name, excluding co_occurs_with relations, ordered by confidence (up to 20 results).
  • Updates _brain_search to always attempt SQL-based fact retrieval first, then attempt kg_hybrid_search for chunk retrieval separately.
  • When hybrid search fails due to embedding/model errors (RuntimeError, OSError, MemoryError), a kg_degraded flag is set and the response still returns SQL facts with a warning notice appended to formatted text.
  • Behavioral Change: facts are now sourced from SQL (up to 20) rather than sliced from hybrid search results (previously capped at 8); co_occurs_with relations are excluded.

Macroscope summarized 54a71c1.

Summary by CodeRabbit

Release Notes

  • Bug Fixes

    • Improved search reliability when embedding or model services are unavailable; search now gracefully degrades to return available results instead of failing completely.
    • Enhanced entity detection and knowledge graph result retrieval for search queries.
  • Refactor

    • Optimized search routing for entity-aware queries to improve performance and result accuracy.

Root cause: brain_search's KG path required embedding model loading
(~650MB, slow) and wrapped everything in `except Exception` that
silently fell back to text-only search. Users never saw KG results.

Fix: Two-path architecture:
- Path 1 (always runs): _kg_facts_sql() does pure SQL lookup against
  kg_relations — no embeddings, no vector search, just SELECT.
  Returns typed relations excluding co_occurs_with.
- Path 2 (optional): Full kg_hybrid_search with embedding model.
  If it fails, Path 1 results still shown with degradation notice.

Also:
- Replaced broad `except Exception` with specific handlers
  (RuntimeError, OSError, MemoryError) + warning-level logging
- Added "KG degraded" notice to MCP response when Path 2 fails
- KG facts always appear when entity is detected, regardless
  of embedding model availability

Eval results (local):
- brain_search('anthropic created claude code') → KG: created→Claude Code ✓
- brain_entity('Etan Heyman') → 9 relations, 0 co_occurs_with ✓
- Entity detection: finds "anthropic" + "Claude Code" ✓

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@EtanHey
Copy link
Copy Markdown
Owner Author

EtanHey commented Apr 3, 2026

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 3, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 3, 2026

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

Added a new _kg_facts_sql() function to perform pure SQL-based Knowledge Graph lookups for detected entities. Modified _brain_search() to invoke this function when entities are detected and no active filters exist, with conditional hybrid search fallback wrapped in exception handling to ensure graceful degradation.

Changes

Cohort / File(s) Summary
Search Handler KG Routing
src/brainlayer/mcp/search_handler.py
Added _kg_facts_sql() function for SQL-based entity-to-relation queries with JSON parsing and scoring. Updated _brain_search() entity routing to prioritize SQL facts, attempt hybrid search with try/except fallback, and degrade gracefully with kg_degraded flag on hybrid failures.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Search Request
    participant Entity as Entity Detection
    participant KGSQL as KG SQL Facts
    participant Hybrid as Hybrid Search
    participant Struct as Structured Results
    participant Resp as Response Builder

    Client->>Entity: Detect entities (no active filters)
    Entity-->>Client: Entities found
    Client->>KGSQL: _kg_facts_sql() for each entity
    KGSQL-->>Client: fact_items (20 relations max)
    
    Client->>Hybrid: Attempt kg_hybrid_search
    alt Hybrid Success
        Hybrid-->>Client: embedding results
    else Hybrid Fails
        Hybrid-->>Client: Exception caught
    end
    
    Client->>Struct: Process chunk results
    Struct-->>Client: structured_results
    
    Client->>Resp: Combine fact_items + structured_results
    alt Either path succeeded
        Resp-->>Client: Return combined results
        Resp->>Client: Add kg_degraded flag if hybrid failed
    else Both empty
        Resp-->>Client: Empty results
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

🐰 A rabbit hops through facts so fine,
SQL paths and hybrid align,
When search stumbles, grace won't fail,
For facts and fallbacks tell the tale! 🌿✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: implementing a pure SQL path for KG search that does not require the embedding model.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/kg-search-fallback

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@EtanHey EtanHey merged commit 6a9137d into main Apr 3, 2026
5 of 6 checks passed
@EtanHey EtanHey deleted the fix/kg-search-fallback branch April 3, 2026 10:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant