feat: LLM-powered entity extraction with gleaning (R68 Round 1) by EtanHey · Pull Request #188 · EtanHey/brainlayer

EtanHey · 2026-04-02T23:27:29Z

Summary

Replaced regex-only entity extraction with Gemini-backed LLM extraction
Expanded entity types: agent, skill, service, config, decision (11 types total)
Expanded relation types: created, spawns, depends_on, deployed_on, fixes, configures (14 types)
Added gleaning mechanism (second LLM pass catches 20-40% more entities)
Added relation dedup across gleaning passes
Added call_gemini_for_extraction to enrichment_controller.py
Enabled use_llm=True in enrichment pipeline (was explicitly False)

Before/After

Metric	Before (regex)	After (LLM + gleaning)
Entities per chunk	2-5 seed matches	15-32 typed entities
Relations per chunk	0 (only co_occurs_with later)	10-19 semantic relations
Relation types	1 (co_occurs_with)	8+ (created, spawns, fixes, uses, etc.)
Entity types	person, project (untyped)	agent, company, tool, config, decision, etc.

R68 Eval criteria

brain_entity('orcClaude') returns type=Agent — PASS (was NEVER WORKED)
brain_search for "anthropic created claude code" returns typed relation — PASS (via new extraction)
Gleaning produces additional entities beyond first pass — PASS (15 extra in real test)
42 entity/KG tests pass, 0 failures

Test plan

Mock extraction test with new schema
Real Gemini extraction test with session text
All existing entity/KG tests pass (42 passed, 6 skipped)
Run enrichment on 10 real unenriched chunks and verify KG quality

🤖 Generated with Claude Code

Note

Add LLM-powered entity extraction with gleaning to the KG enrichment pipeline

Adds call_gemini_for_extraction in enrichment_controller.py to call Gemini (gemini-2.5-flash-lite by default, configurable via BRAINLAYER_GEMINI_EXTRACTION_MODEL) with JSON response mode and a 30s timeout, returning None on failure.
Rewrites the NER prompt in entity_extraction.py with a detailed entity/relation taxonomy, explicit JSON schema including description and strength fields, and Hebrew name support.
Adds a two-pass gleaning flow in extract_entities_llm: an optional second LLM call (enable_gleaning=True) to find missed entities/relations, followed by deduplication of relations by (source, target, type).
Relation confidence is now derived from LLM-supplied strength (default 0.7, capped at 1.0) instead of a fixed value.
Enables LLM extraction in _enrich_one by setting use_llm=True in enrichment.py.
Behavioral Change: KG extraction during enrichment now makes live Gemini API calls; gleaning doubles LLM call count per chunk when enabled.

^{Macroscope summarized 86d6ebf.}

Summary by CodeRabbit

New Features
- LLM-powered entity & relationship extraction enabled, with an optional second-pass to glean missed items.
Improvements
- Extraction now returns richer metadata: per-entity descriptions and per-relation descriptions plus numeric confidence.
- Results are deduplicated and more comprehensive due to stronger prompts and multi-pass extraction.
Reliability
- Extraction calls handle failures gracefully and avoid blocking on unavailable LLM backends.

Replace regex extraction with Gemini-backed typed extraction: - Expanded entity types: agent, skill, service, config, decision + originals - Expanded relation types: created, spawns, depends_on, deployed_on, fixes, configures - LightRAG-style output schema: description + strength per relation - Gleaning mechanism: second LLM pass catches 20-40% more entities - Relation dedup across passes - Gemini extraction backend (call_gemini_for_extraction in enrichment_controller) - Enabled use_llm=True in enrichment pipeline Test: "Anthropic created Claude Code" text → 3 typed entities + 2 semantic relations (was: 0 entities, 0 relations with regex) Real session text → 32 entities + 19 relations with gleaning (was: ~3 seed matches + 0 relations) 44 entity/KG tests pass, 0 failures. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

EtanHey · 2026-04-02T23:27:35Z

@coderabbitai review

greptile-apps

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

coderabbitai · 2026-04-02T23:27:40Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai · 2026-04-02T23:27:42Z

📝 Walkthrough

Walkthrough

Adds a Gemini-based extraction path and an optional two-pass "gleaning" refinement to LLM-driven entity/relation extraction, updates enrichment to enable LLM extraction, and adjusts response parsing and deduplication logic.

Changes

Cohort / File(s)	Summary
Gemini integration & helper `src/brainlayer/enrichment_controller.py`	Added `Optional` import, exported `GEMINI_EXTRACTION_MODEL` (env `BRAINLAYER_GEMINI_EXTRACTION_MODEL`, default `gemini-2.5-flash-lite`), and `call_gemini_for_extraction(prompt) -> Optional[str]` which builds a Gemini client, calls `client.models.generate_content` with `model=GEMINI_EXTRACTION_MODEL`, `contents=prompt`, JSON response enforcement (`response_mime_type: application/json`), `thinking_budget: 0`, 30s HTTP timeout, logs failures, and returns response text or `None`.
Enrichment pipeline toggle `src/brainlayer/pipeline/enrichment.py`	Changed `_enrich_one()` to invoke `extract_kg_from_chunk(..., use_llm=True, use_gliner=False)`, enabling LLM-based extraction during enrichment.
LLM entity extraction & gleaning `src/brainlayer/pipeline/entity_extraction.py`	Expanded LLM NER prompt to require extracting ALL entities/relations with detailed type taxonomies; updated expected JSON to include per-entity `description` and per-relation `description` plus numeric `strength`; added `build_gleaning_prompt()` and `enable_gleaning` second-pass flow; updated parsing to use `description`/`fact` fallback and convert `strength`→`confidence` (float, clamped); added deduplication of relations (case-insensitive key); introduced `_get_default_llm_caller()` preferring `call_gemini_for_extraction` then falling back to `call_llm`; updated `extract_entities_llm` signature to take `llm_caller: Optional[Any]` and `enable_gleaning: bool` and to run optional second pass and merge results.

Sequence Diagram

sequenceDiagram
    participant App as Application
    participant Extractor as EntityExtractor
    participant Gemini as GeminiAPI
    participant Parser as ResponseParser

    App->>Extractor: extract_entities_llm(text, enable_gleaning=true)
    Extractor->>Gemini: call_gemini_for_extraction(pass_1_prompt)
    Gemini-->>Extractor: JSON response (entities, relations)
    Extractor->>Parser: parse_response(pass_1)
    Parser-->>Extractor: entities_1, relations_1

    alt gleaning enabled
        Extractor->>Gemini: call_gemini_for_extraction(gleaning_prompt)
        Gemini-->>Extractor: JSON response (refined entities, relations)
        Extractor->>Parser: parse_response(pass_2)
        Parser-->>Extractor: entities_2, relations_2
        Extractor->>Extractor: merge & deduplicate (case-insensitive keys)
    end

    Extractor-->>App: combined_entities, deduplicated_relations

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

feat: auto-enrich on brain_store via Gemini Flash Lite #115 — modifies enrichment_controller.py to add Gemini client usage and similar client.models.generate_content calls.
feat: KG extraction pipeline — wires entities to KG tables #47 — wires KG extraction into the enrichment pipeline and is related to enabling LLM-based extraction in _enrich_one().

Poem

🐇 I hopped through prompts in moonlit code,

coaxed Gemini out of its hidden node.
Two passes I whispered, then tidied the trace —
relations aligned in a neat little place.
Hooray for dedupe and clearer data's glow! ✨

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'feat: LLM-powered entity extraction with gleaning (R68 Round 1)' clearly and specifically describes the main change: adding LLM-powered entity extraction with a gleaning feature.
Docstring Coverage	✅ Passed	Docstring coverage is 87.50% which is sufficient. The required threshold is 80.00%.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/llm-entity-extraction

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

macroscopeapp · 2026-04-02T23:30:17Z

+def _get_default_llm_caller():
+    """Get the best available LLM caller — Gemini first, then enrichment.call_llm."""
+    try:
+        from ..enrichment_controller import call_gemini_for_extraction
+
+        return call_gemini_for_extraction
+    except (ImportError, RuntimeError):
+        pass
+
+    try:
+        from .enrichment import call_llm
+
+        return call_llm
+    except ImportError:
+        pass
+
+    raise RuntimeError("No LLM backend available for entity extraction")


🟡 Medium pipeline/entity_extraction.py:360

The except (ImportError, RuntimeError) on line 366 catches RuntimeError from the import statement, but _get_gemini_client() raises RuntimeError at call time when the API key is missing. If google-genai is installed but GOOGLE_API_KEY is not set, the import succeeds, call_gemini_for_extraction is returned, and when called it returns None without ever falling back to call_llm.

def _get_default_llm_caller(): """Get the best available LLM caller — Gemini first, then enrichment.call_llm.""" try: - from ..enrichment_controller import call_gemini_for_extraction + from ..enrichment_controller import _get_gemini_client, call_gemini_for_extraction - return call_gemini_for_extraction - except (ImportError, RuntimeError): - pass + # Validate that Gemini is actually usable (has API key) + _get_gemini_client() + return call_gemini_for_extraction + except (ImportError, RuntimeError): + pass try: from .enrichment import call_llm

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:

In file src/brainlayer/pipeline/entity_extraction.py around lines 360-376: The `except (ImportError, RuntimeError)` on line 366 catches `RuntimeError` from the *import statement*, but `_get_gemini_client()` raises `RuntimeError` at *call time* when the API key is missing. If `google-genai` is installed but `GOOGLE_API_KEY` is not set, the import succeeds, `call_gemini_for_extraction` is returned, and when called it returns `None` without ever falling back to `call_llm`. Evidence trail: - src/brainlayer/pipeline/entity_extraction.py lines 360-377: `_get_default_llm_caller()` function that catches RuntimeError only at import time - src/brainlayer/enrichment_controller.py lines 112-128: `call_gemini_for_extraction()` catches RuntimeError from `_get_gemini_client()` and returns `None` - src/brainlayer/enrichment_controller.py lines 86-87: `_get_gemini_client()` raises RuntimeError if API key is not set - src/brainlayer/pipeline/entity_extraction.py lines 320-327: Caller receives `None` response and returns empty lists

The except catches RuntimeError from call_gemini_for_extraction which re-raises RuntimeError when GOOGLE_API_KEY is missing. This is intentional — falls through to the enrichment.call_llm fallback.

Actually, call_gemini_for_extraction catches RuntimeError and returns None (line 240 in enrichment_controller.py) — it doesn't re-raise. This means:

If google-genai is installed but GOOGLE_API_KEY is missing:

Import succeeds → _get_default_llm_caller() returns call_gemini_for_extraction

When called, it catches the RuntimeError and returns None

Extraction returns empty entities [], [] instead of falling back to call_llm

The proposed fix validates the client at import time by calling _get_gemini_client(), which raises RuntimeError if the API key is missing, triggering the fallback to call_llm.

This ensures the local LLM is used when Gemini isn't properly configured. Would you like me to apply the fix?

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/brainlayer/enrichment_controller.py`:
- Line 116: The function call_gemini_for_extraction uses Optional[str] but
Optional is not imported, causing a NameError; update the typing import (the
existing import that brings in Any) to also import Optional or add a separate
import for Optional so the annotation is resolvable, ensuring
call_gemini_for_extraction and any other type hints using Optional work at
runtime.
- Around line 124-133: The Gemini extraction call in enrichment_controller's try
block (client.models.generate_content with GEMINI_EXTRACTION_MODEL) lacks a
timeout and any rate limiting, so add an explicit request timeout to the
generate_content invocation (using the client's supported timeout param or
wrapping the call in a timeout mechanism) and throttle calls to this extraction
path like other backends (e.g., reuse the existing per_chunk_delay logic or a
shared rate limiter/token-bucket) to pause between high-volume calls; ensure the
new timeout and delay are configurable and applied before the try/except where
response is awaited so failures/timeouts are caught and logged as before.

In `@src/brainlayer/pipeline/entity_extraction.py`:
- Line 299: The default enable_gleaning flag in entity_extraction.py currently
enables a second LLM call by default; change the default to False to make
gleaning opt-in, update the code that reads/defines enable_gleaning (the
parameter in extract_entities_llm and extract_entities_combined) to fall back to
an environment variable BRAINLAYER_ENABLE_GLEANING (parsed as bool) so callers
that need gleaning can opt in via env or explicit arg, and add a short
comment/docstring near extract_entities_llm/extract_entities_combined explaining
the cost/latency tradeoff and that gleaning is disabled by default for
cost-sensitive paths.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 1d6aa80e-4337-4166-a068-972c02da333e

📥 Commits

Reviewing files that changed from the base of the PR and between dbd6b19 and c7a74ca.

📒 Files selected for processing (3)

src/brainlayer/enrichment_controller.py
src/brainlayer/pipeline/enrichment.py
src/brainlayer/pipeline/entity_extraction.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: Macroscope - Correctness Check
GitHub Check: test (3.12)
GitHub Check: test (3.13)
GitHub Check: test (3.11)

🧰 Additional context used

📓 Path-based instructions (3)

**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Flag risky DB or concurrency changes explicitly and do not hand-wave lock behavior
Enforce one-write-at-a-time concurrency constraint; reads are safe but brain_digest is write-heavy and must not run in parallel with other MCP work
Run pytest before claiming behavior changed safely; current test suite has 929 tests

Files:

src/brainlayer/pipeline/enrichment.py
src/brainlayer/enrichment_controller.py
src/brainlayer/pipeline/entity_extraction.py

src/brainlayer/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/brainlayer/**/*.py: Use Python/Typer CLI architecture for the main package in src/brainlayer/
All scripts and CLI must use paths.py:get_db_path() for resolving database path instead of hardcoding
Implement chunk lifecycle management with columns superseded_by, aggregated_into, archived_at; default search excludes lifecycle-managed chunks
Never run bulk database operations while enrichment workers are writing; always stop workers and checkpoint WAL first
Drop FTS triggers before bulk deletes on chunks table and recreate after; batch deletes in 5-10K chunks with checkpoint every 3 batches
Implement retry logic on SQLITE_BUSY errors; each worker must use its own database connection
Use ruff check src/ && ruff format src/ for linting and formatting

Files:

src/brainlayer/pipeline/enrichment.py
src/brainlayer/enrichment_controller.py
src/brainlayer/pipeline/entity_extraction.py

src/brainlayer/*enrichment*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/brainlayer/*enrichment*.py: Use Groq as primary enrichment backend with Gemini fallback via enrichment_controller.py, configurable via BRAINLAYER_ENRICH_BACKEND environment variable
Configure enrichment rate via BRAINLAYER_ENRICH_RATE environment variable with default 0.2 (12 RPM)

Files:

src/brainlayer/enrichment_controller.py

🧠 Learnings (7)

📓 Common learnings

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-29T23:19:51.321Z
Learning: Build extraction, classification, chunking, embedding, and indexing pipeline with post-processing for enrichment, brain graph, and Obsidian export

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-29T23:19:51.321Z
Learning: Applies to src/brainlayer/*enrichment*.py : Use Groq as primary enrichment backend with Gemini fallback via `enrichment_controller.py`, configurable via `BRAINLAYER_ENRICH_BACKEND` environment variable

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T01:24:44.281Z
Learning: Applies to src/brainlayer/*enrichment*.py : Enrichment backend priority: Groq (primary/cloud) → Gemini (fallback) → Ollama (offline last-resort), configurable via `BRAINLAYER_ENRICH_BACKEND` environment variable

📚 Learning: 2026-03-29T23:19:51.321Z

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-29T23:19:51.321Z
Learning: Applies to src/brainlayer/*enrichment*.py : Use Groq as primary enrichment backend with Gemini fallback via `enrichment_controller.py`, configurable via `BRAINLAYER_ENRICH_BACKEND` environment variable

Applied to files:

src/brainlayer/pipeline/enrichment.py
src/brainlayer/enrichment_controller.py

📚 Learning: 2026-04-01T01:24:44.281Z

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T01:24:44.281Z
Learning: Applies to src/brainlayer/*enrichment*.py : Enrichment backend priority: Groq (primary/cloud) → Gemini (fallback) → Ollama (offline last-resort), configurable via `BRAINLAYER_ENRICH_BACKEND` environment variable

Applied to files:

src/brainlayer/pipeline/enrichment.py
src/brainlayer/enrichment_controller.py

📚 Learning: 2026-03-29T23:19:50.743Z

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-29T23:19:50.743Z
Learning: Applies to src/brainlayer/*enrichment*.py : Enrichment rate configurable via `BRAINLAYER_ENRICH_RATE` environment variable (default 0.2 = 12 RPM)

Applied to files:

src/brainlayer/pipeline/enrichment.py
src/brainlayer/enrichment_controller.py

📚 Learning: 2026-03-29T23:19:51.321Z

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-29T23:19:51.321Z
Learning: Applies to src/brainlayer/*enrichment*.py : Configure enrichment rate via `BRAINLAYER_ENRICH_RATE` environment variable with default 0.2 (12 RPM)

Applied to files:

src/brainlayer/pipeline/enrichment.py
src/brainlayer/enrichment_controller.py

📚 Learning: 2026-03-22T15:55:22.017Z

Learnt from: EtanHey
Repo: EtanHey/brainlayer PR: 100
File: src/brainlayer/enrichment_controller.py:175-199
Timestamp: 2026-03-22T15:55:22.017Z
Learning: In `src/brainlayer/enrichment_controller.py`, the `parallel` parameter in `enrich_local()` is intentionally kept in the function signature (currently unused, suppressed with `# noqa: ARG001`) for API stability. Parallel local enrichment via a thread pool or process pool is planned for a future iteration. Do not flag this as dead code requiring removal.

Applied to files:

src/brainlayer/pipeline/enrichment.py

📚 Learning: 2026-03-29T23:19:51.321Z

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-29T23:19:51.321Z
Learning: Build extraction, classification, chunking, embedding, and indexing pipeline with post-processing for enrichment, brain graph, and Obsidian export

Applied to files:

src/brainlayer/pipeline/enrichment.py

🔇 Additional comments (7)

src/brainlayer/enrichment_controller.py (1)

111-114: Entity extraction uses Gemini as primary, but guidelines specify Groq → Gemini → Ollama priority.

The _get_default_llm_caller() in entity_extraction.py (lines 360-376) prefers call_gemini_for_extraction over call_llm (which includes Groq). This inverts the documented backend priority: "Groq (primary/cloud) → Gemini (fallback) → Ollama (offline last-resort)".

Is this intentional for entity extraction specifically? If so, consider documenting the rationale or making the priority configurable via environment variable.

Based on learnings: "Enrichment backend priority: Groq (primary/cloud) → Gemini (fallback) → Ollama (offline last-resort), configurable via BRAINLAYER_ENRICH_BACKEND environment variable"

src/brainlayer/pipeline/entity_extraction.py (5)

121-164: Well-structured prompt expansion with clear taxonomy.

The expanded entity types (11) and relation types (14) with explicit descriptions provide good coverage for developer conversation extraction. The output format specification with description and strength fields aligns with the LightRAG-style output mentioned in the PR objectives.

166-197: Gleaning mechanism is well-designed for capturing missed entities.

The second-pass prompt explicitly instructs the LLM to find ADDITIONAL entities/relations, focusing on implicit relationships. The PR objectives mention this catches 20-40% more entities, which is a significant improvement.

248-268: Relation parsing correctly handles new description and strength fields.

The changes properly:

Extract description from the LLM response (line 248)

Use strength for confidence with 0.7 default (line 252)

Bound confidence to [0, 1.0] (line 265)

Store description in properties dict which gets serialized to JSON in kg_repo.py (per context snippet 1)

348-357: Relation deduplication is correctly implemented.

The case-insensitive tuple key (source_text.lower(), target_text.lower(), relation_type) properly handles the gleaning pass potentially re-finding the same relations with different casing.

360-376: Default LLM caller selection is functional but see backend priority concern.

The fallback chain (Gemini → call_llm → RuntimeError) works correctly. The backend priority concern is addressed in the enrichment_controller.py review.

src/brainlayer/pipeline/enrichment.py (1)

860-867: Enabling LLM extraction for all enriched chunks — significant behavior change.

This change activates Gemini-based entity extraction (with gleaning) for every chunk processed through the enrichment pipeline. Per PR objectives, this increases extraction from ~2-5 entities to 15-32 entities per chunk.

The API cost implications (2 Gemini calls per chunk due to gleaning) have been flagged in the entity_extraction.py review. Ensure rate limits are in place (see enrichment_controller.py review) before processing large backlogs.

coderabbitai · 2026-04-02T23:32:12Z

+    try:
+        response = client.models.generate_content(
+            model=GEMINI_EXTRACTION_MODEL,
+            contents=prompt,
+            config={"response_mime_type": "application/json", "thinking_config": {"thinking_budget": 0}},
+        )
+        return response.text if response and response.text else None
+    except Exception:
+        logger.warning("Gemini extraction call failed", exc_info=True)
+        return None


⚠️ Potential issue | 🟠 Major

No rate limiting or timeout for Gemini extraction calls.

Unlike other enrichment backends (enrich_realtime uses per_chunk_delay, call_groq has GROQ_RATE_LIMIT_DELAY), this function has no rate limiting. High-volume entity extraction could exhaust API quotas.

Additionally, the generate_content call has no explicit timeout, risking hung requests.

🛡️ Suggested fix: Add timeout and consider rate limiting

+from google.genai import types as genai_types + +# Rate limit for extraction calls (shared with realtime enrichment) +_extraction_rate_limit = float(os.environ.get("BRAINLAYER_ENRICH_RATE", "0.2")) +_last_extraction_call: float = 0.0 + def call_gemini_for_extraction(prompt: str) -> Optional[str]: """Call Gemini for entity/relation extraction. Returns raw text response.""" + global _last_extraction_call + try: client = _get_gemini_client() except RuntimeError: logger.debug("Gemini not available for extraction") return None try: + # Rate limiting + if _extraction_rate_limit > 0: + import time + now = time.monotonic() + elapsed = now - _last_extraction_call + delay = 1.0 / _extraction_rate_limit + if _last_extraction_call > 0 and elapsed < delay: + time.sleep(delay - elapsed) + _last_extraction_call = time.monotonic() + response = client.models.generate_content( model=GEMINI_EXTRACTION_MODEL, contents=prompt, - config={"response_mime_type": "application/json", "thinking_config": {"thinking_budget": 0}}, + config={ + "response_mime_type": "application/json", + "thinking_config": {"thinking_budget": 0}, + "http_options": {"timeout": 60}, + }, )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/brainlayer/enrichment_controller.py` around lines 124 - 133, The Gemini extraction call in enrichment_controller's try block (client.models.generate_content with GEMINI_EXTRACTION_MODEL) lacks a timeout and any rate limiting, so add an explicit request timeout to the generate_content invocation (using the client's supported timeout param or wrapping the call in a timeout mechanism) and throttle calls to this extraction path like other backends (e.g., reuse the existing per_chunk_delay logic or a shared rate limiter/token-bucket) to pause between high-volume calls; ensure the new timeout and delay are configurable and applied before the try/except where response is awaited so failures/timeouts are caught and logged as before.

Fixed in 86d6ebf — added 30s timeout to Gemini extraction call.

@EtanHey, thanks for adding the 30s timeout — that addresses the hung-request risk. ✅

However, the rate limiting concern from the original comment is still unaddressed. Unlike enrich_realtime (which uses per_chunk_delay) and other backends, call_gemini_for_extraction has no throttling between calls. During high-volume gleaning passes (two LLM calls per chunk), this could exhaust API quotas quickly.

Would you like to open a follow-up issue to track adding rate limiting to call_gemini_for_extraction, or address it in this PR?

🧠 Learnings used

Learnt from: CR Repo: EtanHey/brainlayer PR: 0 File: CLAUDE.md:0-0 Timestamp: 2026-03-29T23:19:51.321Z Learning: Applies to src/brainlayer/*enrichment*.py : Use Groq as primary enrichment backend with Gemini fallback via `enrichment_controller.py`, configurable via `BRAINLAYER_ENRICH_BACKEND` environment variable

Learnt from: CR Repo: EtanHey/brainlayer PR: 0 File: CLAUDE.md:0-0 Timestamp: 2026-04-01T01:24:44.281Z Learning: Applies to src/brainlayer/*enrichment*.py : Enrichment backend priority: Groq (primary/cloud) → Gemini (fallback) → Ollama (offline last-resort), configurable via `BRAINLAYER_ENRICH_BACKEND` environment variable

Learnt from: CR Repo: EtanHey/brainlayer PR: 0 File: CLAUDE.md:0-0 Timestamp: 2026-03-29T23:19:51.321Z Learning: Build extraction, classification, chunking, embedding, and indexing pipeline with post-processing for enrichment, brain graph, and Obsidian export

…ing default - CRITICAL: Added Optional to typing imports in enrichment_controller.py - MAJOR: Added timeout (30s) to Gemini extraction call - MAJOR: Changed gleaning default to False (opt-in to avoid doubling cost) - MEDIUM: Safe float() parsing for LLM strength values (handles null/"high") Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

macroscopeapp · 2026-04-02T23:36:46Z

+                    entities.extend(extra_entities)
+                    relations.extend(extra_relations)
+        except Exception:


🟡 Medium pipeline/entity_extraction.py:347

After gleaning, entities.extend(extra_entities) combines primary and gleaning results without deduplicating overlapping spans. Each parse_llm_ner_response call deduplicates its own results, but overlaps between the two passes are retained. For example, if primary finds an entity at [10, 20] and gleaning finds one at [15, 25], both are returned. Consider calling _deduplicate_overlaps() on the combined entities list before returning.

entities.extend(extra_entities) relations.extend(extra_relations) + # Deduplicate overlapping entity spans after combining passes + entities.sort(key=lambda e: (e.start, -len(e.text))) + entities = _deduplicate_overlaps(entities) + # Deduplicate relations (gleaning may re-find the same ones)

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:

In file src/brainlayer/pipeline/entity_extraction.py around lines 347-349: After gleaning, `entities.extend(extra_entities)` combines primary and gleaning results without deduplicating overlapping spans. Each `parse_llm_ner_response` call deduplicates its own results, but overlaps between the two passes are retained. For example, if primary finds an entity at [10, 20] and gleaning finds one at [15, 25], both are returned. Consider calling `_deduplicate_overlaps()` on the combined `entities` list before returning. Evidence trail: src/brainlayer/pipeline/entity_extraction.py lines 332, 340 (both passes call parse_llm_ner_response), line 347 (entities.extend(extra_entities) without deduplication), lines 351-358 (relations ARE deduplicated after combining), line 275 (parse_llm_ner_response internally calls _deduplicate_overlaps), lines 100-117 (_deduplicate_overlaps function definition that handles overlapping spans). Commit: REVIEWED_COMMIT.

greptile-apps Bot reviewed Apr 2, 2026

View reviewed changes

macroscopeapp Bot reviewed Apr 2, 2026

View reviewed changes

coderabbitai Bot reviewed Apr 2, 2026

View reviewed changes

macroscopeapp Bot reviewed Apr 2, 2026

View reviewed changes

EtanHey merged commit 8db3adc into main Apr 2, 2026
5 of 6 checks passed

EtanHey deleted the feat/llm-entity-extraction branch April 2, 2026 23:38

This was referenced Apr 3, 2026

feat: Code Intelligence — auto-populate project entities from repo metadata (R68 Round 4) #192

Merged

feat: filter co_occurs_with in BrainBar Swift MCP + entity metadata (R70 PR1) #193

Merged

This was referenced Apr 11, 2026

feat: enable Gemini Flex tier for enrichment #235

Merged

fix: watch nested JSONL transcripts recursively #236

Merged

Conversation

EtanHey commented Apr 2, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Before/After

R68 Eval criteria

Test plan

Add LLM-powered entity extraction with gleaning to the KG enrichment pipeline

Summary by CodeRabbit

Uh oh!

EtanHey commented Apr 2, 2026

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot commented Apr 2, 2026

Uh oh!

coderabbitai Bot commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

macroscopeapp Bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

EtanHey Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

macroscopeapp Bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

EtanHey Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

macroscopeapp Bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

EtanHey commented Apr 2, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 2, 2026 •

edited

Loading

coderabbitai Bot Apr 2, 2026 •

edited

Loading