feat: Phase 3 — Brain Digest + Brain Entity tools#32
Conversation
Add brain_digest MCP tool for structured content ingestion (transcripts, documents, articles) and brain_entity for KG entity lookup with evidence. - digest.py: digest_content() creates chunk, extracts entities (Phase 2), analyzes sentiment (Phase 6), extracts action items/decisions/questions - entity_lookup() searches entities via FTS + semantic fallback, returns relations and evidence chunks - user_verified column on kg_entities and kg_relations tables - CLI: brainlayer digest command (text or --file input) - 17 new tests, 434 total pass, lint clean Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
📝 WalkthroughWalkthroughIntroduces Phase 3 Brain Digest feature with a digest pipeline module that extracts structured knowledge from content (entities, relations, actions, decisions, questions, sentiment), adds CLI digest command and MCP brain_digest/brain_entity tools, updates KG schema with user_verified column, and establishes comprehensive tests. Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant CLI
participant VectorStore
participant EmbedModel
participant Phase2Extractor
participant SentimentAnalyzer
participant KG as Knowledge Graph
User->>CLI: digest --content "text" --title "Title" --participants "Alice,Bob"
CLI->>VectorStore: Initialize store
CLI->>EmbedModel: Load embedding model
CLI->>Phase2Extractor: digest_content(content, store, embed_fn)
Phase2Extractor->>VectorStore: Create digest chunk
Phase2Extractor->>Phase2Extractor: Build seed entities from participants
Phase2Extractor->>Phase2Extractor: Extract entities & relations (Phase 2)
Phase2Extractor->>SentimentAnalyzer: Analyze sentiment (Phase 6)
Phase2Extractor->>Phase2Extractor: Extract actions, decisions, questions (regex)
Phase2Extractor->>EmbedModel: Embed chunk content
Phase2Extractor->>VectorStore: Upsert chunk with embeddings
Phase2Extractor->>KG: Update chunk sentiment
Phase2Extractor->>Phase2Extractor: Compute confidence tiers
Phase2Extractor-->>CLI: Return DigestResult
CLI->>VectorStore: Close store
CLI-->>User: Print digest ID, summary, sentiment, stats, entities, actions
sequenceDiagram
actor User
participant MCP
participant VectorStore
participant EmbedModel
participant KG as Knowledge Graph
User->>MCP: brain_entity(query="Alice", entity_type="person")
MCP->>VectorStore: Initialize store
MCP->>EmbedModel: Load embedding model
MCP->>VectorStore: Full-text search for "Alice"
alt FTS results found
VectorStore-->>MCP: Return FTS matches
else FTS empty
MCP->>EmbedModel: Embed query
MCP->>VectorStore: Semantic search (embeddings)
VectorStore-->>MCP: Return semantic matches
end
MCP->>KG: Retrieve entity data & relations
MCP->>KG: Fetch evidence chunks
MCP-->>User: Return entity with relations and evidence
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 8
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/brainlayer/vector_store.py (1)
355-372: 🧹 Nitpick | 🔵 Trivial
user_verifiedshould be declared in theCREATE TABLEDDL, not only via migrationBecause
user_verifiedis absent from bothCREATE TABLE IF NOT EXISTS kg_entitiesandCREATE TABLE IF NOT EXISTS kg_relations, every fresh database init always executesPRAGMA table_info+ALTER TABLEfor both tables. The migration guard was designed for upgrading existing databases, not for seeding new ones.Including the column in the DDL (and keeping the migration guard) is idempotent, self-documenting, and avoids the unnecessary overhead on fresh installs.
💡 Proposed fix
cursor.execute(""" CREATE TABLE IF NOT EXISTS kg_entities ( id TEXT PRIMARY KEY, entity_type TEXT NOT NULL, name TEXT NOT NULL, metadata TEXT DEFAULT '{}', created_at TEXT DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ','now')), updated_at TEXT DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ','now')), + user_verified INTEGER DEFAULT 0, UNIQUE(entity_type, name) ) """)cursor.execute(""" CREATE TABLE IF NOT EXISTS kg_relations ( id TEXT PRIMARY KEY, source_id TEXT NOT NULL, target_id TEXT NOT NULL, relation_type TEXT NOT NULL, properties TEXT DEFAULT '{}', confidence REAL DEFAULT 1.0, created_at TEXT DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ','now')), + user_verified INTEGER DEFAULT 0, UNIQUE(source_id, target_id, relation_type) ) """)The existing migration guards at lines 370-372 and 392-394 remain and continue to handle old databases correctly.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/brainlayer/vector_store.py` around lines 355 - 372, The CREATE TABLE DDL for kg_entities (and similarly for kg_relations) must include the user_verified column so new DBs don't always run the migration: update the CREATE TABLE IF NOT EXISTS kg_entities statement to declare "user_verified INTEGER DEFAULT 0" (and do the same for kg_relations' CREATE TABLE) while keeping the existing migration guard that checks PRAGMA table_info and the ALTER TABLE in functions/blocks where those statements live; this ensures idempotent schema creation for fresh installs and still upgrades old databases.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/plans/2026-02-25-phase-3-brain-digest.md`:
- Line 13: The "### Task N:" headings jump from H1 to H3 causing markdownlint
MD001; change each "### Task N:" (e.g., "### Task 1: Add user_verified column to
KG tables") to an H2 ("## Task 1: ...") so headings increment H1→H2→H3
consistently, apply this change to all occurrences of "### Task" in the document
and re-run markdownlint to verify no MD001 warnings remain.
In `@src/brainlayer/cli/__init__.py`:
- Around line 364-431: The VectorStore instance created in digest (store =
VectorStore(DEFAULT_DB_PATH)) can leak if an exception is raised before the
happy-path store.close(); ensure the store is always closed by moving creation
outside the try or by introducing a finally block that calls store.close() when
store is not None. Update the digest function in src/brainlayer/cli/__init__.py
to wrap the core logic (calls to get_embedding_model, digest_content, and
console output) in try/finally (or set store = None before try and close in
finally) so that store.close() runs on both success and error paths; reference
VectorStore, DEFAULT_DB_PATH, store.close(), get_embedding_model, and
digest_content to locate the changes.
In `@src/brainlayer/mcp/__init__.py`:
- Around line 945-959: The timeout wrapper is ineffective because _brain_entity
does synchronous blocking I/O on the event loop, so asyncio.wait_for
(_with_timeout) cannot interrupt it; instead, run the blocking work off the loop
(use asyncio.get_running_loop().run_in_executor or asyncio.to_thread) and apply
_with_timeout to that task. Concretely, change the caller that returns await
_with_timeout(_brain_entity(...)) to schedule the blocking portion of
_brain_entity in an executor (or refactor the blocking logic into a sync helper
and call it via run_in_executor/to_thread) and await _with_timeout on the
resulting future; keep function names _brain_entity and _with_timeout referenced
so you update the correct call sites.
- Around line 1078-1107: The _brain_digest handler currently calls the
synchronous digest_content directly, which blocks the event loop; modify
_brain_digest to offload the blocking work to a thread executor (e.g., use
asyncio.get_running_loop().run_in_executor or asyncio.to_thread) when invoking
digest_content, passing store from _get_vector_store(), embed_fn=model.embed
from _get_embedding_model(), and the normalized project via
_normalize_project_name; preserve the existing try/except behavior and return
types (CallToolResult with TextContent on success, _error_result on exceptions)
so all embedding/DB/entity extraction/sentiment work runs off the MCP event
loop.
- Around line 1110-1134: _brain_entity blocks the event loop because it calls
the synchronous entity_lookup (which touches the DB and may call model.embed)
and lacks error handling; wrap the call to entity_lookup inside an executor
(e.g., use asyncio.get_running_loop().run_in_executor or asyncio.to_thread) and
pass store/_get_vector_store and model/_get_embedding_model as before, then
surround that awaited offloaded call with try/except to catch any exceptions,
log or convert the exception to a safe text message, and return a CallToolResult
containing an error TextContent instead of letting the traceback propagate;
refer to symbols _brain_entity, entity_lookup, _get_vector_store,
_get_embedding_model, and CallToolResult to locate and update the code.
In `@src/brainlayer/pipeline/digest.py`:
- Around line 91-103: The _classify_confidence function currently collapses
medium and low confidences into "needs_review", making
MEDIUM_CONFIDENCE_THRESHOLD unused; update the function to preserve a distinct
medium bucket by adding a third counter (e.g., medium = 0), increment medium
when conf >= MEDIUM_CONFIDENCE_THRESHOLD and < HIGH_CONFIDENCE_THRESHOLD, leave
low to the else branch, and return {"high_confidence": high,
"medium_confidence": medium, "needs_review": low}; reference
_classify_confidence, HIGH_CONFIDENCE_THRESHOLD, and MEDIUM_CONFIDENCE_THRESHOLD
when making the change.
- Around line 28-45: ACTION_PATTERNS[3] is too broad and will generate many
false positives; replace or remove it — either restrict it to
first‑person/imperative forms (e.g. anchors like ^\s*(?:I|we)\s+(?:will|need
to|should)\b or require the phrase to start a line/paragraph or be prefixed by
TODO/ACTION) or drop it and rely on the structured list patterns; also fix
ACTION_PATTERNS[1] which currently uses re.S and a lazy dot that can span the
whole document — remove the re.S flag and change the capture to
per-line/non-newline matching (e.g. use [^\n]+ or apply re.M with an explicit
line-based pattern) so a numbered item only captures its own line/block instead
of to the document end.
In `@tests/test_phase3_digest.py`:
- Around line 31-55: Multiple tests create VectorStore(tmp_path / "test.db")
inline without closing it, leaking APSW connections; fix by adding a shared
pytest fixture that yields a VectorStore and calls store.close() on teardown
(follow the pattern used in tests/test_kg_schema.py) and update affected tests
to accept that fixture (or alternatively ensure each test calls store.close() or
uses a context manager around VectorStore); reference the VectorStore
constructor usage and the store.close() method when making the change so all
inline creations are replaced or closed.
---
Outside diff comments:
In `@src/brainlayer/vector_store.py`:
- Around line 355-372: The CREATE TABLE DDL for kg_entities (and similarly for
kg_relations) must include the user_verified column so new DBs don't always run
the migration: update the CREATE TABLE IF NOT EXISTS kg_entities statement to
declare "user_verified INTEGER DEFAULT 0" (and do the same for kg_relations'
CREATE TABLE) while keeping the existing migration guard that checks PRAGMA
table_info and the ALTER TABLE in functions/blocks where those statements live;
this ensures idempotent schema creation for fresh installs and still upgrades
old databases.
ℹ️ Review info
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (7)
docs/plans/2026-02-25-phase-3-brain-digest.mdsrc/brainlayer/cli/__init__.pysrc/brainlayer/mcp/__init__.pysrc/brainlayer/pipeline/digest.pysrc/brainlayer/vector_store.pytests/test_kg_schema.pytests/test_phase3_digest.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: test (3.13)
- GitHub Check: test (3.12)
- GitHub Check: test (3.11)
🧰 Additional context used
📓 Path-based instructions (5)
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Run tests using
pytestfrom the project root
Files:
tests/test_kg_schema.pytests/test_phase3_digest.py
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use
ruff check src/for linting andruff format src/for code formatting
Files:
src/brainlayer/vector_store.pysrc/brainlayer/pipeline/digest.pysrc/brainlayer/cli/__init__.pysrc/brainlayer/mcp/__init__.py
src/brainlayer/vector_store.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use sqlite-vec with APSW for vector storage, WAL mode, and
PRAGMA busy_timeout = 5000for concurrent multi-process safety
Files:
src/brainlayer/vector_store.py
src/brainlayer/cli/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Enable project-specific indexing with
brainlayer index --project <project_name>and incremental indexing withbrainlayer index-fast
Files:
src/brainlayer/cli/__init__.py
src/brainlayer/mcp/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Implement MCP server with brain_search, brain_store, and brain_recall tools, maintaining backward compatibility with old brainlayer_* tool names
Files:
src/brainlayer/mcp/__init__.py
🧠 Learnings (2)
📚 Learning: 2026-02-23T16:51:38.317Z
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-23T16:51:38.317Z
Learning: Applies to src/brainlayer/mcp/**/*.py : Implement MCP server with brain_search, brain_store, and brain_recall tools, maintaining backward compatibility with old brainlayer_* tool names
Applied to files:
docs/plans/2026-02-25-phase-3-brain-digest.mdsrc/brainlayer/mcp/__init__.py
📚 Learning: 2026-02-23T16:51:38.317Z
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-23T16:51:38.317Z
Learning: Applies to src/brainlayer/pipeline/enrichment.py : Enrich chunks with 10 metadata fields: summary, tags, importance (1-10), intent, primary_symbols, resolved_query, epistemic_level, version_scope, debt_impact, and external_deps
Applied to files:
src/brainlayer/pipeline/digest.py
🧬 Code graph analysis (3)
src/brainlayer/cli/__init__.py (3)
src/brainlayer/embeddings.py (1)
get_embedding_model(109-114)src/brainlayer/pipeline/digest.py (1)
digest_content(106-222)src/brainlayer/vector_store.py (2)
VectorStore(72-2534)close(2523-2528)
tests/test_phase3_digest.py (4)
src/brainlayer/vector_store.py (2)
VectorStore(72-2534)upsert_chunks(490-549)tests/test_kg_schema.py (1)
store(23-28)src/brainlayer/pipeline/digest.py (2)
digest_content(106-222)entity_lookup(225-286)src/brainlayer/mcp/__init__.py (1)
list_tools(527-827)
src/brainlayer/mcp/__init__.py (1)
src/brainlayer/pipeline/digest.py (2)
digest_content(106-222)entity_lookup(225-286)
🪛 markdownlint-cli2 (0.21.0)
docs/plans/2026-02-25-phase-3-brain-digest.md
[warning] 13-13: Heading levels should only increment by one level at a time
Expected: h2; Actual: h3
(MD001, heading-increment)
🔇 Additional comments (6)
tests/test_kg_schema.py (1)
74-82: LGTM — column set assertions correctly extended foruser_verifiedBoth exact-set comparisons (
kg_entitiesat line 77 andkg_relationsat line 82) now includeuser_verified, matching the Phase 3 schema migrations. Using exact-set equality rather thanin-checks provides good regression protection.src/brainlayer/pipeline/digest.py (2)
225-286: LGTM — two-stage entity lookup is well-structuredFTS-first with semantic fallback is the right pattern here. The relation hydration correctly handles both outgoing (
target_name/target_type) and incoming (source_name/source_type) directions, and the evidence truncation at 300 chars is a safe default for MCP response size.
155-168: No issue found —store_extraction_resultreturns aDict[str, str]mapping entity text → entity_id (as documented and implemented in batch_extraction.py line 95:entity_ids[entity.text] = entity_id). The code at line 162 correctly usesext_entity.textas the lookup key, so all.get()calls will succeed for extracted entities that were stored.Likely an incorrect or invalid review comment.
tests/test_phase3_digest.py (1)
299-308: The assertion onentities_found >= 2is robust and does not need mockingThis test relies on seed entity matching (deterministic string matching), not Phase 2 NER. Both participants are explicitly mentioned in the content and will be found by seed matching with high confidence. No mocking or refactoring is needed.
src/brainlayer/mcp/__init__.py (2)
50-66: LGTM — Server instructions updated correctly for the two new tools.Tool count is accurate (5 tools), and the new tool descriptions are clear and consistent with the existing format.
766-826: Tool definitions look well-structured and consistent with existing tools.Both schemas follow the established patterns.
brain_digestcorrectly uses_WRITEannotations (it creates chunks), andbrain_entitycorrectly uses_READ_ONLY.
|
|
||
| --- | ||
|
|
||
| ### Task 1: Add user_verified column to KG tables |
There was a problem hiding this comment.
Fix heading-level jump (MD001)
Line 13 opens at ### (H3) directly beneath the # (H1) title, skipping H2. markdownlint flags this as MD001 heading-increment.
🛠 Proposed fix
-### Task 1: Add user_verified column to KG tables
+## Task 1: Add user_verified column to KG tablesApply consistently to all ### Task N: headings in the file.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ### Task 1: Add user_verified column to KG tables | |
| ## Task 1: Add user_verified column to KG tables |
🧰 Tools
🪛 markdownlint-cli2 (0.21.0)
[warning] 13-13: Heading levels should only increment by one level at a time
Expected: h2; Actual: h3
(MD001, heading-increment)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/plans/2026-02-25-phase-3-brain-digest.md` at line 13, The "### Task N:"
headings jump from H1 to H3 causing markdownlint MD001; change each "### Task
N:" (e.g., "### Task 1: Add user_verified column to KG tables") to an H2 ("##
Task 1: ...") so headings increment H1→H2→H3 consistently, apply this change to
all occurrences of "### Task" in the document and re-run markdownlint to verify
no MD001 warnings remain.
| def test_user_verified_column_on_kg_entities(tmp_path): | ||
| """kg_entities has user_verified column.""" | ||
| store = VectorStore(tmp_path / "test.db") | ||
| cursor = store.conn.cursor() | ||
| cols = {row[1] for row in cursor.execute("PRAGMA table_info(kg_entities)")} | ||
| assert "user_verified" in cols | ||
|
|
||
|
|
||
| def test_user_verified_column_on_kg_relations(tmp_path): | ||
| """kg_relations has user_verified column.""" | ||
| store = VectorStore(tmp_path / "test.db") | ||
| cursor = store.conn.cursor() | ||
| cols = {row[1] for row in cursor.execute("PRAGMA table_info(kg_relations)")} | ||
| assert "user_verified" in cols | ||
|
|
||
|
|
||
| def test_user_verified_defaults_to_false(tmp_path): | ||
| """user_verified defaults to 0 (false) on new entities.""" | ||
| store = VectorStore(tmp_path / "test.db") | ||
| eid = store.upsert_entity("test-ent-1", "person", "Test Person") | ||
| cursor = store.conn.cursor() | ||
| row = list(cursor.execute( | ||
| "SELECT user_verified FROM kg_entities WHERE id = ?", [eid] | ||
| ))[0] | ||
| assert row[0] == 0 |
There was a problem hiding this comment.
VectorStore instances are never closed — resource leak across all inline-created stores
Every test function in this file that creates VectorStore(tmp_path / "test.db") inline (lines 33, 41, 49, 65, 91, 115, 134, 151, 170, 183, 252, 275, 289) never calls store.close(). This leaks open APSW connections and WAL file handles. tests/test_kg_schema.py avoids this via a store fixture that properly yields and closes the instance.
The simplest fix is to add a shared store fixture (matching the pattern in test_kg_schema.py) and use it in each test, or to call store.close() / use a with-statement where a fixture isn't practical.
🛠 Proposed fixture + usage pattern
+import pytest
+from brainlayer.vector_store import VectorStore
+
+@pytest.fixture
+def store(tmp_path):
+ s = VectorStore(tmp_path / "test.db")
+ yield s
+ s.close()Then each test that previously created its own store inline:
-def test_user_verified_column_on_kg_entities(tmp_path):
- store = VectorStore(tmp_path / "test.db")
- cursor = store.conn.cursor()
+def test_user_verified_column_on_kg_entities(store):
+ cursor = store.conn.cursor()Tests that need the embed_fn can still accept both store and the mock_embedding fixture.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/test_phase3_digest.py` around lines 31 - 55, Multiple tests create
VectorStore(tmp_path / "test.db") inline without closing it, leaking APSW
connections; fix by adding a shared pytest fixture that yields a VectorStore and
calls store.close() on teardown (follow the pattern used in
tests/test_kg_schema.py) and update affected tests to accept that fixture (or
alternatively ensure each test calls store.close() or uses a context manager
around VectorStore); reference the VectorStore constructor usage and the
store.close() method when making the change so all inline creations are replaced
or closed.
- Remove unused imports (json, patch) in test file - Replace lambda assignments with def function (E731) - Fix import sorting (I001) - Update tool count test: 3 → 5 (brain_digest + brain_entity added) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Run MCP handlers via loop.run_in_executor to avoid blocking event loop - Add error handling to brain_entity handler - Fix VectorStore resource leak in CLI digest command (try/finally) - Remove overly broad modal-verb action item pattern (false positives) - Separate low_confidence tier from needs_review in confidence stats Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
| lambda: digest_content( | ||
| content=content, | ||
| store=store, | ||
| embed_fn=model.embed, |
There was a problem hiding this comment.
model.embed attribute doesn't exist on EmbeddingModel
High Severity
EmbeddingModel has embed_query and embed_chunks methods but no embed method. Passing model.embed as embed_fn will raise an AttributeError at runtime when brain_digest, brain_entity, or the CLI digest command is invoked. Tests pass because they use a _dummy_embed function that bypasses the real model entirely.
Additional Locations (2)
|
|
||
| store = _get_vector_store() | ||
| model = _get_embedding_model() | ||
| loop = asyncio.get_event_loop() |
There was a problem hiding this comment.
Inconsistent use of deprecated get_event_loop API
Low Severity
_brain_digest and _brain_entity use asyncio.get_event_loop(), while the rest of the codebase (e.g., _brain_search at line 1401) consistently uses asyncio.get_running_loop(). get_event_loop() is deprecated in Python 3.10+ for this use case and may be removed in future versions.


Summary
source="digest"user_verifiedcolumn onkg_entitiesandkg_relationsfor human confirmation flagsbrainlayer digestcommand for terminal usagesrc/brainlayer/pipeline/digest.pymodule integrating Phase 2 (entity extraction) + Phase 6 (sentiment analysis)Changes
src/brainlayer/pipeline/digest.pydigest_content()+entity_lookup()+ regex extractors for action items/decisions/questionssrc/brainlayer/mcp/__init__.pybrain_digest+brain_entitytools (schema + handlers), update server instructions (3→5 tools)src/brainlayer/vector_store.pyuser_verifiedcolumn migration for kg_entities + kg_relationssrc/brainlayer/cli/__init__.pybrainlayer digestcommand with--file,--title,--project,--participantstests/test_phase3_digest.pytests/test_kg_schema.pyuser_verifiedcolumnTest plan
ruff check src/cleanbrainlayer digest --helpworks🤖 Generated with Claude Code
Summary by CodeRabbit
digestCLI command to analyze text and extract structured insights including entities, sentiment, actions, decisions, and questionsbrain_digestandbrain_entityMCP tools for content processing and entity lookupNote
Medium Risk
Introduces new write paths into the DB (new chunk ingestion + KG writes) and applies schema migrations on startup, so correctness and compatibility with existing databases matter; changes are localized and covered by new tests.
Overview
Adds Phase 3 ingestion and lookup capabilities:
brain_digestwrites a newsource="digest"chunk and runs entity/relation extraction + sentiment + basic action/decision/question parsing, whilebrain_entityperforms KG entity lookup (FTS then semantic) and returns relations and evidence chunks.Extends the KG schema by adding a
user_verifiedcolumn tokg_entitiesandkg_relationswith lightweight migrations, exposes digestion via a newbrainlayer digestCLI command, and updates/extends tests to cover the new schema, tools, and end-to-end digest→lookup flow (including MCP tool count from 3→5).Written by Cursor Bugbot for commit 6d0dc39. This will update automatically on new commits. Configure here.