Conversation
…ity-tagged store
4 features for 6PM trial exercise (Avi Simon meeting scheduling):
1. Person profile schema convention on kg_entities metadata JSON:
{ hard_constraints, preferences, contact_info }
2. Per-person memory scoping: brain_search(entity_id=...) filters
both semantic (KNN) and FTS5 to entity-linked chunks only.
Added entity_id param to search(), hybrid_search(), _search(),
_brain_search(). Bypasses routing rules, bumps k-value 10x.
3. brain_get_person composite MCP tool: one call returns profile +
constraints + relations + scoped memories. Designed for copilot
agents that need full person context.
4. Entity-tagged brain_store(entity_id=...): stores chunk then
auto-links to entity via link_entity_chunk().
19 new tests (525 total passing). Zero breaking changes.
Source: avi-trial (6PM scheduling startup trial exercise)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
📝 WalkthroughWalkthroughIntroduces entity-scoped memory management by adding the Changes
Sequence DiagramsequenceDiagram
participant Client
participant MCP as MCP Layer
participant KG as KG / Entity<br/>Lookup
participant VS as Vector Store
participant Store as Memory Store
Client->>MCP: call_tool("brain_get_person",<br/>name, context, num_memories)
MCP->>KG: entity_lookup(name)
KG-->>MCP: entity_id, profile, relations
alt context provided
MCP->>VS: hybrid_search(context,<br/>entity_id=entity_id)
VS-->>MCP: ranked memories (entity-scoped)
else no context
MCP->>VS: search(entity_id=entity_id)
VS-->>MCP: entity-linked chunks
end
MCP-->>Client: JSON(entity_id, name, profile,<br/>relations, memories, count)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/brainlayer/mcp/__init__.py (1)
2271-2301:⚠️ Potential issue | 🟠 MajorDB-busy fallback drops
entity_id, so entity links can be silently lost.Immediate writes pass
entity_id, but queued writes/replay do not. If the DB is busy,brain_store(..., entity_id=...)will store memory without person linkage.🛠️ Proposed fix
@@ def _flush_pending_stores(store, embed_fn) -> int: store_memory( store=store, embed_fn=embed_fn, content=item["content"], memory_type=item["memory_type"], project=item.get("project"), tags=item.get("tags"), importance=item.get("importance"), confidence_score=item.get("confidence_score"), outcome=item.get("outcome"), reversibility=item.get("reversibility"), files_changed=item.get("files_changed"), + entity_id=item.get("entity_id"), ) @@ async def _store(...): _queue_store( { "content": content, "memory_type": memory_type, "project": _normalize_project_name(project), "tags": tags, "importance": importance, "confidence_score": confidence_score, "outcome": outcome, "reversibility": reversibility, "files_changed": files_changed, + "entity_id": entity_id, } )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/brainlayer/mcp/__init__.py` around lines 2271 - 2301, The DB-busy fallback path is dropping entity_id when buffering/replaying writes so person links are lost; ensure the buffering and replay code includes entity_id and passes it through to store_memory (and any call sites like brain_store) exactly as the immediate path does. Update the code that serializes memories to the JSONL buffer to include the entity_id field, and update the replay/path that reconstructs calls to call store_memory(..., entity_id=entity_id) (or forward entity_id into the lambda used with loop.run_in_executor) so queued writes preserve and pass entity_id through end-to-end.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/brainlayer/mcp/__init__.py`:
- Around line 717-751: Update the module-level metadata string that still
references the old number of MCP tools so it reflects the current public surface
of 6 tools; locate the explanatory text near the Tool definitions (e.g., around
the Tool entries like name="brain_get_person") and change the count in that
metadata/docstring or metadata variable to "6" (and, if present, make the text
generic or auto-derived so future additions don't get out of sync).
In `@src/brainlayer/store.py`:
- Around line 142-149: Validate entity_id before calling
store.link_entity_chunk: ensure entity_id is non-empty and corresponds to an
existing entity in the store (e.g., call a lookup like
store.get_entity(entity_id) or store.entity_exists(entity_id) and only call
store.link_entity_chunk with chunk_id when that check returns true); if the
lookup fails, skip linking and log or return a warning to avoid creating
dangling kg_entity_chunks rows.
In `@src/brainlayer/vector_store.py`:
- Around line 628-633: The current calculation of effective_k (effective_k =
n_results * 10 if entity_id else n_results) can grow unbounded and cause
expensive KNN scans; add a sane cap by defining a MAX_K_FANOUT constant (e.g.
500 or another sensible limit) and replace effective_k with min(effective_k,
MAX_K_FANOUT) before building params; update the variable used in params =
[query_bytes, effective_k] + filter_params so entity-scoped fan-out is limited
while still over-fetching for entity_id cases.
---
Outside diff comments:
In `@src/brainlayer/mcp/__init__.py`:
- Around line 2271-2301: The DB-busy fallback path is dropping entity_id when
buffering/replaying writes so person links are lost; ensure the buffering and
replay code includes entity_id and passes it through to store_memory (and any
call sites like brain_store) exactly as the immediate path does. Update the code
that serializes memories to the JSONL buffer to include the entity_id field, and
update the replay/path that reconstructs calls to call store_memory(...,
entity_id=entity_id) (or forward entity_id into the lambda used with
loop.run_in_executor) so queued writes preserve and pass entity_id through
end-to-end.
ℹ️ Review info
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (5)
src/brainlayer/mcp/__init__.pysrc/brainlayer/store.pysrc/brainlayer/vector_store.pytests/test_6pm_entity_upgrades.pytests/test_think_recall_integration.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: test (3.11)
- GitHub Check: test (3.12)
- GitHub Check: test (3.13)
- GitHub Check: Cursor Bugbot
🧰 Additional context used
📓 Path-based instructions (4)
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Run tests using
pytestfrom the project root
Files:
tests/test_think_recall_integration.pytests/test_6pm_entity_upgrades.py
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use
ruff check src/for linting andruff format src/for code formatting
Files:
src/brainlayer/vector_store.pysrc/brainlayer/store.pysrc/brainlayer/mcp/__init__.py
src/brainlayer/vector_store.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use sqlite-vec with APSW for vector storage, WAL mode, and
PRAGMA busy_timeout = 5000for concurrent multi-process safety
Files:
src/brainlayer/vector_store.py
src/brainlayer/mcp/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Implement MCP server with brain_search, brain_store, and brain_recall tools, maintaining backward compatibility with old brainlayer_* tool names
Files:
src/brainlayer/mcp/__init__.py
🧠 Learnings (1)
📚 Learning: 2026-02-23T16:51:38.317Z
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-23T16:51:38.317Z
Learning: Applies to src/brainlayer/mcp/**/*.py : Implement MCP server with brain_search, brain_store, and brain_recall tools, maintaining backward compatibility with old brainlayer_* tool names
Applied to files:
tests/test_think_recall_integration.pysrc/brainlayer/mcp/__init__.py
🧬 Code graph analysis (4)
tests/test_think_recall_integration.py (1)
src/brainlayer/mcp/__init__.py (1)
list_tools(527-870)
tests/test_6pm_entity_upgrades.py (2)
src/brainlayer/vector_store.py (7)
VectorStore(72-2562)serialize_f32(67-69)upsert_entity(2131-2170)link_entity_chunk(2205-2223)get_entity(2225-2244)search(551-750)get_entity_chunks(2335-2364)src/brainlayer/store.py (1)
store_memory(33-154)
src/brainlayer/store.py (1)
src/brainlayer/vector_store.py (1)
link_entity_chunk(2205-2223)
src/brainlayer/mcp/__init__.py (2)
src/brainlayer/pipeline/digest.py (1)
entity_lookup(231-292)src/brainlayer/vector_store.py (2)
hybrid_search(868-1080)get_entity_chunks(2335-2364)
🔇 Additional comments (4)
src/brainlayer/vector_store.py (1)
585-587: Entity scoping is applied consistently across semantic, text, and FTS branches.Nice implementation: all three retrieval paths now enforce
entity_idfiltering with parameterized SQL.Also applies to: 652-654, 909-910, 926-930
tests/test_think_recall_integration.py (1)
249-255: Tool discovery assertion update is correct and useful.This protects against accidental removal of
brain_get_personfrom the registered MCP tools list.tests/test_6pm_entity_upgrades.py (1)
201-282: Excellent coverage for entity scoping and isolation paths.These tests meaningfully protect the new
entity_idbehavior across search, store linking, and multi-person separation.Also applies to: 287-367, 443-505
src/brainlayer/mcp/__init__.py (1)
1313-1354: Entity-scoped routing behavior is well-implemented.Bypassing heuristic routing and forwarding
entity_iddirectly to_searchmakes per-person queries deterministic and avoids accidental route mismatches.Also applies to: 1594-1614
When brain_store(entity_id=...) hit a DB lock and fell back to queue, the entity_id was dropped from the queued payload. On flush, the entity-chunk linkage was silently lost. Found by CodeRabbit on PR #33. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Validate entity_id exists before creating kg_entity_chunks link (prevents dangling rows) - Update metadata text from "5 tools" to "6 tools" - Cap entity-scoped KNN effective_k at 1000 to bound query latency Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable autofix in the Cursor dashboard.
| chunk_id=chunk_id, | ||
| relevance=1.0, | ||
| context=f"Stored via brain_store: {memory_type}", | ||
| ) |
There was a problem hiding this comment.
Entity validation after chunk commit creates orphaned data
Medium Severity
The entity_id validation (store.get_entity(entity_id)) happens after the chunk and its embedding have already been written to the database. Since the project uses apsw, which auto-commits each statement, both the INSERT INTO chunks and INSERT INTO chunk_vectors are permanently committed before the entity check runs. If the entity doesn't exist, ValueError is raised, the caller returns a "Validation error" to the user, but the chunk is already persisted — orphaned and unlinked. The user, believing the store failed, may retry and create a duplicate.
Moving the entity existence check before the chunk insert would fix the atomicity issue.


Summary
brain_search(entity_id=...)filters results to a specific person's linked chunks viakg_entity_chunksbridge table. Works for both vector and FTS5 hybrid search.brain_get_personcomposite tool: New MCP tool that returns a person's full profile (entity metadata, hard_constraints, preferences, contact_info) + their relevant memories in one call.brain_store:brain_store(entity_id=...)auto-links newly stored chunks to a person entity vialink_entity_chunk().kg_entities.metadataJSON field — no migration needed.Context
These upgrades enable the 6PM meeting orchestration trial exercise. Each copilot needs per-person context (profile + memories) to extract scheduling constraints. The entity system now supports scoped search, composite retrieval, and tagged storage — all accessible via MCP tools.
Test plan
tests/test_6pm_entity_upgrades.pycovering all 4 featurespytest -x)test_tool_countassertion (5 → 6 tools)🤖 Generated with Claude Code
Note
Medium Risk
Touches core search and store paths by adding
entity_idfiltering (including SQL changes and altered default source scoping), which could impact relevance/performance or raise new validation errors if misused. Mitigated by added coverage and mostly additive behavior whenentity_idis omitted.Overview
Adds per-person memory scoping end-to-end:
brain_searchand the underlyingVectorStore.search/hybrid_searchnow acceptentity_idto restrict results to chunks linked viakg_entity_chunks, with special-casing to bypass routing and broaden candidate retrieval for vector KNN.Extends writes to support entity-tagged memories:
brain_store/store_memoryaccept optionalentity_idand auto-link the new chunk to the entity (validating the entity exists). Introduces a new MCP toolbrain_get_personthat looks up apersonentity and returns its profile/relations plus scoped memories (optionally ranked by a provided context), and updates/extends tests (including MCP tool count 5→6).Written by Cursor Bugbot for commit 8285cff. This will update automatically on new commits. Configure here.
Summary by CodeRabbit
Release Notes
New Features
Tests