Skip to content

feat(6pm): entity upgrades — per-person search, brain_get_person, entity-tagged store#33

Merged
EtanHey merged 4 commits intomainfrom
feature/6pm-entity-upgrades
Feb 26, 2026
Merged

feat(6pm): entity upgrades — per-person search, brain_get_person, entity-tagged store#33
EtanHey merged 4 commits intomainfrom
feature/6pm-entity-upgrades

Conversation

@EtanHey
Copy link
Copy Markdown
Owner

@EtanHey EtanHey commented Feb 26, 2026

Summary

  • Per-person memory scoping: brain_search(entity_id=...) filters results to a specific person's linked chunks via kg_entity_chunks bridge table. Works for both vector and FTS5 hybrid search.
  • brain_get_person composite tool: New MCP tool that returns a person's full profile (entity metadata, hard_constraints, preferences, contact_info) + their relevant memories in one call.
  • Entity-tagged brain_store: brain_store(entity_id=...) auto-links newly stored chunks to a person entity via link_entity_chunk().
  • Person profile schema convention: Structured metadata on existing kg_entities.metadata JSON field — no migration needed.
  • 19 new tests (525 total passing, 0 failures)

Context

These upgrades enable the 6PM meeting orchestration trial exercise. Each copilot needs per-person context (profile + memories) to extract scheduling constraints. The entity system now supports scoped search, composite retrieval, and tagged storage — all accessible via MCP tools.

Test plan

  • 19 new tests in tests/test_6pm_entity_upgrades.py covering all 4 features
  • 525 total tests passing (pytest -x)
  • Updated test_tool_count assertion (5 → 6 tools)
  • CodeRabbit review

🤖 Generated with Claude Code


Note

Medium Risk
Touches core search and store paths by adding entity_id filtering (including SQL changes and altered default source scoping), which could impact relevance/performance or raise new validation errors if misused. Mitigated by added coverage and mostly additive behavior when entity_id is omitted.

Overview
Adds per-person memory scoping end-to-end: brain_search and the underlying VectorStore.search/hybrid_search now accept entity_id to restrict results to chunks linked via kg_entity_chunks, with special-casing to bypass routing and broaden candidate retrieval for vector KNN.

Extends writes to support entity-tagged memories: brain_store/store_memory accept optional entity_id and auto-link the new chunk to the entity (validating the entity exists). Introduces a new MCP tool brain_get_person that looks up a person entity and returns its profile/relations plus scoped memories (optionally ranked by a provided context), and updates/extends tests (including MCP tool count 5→6).

Written by Cursor Bugbot for commit 8285cff. This will update automatically on new commits. Configure here.

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced "Get Person" tool to retrieve person profiles and associated memories with contextual ranking
    • Extended search to support person-scoped filtering, enabling targeted memory retrieval within a specific person's context
    • Memory storage now supports linking to specific people for organized recall
  • Tests

    • Added comprehensive tests for entity upgrades, person-scoped search, and memory linking functionality

…ity-tagged store

4 features for 6PM trial exercise (Avi Simon meeting scheduling):

1. Person profile schema convention on kg_entities metadata JSON:
   { hard_constraints, preferences, contact_info }

2. Per-person memory scoping: brain_search(entity_id=...) filters
   both semantic (KNN) and FTS5 to entity-linked chunks only.
   Added entity_id param to search(), hybrid_search(), _search(),
   _brain_search(). Bypasses routing rules, bumps k-value 10x.

3. brain_get_person composite MCP tool: one call returns profile +
   constraints + relations + scoped memories. Designed for copilot
   agents that need full person context.

4. Entity-tagged brain_store(entity_id=...): stores chunk then
   auto-links to entity via link_entity_chunk().

19 new tests (525 total passing). Zero breaking changes.

Source: avi-trial (6PM scheduling startup trial exercise)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Feb 26, 2026

📝 Walkthrough

Walkthrough

Introduces entity-scoped memory management by adding the brain_get_person composite tool, extending brain_search and memory storage with entity_id parameter for per-entity filtering and linking, implementing entity-aware retrieval through vector store and knowledge graph linkage, and adding comprehensive tests for entity isolation.

Changes

Cohort / File(s) Summary
Core MCP Tool Implementation
src/brainlayer/mcp/__init__.py
Added brain_get_person composite tool (entity lookup → memory retrieval), extended brain_search with entity_id filtering, updated call_tool dispatch logic, enhanced _brain_search and _store to accept and propagate entity_id, and registered new tool in list_tools.
Vector Store Entity Scoping
src/brainlayer/vector_store.py
Extended search and hybrid_search methods with optional entity_id parameter; added WHERE clauses filtering chunks through kg_entity_chunks table; adjusted KNN over-fetching and FTS fusion logic to respect per-entity constraints.
Memory Storage Entity Linking
src/brainlayer/store.py
Added optional entity_id parameter to store_memory; after inserting a chunk, links it to the entity via kg_entity_chunks with relevance 1.0 when entity_id is provided.
Test Coverage
tests/test_6pm_entity_upgrades.py, tests/test_think_recall_integration.py
New test module covering person profile schema validation, per-person memory scoping, entity-tagged store behavior, brain_get_person composite logic, and multi-person isolation; updated tool count assertion in integration tests (5 → 6 tools).

Sequence Diagram

sequenceDiagram
    participant Client
    participant MCP as MCP Layer
    participant KG as KG / Entity<br/>Lookup
    participant VS as Vector Store
    participant Store as Memory Store
    
    Client->>MCP: call_tool("brain_get_person",<br/>name, context, num_memories)
    MCP->>KG: entity_lookup(name)
    KG-->>MCP: entity_id, profile, relations
    
    alt context provided
        MCP->>VS: hybrid_search(context,<br/>entity_id=entity_id)
        VS-->>MCP: ranked memories (entity-scoped)
    else no context
        MCP->>VS: search(entity_id=entity_id)
        VS-->>MCP: entity-linked chunks
    end
    
    MCP-->>Client: JSON(entity_id, name, profile,<br/>relations, memories, count)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 A curious rabbit hops through memory lanes,
Tracking each person's thoughts and refrains,
Entity by entity, memory by name,
Brain_get_person brings scoped knowledge to the game!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and comprehensively summarizes the main changes: adding entity upgrades including per-person search, the new brain_get_person composite tool, and entity-tagged storage. It is specific, concise, and clearly conveys the primary objectives.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/6pm-entity-upgrades

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/brainlayer/mcp/__init__.py (1)

2271-2301: ⚠️ Potential issue | 🟠 Major

DB-busy fallback drops entity_id, so entity links can be silently lost.

Immediate writes pass entity_id, but queued writes/replay do not. If the DB is busy, brain_store(..., entity_id=...) will store memory without person linkage.

🛠️ Proposed fix
@@ def _flush_pending_stores(store, embed_fn) -> int:
             store_memory(
                 store=store,
                 embed_fn=embed_fn,
                 content=item["content"],
                 memory_type=item["memory_type"],
                 project=item.get("project"),
                 tags=item.get("tags"),
                 importance=item.get("importance"),
                 confidence_score=item.get("confidence_score"),
                 outcome=item.get("outcome"),
                 reversibility=item.get("reversibility"),
                 files_changed=item.get("files_changed"),
+                entity_id=item.get("entity_id"),
             )
@@ async def _store(...):
             _queue_store(
                 {
                     "content": content,
                     "memory_type": memory_type,
                     "project": _normalize_project_name(project),
                     "tags": tags,
                     "importance": importance,
                     "confidence_score": confidence_score,
                     "outcome": outcome,
                     "reversibility": reversibility,
                     "files_changed": files_changed,
+                    "entity_id": entity_id,
                 }
             )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/brainlayer/mcp/__init__.py` around lines 2271 - 2301, The DB-busy
fallback path is dropping entity_id when buffering/replaying writes so person
links are lost; ensure the buffering and replay code includes entity_id and
passes it through to store_memory (and any call sites like brain_store) exactly
as the immediate path does. Update the code that serializes memories to the
JSONL buffer to include the entity_id field, and update the replay/path that
reconstructs calls to call store_memory(..., entity_id=entity_id) (or forward
entity_id into the lambda used with loop.run_in_executor) so queued writes
preserve and pass entity_id through end-to-end.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/brainlayer/mcp/__init__.py`:
- Around line 717-751: Update the module-level metadata string that still
references the old number of MCP tools so it reflects the current public surface
of 6 tools; locate the explanatory text near the Tool definitions (e.g., around
the Tool entries like name="brain_get_person") and change the count in that
metadata/docstring or metadata variable to "6" (and, if present, make the text
generic or auto-derived so future additions don't get out of sync).

In `@src/brainlayer/store.py`:
- Around line 142-149: Validate entity_id before calling
store.link_entity_chunk: ensure entity_id is non-empty and corresponds to an
existing entity in the store (e.g., call a lookup like
store.get_entity(entity_id) or store.entity_exists(entity_id) and only call
store.link_entity_chunk with chunk_id when that check returns true); if the
lookup fails, skip linking and log or return a warning to avoid creating
dangling kg_entity_chunks rows.

In `@src/brainlayer/vector_store.py`:
- Around line 628-633: The current calculation of effective_k (effective_k =
n_results * 10 if entity_id else n_results) can grow unbounded and cause
expensive KNN scans; add a sane cap by defining a MAX_K_FANOUT constant (e.g.
500 or another sensible limit) and replace effective_k with min(effective_k,
MAX_K_FANOUT) before building params; update the variable used in params =
[query_bytes, effective_k] + filter_params so entity-scoped fan-out is limited
while still over-fetching for entity_id cases.

---

Outside diff comments:
In `@src/brainlayer/mcp/__init__.py`:
- Around line 2271-2301: The DB-busy fallback path is dropping entity_id when
buffering/replaying writes so person links are lost; ensure the buffering and
replay code includes entity_id and passes it through to store_memory (and any
call sites like brain_store) exactly as the immediate path does. Update the code
that serializes memories to the JSONL buffer to include the entity_id field, and
update the replay/path that reconstructs calls to call store_memory(...,
entity_id=entity_id) (or forward entity_id into the lambda used with
loop.run_in_executor) so queued writes preserve and pass entity_id through
end-to-end.

ℹ️ Review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6c5db6f and 2fe796f.

📒 Files selected for processing (5)
  • src/brainlayer/mcp/__init__.py
  • src/brainlayer/store.py
  • src/brainlayer/vector_store.py
  • tests/test_6pm_entity_upgrades.py
  • tests/test_think_recall_integration.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: test (3.11)
  • GitHub Check: test (3.12)
  • GitHub Check: test (3.13)
  • GitHub Check: Cursor Bugbot
🧰 Additional context used
📓 Path-based instructions (4)
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Run tests using pytest from the project root

Files:

  • tests/test_think_recall_integration.py
  • tests/test_6pm_entity_upgrades.py
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use ruff check src/ for linting and ruff format src/ for code formatting

Files:

  • src/brainlayer/vector_store.py
  • src/brainlayer/store.py
  • src/brainlayer/mcp/__init__.py
src/brainlayer/vector_store.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use sqlite-vec with APSW for vector storage, WAL mode, and PRAGMA busy_timeout = 5000 for concurrent multi-process safety

Files:

  • src/brainlayer/vector_store.py
src/brainlayer/mcp/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Implement MCP server with brain_search, brain_store, and brain_recall tools, maintaining backward compatibility with old brainlayer_* tool names

Files:

  • src/brainlayer/mcp/__init__.py
🧠 Learnings (1)
📚 Learning: 2026-02-23T16:51:38.317Z
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-23T16:51:38.317Z
Learning: Applies to src/brainlayer/mcp/**/*.py : Implement MCP server with brain_search, brain_store, and brain_recall tools, maintaining backward compatibility with old brainlayer_* tool names

Applied to files:

  • tests/test_think_recall_integration.py
  • src/brainlayer/mcp/__init__.py
🧬 Code graph analysis (4)
tests/test_think_recall_integration.py (1)
src/brainlayer/mcp/__init__.py (1)
  • list_tools (527-870)
tests/test_6pm_entity_upgrades.py (2)
src/brainlayer/vector_store.py (7)
  • VectorStore (72-2562)
  • serialize_f32 (67-69)
  • upsert_entity (2131-2170)
  • link_entity_chunk (2205-2223)
  • get_entity (2225-2244)
  • search (551-750)
  • get_entity_chunks (2335-2364)
src/brainlayer/store.py (1)
  • store_memory (33-154)
src/brainlayer/store.py (1)
src/brainlayer/vector_store.py (1)
  • link_entity_chunk (2205-2223)
src/brainlayer/mcp/__init__.py (2)
src/brainlayer/pipeline/digest.py (1)
  • entity_lookup (231-292)
src/brainlayer/vector_store.py (2)
  • hybrid_search (868-1080)
  • get_entity_chunks (2335-2364)
🔇 Additional comments (4)
src/brainlayer/vector_store.py (1)

585-587: Entity scoping is applied consistently across semantic, text, and FTS branches.

Nice implementation: all three retrieval paths now enforce entity_id filtering with parameterized SQL.

Also applies to: 652-654, 909-910, 926-930

tests/test_think_recall_integration.py (1)

249-255: Tool discovery assertion update is correct and useful.

This protects against accidental removal of brain_get_person from the registered MCP tools list.

tests/test_6pm_entity_upgrades.py (1)

201-282: Excellent coverage for entity scoping and isolation paths.

These tests meaningfully protect the new entity_id behavior across search, store linking, and multi-person separation.

Also applies to: 287-367, 443-505

src/brainlayer/mcp/__init__.py (1)

1313-1354: Entity-scoped routing behavior is well-implemented.

Bypassing heuristic routing and forwarding entity_id directly to _search makes per-person queries deterministic and avoids accidental route mismatches.

Also applies to: 1594-1614

Comment thread src/brainlayer/mcp/__init__.py
Comment thread src/brainlayer/store.py
Comment thread src/brainlayer/vector_store.py
Comment thread src/brainlayer/mcp/__init__.py
EtanHey and others added 2 commits February 26, 2026 15:13
When brain_store(entity_id=...) hit a DB lock and fell back to queue,
the entity_id was dropped from the queued payload. On flush, the
entity-chunk linkage was silently lost.

Found by CodeRabbit on PR #33.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Validate entity_id exists before creating kg_entity_chunks link (prevents dangling rows)
- Update metadata text from "5 tools" to "6 tools"
- Cap entity-scoped KNN effective_k at 1000 to bound query latency

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@EtanHey EtanHey merged commit 3ebe5df into main Feb 26, 2026
6 checks passed
@EtanHey EtanHey deleted the feature/6pm-entity-upgrades branch February 26, 2026 13:16
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable autofix in the Cursor dashboard.

Comment thread src/brainlayer/store.py
chunk_id=chunk_id,
relevance=1.0,
context=f"Stored via brain_store: {memory_type}",
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Entity validation after chunk commit creates orphaned data

Medium Severity

The entity_id validation (store.get_entity(entity_id)) happens after the chunk and its embedding have already been written to the database. Since the project uses apsw, which auto-commits each statement, both the INSERT INTO chunks and INSERT INTO chunk_vectors are permanently committed before the entity check runs. If the entity doesn't exist, ValueError is raised, the caller returns a "Validation error" to the user, but the chunk is already persisted — orphaned and unlinked. The user, believing the store failed, may retry and create a duplicate.

Moving the entity existence check before the chunk insert would fix the atomicity issue.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant