feat: add parent_id to kg_entities + expand canonical relation types by EtanHey · Pull Request #219 · EtanHey/brainlayer

EtanHey · 2026-04-06T09:48:56Z

Summary

Added parent_id column to kg_entities table with index for instance-level hierarchy
Added KGMixin methods: get_entity_parent(), get_entity_children(), set_entity_parent()
Expanded CANONICAL_RELATION_TYPES from 8 to 14: added depends_on, spawns, created, lives_in, leads, freelances_for
Added _RELATION_TYPE_ALIASES mapping: ceo_of→leads, cto_of→leads, worked_at→works_at, framework_for→depends_on, etc.
Wired parent/children info into brain_entity MCP output

Context

P2 #7 from brainlayer-r75-r78-unimplemented.md

Test plan

pytest tests/test_entity_parent_relations.py -v
ruff check src/ tests/ clean
CodeRabbit review addressed

🤖 Generated with Claude Code

Note

Add `parent_id` to kg_entities and expand canonical relation types

Adds a parent_id column to the kg_entities table (with index) via schema migration in vector_store.py, enabling hierarchical entity relationships.
Extends KGMixin in kg_repo.py with get_entity_children, get_entity_parent, and set_entity_parent methods, and updates upsert_entity/get_entity/get_entity_by_name to include parent_id.
The entity lookup MCP handler in entity_handler.py now attaches parent and children to results, and _format.py renders them in output.
Adds new canonical relation types (depends_on, spawns, created, lives_in, leads, freelances_for) and alias mappings (e.g. ceo_of→leads, worked_at→works_at) in kg_extraction.py.
Risk: get_entity_parent contains malformed SQL (stray + characters), so any entity lookup with a parent_id will raise a SQL error at runtime.

^{Macroscope summarized 31f275e.}

Summary by CodeRabbit

Release Notes

New Features
- Added entity hierarchy support—entities can now have parent-child relationships.
- Enhanced entity display to show parent and child entity information when available.
- Added new relation types: depends_on, spawns, created, lives_in, leads, and freelances_for.
- Automatic normalization of legacy relation type aliases to their canonical forms.

coderabbitai · 2026-04-06T09:49:04Z

Warning

Rate limit exceeded

@EtanHey has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 11 minutes and 6 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 11 minutes and 6 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 4ac74dcf-55f2-44c2-9f40-b1d5723b5d81

📥 Commits

Reviewing files that changed from the base of the PR and between 429d43f and 31f275e.

📒 Files selected for processing (7)

src/brainlayer/kg_repo.py
src/brainlayer/mcp/_format.py
src/brainlayer/mcp/entity_handler.py
src/brainlayer/pipeline/kg_extraction.py
src/brainlayer/vector_store.py
tests/test_entity_parent_relations.py
tests/test_kg_schema.py

📝 Walkthrough

Walkthrough

The PR adds entity hierarchy support to the knowledge graph system by introducing a parent_id column to kg_entities, implementing parent/child retrieval methods, normalizing additional relation types in the extraction pipeline, and enriching MCP entity responses with hierarchical information.

Changes

Cohort / File(s)	Summary
Entity Hierarchy Storage & Retrieval `src/brainlayer/kg_repo.py`, `src/brainlayer/vector_store.py`	Added `parent_id` parameter to `upsert_entity`, updated `get_entity` and `get_entity_by_name` to return `parent_id`, and introduced three new methods: `get_entity_children`, `get_entity_parent`, and `set_entity_parent`. Schema migration adds `parent_id TEXT` column and index to `kg_entities`.
Entity Hierarchy Formatting & Enrichment `src/brainlayer/mcp/entity_handler.py`, `src/brainlayer/mcp/_format.py`	Entity handler now fetches and attaches parent and children data to lookup results. Format function conditionally renders "Parent" and "Children" sections in entity output when hierarchical data is present.
Relation Type Normalization `src/brainlayer/pipeline/kg_extraction.py`	Expanded canonical relation types to include `depends_on`, `spawns`, `created`, `lives_in`, `leads`, `freelances_for`. Added `_RELATION_TYPE_ALIASES` and normalization logic in `validate_extraction_result` to rewrite legacy types (`ceo_of`, `worked_at`, `framework_for`, etc.) to their canonical equivalents.
Test Coverage `tests/test_entity_parent_relations.py`, `tests/test_kg_schema.py`	New test file validates parent/child persistence, retrieval, and relation type normalization. Schema test updated to expect `parent_id` column in `kg_entities`.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant MCP as MCP Handler
    participant KGRepo as KG Repository
    participant DB as SQLite DB
    participant Format as Format Service

    User->>MCP: Request entity details
    MCP->>KGRepo: entity_lookup(entity_name)
    KGRepo->>DB: SELECT entity data
    DB-->>KGRepo: entity record
    
    alt Entity has parent
        MCP->>KGRepo: get_entity_parent(entity_id)
        KGRepo->>DB: SELECT parent entity
        DB-->>KGRepo: parent record
        KGRepo-->>MCP: parent dict
    end
    
    MCP->>KGRepo: get_entity_children(entity_id)
    KGRepo->>DB: SELECT children (parent_id = ?)
    DB-->>KGRepo: children records
    KGRepo-->>MCP: children list
    
    MCP->>Format: format_entity_simple(enriched_entity)
    Format->>Format: Render parent section (if exists)
    Format->>Format: Render children section (if exists)
    Format-->>User: Formatted entity with hierarchy

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

PR #29: Adds the foundational kg_entities table and entity CRUD APIs (upsert_entity, get_entity) that this PR directly extends with parent/child hierarchy methods and schema columns.
PR #47: Modifies the KG extraction pipeline (kg_extraction.py) to define relation types and validation; this PR updates the same module to add new canonical relation types and normalization aliases.
PR #218: Modifies the KG repository and MCP entity handler to expand entity-facing functionality; this PR adds parent/child hierarchy methods and MCP enrichment in the same classes.

Poem

🐰 A rabbit hops through hierarchies,
Parents and children now clearly seen,
Relations normalized, types aligned with ease,
Knowledge graphs bloom with structure pristine! 🌿

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the two main changes: adding parent_id support to kg_entities and expanding canonical relation types.
Docstring Coverage	✅ Passed	Docstring coverage is 95.45% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/entity-parent-id-relations

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

greptile-apps

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

macroscopeapp · 2026-04-06T09:51:36Z

🟡 Medium

brainlayer/src/brainlayer/mcp/entity_handler.py

Line 27 in 6bdf895

try:

The new database calls store.get_entity(), store.get_entity_parent(), and store.get_entity_children() on lines 44-50 are outside the try/except block that wraps entity_lookup. If any of these raise an exception, it propagates unhandled and crashes the handler instead of returning _error_result. Consider wrapping lines 44-52 in the existing try/except or adding a separate error handler.

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:

In file src/brainlayer/mcp/entity_handler.py around line 27: The new database calls `store.get_entity()`, `store.get_entity_parent()`, and `store.get_entity_children()` on lines 44-50 are outside the `try/except` block that wraps `entity_lookup`. If any of these raise an exception, it propagates unhandled and crashes the handler instead of returning `_error_result`. Consider wrapping lines 44-52 in the existing try/except or adding a separate error handler. Evidence trail: src/brainlayer/mcp/entity_handler.py lines 1-70 at REVIEWED_COMMIT: try/except block spans lines 26-35 (wrapping entity_lookup), database calls store.get_entity() at line 42, store.get_entity_parent() at line 44, store.get_entity_children() at line 48 are all outside this try/except block.

macroscopeapp · 2026-04-06T09:51:36Z

+        """Set the parent of an entity."""
+        cursor = self.conn.cursor()
+        cursor.execute(
+            "UPDATE kg_entities SET parent_id = ?, updated_at = strftime('%Y-%m-%dT%H:%M:%fZ','now') WHERE id = ?",


🟢 Low brainlayer/kg_repo.py:351

The updated_at timestamp format differs based on which method updates the entity. upsert_entity uses Python's strftime("%Y-%m-%dT%H:%M:%S.%fZ") producing 6-digit microseconds, while set_entity_parent uses SQLite's strftime('%Y-%m-%dT%H:%M:%fZ','now') where %f is seconds with 3-digit milliseconds. This creates inconsistent timestamp formats in the same table.

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:

In file src/brainlayer/kg_repo.py around line 351: The `updated_at` timestamp format differs based on which method updates the entity. `upsert_entity` uses Python's `strftime("%Y-%m-%dT%H:%M:%S.%fZ")` producing 6-digit microseconds, while `set_entity_parent` uses SQLite's `strftime('%Y-%m-%dT%H:%M:%fZ','now')` where `%f` is seconds with 3-digit milliseconds. This creates inconsistent timestamp formats in the same table. Evidence trail: 1. src/brainlayer/kg_repo.py line 44: `now = datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%S.%fZ")` (Python strftime with %f = 6-digit microseconds) 2. src/brainlayer/kg_repo.py line 351: `strftime('%Y-%m-%dT%H:%M:%fZ','now')` (SQLite strftime where %f = SS.SSS format) 3. SQLite documentation (https://sqlite.org/lang_datefunc.html): "%f fractional seconds: SS.SSS" confirms SQLite %f includes seconds with 3-digit milliseconds 4. Python documentation: %f is microseconds as 6 digits (000000-999999)

- Add parent_id column to kg_entities with index - Add get_entity_parent(), get_entity_children(), set_entity_parent() to KGMixin - Expand CANONICAL_RELATION_TYPES: depends_on, spawns, created, lives_in, leads, freelances_for - Add _RELATION_TYPE_ALIASES mapping (ceo_of→leads, worked_at→works_at, etc.) - Wire parent/children into brain_entity output format - Tests for schema, parent/child queries, relation aliases Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/brainlayer/pipeline/kg_extraction.py`:
- Around line 52-60: Update the extraction prompts so they enumerate the current
canonical relation types (including "leads" and "freelances_for") and remove
obsolete types ("deployed_on", "fixes", "configures") referenced in the prompt
text, ensuring the prompt language matches the normalization map
_RELATION_TYPE_ALIASES; specifically edit the prompt strings used in the
entity_extraction module (the entity extraction prompt around where
entities/relations are built) and the prompt in kg_extraction_groq to list only
the canonical relation types and include the new ones so the model emits the
canonical names rather than falling back to "related_to".

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: e9cc5430-0307-44c2-9715-ff69d848f4e4

📥 Commits

Reviewing files that changed from the base of the PR and between d673c51 and 429d43f.

📒 Files selected for processing (7)

src/brainlayer/kg_repo.py
src/brainlayer/mcp/_format.py
src/brainlayer/mcp/entity_handler.py
src/brainlayer/pipeline/kg_extraction.py
src/brainlayer/vector_store.py
tests/test_entity_parent_relations.py
tests/test_kg_schema.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: test (3.12)
GitHub Check: test (3.11)
GitHub Check: test (3.13)

🧰 Additional context used

📓 Path-based instructions (2)

**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Flag risky DB or concurrency changes explicitly and do not hand-wave lock behavior
Enforce one-write-at-a-time concurrency constraint; reads are safe but brain_digest is write-heavy and must not run in parallel with other MCP work
Run pytest before claiming behavior changed safely; current test suite has 929 tests

**/*.py: Use paths.py:get_db_path() for all database path resolution; all scripts and CLI must use this function rather than hardcoding paths
When performing bulk database operations: stop enrichment workers first, checkpoint WAL before and after, drop FTS triggers before bulk deletes, batch deletes in 5-10K chunks, and checkpoint every 3 batches

Files:

tests/test_kg_schema.py
src/brainlayer/vector_store.py
src/brainlayer/mcp/_format.py
src/brainlayer/mcp/entity_handler.py
tests/test_entity_parent_relations.py
src/brainlayer/pipeline/kg_extraction.py
src/brainlayer/kg_repo.py

src/brainlayer/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/brainlayer/**/*.py: Use retry logic on SQLITE_BUSY errors; each worker must use its own database connection to handle concurrency safely
Classification must preserve ai_code, stack_trace, and user_message verbatim; skip noise entries entirely and summarize build_log and dir_listing entries (structure only)
Use AST-aware chunking via tree-sitter; never split stack traces; mask large tool output
For enrichment backend selection: use Groq as primary backend (cloud, configured in launchd plist), Gemini as fallback via enrichment_controller.py, and Ollama as offline last-resort; allow override via BRAINLAYER_ENRICH_BACKEND env var
Configure enrichment rate via BRAINLAYER_ENRICH_RATE environment variable (default 0.2 = 12 RPM)
Implement chunk lifecycle columns: superseded_by, aggregated_into, archived_at on chunks table; exclude lifecycle-managed chunks from default search; allow include_archived=True to show history
Implement brain_supersede with safety gate for personal data (journals, notes, health/finance); use soft-delete for brain_archive with timestamp
Add supersedes parameter to brain_store for atomic store-and-replace operations
Run linting and formatting with: ruff check src/ && ruff format src/
Run tests with pytest
Use PRAGMA wal_checkpoint(FULL) before and after bulk database operations to prevent WAL bloat

Files:

src/brainlayer/vector_store.py
src/brainlayer/mcp/_format.py
src/brainlayer/mcp/entity_handler.py
src/brainlayer/pipeline/kg_extraction.py
src/brainlayer/kg_repo.py

🧠 Learnings (4)

📚 Learning: 2026-03-29T23:19:50.743Z

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-29T23:19:50.743Z
Learning: Applies to src/brainlayer/{vector_store,search}*.py : Chunk lifecycle: implement columns `superseded_by`, `aggregated_into`, `archived_at` on chunks table; exclude lifecycle-managed chunks from default search

Applied to files:

src/brainlayer/vector_store.py

📚 Learning: 2026-03-29T23:19:50.743Z

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-29T23:19:50.743Z
Learning: Applies to src/brainlayer/vector_store.py : Use sqlite-vec with APSW for vector storage and retrieval

Applied to files:

src/brainlayer/vector_store.py

📚 Learning: 2026-04-01T01:24:44.281Z

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T01:24:44.281Z
Learning: Applies to src/brainlayer/mcp/*.py : MCP tools include: brain_search, brain_store, brain_recall, brain_entity, brain_expand, brain_update, brain_digest, brain_get_person, brain_tags, brain_supersede, brain_archive (legacy brainlayer_* aliases still supported)

Applied to files:

src/brainlayer/mcp/entity_handler.py

📚 Learning: 2026-03-14T02:20:54.656Z

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-03-14T02:20:54.656Z
Learning: Treat retrieval correctness, write safety, and MCP stability as critical-path concerns in BrainLayer reviews

Applied to files:

src/brainlayer/mcp/entity_handler.py

🔇 Additional comments (15)

tests/test_kg_schema.py (1)

92-92: LGTM!

The test expectation correctly updated to include the new parent_id column, aligning with the schema migration in vector_store.py.

src/brainlayer/vector_store.py (1)

724-726: LGTM!

The migration correctly adds the parent_id column conditionally and creates an index. The pattern is consistent with other migrations in the file and is idempotent.

src/brainlayer/mcp/entity_handler.py (1)

43-52: LGTM — enrichment logic is sound.

The handler defensively checks entity_record and parent_id before fetching parent, and correctly attaches children when non-empty. The lambda captures are safe since entity_id doesn't change during execution.

Note: The PR objectives mention a SQL bug in get_entity_parent. I'll verify this in the kg_repo.py review.

src/brainlayer/mcp/_format.py (1)

194-203: LGTM!

The formatting logic is defensive with .get() and isinstance() checks, consistent with the file's patterns and safely handles callers that don't provide parent or children keys.

src/brainlayer/pipeline/kg_extraction.py (3)

41-46: LGTM — canonical types expanded correctly.

The new relation types align with the PR objectives and are properly integrated into the validation logic.

70-75: Direction rules correctly defined for new types.

The source/target type constraints are sensible for the new relation types.

122-122: Alias normalization correctly applied before canonical check.

The order is correct: normalize aliases first, then validate against canonical types.

tests/test_entity_parent_relations.py (1)

1-130: Comprehensive test coverage for the new hierarchy features.

The tests cover schema verification, CRUD operations, edge cases (empty children, no parent), ordering behavior, and relation type alias normalization. Well-structured test suite.

src/brainlayer/kg_repo.py (7)

29-29: LGTM — parent_id parameter added to upsert_entity.

The new optional parameter follows the existing pattern of keyword-only arguments with sensible defaults.

39-74: LGTM — SQL updated correctly for parent_id persistence.

The INSERT includes parent_id and the conflict clause uses COALESCE(excluded.parent_id, kg_entities.parent_id) to preserve existing values when the new value is NULL — consistent with how group_id, valid_from, and valid_until are handled.

174-204: LGTM — get_entity correctly returns parent_id.

The SELECT and return dict now include parent_id at index 13.

206-236: LGTM — get_entity_by_name correctly returns parent_id.

Consistent with get_entity — both methods now include parent_id in the returned entity dict.

315-328: LGTM — get_entity_children implementation is correct.

The query filters by parent_id = ?, includes status check for active entities, orders by importance DESC, name ASC, and respects the limit. The returned dict structure matches what's expected by the formatter.

347-353: LGTM — set_entity_parent is correct.

Simple UPDATE that sets parent_id and updates updated_at timestamp. Uses write cursor (self.conn.cursor()) appropriately.

330-345: SQL is syntactically correct — no bugs in this method.

The code shows clean SQL with a proper self-join on kg_entities and no stray characters or syntax errors. The PR objectives may reference an older issue that was already fixed, or they may be outdated.

greptile-apps Bot reviewed Apr 6, 2026

View reviewed changes

macroscopeapp Bot reviewed Apr 6, 2026

View reviewed changes

EtanHey and others added 2 commits April 6, 2026 13:15

fix: add parent_id to kg_entities column test

31f275e

EtanHey force-pushed the feat/entity-parent-id-relations branch from 429d43f to 31f275e Compare April 6, 2026 10:15

coderabbitai Bot reviewed Apr 6, 2026

View reviewed changes

Comment thread src/brainlayer/pipeline/kg_extraction.py

EtanHey merged commit a62b023 into main Apr 6, 2026
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add parent_id to kg_entities + expand canonical relation types#219

feat: add parent_id to kg_entities + expand canonical relation types#219
EtanHey merged 2 commits intomainfrom
feat/entity-parent-id-relations

EtanHey commented Apr 6, 2026 •

edited by macroscopeapp Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 6, 2026 •

edited

Loading

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

greptile-apps Bot left a comment

Uh oh!

macroscopeapp Bot Apr 6, 2026

Uh oh!

macroscopeapp Bot Apr 6, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

EtanHey commented Apr 6, 2026 • edited by macroscopeapp Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Context

Test plan

Add parent_id to kg_entities and expand canonical relation types

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

macroscopeapp Bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

macroscopeapp Bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

EtanHey commented Apr 6, 2026 •

edited by macroscopeapp Bot

Loading

Add `parent_id` to kg_entities and expand canonical relation types

coderabbitai Bot commented Apr 6, 2026 •

edited

Loading