docs: update README — 1,454 tests, 12 MCP tools, entity contracts by EtanHey · Pull Request #120 · EtanHey/brainlayer

EtanHey · 2026-03-28T20:23:30Z

Summary

Updated test count: 1,204 -> 1,454 Python tests
Updated MCP tool count: 11 -> 12 (added brain_enrich)
Updated chunk count: 281K+ -> 284K+
Added entity contracts and health scoring to feature list
Added MLX as default enrichment backend
Added brain_enrich tool to MCP tools table

Test plan

Verify badge renders correctly on GitHub
Confirm all numbers match pytest --co -q output

Generated with Claude Code

Note

Add entity health scoring, contracts, and new DB schema tables

Introduces contracts/entity-types.yaml defining field requirements, relationship expectations, and health criteria for entity types (agent, person, tool, project, concept).
Adds scripts/score_entity_health.py which computes a weighted completeness score (required/expected fields, relationship density, chunk density, recency decay) per entity and upserts results into a new entity_health table.
Extends VectorStore._init_db in vector_store.py to create entity_contracts, entity_health, and entity_type_hierarchy tables, and adds entity_subtype/status columns to kg_entities and relation_tier/weight to kg_entity_chunks.
Updates entity_lookup in digest.py to include completeness_score and health_level from the entity_health table when available.
Updates README to reflect 12 MCP tools, 1,454 tests, and documents the new enrichment and entity contracts features.
Risk: schema alterations run via ALTER TABLE on every DB init; existing databases will be migrated silently with new default-valued columns.

📊 Macroscope summarized 1ad9a62. 6 files reviewed, 3 issues evaluated, 1 issue filtered, 1 comment posted

🗂️ Filtered Issues

scripts/score_entity_health.py — 0 comments posted, 2 evaluated, 1 filtered

line 42: If the YAML file is empty or contains only whitespace/comments, yaml.safe_load(f) returns None. The subsequent data.get("entity_types", {}) will raise AttributeError: 'NoneType' object has no attribute 'get'. [ Skipped comment generation ]

Summary by CodeRabbit

New Features
- Entity health scoring system now tracks completeness and validates entity data integrity.
- Entity type contracts framework provides structured validation for agent, person, tool, project, and concept entities.
- New enrichment controller tool (brain_enrich) available.
Documentation
- Updated capabilities and tool count in project documentation.
Tests
- Added comprehensive test suite validating entity contract functionality and health scoring.

…rarchy Add entity contract system for evaluating knowledge graph entity completeness: - SQL schema: entity_contracts, entity_health, entity_type_hierarchy tables - ALTER: kg_entities +entity_subtype +status; kg_entity_chunks +relation_tier +weight - ALTER: kg_entity_aliases +valid_from +valid_to - contracts/entity-types.yaml: YAML contracts for agent, person, tool, project, concept - scripts/score_entity_health.py: 5-dimension weighted scoring (required 40%, expected 25%, relationships 15%, chunks 10%, recency 10%) with Wikidata 5-tier health levels - brain_entity response now includes completeness_score and health_level - Type hierarchy seeded with core types + subtypes (golem->agent, platform->tool, etc.) - 39 new tests (TDD: RED first, all GREEN), 0 regressions on existing suite R49 research: brainbar-b917124c-913 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…tures Tests: 1,204 -> 1,454 Python. MCP tools: 11 -> 12 (brain_enrich added). Chunks: 281K+ -> 284K+. Added entity contracts, health scoring, MLX backend. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

greptile-apps

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

coderabbitai · 2026-03-28T20:23:46Z

📝 Walkthrough

Walkthrough

This PR implements R49 Phase 1 entity contract and health scoring infrastructure. It introduces an entity type contracts schema (contracts/entity-types.yaml), a health scoring CLI script that computes completeness metrics, database migrations for new entity-related tables and columns, and extends the entity lookup API with health metadata. Comprehensive tests validate the end-to-end implementation.

Changes

Cohort / File(s)	Summary
Documentation & Contracts `README.md`, `contracts/entity-types.yaml`	Updated README to reflect 12 MCP tools (was 11), 1,454 tests (was 1,204), new `brain_enrich` tool, MLX backend addition, and "Entity contracts" feature. Added new versioned R49 entity type contracts file defining required/expected/optional fields, relationship expectations, and health criteria for five entity types.
Health Scoring Implementation `scripts/score_entity_health.py`	New CLI script that loads entity contracts, iterates entities, computes weighted completeness scores based on field coverage, relationship density, chunk density, and recency (90-day half-life), classifies health levels (1–5), and persists results to `entity_health` table via upsert.
Core Database & Logic Updates `src/brainlayer/vector_store.py`, `src/brainlayer/pipeline/digest.py`	Database initialization adds three new tables (`entity_contracts`, `entity_health`, `entity_type_hierarchy`) with seeded hierarchy data and conditional DDL migrations to `kg_entities` (adds `entity_subtype`, `status`), `kg_entity_chunks` (adds `relation_tier`, `weight`), and `kg_entity_aliases` (adds `valid_from`, `valid_to`). Entity lookup enriched with conditional fetch of `completeness_score` and `health_level` from `entity_health`.
Test Coverage `tests/test_entity_contracts.py`, `tests/test_kg_schema.py`	Comprehensive 540-line end-to-end test suite validating schema initialization, contracts loading, health scoring behavior, entity count verification, health classification boundaries, persistence and deduplication, and enriched entity lookup fields. Updated KG schema assertions to include new columns on `kg_entities` and `kg_entity_chunks`.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant digest as entity_lookup<br/>(pipeline)
    participant store as VectorStore<br/>(DB)
    participant health as entity_health<br/>(table)
    participant scorer as score_entity_health<br/>(script)

    User->>digest: entity_lookup(entity_name)
    digest->>store: query FTS/semantic search
    store-->>digest: entity results + relations
    digest->>store: fetch health metadata
    store->>health: SELECT completeness_score,<br/>health_level by entity_name
    health-->>store: score data (or NULL)
    store-->>digest: completeness_score, health_level
    digest-->>User: enriched entity result

    User->>scorer: run score_entity_health.py
    scorer->>store: load contracts from YAML
    scorer->>store: iterate kg_entities
    loop per entity
        scorer->>store: fetch fields, relationships, chunks, updated_at
        scorer->>scorer: compute weighted completeness<br/>(fields + relations + chunks + recency)
        scorer->>scorer: classify health level (1-5)
    end
    scorer->>store: upsert entity_health records
    store-->>scorer: persist (commit)
    scorer-->>User: print health distribution counts

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

feat: KG standard tables — matches Convex kgSpec.ts #46: Modifies vector_store.py database schema initialization and updates KG schema test assertions with new column expectations.
feat: Phase 1 — KG schema + entity table + youtube source fix #29: Alters VectorStore._init_db to add KG schema tables and columns, overlapping core database migration logic.
feat: Phase 3 — Brain Digest + Brain Entity tools #32: Extends entity_lookup in digest.py return payload, directly related to the enrichment of entity lookup with health metadata.

Poem

🐰 A rabbit hops through entity contracts divine,
Scoring health with chunks and bonds aligned,
Completeness blossoms in the KG store—
R49 contracts open up the door! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 46.55% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically summarizes the main documentation updates: test count increase, new MCP tool, and entity contracts feature addition.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch docs/march-28-sweep

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

macroscopeapp · 2026-03-28T20:24:15Z

+    def test_missing_required_detected(self, populated_store, contracts_path):
+        """coachClaude doesn't have all required fields populated — detect gaps."""
+        from scripts.score_entity_health import score_entity, load_contracts
+        contracts = load_contracts(str(contracts_path))
+        result = score_entity(
+            populated_store, "ent-coach", "agent", contracts["agent"]
+        )
+        # capabilities is partially there in metadata, memory_domains is missing
+        assert isinstance(result["missing_required"], list)


🟡 Medium tests/test_entity_contracts.py:395

test_missing_required_detected only asserts that missing_required is a list, which passes even when empty. The docstring and comment claim memory_domains should be detected as missing, but the test never verifies this field (or any field) actually appears in the list. Consider adding an assertion that specific expected fields are present in missing_required.

assert isinstance(result["missing_required"], list) + # Verify specific missing fields are detected + assert "memory_domains" in result["missing_required"]

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:

In file tests/test_entity_contracts.py around lines 395-403: `test_missing_required_detected` only asserts that `missing_required` is a list, which passes even when empty. The docstring and comment claim `memory_domains` should be detected as missing, but the test never verifies this field (or any field) actually appears in the list. Consider adding an assertion that specific expected fields are present in `missing_required`. Evidence trail: tests/test_entity_contracts.py lines 395-403 at REVIEWED_COMMIT: The test function `test_missing_required_detected` has docstring claiming detection of missing required fields, comment specifically mentioning `memory_domains is missing`, but the only assertion is `assert isinstance(result["missing_required"], list)` with no check for specific field presence.

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

README.md (2)

94-110: ⚠️ Potential issue | 🟡 Minor

Inconsistency: Update architecture diagram to 12 tools.

The Mermaid diagram label still shows "11 tools" while the rest of the document reflects 12.

📝 Proposed fix

-    A["Claude Code / Cursor / Zed"] -->|MCP| B["BrainLayer MCP Server<br/>11 tools"]
+    A["Claude Code / Cursor / Zed"] -->|MCP| B["BrainLayer MCP Server<br/>12 tools"]

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@README.md` around lines 94 - 110, Update the Mermaid diagram node B that
currently reads "BrainLayer MCP Server<br/>11 tools" to reflect "12 tools" so
the architecture diagram matches the rest of the doc; locate the graph LR block
and change the label for node B (symbol B / "BrainLayer MCP Server") to show 12
tools.

3-3: ⚠️ Potential issue | 🟡 Minor

Inconsistency: Update tagline to 12 MCP tools.

The opening tagline still mentions "11 MCP tools" but the rest of the README (badge, stats line, section header, comparison table) now reflects 12 tools.

📝 Proposed fix

-> Persistent memory and knowledge graph for AI agents — 11 MCP tools, real-time JSONL watcher, Axiom telemetry, and a native macOS daemon for always-on recall across every conversation.
+> Persistent memory and knowledge graph for AI agents — 12 MCP tools, real-time JSONL watcher, Axiom telemetry, and a native macOS daemon for always-on recall across every conversation.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@README.md` at line 3, Update the README opening tagline to reflect "12 MCP
tools" instead of "11 MCP tools" so it matches the badges, stats line, section
header, and comparison table; locate the tagline string in the README (the first
paragraph/heading that currently reads "Persistent memory and knowledge graph
for AI agents — 11 MCP tools, real-time JSONL watcher, Axiom telemetry, and a
native macOS daemon for always-on recall across every conversation.") and
replace "11 MCP tools" with "12 MCP tools".

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/score_entity_health.py`:
- Around line 240-245: The tuple indexing for the parent type lookup is
incorrect/overly complex: the SELECT returns single-column rows so parent_row[0]
is like (parent_type,) and the ternary with len(parent_row[0]) > 1 is useless;
change the logic that follows the cursor.execute call to extract the parent_type
as parent_row[0][0] (or assign parent_type = parent_row[0][0]) and then use
contracts.get(parent_type) instead of the current conditional expression so the
lookup is correct and readable (refer to variables parent_row, cursor.execute
and contracts in this change).

In `@tests/test_entity_contracts.py`:
- Around line 17-20: The file imports unused symbols subprocess and patch;
remove these unused imports from the top of tests/test_entity_contracts.py so
only needed imports (sys, Path) remain, or alternatively use them in tests if
intended; specifically edit the import line(s) to drop subprocess and from
unittest.mock import patch unless patch is actually referenced by functions or
fixtures like any test_* functions.

---

Outside diff comments:
In `@README.md`:
- Around line 94-110: Update the Mermaid diagram node B that currently reads
"BrainLayer MCP Server<br/>11 tools" to reflect "12 tools" so the architecture
diagram matches the rest of the doc; locate the graph LR block and change the
label for node B (symbol B / "BrainLayer MCP Server") to show 12 tools.
- Line 3: Update the README opening tagline to reflect "12 MCP tools" instead of
"11 MCP tools" so it matches the badges, stats line, section header, and
comparison table; locate the tagline string in the README (the first
paragraph/heading that currently reads "Persistent memory and knowledge graph
for AI agents — 11 MCP tools, real-time JSONL watcher, Axiom telemetry, and a
native macOS daemon for always-on recall across every conversation.") and
replace "11 MCP tools" with "12 MCP tools".

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: d0f71f44-96c5-43aa-ad6f-7bdaf42c4493

📥 Commits

Reviewing files that changed from the base of the PR and between d0bcac9 and 1ad9a62.

📒 Files selected for processing (7)

README.md
contracts/entity-types.yaml
scripts/score_entity_health.py
src/brainlayer/pipeline/digest.py
src/brainlayer/vector_store.py
tests/test_entity_contracts.py
tests/test_kg_schema.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: test (3.11)
GitHub Check: test (3.12)
GitHub Check: test (3.13)

🧰 Additional context used

📓 Path-based instructions (5)

**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Flag risky DB or concurrency changes explicitly and do not hand-wave lock behavior
Enforce one-write-at-a-time concurrency constraint; reads are safe but brain_digest is write-heavy and must not run in parallel with other MCP work
Run pytest before claiming behavior changed safely; current test suite has 929 tests

Files:

tests/test_kg_schema.py
src/brainlayer/pipeline/digest.py
src/brainlayer/vector_store.py
scripts/score_entity_health.py
tests/test_entity_contracts.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use pytest for testing with configuration in pyproject.toml

Files:

tests/test_kg_schema.py
tests/test_entity_contracts.py

src/brainlayer/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/brainlayer/**/*.py: Use all scripts and CLI to resolve DB path via paths.py:get_db_path() with env var override support
Preserve verbatim content for ai_code, stack_trace, and user_message fields during classification
Use AST-aware chunking via tree-sitter and never split stack traces during document chunking
Mask large tool output and skip noise entries during classification and chunking
Configure enrichment rate via BRAINLAYER_ENRICH_RATE env var with default of 0.2 (12 RPM)
Implement retry logic on SQLITE_BUSY errors with each worker using its own database connection for concurrency
Use Typer CLI framework for command-line interface implementation in src/brainlayer/
Exclude lifecycle-managed chunks from default search results; provide include_archived parameter to show historical chunks

Files:

src/brainlayer/pipeline/digest.py
src/brainlayer/vector_store.py

src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use ruff for linting and formatting with commands: ruff check src/ && ruff format src/

Files:

src/brainlayer/pipeline/digest.py
src/brainlayer/vector_store.py

src/brainlayer/vector_store.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/brainlayer/vector_store.py: Store vector embeddings using sqlite-vec via APSW in the vector store implementation
Drop FTS triggers before bulk deletes on chunks table and recreate after to prevent performance degradation
Batch deletes into 5-10K chunks with checkpoints every 3 batches for large dataset bulk operations
Implement chunk lifecycle management with columns superseded_by, aggregated_into, and archived_at for tracking chunk relationships and soft-deletes

Files:

src/brainlayer/vector_store.py

🧠 Learnings (13)

📚 Learning: 2026-03-17T01:04:22.497Z

Learnt from: EtanHey
Repo: EtanHey/brainlayer PR: 0
File: :0-0
Timestamp: 2026-03-17T01:04:22.497Z
Learning: Applies to src/brainlayer/mcp/**/*.py and brain-bar/Sources/BrainBar/MCPRouter.swift: The 8 required MCP tools are `brain_search`, `brain_store`, `brain_recall`, `brain_entity`, `brain_expand`, `brain_update`, `brain_digest`, `brain_tags`. `brain_tags` is the 8th tool, replacing `brain_get_person`, as defined in the Phase B spec merged in PR `#72`. The Python MCP server already implements `brain_tags`. Legacy `brainlayer_*` aliases must be maintained for backward compatibility.

Applied to files:

README.md

📚 Learning: 2026-03-28T00:26:19.473Z

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-28T00:26:19.473Z
Learning: Applies to src/brainlayer/mcp/**/*.py : Implement MCP server entrypoint as brainlayer-mcp with 11 tools: brain_search, brain_store, brain_recall, brain_entity, brain_expand, brain_update, brain_digest, brain_get_person, brain_tags, brain_supersede, brain_archive

Applied to files:

README.md

📚 Learning: 2026-03-28T00:26:19.473Z

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-28T00:26:19.473Z
Learning: Applies to src/brainlayer/**/*.py : Exclude lifecycle-managed chunks from default search results; provide include_archived parameter to show historical chunks

Applied to files:

README.md

📚 Learning: 2026-03-17T01:04:22.497Z

Learnt from: EtanHey
Repo: EtanHey/brainlayer PR: 0
File: :0-0
Timestamp: 2026-03-17T01:04:22.497Z
Learning: In BrainLayer, the BrainBar daemon uses the socket path `/tmp/brainbar.sock` (NOT `/tmp/brainlayer.sock`). BrainBar is a new native Swift daemon designed to coexist with the existing Python `brainlayer-mcp` server during the migration period. Different socket paths avoid conflicts and enable A/B testing. Once BrainBar is proven stable, the Python server will be retired.

Applied to files:

README.md

📚 Learning: 2026-03-28T00:26:19.473Z

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-28T00:26:19.473Z
Learning: Applies to src/brainlayer/**/*.py : Mask large tool output and skip noise entries during classification and chunking

Applied to files:

README.md

📚 Learning: 2026-03-14T02:20:54.656Z

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-03-14T02:20:54.656Z
Learning: Be aware of known BrainLayer issues: DB locking during enrichment and WAL growth up to 4.7GB

Applied to files:

README.md

📚 Learning: 2026-03-28T00:26:19.473Z

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-28T00:26:19.473Z
Learning: Applies to src/brainlayer/enrichment_controller.py : Use Groq as primary enrichment backend with Gemini as fallback via enrichment_controller.py, overridable via BRAINLAYER_ENRICH_BACKEND env var

Applied to files:

README.md

📚 Learning: 2026-03-28T00:26:19.473Z

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-28T00:26:19.473Z
Learning: Applies to src/brainlayer/**/*.py : Configure enrichment rate via BRAINLAYER_ENRICH_RATE env var with default of 0.2 (12 RPM)

Applied to files:

README.md

📚 Learning: 2026-03-22T15:55:22.017Z

Learnt from: EtanHey
Repo: EtanHey/brainlayer PR: 100
File: src/brainlayer/enrichment_controller.py:175-199
Timestamp: 2026-03-22T15:55:22.017Z
Learning: In `src/brainlayer/enrichment_controller.py`, the `parallel` parameter in `enrich_local()` is intentionally kept in the function signature (currently unused, suppressed with `# noqa: ARG001`) for API stability. Parallel local enrichment via a thread pool or process pool is planned for a future iteration. Do not flag this as dead code requiring removal.

Applied to files:

README.md

📚 Learning: 2026-03-14T02:20:54.656Z

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-03-14T02:20:54.656Z
Learning: Applies to **/*.py : Run pytest before claiming behavior changed safely; current test suite has 929 tests

Applied to files:

README.md

📚 Learning: 2026-03-28T00:26:19.473Z

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-28T00:26:19.473Z
Learning: Applies to src/**/*.py : Use ruff for linting and formatting with commands: ruff check src/ && ruff format src/

Applied to files:

README.md

📚 Learning: 2026-03-28T00:26:19.473Z

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-28T00:26:19.473Z
Learning: Applies to src/brainlayer/vector_store.py : Store vector embeddings using sqlite-vec via APSW in the vector store implementation

Applied to files:

src/brainlayer/vector_store.py

📚 Learning: 2026-03-28T00:26:19.473Z

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-28T00:26:19.473Z
Learning: Applies to src/brainlayer/vector_store.py : Implement chunk lifecycle management with columns superseded_by, aggregated_into, and archived_at for tracking chunk relationships and soft-deletes

Applied to files:

src/brainlayer/vector_store.py

🔇 Additional comments (7)

README.md (1)

9-10: LGTM — Documentation updates are consistent.

The test count (1,454), MCP tool count (12), chunk count (284K+), entity contracts feature, MLX backend addition, and brain_enrich tool documentation all align with the PR objectives and commit messages.

Also applies to: 15-15, 125-125, 128-128, 149-154, 183-183, 187-190, 201-201, 216-218, 289-289

contracts/entity-types.yaml (1)

1-162: LGTM — Well-structured entity type contracts.

The contract specification is well-documented with clear requirement levels and consistent health criteria across all entity types. The hierarchical structure (required/expected/optional) provides flexibility for progressive entity completeness.

tests/test_kg_schema.py (1)

74-95: LGTM — Schema assertions updated for R49 columns.

The expected column sets correctly reflect the new entity_subtype, status, relation_tier, and weight columns added by the R49 migrations in vector_store.py.

Also applies to: 118-122

src/brainlayer/vector_store.py (1)

586-662: LGTM — R49 schema additions are idempotent and well-structured.

The schema changes follow best practices:

Table creation uses IF NOT EXISTS

Column additions check existence via PRAGMA table_info before ALTER TABLE

Type hierarchy seeding uses INSERT OR IGNORE

All changes are safe for repeated _init_db() calls

src/brainlayer/pipeline/digest.py (1)

847-872: LGTM — Health data integration with graceful fallback.

The implementation correctly:

Uses _read_cursor() for thread-safe reads

Queries by entity_name matching the entity_health table's primary key

Falls back to None when the table doesn't exist or entity is unscored

The bare except Exception is appropriate here since it needs to handle both OperationalError (table doesn't exist) and empty results, and the fallback behavior (return None for both fields) is the same.

scripts/score_entity_health.py (1)

1-22: LGTM — Well-documented scoring implementation.

The scoring algorithm is clearly documented with:

5-dimension weighted scoring (40% required, 25% expected, 15% relationships, 10% chunks, 10% recency)

Wikidata-style 5-tier health classification

Exponential decay for recency with 90-day half-life

Proper use of get_db_path() for DB resolution per coding guidelines

Clean resource management with finally block

Also applies to: 34-42, 45-63, 129-219, 286-318

tests/test_entity_contracts.py (1)

30-128: LGTM — Comprehensive R49 test coverage.

The test suite thoroughly covers:

Schema creation (3 new tables, altered columns with defaults)

Contract YAML validation (structure, required fields, health criteria)

Scoring algorithm (5-dimension weights, tier boundaries, edge cases)

Persistence (upsert behavior, idempotency)

Lookup integration (health fields present/absent)

Type hierarchy seeding

Good use of the populated_store fixture to set up realistic test data with entities, relations, and chunks.

Also applies to: 134-178, 184-249, 254-265, 271-317, 323-431, 436-463, 469-508, 514-539

coderabbitai · 2026-03-28T20:28:16Z

+            parent_row = list(cursor.execute(
+                "SELECT parent_type FROM entity_type_hierarchy WHERE child_type = ?",
+                (entity_type,),
+            ))
+            if parent_row:
+                contract = contracts.get(parent_row[0][1] if len(parent_row[0]) > 1 else parent_row[0][0])


⚠️ Potential issue | 🟡 Minor

Bug: Incorrect tuple indexing for parent type lookup.

The query SELECT parent_type FROM entity_type_hierarchy returns a single-column tuple, so parent_row[0] is (parent_type,). The condition len(parent_row[0]) > 1 will never be true for a single-column result, making this always fall through to parent_row[0][0].

The ternary is unnecessary and confusing.

🐛 Proposed fix

parent_row = list(cursor.execute( "SELECT parent_type FROM entity_type_hierarchy WHERE child_type = ?", (entity_type,), )) if parent_row: - contract = contracts.get(parent_row[0][1] if len(parent_row[0]) > 1 else parent_row[0][0]) + contract = contracts.get(parent_row[0][0])

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@scripts/score_entity_health.py` around lines 240 - 245, The tuple indexing for the parent type lookup is incorrect/overly complex: the SELECT returns single-column rows so parent_row[0] is like (parent_type,) and the ternary with len(parent_row[0]) > 1 is useless; change the logic that follows the cursor.execute call to extract the parent_type as parent_row[0][0] (or assign parent_type = parent_row[0][0]) and then use contracts.get(parent_type) instead of the current conditional expression so the lookup is correct and readable (refer to variables parent_row, cursor.execute and contracts in this change).

coderabbitai · 2026-03-28T20:28:16Z

+import subprocess
+import sys
+from pathlib import Path
+from unittest.mock import patch


🧹 Nitpick | 🔵 Trivial

Minor: Unused imports.

subprocess and patch are imported but not used in this test file.

🧹 Proposed cleanup

import json import math import os -import subprocess -import sys from pathlib import Path -from unittest.mock import patch import pytest

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

import subprocess

import sys

from pathlib import Path

from unittest.mock import patch

import json

import math

import os

from pathlib import Path

import pytest

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/test_entity_contracts.py` around lines 17 - 20, The file imports unused symbols subprocess and patch; remove these unused imports from the top of tests/test_entity_contracts.py so only needed imports (sys, Path) remain, or alternatively use them in tests if intended; specifically edit the import line(s) to drop subprocess and from unittest.mock import patch unless patch is actually referenced by functions or fixtures like any test_* functions.

EtanHey and others added 2 commits March 28, 2026 20:10

greptile-apps Bot reviewed Mar 28, 2026

View reviewed changes

macroscopeapp Bot reviewed Mar 28, 2026

View reviewed changes

coderabbitai Bot reviewed Mar 28, 2026

View reviewed changes

EtanHey merged commit 3df38b8 into main Mar 28, 2026
2 of 6 checks passed

coderabbitai Bot mentioned this pull request Apr 13, 2026

Fix enrichment retrieval quality regressions #240

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: update README — 1,454 tests, 12 MCP tools, entity contracts#120

docs: update README — 1,454 tests, 12 MCP tools, entity contracts#120
EtanHey merged 2 commits intomainfrom
docs/march-28-sweep

EtanHey commented Mar 28, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

greptile-apps Bot left a comment

Uh oh!

coderabbitai Bot commented Mar 28, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

macroscopeapp Bot Mar 28, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Mar 28, 2026

Uh oh!

coderabbitai Bot Mar 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

EtanHey commented Mar 28, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Add entity health scoring, contracts, and new DB schema tables

🗂️ Filtered Issues

Summary by CodeRabbit

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot commented Mar 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

macroscopeapp Bot Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

EtanHey commented Mar 28, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Mar 28, 2026 •

edited

Loading