Conversation
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThe changes add post-enrichment KG extraction to cloud backfill processing, introduce input validation for brain search operations, and establish comprehensive test coverage including a reusable project fixture and regression tests for both cloud backfill and search validation workflows. Changes
Sequence DiagramsequenceDiagram
participant Caller as Cloud Backfill<br/>import_results
participant Parser as Enrichment<br/>Parser
participant KGExtractor as extract_kg_from_chunk
participant Handler as Exception<br/>Handler
Caller->>Parser: parse enrichment result
Parser-->>Caller: enrichment data
Caller->>KGExtractor: extract_kg_from_chunk(chunk,<br/>seed_entities=DEFAULT_SEED_ENTITIES,<br/>use_llm=False, use_gliner=False)
alt KG Extraction Success
KGExtractor-->>Caller: extraction complete
Caller->>Caller: continue import flow
else KG Extraction Failure
KGExtractor-->>Handler: exception raised
Handler->>Handler: print warning
Handler-->>Caller: error handled (non-blocking)
Caller->>Caller: continue import flow
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@codex review |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/test_search_validation.py`:
- Around line 29-41: Add a new unit test mirroring
test_num_results_over_100_returns_error to verify the lower bound: call
_brain_search with num_results=0 (imported from brainlayer.mcp.search_handler),
patch _get_vector_store, _detect_entities, and _search (AsyncMock) the same way,
then assert mock_search.assert_not_called(), result.isError is True, and that
the error message in result.content[0].text contains "must be between 1 and
100"; name the test e.g. test_num_results_zero_returns_error to make the
lower-bound validation explicit.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 331d6332-42a6-438b-a346-4c12feb499b7
📒 Files selected for processing (5)
scripts/cloud_backfill.pysrc/brainlayer/mcp/search_handler.pytests/conftest.pytests/test_cloud_backfill.pytests/test_search_validation.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: test (3.12)
- GitHub Check: test (3.13)
- GitHub Check: test (3.11)
🧰 Additional context used
📓 Path-based instructions (2)
src/brainlayer/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/brainlayer/**/*.py: Python package structure should follow the layout:src/brainlayer/for package code, with separate modules forvector_store.py,embeddings.py,daemon.py,dashboard/, andmcp/for different concerns
Usepaths.py:get_db_path()for all database path resolution instead of hardcoding paths; support environment variable overrides and canonical path fallback (~/.local/share/brainlayer/brainlayer.db)
Lint and format Python code usingruff check src/andruff format src/
Preserve verbatim content forai_code,stack_trace, anduser_messagemessage types during classification and chunking; skipnoisecontent entirely; summarizebuild_logcontent; extract structure-only fordir_listing
Use AST-aware chunking with tree-sitter; never split stack traces; mask large tool output during chunking
Handle SQLite concurrency by implementing retry logic onSQLITE_BUSYerrors; ensure each worker uses its own database connection
Prioritize MLX (Qwen2.5-Coder-14B-Instruct-4bit) on Apple Silicon (port 8080) as the enrichment backend; fall back to Ollama (glm-4.7-flashon port 11434) after 3 consecutive MLX failures; support backend override viaBRAINLAYER_ENRICH_BACKENDenvironment variable
Brain graph API must expose endpoints:/brain/graph,/brain/node/{node_id}(FastAPI)
Backlog API must support endpoints:/backlog/itemswith GET, POST, PATCH, DELETE operations (FastAPI)
Providebrainlayer brain-exportcommand to export brain graph as JSON for dashboard consumption
Providebrainlayer export-obsidiancommand to export as Markdown vault with backlinks and tags
For bulk database operations: stop enrichment workers first, checkpoint WAL before and after operations, drop FTS triggers before bulk deletes, batch deletes in 5-10K chunks with checkpoint every 3 batches, never delete fromchunkswhile FTS trigger is active
Files:
src/brainlayer/mcp/search_handler.py
src/brainlayer/mcp/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
MCP tools must implement:
brain_search,brain_store,brain_recall,brain_entity,brain_expand,brain_update,brain_digest,brain_get_personwith legacybrainlayer_*aliases for backward compatibility
Files:
src/brainlayer/mcp/search_handler.py
🧠 Learnings (2)
📚 Learning: 2026-03-12T14:22:54.798Z
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-12T14:22:54.798Z
Learning: Applies to src/brainlayer/mcp/**/*.py : MCP tools must implement: `brain_search`, `brain_store`, `brain_recall`, `brain_entity`, `brain_expand`, `brain_update`, `brain_digest`, `brain_get_person` with legacy `brainlayer_*` aliases for backward compatibility
Applied to files:
src/brainlayer/mcp/search_handler.py
📚 Learning: 2026-03-12T14:22:54.798Z
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-12T14:22:54.798Z
Learning: Run tests using `pytest` for the project
Applied to files:
tests/conftest.py
🔇 Additional comments (8)
tests/conftest.py (1)
17-21: LGTM!The
eval_projectfixture correctly generates a unique project name per test case using UUID hex prefix. Function scope (default) ensures test isolation.src/brainlayer/mcp/search_handler.py (2)
12-14: LGTM!Constants are well-defined using
frozensetfor O(1) membership testing and clear boundary values fornum_resultsvalidation.
130-136: LGTM!Input validation correctly rejects invalid
detailvalues and out-of-rangenum_resultsbefore any expensive operations. Error messages are descriptive and include valid options/ranges.scripts/cloud_backfill.py (2)
37-45: LGTM!New imports correctly reference internal modules for seed-only KG extraction. The import of
DEFAULT_SEED_ENTITIESprovides domain-specific entities for fast, deterministic extraction without LLM overhead.
515-524: LGTM!KG extraction is appropriately wrapped in try/except to ensure failures don't block the import flow. The warning message includes both
chunk_idand exception details for debugging. Parameters correctly disable LLM/GLiNER for fast seed-only extraction during bulk backfill.tests/test_cloud_backfill.py (1)
1-102: LGTM!Well-structured regression test that:
- Creates isolated VectorStore with
tmp_path- Inserts minimal unenriched chunk matching DB schema
- Uses
monkeypatchcorrectly withraising=Falsefor imported symbols- Verifies both return counts and KG extraction call parameters
- Properly cleans up with
finallyblocktests/test_search_validation.py (2)
12-57: LGTM!Comprehensive validation tests with correct patch targeting at the module namespace level. Tests cover both rejection paths (invalid
detail,num_results > 100) and the valid path (detail="compact"delegates to_search).
60-121: LGTM!The
TestEvalProjectIsolationclass effectively demonstrates project isolation:
- The
xfailtest documents the contamination issue when sharing projects- The passing test proves unique project IDs prevent cross-case leakage
- The fixture validation test ensures
eval_projectfollows the expected formatThe
_embedhelper provides deterministic embeddings suitable for testing without external dependencies.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6820286609
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| except Exception as exc: | ||
| print(f" WARNING: KG extraction failed for {chunk_id}: {exc}") | ||
| success += 1 |
There was a problem hiding this comment.
Treat KG extraction failure as failed import
When extract_kg_from_chunk throws, this path only logs a warning and still increments success. In import_results, chunks with non-null enriched_at are skipped on future runs, so any transient extraction error here leaves a chunk permanently marked as imported but missing KG links with no retry path. Count this as a failure (or otherwise preserve retryability) so reruns can repair KG coverage.
Useful? React with 👍 / 👎.
f995465 to
e73a7ea
Compare
Summary
brain_searchdetail and publicnum_resultsat the MCP boundaryeval_projectfixture so eval tests can isolate seeded data by projectcloud_backfillimportsTesting
PYTHONPATH=src pytest -q tests/test_search_quality.py tests/test_search_chunk_id.py tests/test_search_validation.py tests/test_cloud_backfill.py@codex review
Summary by CodeRabbit
Bug Fixes
detailmust be "compact" or "full", andnum_resultsmust be between 1 and 100.Tests