feat: Codex session transcript ingestion by EtanHey · Pull Request #83 · EtanHey/brainlayer

EtanHey · 2026-03-14T13:18:29Z

Summary

New Codex JSONL adapter: parse → classify → chunk → embed → store
CLI: brainlayer ingest-codex with dedup
39 tests, 4,129 real chunks validated

Test plan

39 tests passing
Real-data validation on 20 sessions

🤖 Generated with Claude Code

- New adapter: src/brainlayer/ingest/codex.py — parse, classify, chunk, embed, store - CLI command: brainlayer ingest-codex <file|dir> - Smart mapping: function_call_output → file_read/stack_trace/git_diff - Batch ingest with dedup (skips already-indexed sessions) - Fixed source field propagation in index_new.py - 39 tests, validated on real data: 4,129 chunks from 20 sessions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

greptile-apps

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

coderabbitai · 2026-03-14T13:18:36Z

Warning

Rate limit exceeded

@EtanHey has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 2 minutes and 38 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: f3f902d5-9c7e-4082-b8c6-aaded18801a5

📥 Commits

Reviewing files that changed from the base of the PR and between 322265a and f516ce8.

📒 Files selected for processing (6)

scripts/cloud_backfill.py
src/brainlayer/cli/__init__.py
src/brainlayer/index_new.py
src/brainlayer/ingest/__init__.py
src/brainlayer/ingest/codex.py
tests/test_ingest_codex.py

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature/phase-3b-codex-ingestion

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Remove unused `import sys` and fix import sort order. Pre-existing on main, not from BrainBar changes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: BrainBar Swift daemon — MCP server over Unix socket New Swift menu bar daemon that owns the BrainLayer SQLite database and serves MCP tools over /tmp/brainbar.sock. Eliminates 10 Python processes (931MB RAM) with a single native daemon (~40MB est.). Components (28 tests, all passing): - MCPFraming: Content-Length parser/encoder (7 tests) - MCPRouter: JSON-RPC dispatch for 8 BrainLayer tools (7 tests) - BrainDatabase: SQLite3 C API with FTS5, WAL, PRAGMAs (10 tests) - BrainBarServer: POSIX Unix socket + E2E integration (4 tests) Architecture: direct SQLite3 C bindings (zero external deps), single-writer (eliminates all SQLITE_BUSY errors), 209KB binary. Includes: build-app.sh, Info.plist, LaunchAgent plist, .mcp.json.example updated with socat config. Part of Three-Daemon Sprint (Phase 3). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address security and robustness issues from self-review Pre-CodeRabbit fixes for 5 MAJOR + 4 MEDIUM issues: MAJOR fixes: - SQL injection in pragma(): whitelist allowed pragma names - JSON injection in tool handlers: use JSONSerialization, not string interpolation - Socket path buffer overflow: add length check before memcpy - Content-Length DoS: cap at 10MB max payload - FULLMUTEX retained: needed for WAL concurrent reads + close() race safety MEDIUM fixes: - FTS5 sanitization: strip * and handle empty queries - Listen backlog: 5 → 16 for connection burst handling - sendResponse: handle EAGAIN on non-blocking sockets - Database exposes isOpen for server startup validation 28/28 tests still GREEN. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: lint errors in codex.py from PR #83 Remove unused `import sys` and fix import sort order. Pre-existing on main, not from BrainBar changes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * style: ruff format codex.py and test_ingest_codex.py Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address CodeRabbit review Round 1 — 14 actionable comments MAJOR fixes: - FTS5 backfill: rebuild index when opening existing DB (critical for 312K chunk DB) - Stub handlers: return notImplemented error instead of fake success - brain_search: clamp num_results to max 100 MEDIUM fixes: - MCPFraming: add 16MB total buffer limit (DoS protection) - build-app.sh: allow BRAINBAR_APP_DIR env override (no sudo needed) - Test: replace force unwrap with XCTUnwrap - LaunchAgent: add ThrottleInterval for restart storms Design decisions (commented on PR): - /tmp/brainbar.sock intentional (not brainlayer.sock) — avoids conflict during migration - brain_tags replaces brain_get_person — per Phase B spec 28/28 tests GREEN. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: align BrainDatabase schema with production DB (312K chunks) Root cause of "SQLite prepare failed: 1": - BrainBar used `chunk_id` — production uses `id` - BrainBar used `session_id` — production uses `conversation_id` - BrainBar created its own FTS5 schema — production uses (content, summary, tags, resolved_query, chunk_id UNINDEXED) - Production requires source_file NOT NULL, metadata NOT NULL Fix: - All SQL now uses production column names (id, conversation_id, etc.) - ensureSchema() skips creation on existing DB (production already has schema) - New/test DBs get production-compatible schema - FTS5 triggers match production (include resolved_query, chunk_id) Tested: brain_search via socat returns real results from 312K chunk DB. 28/28 Swift tests GREEN. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

greptile-apps Bot reviewed Mar 14, 2026

View reviewed changes

EtanHey merged commit 31e669c into main Mar 14, 2026
4 of 5 checks passed

EtanHey deleted the feature/phase-3b-codex-ingestion branch March 14, 2026 13:18

EtanHey added a commit that referenced this pull request Mar 17, 2026

fix: lint errors in codex.py from PR #83

edfe899

Remove unused `import sys` and fix import sort order. Pre-existing on main, not from BrainBar changes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Codex session transcript ingestion#83

feat: Codex session transcript ingestion#83
EtanHey merged 1 commit intomainfrom
feature/phase-3b-codex-ingestion

EtanHey commented Mar 14, 2026

Uh oh!

greptile-apps Bot left a comment

Uh oh!

coderabbitai Bot commented Mar 14, 2026

Rate limit exceeded

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

EtanHey commented Mar 14, 2026

Summary

Test plan

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot commented Mar 14, 2026

Rate limit exceeded

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant