fix: BrainBar FTS5 JOIN uses chunk_id instead of rowid#94
Conversation
Root cause: production FTS5 trigger (Python) doesn't set explicit rowid, so FTS5 auto-assigned rowids diverge from chunks.rowid after rebuilds. The f.rowid = c.rowid JOIN mapped to wrong chunks, making importance_min, project, and tag filters check the wrong chunk's data. Fix: JOIN on c.id = f.chunk_id (stored as UNINDEXED column in FTS5). Also: importance now read as Double (was rounded to Int via column_int), test triggers updated to match production (no explicit rowid). 39 tests pass including new testImportanceFilterWorksWithDivergedRowids that reproduces the exact production bug (FTS5 rebuild + rowid divergence). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
|
@coderabbitai review |
📝 WalkthroughWalkthroughModified FTS (Full-Text Search) synchronization triggers in BrainDatabase to replace explicit rowid usage with chunk_id as the primary join key across insert, update, and delete operations. Adjusted FTS query joins from rowid matching to chunk_id-based joins. Changed importance field mapping from integer to double and elevated exec() method visibility from private to internal. Added tests validating importance-based filtering under edge cases. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@codex review |
✅ Actions performedReview triggered.
|
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@brain-bar/Sources/BrainBar/BrainDatabase.swift`:
- Line 283: The test-accessible database helper function exec(_ sql: String)
currently relies on Swift's default internal visibility; make this explicit by
adding the internal access modifier to the function declaration (change func
exec(_ sql: String) to internal func exec(_ sql: String)) so tests can call
db.exec() and the intent is documented and protected from accidental visibility
changes.
In `@brain-bar/Tests/BrainBarTests/DatabaseTests.swift`:
- Around line 206-210: The tests currently check importance as either Int or
Double, but BrainDatabase.search() always returns a Double (it uses
sqlite3_column_double); remove the dead as? Int branches and simplify assertions
to treat importance as Double only (e.g., cast/guard let importance as? Double
and XCTAssertGreaterThanOrEqual(importance, 7.0, ...)) in the test methods
testImportanceFilterWorksWithDivergedRowids and
testAllResultsRespectImportanceMin so the tests fail fast if the return type
changes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 8f4a7db4-6691-4efa-bb8e-a3a2aaff0a52
📒 Files selected for processing (2)
brain-bar/Sources/BrainBar/BrainDatabase.swiftbrain-bar/Tests/BrainBarTests/DatabaseTests.swift
📜 Review details
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-03-14T02:20:54.656Z
Learning: Request codex review, cursor review, and bugbot review for BrainLayer PRs
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-12T14:22:54.809Z
Learning: Applies to src/brainlayer/**/*.py : For bulk database operations: stop enrichment workers first, checkpoint WAL before and after operations, drop FTS triggers before bulk deletes, batch deletes in 5-10K chunks with checkpoint every 3 batches, never delete from `chunks` while FTS trigger is active
📚 Learning: 2026-03-12T14:22:54.809Z
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-12T14:22:54.809Z
Learning: Applies to src/brainlayer/**/*.py : For bulk database operations: stop enrichment workers first, checkpoint WAL before and after operations, drop FTS triggers before bulk deletes, batch deletes in 5-10K chunks with checkpoint every 3 batches, never delete from `chunks` while FTS trigger is active
Applied to files:
brain-bar/Sources/BrainBar/BrainDatabase.swift
📚 Learning: 2026-03-18T00:12:08.774Z
Learnt from: EtanHey
Repo: EtanHey/brainlayer PR: 87
File: brain-bar/Sources/BrainBar/BrainBarServer.swift:118-129
Timestamp: 2026-03-18T00:12:08.774Z
Learning: In Swift files under brain-bar/Sources/BrainBar, enforce that when a critical dependency like the database is nil due to startup ordering (socket before DB), any tool handler that accesses the database must throw an explicit error (e.g., ToolError.noDatabase) instead of returning a default/empty value. Do not allow silent defaults (e.g., guard let db else { return ... }). Flag patterns that silently return defaults when db is nil, as this masks startup timing issues. This guidance applies broadly to similar Swift files in the BrainBar module, not just this one location.
Applied to files:
brain-bar/Sources/BrainBar/BrainDatabase.swift
🔇 Additional comments (4)
brain-bar/Sources/BrainBar/BrainDatabase.swift (3)
87-107: LGTM! Triggers correctly use chunk_id linkage.The updated triggers properly:
- Insert without explicit rowid, storing
new.idin thechunk_idUNINDEXED column- Delete by
chunk_id = old.idinstead of rowid- Update via delete-then-insert pattern using chunk_id
This matches the production trigger behavior and fixes the rowid divergence issue.
199-199: LGTM! Core fix for the rowid divergence bug.Joining on
c.id = f.chunk_iduses the stable UNINDEXED column that stores the actual chunk ID, rather than FTS5's auto-assigned rowid which diverges after table rebuilds. This ensures filters (importance_min, project, tag) check the correct chunk data.
233-233: LGTM! Correct type for importance column.Using
sqlite3_column_doublealigns with the schema whereimportanceis defined asREAL(line 73). This prevents unintended truncation that would occur withsqlite3_column_int.brain-bar/Tests/BrainBarTests/DatabaseTests.swift (1)
160-211: Good test that reproduces the production bug scenario.The test correctly simulates rowid divergence by:
- Inserting chunks with synced rowids via triggers
- Dropping and recreating FTS5 without explicit rowids
- Re-populating in reverse order to guarantee mismatch
- Verifying the importance filter returns correct data
This provides excellent regression coverage for the fix.
| // MARK: - Helpers | ||
|
|
||
| private func exec(_ sql: String) { | ||
| func exec(_ sql: String) { |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider adding internal explicitly for clarity.
The change from private to default (internal) visibility is needed for tests to call db.exec(). While Swift's default is internal, making it explicit documents the intent and prevents accidental changes.
💡 Optional: Make visibility explicit
- func exec(_ sql: String) {
+ internal func exec(_ sql: String) {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@brain-bar/Sources/BrainBar/BrainDatabase.swift` at line 283, The
test-accessible database helper function exec(_ sql: String) currently relies on
Swift's default internal visibility; make this explicit by adding the internal
access modifier to the function declaration (change func exec(_ sql: String) to
internal func exec(_ sql: String)) so tests can call db.exec() and the intent is
documented and protected from accidental visibility changes.
| if let imp = importance as? Int { | ||
| XCTAssertGreaterThanOrEqual(imp, 7, "Returned importance must be >= 7") | ||
| } else if let imp = importance as? Double { | ||
| XCTAssertGreaterThanOrEqual(imp, 7.0, "Returned importance must be >= 7") | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Simplify importance type assertion—it's always Double.
Since BrainDatabase.search() uses sqlite3_column_double (line 233), the importance value is always a Double. The as? Int branches are dead code and could mask a regression if the type accidentally changes back.
♻️ Proposed fix for testImportanceFilterWorksWithDivergedRowids
// Also verify the returned importance is correct (not from wrong row)
- let importance = filtered.first?["importance"]
- XCTAssertNotNil(importance, "importance should be present")
- // Should be 9, not 2 (which would happen if rowids mapped to wrong chunk)
- if let imp = importance as? Int {
- XCTAssertGreaterThanOrEqual(imp, 7, "Returned importance must be >= 7")
- } else if let imp = importance as? Double {
- XCTAssertGreaterThanOrEqual(imp, 7.0, "Returned importance must be >= 7")
- }
+ let importance = filtered.first?["importance"] as? Double
+ XCTAssertNotNil(importance, "importance should be present as Double")
+ // Should be 9, not 2 (which would happen if rowids mapped to wrong chunk)
+ XCTAssertGreaterThanOrEqual(importance!, 7.0, "Returned importance must be >= 7")♻️ Proposed fix for testAllResultsRespectImportanceMin
for result in results {
- if let imp = result["importance"] as? Int {
- XCTAssertGreaterThanOrEqual(imp, 7, "Every result must have importance >= 7")
- } else if let imp = result["importance"] as? Double {
- XCTAssertGreaterThanOrEqual(imp, 7.0, "Every result must have importance >= 7")
- }
+ guard let imp = result["importance"] as? Double else {
+ XCTFail("importance should be present as Double")
+ continue
+ }
+ XCTAssertGreaterThanOrEqual(imp, 7.0, "Every result must have importance >= 7")
}Also applies to: 222-226
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@brain-bar/Tests/BrainBarTests/DatabaseTests.swift` around lines 206 - 210,
The tests currently check importance as either Int or Double, but
BrainDatabase.search() always returns a Double (it uses sqlite3_column_double);
remove the dead as? Int branches and simplify assertions to treat importance as
Double only (e.g., cast/guard let importance as? Double and
XCTAssertGreaterThanOrEqual(importance, 7.0, ...)) in the test methods
testImportanceFilterWorksWithDivergedRowids and
testAllResultsRespectImportanceMin so the tests fail fast if the return type
changes.
Summary
chunks.rowid. Thef.rowid = c.rowidJOIN mapped to wrong chunks — importance_min, project, and tag filters checked the wrong chunk's data.JOIN chunks c ON c.id = f.chunk_idinstead ofON f.rowid = c.rowid. Thechunk_idUNINDEXED column in FTS5 stores the actual chunk ID, immune to rowid divergence.Double(was rounded toIntviasqlite3_column_int), test triggers updated to match production (no explicit rowid).Verified via socat
Test plan
testImportanceFilterWorksWithDivergedRowidsreproduces exact production bug (RED before fix, GREEN after)testAllResultsRespectImportanceMinverifies all results >= threshold🤖 Generated with Claude Code
Summary by CodeRabbit
Bug Fixes
Tests