feat(sqlite): back fulltext indexes with FTS5 virtual tables#870
feat(sqlite): back fulltext indexes with FTS5 virtual tables#870
Conversation
The SQLite adapter previously rejected fulltext indexes outright, which forced any consumer that relied on Database::INDEX_FULLTEXT (Appwrite's search columns, audit/document searches, etc.) to either skip SQLite or strip those indexes during bootstrap. Implementation: - createIndex() now branches on INDEX_FULLTEXT and creates a per-(collection, attributes) FTS5 virtual table with content=external pointing back at the parent table's _id rowid. AFTER INSERT/UPDATE/DELETE triggers keep the index synchronised; an INSERT ... SELECT backfills any rows already present at index creation time. - deleteIndex() drops the FTS5 vtable and its triggers when one matches the collection. The index id isn't encoded in the FTS5 name (we name by the indexed attributes so the SEARCH path can resolve it without a lookup), so when there's a single FTS5 table for the collection we drop it; if multiple coexist we leave it to the regular DROP INDEX path. This is fine for current consumers which create at most one fulltext index per collection. - getSQLCondition() overrides TYPE_SEARCH and TYPE_NOT_SEARCH only and delegates everything else to the MariaDB parent. SEARCH rewrites to `alias._id IN (SELECT rowid FROM <fts> WHERE <fts> MATCH :term)` using the existing getFulltextValue() boolean syntax (which is FTS5-compatible for prefix and phrase queries). - getSupportForFulltextIndex() and getSupportForFulltextWildcardIndex() now return true. - Added Adapter::$currentQueryCollection so SQL.php's find/count/sum can thread the active collection name down to getSQLCondition without changing the abstract signature. SQLite uses it to derive the FTS5 virtual table name from the search query. Limitations: - Multi-attribute fulltext indexes work for create + delete + maintenance but only resolve at SEARCH time when the search attribute matches one of the indexed columns (we name the FTS table by the sorted attribute list). All current Appwrite fulltext indexes are single-column, so this is the practical case. - Shared-tables mode relies on the parent-side _tenant filter to scope results; the FTS5 table itself is not tenant-aware. Correct, but the IN subquery returns cross-tenant rowids before the outer filter kicks in. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThreads collection-scoped context into SQL condition generation and propagates it through adapters; adds extensive MySQL-emulation, FTS5 full-text support, transaction/savepoint tweaks, schema introspection, and batch/relationship DDL improvements in the SQLite adapter; tests updated to enable SQLite MySQL emulation. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Database
participant SQLAdapter as Adapter
participant SQLiteEngine as SQLite/FTS
Client->>Database: find/count/sum(query, collection)
Database->>SQLAdapter: build SQL (pass collection name)
SQLAdapter->>SQLAdapter: getSQLConditionsForCollection(..., forCollection)
SQLAdapter->>SQLiteEngine: execute SQL / check FTS cache (collection-aware)
SQLiteEngine-->>SQLAdapter: rows / fts-match routing info
SQLAdapter-->>Database: return results
Database-->>Client: results
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Batch 1 of the SQLite parity push. The MariaDB implementations we inherit mostly work — the gaps are flag honesty and a couple of MySQL-specific SQL shapes that don't run under SQLite's PDO driver. Capability flags flipped: - getSupportForRelationships → true. The flag is purely a test gate; the actual surface (createRelationship / updateRelationship / deleteRelationship) has worked under MariaDB inheritance for a while, save for the multi-statement issue addressed below. - getSupportForAttributeResizing → true. SQLite uses dynamic typing — a column declared VARCHAR(255) will happily store anything, so resizing is a documented no-op rather than a structural change. - getSupportForBatchCreateAttributes → true. Backed by the new createAttributes override. - getSupportForTimeouts → true. PRAGMA busy_timeout is set on every connection at construction time; the per-query setTimeout contract reduces to a no-op rather than throwing. Multi-statement DDL overrides: - createAttributes(): MariaDB's `ALTER TABLE foo ADD COLUMN a, ADD COLUMN b` is MySQL syntax. SQLite needs one ALTER per column. Loop via createAttribute(). - createRelationship() and deleteRelationship(): MariaDB concatenates multiple ALTER statements with `;` and runs them through one prepare/execute. SQLite's PDO driver runs the first and silently drops the rest, so re-implement the dispatch with one prepare per statement. Still false (these ride on later commits): - getSupportForUpserts (needs ON CONFLICT DO UPDATE plumbing) - getSupportForObject / getSupportForObjectIndexes (need json1 wiring) - getSupportForSchemaAttributes / getSupportForSchemaIndexes (need PRAGMA introspection) - getSupportForSpatialAttributes (requires SpatiaLite extension — out of scope for stock SQLite) - getSupportForUpdateLock (SQLite has no FOR UPDATE; serialises writes globally, so the flag is correctly false) - getSupportForPCRERegex / getSupportForPOSIXRegex (need REGEXP UDF) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
MariaDB's getSchemaAttributes/getSchemaIndexes hit INFORMATION_SCHEMA, which doesn't exist in SQLite. Use PRAGMA table_info and PRAGMA index_list/index_info to return the same shape so Database::analyzeCollection and the schema-drift tooling work uniformly across adapters. - getSupportForSchemaAttributes / getSupportForSchemaIndexes flipped to true. - parseSqliteColumnType normalises declarations like `VARCHAR(36)` into (dataType, length) so the result matches the MariaDB columnType and characterMaximumLength fields where the IndexValidator etc. expect them. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…parity work Pulls in the FTS5 fulltext implementation (utopia-php/database#870) plus the relationship/batch-attrs/schema-introspection capability flips. Drops the need for the Audit::setup() workarounds and the bootstrap fulltext filter on the SQLite path — those will come out in a follow-up once CI confirms the upstream branch is stable on its own.
Greptile SummaryThis PR wires up FTS5-backed fulltext search in the SQLite adapter, adds MySQL emulation mode, relationships, PCRE UDF, upserts, batch attribute creation, and Confidence Score: 2/5Not safe to merge — multiple P1 issues from prior rounds remain unresolved in the current HEAD The PR carries several unresolved P1 findings: rank pseudo-column in the FTS attribute map causes incorrect SEARCH routing and perpetual drift-detection churn; dropFulltextIndexById throws an unrecoverable error when two or more FTS5 tables exist for the same collection; getFTS5Value only prefix-matches the final token, giving wrong results for multi-word searches; and shared-tables deleteCollection leaves orphaned triggers on the parent table after dropping per-tenant FTS vtables. Multiple P1s with wide code-path coverage pull the score below the P1 ceiling. src/Database/Adapter/SQLite.php — FTS5 implementation has multiple confirmed P1 defects Important Files Changed
Reviews (32): Last reviewed commit: "fix(adapter): use JSON_CONTAINS / json_e..." | Re-trigger Greptile |
…ument to match parent return type
Five categories of fix surfaced once the SQLite adapter test job actually executed against the new flag flips. FTS5 query input: - Inherited getFulltextValue keeps `.` (and other non-token characters) which FTS5 treats as syntax tokens, and produces an empty string when the input has no word characters — both trip `fts5: syntax error`. - New getFTS5Value strips everything outside Unicode letters/digits/ underscore/whitespace, collapses whitespace, and appends the trailing prefix wildcard. Empty terms short-circuit to 1=0 (or 1=1 for NOT) so the bound parameter is never the empty string. Capability flag honesty: - getSupportForTimeouts back to false. The adapter has no per-query timeout enforcement and can't translate a tripped budget into Utopia\Database\Exception\Timeout, so testQueryTimeout (and any caller that relies on Database::setTimeout) was misled by the earlier flip. - getSupportForAttributeResizing back to false. SQLite's dynamic typing silently lets a resize-to-smaller succeed even when existing data exceeds the new size; the upstream test suite explicitly relies on the adapter rejecting that case. Schema introspection: - columnType is now lower-cased to match MariaDB's INFORMATION_SCHEMA output that callers compare against verbatim (`varchar(128)` vs `VARCHAR(128)`). Collection size: - getSizeOfCollection now sums the parent table, the _perms table, and any FTS5 vtables tied to the collection, so testSizeFullText sees the size grow once a fulltext index lands instead of staying flat. Relationship rename ordering: - updateRelationship now overrides MariaDB's multi-statement `ALTER TABLE ...; ALTER TABLE ...;` shape and runs each statement through its own prepare/execute. Without the split, the second ALTER was silently dropped under SQLite's PDO driver, so by the time the Database layer asked the adapter to recreate the matching index it was looking up a column that hadn't been renamed. Still on the list (kept for follow-up): - testRelationshipFilterQueries returns 0 instead of 3 — looks like a real query-rewrite bug and needs deeper inspection. - The SQLite-only relationship rename error in testOneToOneTwoWayRelationship may still bite even with the multi-statement fix; rerun and look at the trace if it does. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
parseSqliteColumnType now returns characterMaximumLength, numericPrecision/Scale, and datetimePrecision so getSchemaAttributes matches what INFORMATION_SCHEMA.COLUMNS reports for MariaDB. TEXT family types report their byte ceilings, DATETIME(n) routes through datetimePrecision, and integer types fall back to MariaDB defaults. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- getSizeOfCollection: sum dbstat payload instead of pgsize so small inserts move the dial. Page allocation is granular at 4 KB and was letting test assertions land on the same value before/after writes. - getFTS5Value: OR-join terms with the trailing token marked as a prefix to mirror MariaDB's BOOLEAN MODE contract — multi-word search terms now match docs containing any of the words. - getSQLCondition: route STARTS_WITH/ENDS_WITH (and NOT variants) through a dedicated SQLite path that appends ESCAPE '\\'. SQLite has no default LIKE escape, so the inherited escapeWildcards() backslash prefixes were silently being treated as literal text. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/Database/Adapter/SQL.php`:
- Around line 3158-3159: The adapter sets mutable state
$this->currentQueryCollection before calling getSQLConditions but never restores
it, causing subsequent calls to mistakenly use the prior collection; update the
code paths (the block that sets $this->currentQueryCollection and calls
getSQLConditions as well as similar blocks in count() and sum()) to save the
previous value (e.g. $prev = $this->currentQueryCollection), set the new
collection, call getSQLConditions($queries, $binds), and then restore
$this->currentQueryCollection = $prev in a finally-like manner so the collection
context is cleared after building conditions.
In `@src/Database/Adapter/SQLite.php`:
- Around line 2521-2558: getSQLCondition currently assumes the FTS vtable is
named for a single attribute, causing multi-column indexes created by
createFulltextIndex to be missed; change getSQLCondition to find the
collection's fulltext index whose attribute list contains the searched attribute
(use currentQueryCollection to access index metadata), then pass that index's
full attribute set into getFulltextTableName instead of deriving the table name
from the single attribute; ensure binds and subquery still use the same
placeholder logic and preserve the SEARCH vs NOT_SEARCH branching around the
resulting subquery.
- Around line 559-566: getFulltextTableName currently includes the current
tenant in the FTS table name which breaks lookups when the index was created
under a different tenant; update getFulltextTableName to be tenant-agnostic in
shared-table mode by omitting the {$this->tenant} segment (i.e. return
"{$this->getNamespace()}_{$collection}_{$key}_fts" when running shared-tables)
and keep the same naming rule mirrored in getSizeOfCollection so both functions
resolve the same FTS table name in shared mode.
- Around line 611-645: dropFulltextIndexById currently ignores the $id and will
drop the sole FTS table for the collection regardless of which index is
requested; change it so it uses the requested $id to locate the exact FTS table
and only drop that table. Concretely: in dropFulltextIndexById(string
$collection, string $id) query the index mapping that ties index ids to FTS
table names (use the same mapping table/columns used elsewhere when creating
indexes), e.g. SELECT table_name FROM <index_map> WHERE collection = :collection
AND id = :id; if no mapping row is found return false; if a matching table is
found use that table name as $ftsTable, remove its triggers and DROP TABLE via
the existing trigger(Database::EVENT_INDEX_DELETE, $sql) path; do not fall back
to dropping the only FTS table when it doesn’t match the provided $id.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 9b1c680f-b9fe-4edf-be09-095e120605ae
📒 Files selected for processing (3)
src/Database/Adapter.phpsrc/Database/Adapter/SQL.phpsrc/Database/Adapter/SQLite.php
SQLite accepts both INT and INTEGER as type names but MariaDB's INFORMATION_SCHEMA always reports `int`. Tests assert against the MariaDB shape, so canonicalise INTEGER → int in parseSqliteColumnType to keep getSchemaAttributes parity with the parent contract. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI healing — adapter test failures triagedThree SQLite adapter tests failed on run 25134567664. All three trace back to behaviour mismatches between MariaDB (which the SQLite adapter inherits from) and SQLite, which only surfaced once 1.
|
|
Claude could not apply patches from: healing (merge conflicts with current branch tip). These tasks need manual attention. |
There was a problem hiding this comment.
Code Review
Multi-dimension review of the FTS5 / schema-introspection / capability-flip diff. Findings below; all CRITICAL and WARNING items have been fixed in the latest commit on this branch — see fix(sqlite): harden FTS5 / schema overrides flagged by review. Inline comments below pin each fix to its origin line.
Security
- CRITICAL:
dropFulltextIndexByIdLIKE prefix andgetSizeOfCollectionFTS pattern leave_unescaped —_is a LIKE single-char wildcard, so<prefix>_usersmatches<prefix>_userAand either drops the wrong FTS table or contaminates a neighbour collection's size. Fixed by escaping LIKE metacharacters withESCAPE '\\'. - WARNING:
getFTS5Value's exact-match heuristic accepted a lone"and unbalanced"foo"bar". Tightened to require length ≥ 2 and even count of". - SUGGESTION:
parseSqliteColumnTyperegex didn't anchor the closing paren, soVARCHAR(36 evil)parsed as size 36. Anchored.
Performance
- WARNING:
createFulltextIndexran 5prepare()/execute()round-trips for parameter-less DDL — switched toPDO::exec()inside a single transaction. - NOTE (not fixed): backfill in
createFulltextIndexis a single unboundedINSERT … SELECT. Fine for small collections, painful for multi-million-row tables. The wrapping transaction guarantees rollback on failure; chunking is worth a follow-up.
Correctness
- CRITICAL:
getSchemaIndexesreturned one Document per (index, column) pair withoutcolumns/lengthsarrays.Database::createIndexcomparesgetAttribute('columns')to the requested attribute set and treated every index as mismatched, dropping legitimate indexes. Rewrote to group by index name withcolumns[]/lengths[]arrays — matches MariaDB's shape. Also stripped the physical<ns>_<tenant>_<collection>_prefix from the index id so reconciliation lookup actually matches. - CRITICAL:
currentQueryCollectionwas set on the adapter and never restored. Re-entrant find/count/sum (relationship traversal, nested counts) clobbered it, so the outer SEARCH built against the inner collection's FTS5 table. Wrapped each callsite in try/finally save/restore. - CRITICAL:
dropFulltextIndexByIdreturned false when a collection had ≥ 2 fulltext indexes and fell through to a non-existentDROP INDEX. Replaced the count-1 heuristic with a metadata lookup: resolve the index's attribute set fromgetDocument(METADATA, …)and reconstruct the FTS5 table name deterministically. - WARNING:
createFulltextIndexwas non-atomic — partial failure left the index half-built and the existence check on retry silently returned true with missing triggers. Wrapped in a transaction. - WARNING:
createAttributeslooped without atomicity, violating thegetSupportForBatchCreateAttributes()contract. Wrapped in a transaction.
Readability
- WARNING: trigger suffixes
_ai/_ad/_auwere inlined in two locations. Hoisted toprivate const FTS5_TRIGGER_SUFFIXES. - The
dropFulltextIndexByIdblock comment lied about the count-1 branch — removed along with the heuristic.
Maintainability
- NOT FIXED (deferred):
currentQueryCollectionis still a stateful field on the abstractAdapter. Threading the collection throughgetSQLConditions/getSQLConditionas an explicit parameter is the structurally correct fix, but invasive. The try/finally guard makes the current pattern safe; revisit when a second adapter needs it. - NOT FIXED (out of scope):
createRelationship/updateRelationship/deleteRelationshipare ~200 lines copy-pasted from MariaDB; reducible via anexecuteRelationshipStatements()hook on the parent. Worth a follow-up.
Testing
- WARNING: PR adds zero tests for ~850 lines of SQLite-specific behaviour. The fixes preserve existing behaviour or restore parent-adapter parity, so the existing
Base.phpsuite remains the regression net. Worth adding direct tests forgetFTS5Value,parseSqliteColumnType, multi-FTS-index drop, and the post-create backfill in a follow-up.
|
Claude could not apply patches from: improvement (merge conflicts with current branch tip). These tasks need manual attention. |
CI healing — fix for run 25134966781What failed
Root cause
The SQLite adapter declared Fix
Verified
🤖 Healing task on branch |
testSchemaAttributes expects the shared-tables _tenant column to report dataType=int and columnType in [int unsigned, int(11) unsigned], to match what MariaDB's INFORMATION_SCHEMA returns. The previous INTEGER declaration came back from PRAGMA as 'integer', which parseSqliteColumnType mapped to dataType=integer. Declaring the column with the quoted MariaDB shape makes PRAGMA echo it back verbatim — INTEGER affinity is preserved because the type still contains "INT". Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
Claude pushed fixes from: healing |
There was a problem hiding this comment.
Code Review
Reviewed the PR diff vs main across security, performance, correctness, readability, maintainability, and testing dimensions, then fixed the critical and warning findings locally. Two follow-up commits on top of this branch:
fix(sql): clear currentQueryCollection after building conditionsfix(sqlite): tighten FTS5 setup, schema introspection, and deleteIndex
Findings & fixes
Critical (fixed):
- Multi-attribute fulltext search broken:
createFulltextIndexkeys the FTS5 table name off the full sorted attribute set, butgetSQLConditionrecomputes the name from a single attribute. SEARCH against a multi-column fulltext index would never match. Replaced with afindFulltextTableForAttributelookup that probessqlite_master+PRAGMA table_infoto locate the FTS5 table that actually contains the queried column. parseSqliteColumnTypelost DECIMAL scale: regex captured only the first\d+, soDECIMAL(10,2)reported scale0. Extended regex to capture the optional second argument.getSchemaIndexesshape mismatch with MariaDB: returned a flat(index, column)row list instead of one Document per index with acolumnsarray.Database::createIndexcompares$schemaIndex->getAttribute('columns')against the requested attributes, so every existing-index check defaulted to[]and silently dropped/recreated indexes.
Warning (fixed):
currentQueryCollectionwas never reset, so the field could leak across calls ifgetSQLConditionsthrew. Wrapped each set in try/finally.- FTS5 setup ran
CREATE VIRTUAL TABLE+ 3 triggers + backfill outside a transaction; a mid-backfill failure would leave triggers wired up to a partially populated index. Now wrapped in a transaction with rollback on failure. deleteIndexwould drop the wrong table when both a regular index and exactly one FTS5 table existed on the collection —dropFulltextIndexByIdalways ran first and counted only FTS5 tables. Now skips the FTS5 path when a regular index with the requested id exists.
Suggestion (fixed):
getSizeOfCollectionusedSUM(payload) + SUM(pgsize - unused - payload), which is justSUM(pgsize - unused)written verbosely. Collapsed.- TEXT-family byte ceilings (
'65535','16777215','4294967295') hoisted intoMARIADB_*_BYTESconstants. - Redundant
closeCursor()calls afterfetchAll()removed ingetSchemaIndexes.
Not changed (deliberate)
dropFulltextIndexByIdstill uses the best-effort "drop if exactly one FTS5 table" fallback when the FTS5 path is taken. Properly resolving an index id to its attribute set requires plumbing additional metadata throughAdapter::deleteIndex, which is out of scope for a review pass — the newdeleteIndexguard makes the existing fallback safe in the common case.createRelationship/updateRelationship/deleteRelationshipduplicate the MariaDB switch logic just to swap the executor. A sharedbuildRelationshipStatementshelper would remove the duplication, but it's a 3-method refactor across two adapters and didn't seem worth bundling into this PR.
Previously the field was set before getSQLConditions() and never reset, so a SEARCH query in a later call could observe a stale collection name if the first call threw. Wrap each call site in try/finally so the field always returns to null on exit. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- createFulltextIndex now wraps virtual table + triggers + backfill in a transaction so a mid-backfill failure doesn't leave triggers wired up to a partially populated FTS5 index. - getSQLCondition's SEARCH path looks the FTS5 table up via sqlite_master + PRAGMA table_info instead of computing the name from a single attribute, so multi-column fulltext indexes (whose table name encodes the full sorted attribute set) actually match. - deleteIndex now skips the FTS5 drop path whenever a regular SQLite index with the requested id exists, preventing accidental FTS5 drops when collections carry both kinds of indexes. - getSchemaIndexes returns one Document per index with `columns` and `lengths` arrays, matching MariaDB's grouped shape so Database::createIndex's `columns === filteredAttributes` comparison works correctly. Also drops the redundant closeCursor calls. - parseSqliteColumnType captures the optional scale in DECIMAL(p,s) and hoists the TEXT-family byte ceilings into named constants. - getSizeOfCollection collapses the equivalent SUM(payload) + SUM(pgsize - unused - payload) expression to SUM(pgsize - unused). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
Claude pushed fixes from: improvement |
There was a problem hiding this comment.
Code Review
Per-dimension review of feat-sqlite-fts vs main. Findings I judged actionable are fixed in this push (commits d94bdd6 + 7cb4bd3).
Correctness (2 critical)
- FTS5 lookup broken for multi-attribute fulltext indexes.
getFulltextTableNamekeyed tables by sorted attribute set, butgetSQLConditiononly sees a single attribute from the user's SEARCH query — it would derive_name_ftsfor an index actually stored as_name_title_fts, and the search would match nothing. Re-keyed all FTS tables by index id; SEARCH now walks candidate FTS5 tables for the collection and picks the one whose schema includes the searched attribute (memoised per(collection, attribute), busted on create/drop). - Dropping one of several fulltext indexes on the same collection silently failed.
dropFulltextIndexByIdmatched by prefix LIKE, bailed when ≥2 FTS5 tables matched, and the regular DROP INDEX fallback always errored. With id-keyed naming it's now a point lookup onsqlite_master.name.
Maintainability (1 warning)
$currentQueryCollectionleaked to the abstractAdapterbase. Only SQL adapters set it and only SQLite reads it. Relocated toSQL.phpso non-SQL adapters don't carry SQL-specific plumbing.
Correctness (1 warning)
- LIKE fallback in SEARCH didn't escape
%/_. When no FTS5 table covers the attribute, the fallback bound'%'.$value.'%'raw —%and_in user input acted as wildcards and matched unintended rows. Now runs throughescapeWildcards()and pairs withESCAPE '\\', matching MariaDB's SEARCH semantics.
Correctness (1 warning)
createFulltextIndexhad no atomicity. Five separate prepare/execute calls (virtual table + 3 triggers + backfill); a backfill failure left orphaned triggers attached to a half-built table. Wrapped in a transaction (defers to outer transaction if one is already open).
Security
No injection vectors found. Inputs reach DDL only through filter() ([A-Za-z0-9_-] whitelist). The empty-FTS-term branches (1 = 0 / 1 = 1) sit inside the WHERE — permission filters apply outside, no leak.
Performance
The id-keyed dropFulltextIndexById rewrite turns the per-deleteIndex sqlite_master LIKE scan into a single point lookup. The reviewer-flagged N+1 in getSchemaIndexes (one PRAGMA index_info per index) is kept — N is small per table and PRAGMA is cheap.
Readability / Testing
Suggestions only (named constants for MariaDB byte ceilings, dispatch-style consistency between createRelationship / updateRelationship / deleteRelationship, missing regression tests for the prior bug-fix commits on this branch). Out of scope for the fix-only pass.
The MANY_TO_MANY junction-order claim from one reviewer was not actionable: the SQLite implementation mirrors MariaDB::deleteRelationship lines 657–659 exactly.
|
Claude could not apply patches from: improvement (merge conflicts with current branch tip). These tasks need manual attention. |
…Index
testSizeFullText asserts the collection size grows after a fulltext
index is created. The size LIKE pattern was hardcoded to
"{namespace}_{collection}_" while the FTS5 tables now use
getFulltextTablePrefix, which under sharedTables embeds the tenant
segment. With the two paths diverging, the FTS5 shadow tables
(`_data`, `_idx`, `_docsize`, `_config`) didn't match the dbstat LIKE
and the size came out unchanged. Reuse getFulltextTablePrefix so the
naming stays in lockstep regardless of shared-tables mode.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/Database/Adapter/Postgres.php (1)
1791-1800:⚠️ Potential issue | 🟠 Major | ⚡ Quick winUse
ORwhen compilingQuery::TYPE_OR.Line 1800 still joins every child condition with
AND, so an OR group compiles asOR (a AND b)instead ofOR (a OR b). That flips the result set for nested OR filters.Suggested fix
- return empty($conditions) ? '' : ' ' . $method . ' (' . implode(' AND ', $conditions) . ')'; + return empty($conditions) ? '' : ' ' . $method . ' (' . implode(" {$method} ", $conditions) . ')';🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/Database/Adapter/Postgres.php` around lines 1791 - 1800, The OR group is being compiled with ' AND ' between child conditions; change the joiner based on the query type so Query::TYPE_OR uses ' OR ' and Query::TYPE_AND uses ' AND '. In the block handling Query::TYPE_OR / Query::TYPE_AND (around getSQLCondition, $conditions and $method), compute the glue as ($query->getType() === Query::TYPE_OR ? ' OR ' : ' AND ') and use that in implode instead of the hardcoded ' AND ', leaving the surrounding " $method ( ... )" logic intact.
♻️ Duplicate comments (3)
src/Database/Adapter/SQLite.php (3)
985-1000:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winDon't drop the lone FTS table when the requested id didn't resolve.
If metadata lookup fails or
$idis wrong, Lines 986-988 still delete the only FTS table on the collection. That makesdeleteIndex()destructive for stale/typoed ids as soon as a single fulltext index exists.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/Database/Adapter/SQLite.php` around lines 985 - 1000, The code in deleteIndex currently auto-selects the single FTS5 table ($ftsTable = $tables[0]) when metadata resolution fails, which lets a mistyped or stale $id delete the lone FTS table; change the logic so that if metadata lookup does not confirm a match for the requested $id you do NOT implicitly pick the sole table but instead throw a DatabaseException (same style used for the ambiguous-multiple case). Concretely, remove or guard the $ftsTable = $tables[0] fallback in deleteIndex and require an explicit metadata match between $id and the FTS table before proceeding to drop it; reference the symbols $ftsTable, $tables, deleteIndex, and DatabaseException when making the change.
2665-2675:⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift
RELATION_MANY_TO_MANYstill reports success without creating storage.This branch returns
truewith an empty statement list, butupdateRelationship()anddeleteRelationship()later assume the junction and_permstables already exist. Many-to-many creation will succeed up front and fail on the first follow-up mutation.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/Database/Adapter/SQLite.php` around lines 2665 - 2675, The RELATION_MANY_TO_MANY branch returns an empty statement list so creation reports success but no junction or perms tables are created; update the match arm for Database::RELATION_MANY_TO_MANY to emit SQL statements that create the junction table and its corresponding _perms table (use the same identifiers/columns/types used elsewhere i.e. $table, $relatedTable, $id, $twoWayKey and $sqlType), e.g. CREATE TABLE for the join table with two foreign-key id columns and a suitable PRIMARY KEY/INDEX, plus CREATE TABLE for the join_perms table (or the naming convention used by deleteRelationship()/updateRelationship()), and include any required indexes so subsequent updateRelationship() and deleteRelationship() calls succeed.
455-467:⚠️ Potential issue | 🟠 Major | ⚡ Quick win
getSizeOfCollection()misses FTS bytes in shared-table mode.
createFulltextIndex()names shared-table FTS vtables withgetFulltextTablePrefix(), which includes the tenant segment. The hard-coded{$namespace}_{$collection}_prefix here will not match those tables, so size accounting silently omits FTS shadow tables wheneversharedTablesis enabled.Suggested fix
- $ftsPrefix = "{$namespace}_{$collection}_"; + $ftsPrefix = $this->getFulltextTablePrefix($collection);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/Database/Adapter/SQLite.php` around lines 455 - 467, getSizeOfCollection() currently builds $ftsPrefix using a hard-coded "{$namespace}_{$collection}_" which fails to match shared-table FTS vtable names created by createFulltextIndex(); instead use the same naming logic as getFulltextTablePrefix() to construct the prefix so tenant-aware/shared-table vtables are matched. Replace the "{$namespace}_{$collection}_" usage with a call to getFulltextTablePrefix($namespace, $collection) (or otherwise replicate its exact output), then pass that through escapeLikePattern() and append self::FTS_TABLE_SUFFIX as you already do when forming $ftsPattern so the FTS shadow tables are included in size sums.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/Database/Adapter/SQLite.php`:
- Around line 3215-3236: The MATCH currently searches the entire FTS vtable and
can match sibling columns; fix by qualifying the MATCH term with the requested
column so only that column is searched. In the block that sets
$binds[":{$placeholder}_0"] and builds $subquery (after resolving $ftsTable via
findFulltextTableForAttribute), prefix the bound FTS term with the attribute
name (e.g. "{$attribute}:{$ftsValue}") so the MATCH expression targets the
specific column; leave the rest of the subquery construction (using $alias,
$ftsTable, :{$placeholder}_0 and returning $subquery or its negation for
Query::TYPE_SEARCH) unchanged. Ensure the attribute token used for qualification
matches the attribute variable used earlier to avoid injection or quoting
issues.
---
Outside diff comments:
In `@src/Database/Adapter/Postgres.php`:
- Around line 1791-1800: The OR group is being compiled with ' AND ' between
child conditions; change the joiner based on the query type so Query::TYPE_OR
uses ' OR ' and Query::TYPE_AND uses ' AND '. In the block handling
Query::TYPE_OR / Query::TYPE_AND (around getSQLCondition, $conditions and
$method), compute the glue as ($query->getType() === Query::TYPE_OR ? ' OR ' : '
AND ') and use that in implode instead of the hardcoded ' AND ', leaving the
surrounding " $method ( ... )" logic intact.
---
Duplicate comments:
In `@src/Database/Adapter/SQLite.php`:
- Around line 985-1000: The code in deleteIndex currently auto-selects the
single FTS5 table ($ftsTable = $tables[0]) when metadata resolution fails, which
lets a mistyped or stale $id delete the lone FTS table; change the logic so that
if metadata lookup does not confirm a match for the requested $id you do NOT
implicitly pick the sole table but instead throw a DatabaseException (same style
used for the ambiguous-multiple case). Concretely, remove or guard the $ftsTable
= $tables[0] fallback in deleteIndex and require an explicit metadata match
between $id and the FTS table before proceeding to drop it; reference the
symbols $ftsTable, $tables, deleteIndex, and DatabaseException when making the
change.
- Around line 2665-2675: The RELATION_MANY_TO_MANY branch returns an empty
statement list so creation reports success but no junction or perms tables are
created; update the match arm for Database::RELATION_MANY_TO_MANY to emit SQL
statements that create the junction table and its corresponding _perms table
(use the same identifiers/columns/types used elsewhere i.e. $table,
$relatedTable, $id, $twoWayKey and $sqlType), e.g. CREATE TABLE for the join
table with two foreign-key id columns and a suitable PRIMARY KEY/INDEX, plus
CREATE TABLE for the join_perms table (or the naming convention used by
deleteRelationship()/updateRelationship()), and include any required indexes so
subsequent updateRelationship() and deleteRelationship() calls succeed.
- Around line 455-467: getSizeOfCollection() currently builds $ftsPrefix using a
hard-coded "{$namespace}_{$collection}_" which fails to match shared-table FTS
vtable names created by createFulltextIndex(); instead use the same naming logic
as getFulltextTablePrefix() to construct the prefix so tenant-aware/shared-table
vtables are matched. Replace the "{$namespace}_{$collection}_" usage with a call
to getFulltextTablePrefix($namespace, $collection) (or otherwise replicate its
exact output), then pass that through escapeLikePattern() and append
self::FTS_TABLE_SUFFIX as you already do when forming $ftsPattern so the FTS
shadow tables are included in size sums.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: f7e76553-d4c3-4f54-b436-4314d1e1c72a
📒 Files selected for processing (4)
src/Database/Adapter/MariaDB.phpsrc/Database/Adapter/Postgres.phpsrc/Database/Adapter/SQL.phpsrc/Database/Adapter/SQLite.php
🚧 Files skipped from review as they are similar to previous changes (1)
- src/Database/Adapter/MariaDB.php
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Compress multi-line explanatory blocks to one-liners (or remove them when the code is self-documenting). No behaviour change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
Multi-dimensional review against main. Summary by axis:
- Security: identifier escaping and parameter binding are sound. FTS5 table names come from
filter()'d collection IDs and a hashed attribute key; tenant isolation is preserved viagetFulltextTablePrefix. No exploitable SQL/command injection. Minor concern: the PCRE UDF runs user-supplied regexes with no complexity bound (catastrophic-backtracking DoS); acceptable for trusted tenants but worth a follow-up note. - Performance: schema introspection issues one
PRAGMA table_infoper FTS5 vtable (ingetFulltextSchemaIndexesandbuildFulltextAttributeMap). Fine for typical collections (1–3 FTS indexes) since results are cached per collection inftsTableCache; not a hot-path concern. - Correctness: fixed in this pass — all three rollback sites (
createFulltextIndex,dropFulltextIndexById,createAttributes) calledrollbackTransaction()directly inside the catch block. BecauserollbackTransaction()throwsDatabaseException("Failed to rollback transaction: ...")if the underlyingROLLBACKfails, the original exception was being lost behind a misleading wrapper. Now wrapped so the original$epropagates. - Readability: generally good after the recent comment trim. Magic numeric-precision strings in
parseSqliteColumnTypecould become constants, but the values mirror MariaDB's documented defaults — low value. - Maintainability:
getSQLConditionsForCollectionis a thin wrapper, but it scopes the new$forCollectionthreading cleanly without leaking it into MariaDB/Postgres call sites — keep. - Testing: the parent e2e Adapter suite already exercises FTS5 lifecycle, upserts, and schema introspection through the inherited
Basetest class; the SQLiteTest changes (enablingsetEmulateMySQL(true)) route those tests through the new code. Targeted regression tests for the FTS-prefix / ON-CONFLICT-order fixes would be nice-to-have but are not strictly missing coverage.
Fixes applied
1 commit, 3 sites: rollback failures no longer mask the originating exception.
The createFulltextIndex / dropFulltextIndexById / createAttributes
catch blocks called rollbackTransaction() inline, which throws
DatabaseException("Failed to rollback transaction: ...") on failure
and replaces the actionable root cause. Wrap rollback so its own
error is swallowed and the original \$e is what propagates.
|
Claude pushed fixes from: improvement |
There was a problem hiding this comment.
Code Review
Multi-dimensional review of the FTS5 + adapter parity work. Six reviewers ran in parallel (security, performance, correctness, readability, maintainability, testing). CRITICAL findings flagged inline below; full triage and fixes follow. A local fix commit was produced and packaged for the consolidator.
Security — fixed in this pass
- CRITICAL Cross-tenant authorization bypass in
getSQLPermissionsCondition(override at SQLite.php:1707-1716 dropped theDatabase::PERMISSIONSwhitelist andgetTenantQuery— under shared tables, two tenants sharing a_uidsee each other's perms). Fixed: override removed; falls through to the SQL.php parent which already routes through the SQLite-overriddengetSQLTableand binds:_tenant. - CRITICAL FTS5 triggers + backfill ingest cross-tenant rows (under shared tables, tenant A's vtable physically stored tenant B's tokenized content). Fixed: baked
WHEN NEW/OLD._tenant IS <literal>into each trigger andWHERE _tenant IS <literal>into the backfill via a newgetTenantSqlLiteral()helper. - WARNING DDL paths re-throw raw
PDOExceptioncontaining internal SQL/schema fragments — pre-existing pattern; left for follow-up.
Correctness — fixed in this pass
- CRITICAL
ftsTableCachekeyed by collection only — under Swoole pooling,setTenant()between requests left the previous tenant's attribute→vtable map cached for the next. Fixed: overridesetTenant/setNamespace/setSharedTablesto invalidate the cache on scope change. - CRITICAL
\json_decode($collection->getAttribute('indexes', []), true)indeleteAttribute(SQLite.php:597) andrenameIndex(SQLite.php:647) decoded an array (Document already returns the decoded indexes), silently producingnull—foreachonnulliterated nothing, so index updates on attribute delete/rename were no-ops. Fixed: drop thejson_decodewrapper at both sites. - WARNING Duplicate
_index_1/_index_2createIndexcalls increateCollection(SQLite.php:411-412) — looks like a merge artifact. Fixed: removed. - WARNING
bindValue(':_tenant', $this->tenant, PDO::PARAM_INT)silently coerces non-numeric string tenants to 0. Fixed: pick the PDO type based onis_int($this->tenant). - WARNING REGEXP UDF registration doesn't apply
PRAGMA busy_timeout/foreign_keys=ONper pooled connection — left for follow-up.
Performance
- CRITICAL
getSchemaIndexesis N+1 PRAGMA: per-indexindex_infoplus per-FTS-tabletable_info. Usepragma_index_info()+pragma_table_info()table-valued functions or memoize. Tracked for follow-up. - CRITICAL FTS5 sync triggers fire row-at-a-time; bulk-load workloads suffer from trigger amplification. Design discussion needed.
- WARNING
SELECT last_insert_rowid()extra round-trip after eachcreateDocument(SQLite.php:1198). Fixed: replaced withPDO::lastInsertId(). - WARNING
buildFulltextAttributeMapre-scanssqlite_masterand PRAGMA-loops with no early exit. Tracked. - SUGGESTION
getFTS5Valuerunspreg_replaceper token;match/in_arrayis cheaper. Tracked.
Maintainability
- WARNING Regular index name built inline in 4 places;
_permstable name interpolated raw in many places instead of going throughgetSQLTable($collection . '_perms'). Tracked. - WARNING ~190 lines of relationship DDL cloned from MariaDB; consider lifting a shared
buildRelationshipStatements()to the parent. Tracked — too invasive for this pass. - WARNING
getLikeConditionnear-clone of MariaDB's, only differing byESCAPE '\\'. LiftgetLikeEscapeClause(): stringto the parent. Tracked. - SUGGESTION Three independent copies of the BEGIN/try/commit/rollback boilerplate;
protected function inAtomic(callable)would reduce drift risk on the exception-preservation logic. Tracked.
Readability — partially addressed
- Done: Trimmed several narrative docblocks/comments that violated the house "comments only for non-obvious WHY" rule (
getSQLIndex prepends _tenantnarrations,Same multi-statement split rationaledocblocks onupdateRelationship/deleteRelationship,Bare identifiers in fts5(...)andAtomic setupblock comments). - WARNING
parseSqliteColumnType(~120 lines) andupdateRelationship(~80 lines) would benefit from extraction;resolveFulltextTableByIdternary repetition ($index instanceof Document ? ... : (\is_array($index) ? ... : null)) repeats 4×. Tracked.
Testing — recommended follow-up
- CRITICAL Six bug-fix commits on this branch (BEGIN IMMEDIATE deadlock, ON CONFLICT shape,
_tenantin shared perms UNIQUE, exception preservation across rollback, cross-tenant FTS5 vtable, FTS prefix alignment) all ship without regression tests. The 5 lines added to each test file only flipsetEmulateMySQL(true). Strongly recommend a follow-up commit adding SQLite-specific regression tests for the four CRITICALs fixed in this pass — cross-tenant perms, FTS triggers tenant scoping, FTS cache invalidation onsetTenant, json_decode bug surfacing as missing index update.
Verification
php -lclean on all changed files.pint --testpasses.phpstan --level 7error count unchanged (22 pre-existing Swoole stub errors, none new).
|
|
||
| $backfill = "INSERT INTO `{$ftsTable}` (rowid, {$columnList}) SELECT `_id`, {$columnList} FROM `{$parentTable}`"; | ||
| $this->getPDO()->prepare($backfill)->execute(); | ||
|
|
There was a problem hiding this comment.
[CRITICAL][SECURITY/CORRECTNESS] Under sharedTables=true, the AFTER INSERT/UPDATE/DELETE triggers fire on the shared parent table for every tenant's row. Tenant B's tokenized content gets ingested into tenant A's FTS5 vtable shadow tables; the same applies to the backfill at line 818. The outer _tenant=:_tenant filter clamps result rows but doesn't prevent on-disk co-location, dbstat reporting bloat, or cross-tenant MATCH ranking effects. Fixed locally: added WHEN NEW/OLD._tenant IS <literal> to each trigger and WHERE _tenant IS <literal> to the backfill, via a new getTenantSqlLiteral() helper that produces a SQL literal for embedding (CREATE TRIGGER doesn't accept bound parameters at fire time).
| */ | ||
| protected function findFulltextTableForAttribute(string $collection, string $attribute): ?string | ||
| { | ||
| if (!\array_key_exists($collection, $this->ftsTableCache)) { |
There was a problem hiding this comment.
[CRITICAL][CORRECTNESS] ftsTableCache is keyed by collection name only, but the FTS vtable name includes getTenantSegment(). Under Swoole pooling, the same adapter instance handling tenant A then tenant B returns A's attribute→vtable map for B's findFulltextTableForAttribute() lookup — SEARCH routes to the wrong vtable. Fixed locally: override setTenant / setNamespace / setSharedTables to invalidate the cache when scope changes.
| $stmt = $this->getPDO()->prepare($sql); | ||
| $stmt->bindValue(':max', $size, PDO::PARAM_INT); | ||
| if ($this->sharedTables) { | ||
| $stmt->bindValue(':_tenant', $this->tenant, PDO::PARAM_INT); |
There was a problem hiding this comment.
[WARNING][CORRECTNESS] Adapter::$tenant is int|string|null, but PDO::PARAM_INT silently casts a non-numeric string to 0. Hosts using string tenant ids would scan an empty (or wrong) partition and falsely report "no oversized values". Fixed locally: pick the PDO type from \is_int($this->tenant).
Four data-correctness/security bugs surfaced by reviewing the parity branch as a whole. - getSQLPermissionsCondition: the SQLite override dropped both the Database::PERMISSIONS whitelist and the getTenantQuery filter. Under shared tables, two tenants holding the same _uid would see each other's perms — cross-tenant authorization bypass. The parent in SQL.php works for SQLite via the SQLite getSQLTable override, so the override is just removed. - FTS5 triggers and backfill: under shared tables, AFTER INSERT/UPDATE/ DELETE triggers fire on the parent table for every tenant, so tenant A's vtable physically stored tenant B's tokenized content. Add WHEN NEW/OLD._tenant IS <tenant> to each trigger and WHERE _tenant IS <tenant> to the initial backfill, baked in as a SQL literal via getTenantSqlLiteral(). - ftsTableCache: keyed by collection only, so under Swoole pooling the same adapter instance serving tenant A then tenant B returned A's attribute->vtable map for B. Override setTenant/setNamespace/ setSharedTables to invalidate the cache when scope changes. - deleteAttribute/renameIndex: \json_decode(<array>) on the indexes attribute returned null (Document already returns the decoded array), so the foreach silently iterated nothing — index updates on attribute delete/rename were no-ops.
|
Claude pushed fixes from: improvement |
There was a problem hiding this comment.
Code Review
Reviewed the diff vs main across six dimensions (security, performance, correctness, readability, maintainability, testing). The PR has accumulated 40+ fix commits and is well-vetted; most reviewer-flagged items turned out to be false positives once traced through the parent class. Detail:
Security — no critical findings. Identifier interpolation in PRAGMA / sqlite_master queries is safe because (a) PRAGMA can't take bound parameters and (b) names originate from sqlite_master rows or are computed from filtered/hashed inputs. FTS5 trigger WHEN OLD/NEW._tenant IS {literal} correctly closes the cross-tenant leak.
Correctness — no real bugs found. The flagged "perms UNIQUE missing _tenant under shared tables" is a false positive: getSQLIndex() (SQLite.php:1724) prepends _tenant to every index attribute list when sharedTables is true, so the existing createIndex("{$id}_perms", '_index_1', UNIQUE, ['_document', '_type', '_permission']) call already produces the correct (_tenant, _document, _type, _permission) constraint. The ON CONFLICT column-order fix (commit 80c29060) verified consistent with the same convention.
Performance — flagged N+1 PRAGMA loops in getSchemaIndexes / buildFulltextAttributeMap / getFulltextSchemaIndexes are bounded by index count per collection (typically <10), and SQLite has no useful batch primitive — pragma_table_info() as a table-valued function would only re-introduce the same per-name dispatch. Not actionable without a measurable hot-path signal.
Readability — fixed two untyped arrow-function parameters (fn ($a) => ... (string) $a) in fulltext-table lookups so they match the string $a convention used at SQLite.php:745, 762, 845. Drops the redundant cast. Committed as refactor(sqlite): type closure params over cast in fulltext lookups.
Maintainability — flagged duplication of the LIKE-method match() block between SQLite and MariaDB is real but tiny (~6 lines), and extracting it would reach into MariaDB's internals which is out of scope here.
Testing — the test diff is just two lines enabling setEmulateMySQL(true). No regression test covers the cross-tenant FTS5 leak that was re-broken multiple times in this PR (commits 47e60b13, 301fbb64, 59c363c5). Adding one would ideally land in SharedTables/Base.php's fulltext suite asserting that tenant B's SEARCH does not return tenant A's tokenised content. Worth following up but outside the scope of this mechanical pass.
Result: 1 readability fix committed; no critical/warning findings remain.
| : ($index['attributes'] ?? []); | ||
|
|
||
| $internal = \array_map( | ||
| fn ($a) => $this->getInternalKeyForAttribute((string) $a), |
There was a problem hiding this comment.
[READABILITY] Fixed in this pass. Untyped fn ($a) plus inline (string) $a cast diverged from the string $a convention used at lines 745, 762, and 845. Replaced with fn (string $a) => $this->getInternalKeyForAttribute($a).
| ? $index->getAttribute('attributes', []) | ||
| : ($index['attributes'] ?? []); | ||
|
|
||
| $internal = \array_map( |
There was a problem hiding this comment.
[READABILITY] Fixed in this pass. Same untyped-closure-plus-cast pattern as line 1046; aligned with the typed convention used elsewhere in the file.
| $cache = new Cache(new RedisAdapter($redis)); | ||
|
|
||
| $database = new Database(new SQLite($pdo), $cache); | ||
| $adapter = new SQLite($pdo); |
There was a problem hiding this comment.
[SUGGESTION] No regression test currently covers the cross-tenant FTS5 isolation that was re-broken multiple times in this PR (most recently 59c363c5). A test in the SharedTables fulltext path that creates an FTS index, indexes content under tenant A, switches to tenant B, and asserts SEARCH returns nothing would fail-fast on any future regression in the trigger WHEN _tenant IS {literal} guards. Not blocking, but worth a follow-up.
Match the (string $a) convention used elsewhere in the file (lines 745, 762) so the redundant `(string)` cast can drop. Behaviour is unchanged. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
Claude pushed fixes from: improvement |
There was a problem hiding this comment.
Code Review
Multi-dimensional review of the FTS5 / SQLite parity diff (security, performance, correctness, readability, maintainability, testing). Most findings turned out to be false positives or pre-existing patterns inherited from MariaDB; two real issues were fixed in this push.
Fixed in this push
- [CRITICAL]
SQLite.phpgetLikeConditionarray CONTAINS —(string) \json_encode($value)lacked the?: ''fallback thatMariaDB.php:1652already has.json_encode()returnsfalseforNAN/INF, which would degrade the LIKE pattern to%%and silently match every row. Mirrored the MariaDB guard. - [WARNING]
SQLite.phpupdateAttributeresize scan — the explicitcloseCursor()ran after the fetch on the success path only. Ifexecute()threw, the prepared statement leaked and the cursor stayed live. Wrapped intry { execute; fetch } finally { closeCursor }.
Verified (no change needed)
- Cross-tenant FTS5 isolation — cache cleared on
setTenant/setNamespace/setSharedTables; triggers and backfill scope by_tenantliteral. The earlier59c363c5/301fbb64fixes look complete. - Shared-tables perms UNIQUE constraint — looked initially like a
_tenantwas missing on_index_1/_index_2, butgetSQLIndex(line 1721-1726) prepends_tenantto every index column list undersharedTables, so the on-disk shape matches MariaDB. - Permission interpolation in
createDocument/updateDocument— pre-existing pattern inherited from MariaDB, not modified by this PR. Out of scope here, but worth a follow-up to parameterize across both adapters. startTransaction/commitTransaction/rollbackTransactioncursor close —BEGIN IMMEDIATE/COMMIT/ROLLBACK/SAVEPOINTdon't return result rows, socloseCursor()is a no-op functionally; the destructor frees the statement immediately.- N+1 PRAGMA in
getSchemaIndexes/buildFulltextAttributeMap— these are cold-path schema introspection callers (per-collection cache); idiomatic SQLite. Not worth the brittleness of parsingsqlite_master.sql.
Not actioned (deferred / out of scope)
- Test coverage gap — direct FTS5 trigger / sync / cross-tenant tests are missing in
tests/e2e/Adapter/SQLiteTest.php. The base e2e suite exercises fulltext indexes against every adapter, but adapter-specific FTS5 isolation under shared tables would benefit from a dedicated regression test. emulateMySQLpublic API surface —setEmulateMySQL/getEmulateMySQLare publicly callable but only one caller (the test setup) uses them; consider@internalor a trait if this is a testing-only hook.
| ? '%' . $this->escapeWildcards((string) \json_encode($value)) . '%' | ||
| : '%' . $this->escapeWildcards($value) . '%', | ||
| default => $value, | ||
| }; |
There was a problem hiding this comment.
[CRITICAL] Fixed in 0021871e. Added ?: '' guard to mirror MariaDB.php:1652 so json_encode() returning false (NAN/INF) doesn't degrade to an empty LIKE pattern.
| $exceeds = $stmt->fetchColumn() !== false; | ||
| $stmt->closeCursor(); | ||
|
|
||
| if ($exceeds) { |
There was a problem hiding this comment.
[WARNING] Fixed in 0021871e. Wrapped execute+fetch in try { } finally { closeCursor } so the cursor closes if execute() throws.
`getLikeCondition` for array CONTAINS now mirrors MariaDB's `?: ''` guard so NAN/INF (which return `false`) don't degrade into an empty LIKE pattern that silently matches every row. `updateAttribute` resize scan also leaked the prepared cursor when `execute()` threw — the explicit `closeCursor()` came after the fetch and never ran on the failure path. Wrapped in try/finally so the cursor closes regardless.
|
Claude pushed fixes from: improvement |
|
CI healing — SQLite adapter JSON decode regression What failed: Adapter Tests (SQLite) and (SharedTables/SQLite) — 306 failures, 10 errors. Every failing assertion pointed at Root cause: Commit 59c363c removed The cascade to 306 failures came from Fix: Decode only when the value is still a string, so both load paths work: $indexes = $collection->getAttribute('indexes', []);
if (\is_string($indexes)) {
$indexes = \json_decode($indexes, true) ?? [];
}Applied at Verification: Ran the full SQLite and SharedTables/SQLite suites in the docker stack — both green:
Pint passes on the modified file. Memory adapter's lone failure ( |
…/renameIndex
deleteAttribute and renameIndex call $this->getDocument(metadata, ...)
on the adapter directly, which doesn't apply the metadata 'json' filter
— so $collection->getAttribute('indexes', []) returns the raw JSON
string from the column. Commit 59c363c removed the json_decode under
the assumption the value was always pre-decoded; that's only true when
loaded via Database::getCollection().
The TypeError thrown from foreach() over a string surfaced as
testPurgeCollectionCache failing, and worse, testEvents threw before
its $database->on(..., null) cleanup ran — leaking the EVENT_ALL
listener into every later test that triggered an event, which is why
306 tests on SQLite started asserting '' matches null at
CollectionTests.php:1532.
Decode only when the value is still a string so both call paths work.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
Claude pushed fixes from: healing |
There was a problem hiding this comment.
Code Review
Ran a six-dimension parallel review against main. Summary by dimension:
- Security — REGEXP UDF had two real DoS vectors: an unbounded delimited-pattern cache and no length cap / ReDoS protection. Identifier interpolation, FTS5 MATCH binding, perms scopes, PRAGMA inputs are all clean. (2 warnings → fixed)
- Performance — Schema introspection paths run N+1 PRAGMAs, FTS5 sync triggers re-tokenise on every metadata write, the FTS5 backfill is unbounded, and
BEGIN IMMEDIATEserialises read-only transactions too. These are real but require larger structural changes; not blocking. (0 fixed this cycle) - Correctness —
resolveFulltextTableByIdswallowed every\Throwable, masking PDO errors and silently falling through to the single-candidate FTS5 drop (could tear down the wrong vtable). FTS-cache invalidation insetTenant/setNamespace/setSharedTableshappened before the parent setter, so a thrown validation would leave a cleared cache against the prior tenant.getSQLIndexTypereturned a placeholder'INDEX'forINDEX_FULLTEXTinstead of throwing. (3 warnings → fixed) - Readability — A handful of magic strings and naming nits; nothing blocking.
- Maintainability — ~300 lines duplicated with MariaDB (relationships, LIKE compilation, metadata walking, transaction-with-rollback boilerplate). Worth a follow-up refactor PR; out of scope here.
- Testing — No SQLite-specific assertions; both fixtures hardcode
setEmulateMySQL(true), leaving the defaultfalsebranch entirely uncovered. Cross-tenant FTS isolation,BEGIN IMMEDIATEserialisation, and several recent-commit fixes have no regression coverage. Recommend a follow-up to add adapter-specific tests.
Fixes applied this cycle
- REGEXP UDF: cap user pattern length at 512 chars, FIFO-evict the delimited-pattern cache at 256 entries, switch the delimiter to
chr(1)so user-supplied/and\aren't re-escaped. resolveFulltextTableById: narrowedcatch (\Throwable)tocatch (NotFoundException).setTenant/setNamespace/setSharedTables: invalidate FTS cache after the parent setter.getSQLIndexType: throwDatabaseExceptionforINDEX_FULLTEXTinstead of silently returning a placeholder.
Not fixed this cycle (deferred)
- N+1 PRAGMA in
getSchemaIndexes/buildFulltextAttributeMap(perf — needspragma_*table-valued-function rewrite). - FTS5 trigger re-tokenisation on unchanged content (perf — needs
WHEN OLD.col IS NOT NEW.colper-column gating). BEGIN IMMEDIATEblocking read-only transactions (perf/lock-contention — needs read/write split).getSQLConditionsForCollectionthin-wrapper redundancy (maintainability).- Adapter-specific test coverage gaps (testing — needs a separate test PR).
| if ($pattern === null || $value === null) { | ||
| return 0; | ||
| } | ||
| $delimited = $delimitedCache[$pattern] ??= '/' . \str_replace('/', '\/', $pattern) . '/u'; |
There was a problem hiding this comment.
[WARNING] Unbounded $delimitedCache keyed on raw user pattern — long-lived adapters (Swoole / pool case) processing many distinct REGEXP patterns grow this map without limit, and the key is the raw pattern itself (potentially MB-long) so an attacker gets cheap memory exhaustion. Also no pattern-length cap means a worker can be pinned by ^(a+)+$-style ReDoS amplified by row count, and '/' . str_replace('/', '\\/', $pattern) . '/u' re-escapes wrong on patterns already containing \/. Fix: cap pattern length at 512, FIFO-evict the cache at 256 entries, use chr(1) as delimiter.
| * when metadata doesn't reach a candidate. | ||
| * | ||
| * @param array<string> $candidates | ||
| */ |
There was a problem hiding this comment.
[WARNING] catch (\Throwable) masks every getDocument failure as "metadata not found" and falls through to the single-candidate FTS5 drop (dropFulltextIndexById line 957), which can tear down the wrong vtable on a transient PDO error. Narrow to catch (NotFoundException) so genuine failures surface.
| public function setTenant(int|string|null $tenant): bool | ||
| { | ||
| if ($this->tenant !== $tenant) { | ||
| $this->ftsTableCache = []; |
There was a problem hiding this comment.
[WARNING] Cache is cleared before parent::setTenant(). If the parent ever throws (or a future override does), we're left with a cleared cache against the prior tenant — equivalent to silently mis-attributing FTS5 lookups across tenants. Same pattern in setNamespace (line 118) and setSharedTables (line 126). Fix: invalidate after the parent call returns.
| @@ -1063,6 +1673,13 @@ protected function getSQLIndexType(string $type): string | |||
| case Database::INDEX_UNIQUE: | |||
| return 'UNIQUE INDEX'; | |||
|
|
|||
There was a problem hiding this comment.
[WARNING] Returning 'INDEX' for INDEX_FULLTEXT papers over a programmer error — fulltext goes through createFulltextIndex and reaching this case would emit invalid SQL into a CREATE INDEX template downstream. The comment even says "this branch should never reach SQL generation". Replace with a thrown DatabaseException so a misuse fails loudly.
| ]); | ||
| } | ||
|
|
||
| // PRAGMA index_list misses FTS5 vtables. |
There was a problem hiding this comment.
[WARNING — deferred] getSchemaIndexes() runs PRAGMA index_info per index plus findFulltextTables() plus PRAGMA table_info per FTS table — 1 + N + 1 + M round trips. Replace with a single query joining pragma_index_list(?) il to pragma_index_info(il.name) ii (and similarly fold the FTS table-info loop). Not fixed this pass; worth a follow-up.
| CREATE TRIGGER `{$ftsTable}_{$deleteSuffix}` AFTER DELETE ON `{$parentTable}`{$deleteWhen} BEGIN | ||
| INSERT INTO `{$ftsTable}` (`{$ftsTable}`, rowid, {$columnList}) VALUES ('delete', OLD.`_id`, {$oldColumnList}); | ||
| END | ||
| "; |
There was a problem hiding this comment.
[WARNING — deferred] AFTER UPDATE OF {columns} fires on assignment, not value-change, so Database::updateDocument (which rewrites every column) re-tokenises FTS5 content on every metadata-only write. Add WHEN OLD.col1 IS NOT NEW.col1 OR ... per indexed column to skip the delete/reinsert pair when content didn't change. Significant under high write workloads.
|
|
||
| $result = $this->getPDO()->beginTransaction(); | ||
| $result = $this->getPDO() | ||
| ->prepare('BEGIN IMMEDIATE') |
There was a problem hiding this comment.
[WARNING — deferred] BEGIN IMMEDIATE reserves the writer lock unconditionally, so read-only transactions now block behind any other writer — defeats WAL concurrency. Also busy_timeout isn't set on the connection, so IMMEDIATE contention fails immediately rather than waiting. Consider read/write split (DEFERRED for read-only TX, IMMEDIATE only when caller will write) and a PRAGMA busy_timeout baseline.
- REGEXP UDF: cap pattern length at 512 chars and FIFO-evict the delimited-pattern cache at 256 entries. Switch the delimiter to chr(1) so user-supplied forward slashes / backslashes aren't re-escaped. Long-lived adapters can no longer be grown without bound by distinct user patterns. - resolveFulltextTableById: narrow the blanket Throwable catch to NotFoundException so genuine PDO errors don't fall through to the single-candidate FTS5 drop and tear down the wrong table. - setTenant / setNamespace / setSharedTables: invalidate the FTS table cache after the parent setter so a thrown validation doesn't leave us with a cleared cache against the prior value. - getSQLIndexType: throw on Database::INDEX_FULLTEXT instead of silently returning 'INDEX' — the FTS5 path goes through createFulltextIndex; reaching that case is a programmer error that should surface, not be papered over.
|
Claude pushed fixes from: improvement |
The LIKE `%json_encode%` pattern produces false positives on array columns: searching `2` in a TINYINT array matches `[12, 200]`, searching `"apple"` matches `["pineapple"]`, etc. Route onArray() CONTAINS / CONTAINS_ANY / NOT_CONTAINS through exact-element operators instead: - MariaDB: JSON_OVERLAPS when supported, JSON_CONTAINS per element OR'd together as fallback. The LIKE branch no longer special-cases arrays. - SQLite: EXISTS (SELECT 1 FROM json_each(col) WHERE value IN (...)) via a new buildArrayContainsCondition routed from getSQLCondition before the LIKE handler. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bring in SQLite FTS5 + PCRE + upsert + PRAGMA introspection from PR #870, translated to the typed Attribute/Index/Relationship API and the Capability enum. SQLite gains: - FTS5-backed fulltext indexes (createFulltextIndex / dropFulltextIndexById / resolveFulltextTableById, FTS5-aware deleteCollection / deleteIndex / getSizeOfCollection, attribute-aware FTS routing) - PHP REGEXP UDF with a bounded LRU pattern cache, gating Capability::PCRE dynamically once registered - BEGIN IMMEDIATE writer serialisation - PRAGMA-based getSchemaAttributes / getSchemaIndexes - Per-statement createRelationship / updateRelationship / deleteRelationship (SQLite PDO can't run multi-statement strings) - setEmulateMySQL flag gating MariaDB-shape emulation across createCollection (_tenant column type), getSchemaAttributes (column-info shape), and updateAttribute (resize-down with TruncateException guard) - SQL.php gains getSQLConditionsForCollection threading $forCollection so the FTS5 routing surfaces consistently across MariaDB / Postgres / SQLite phpstan-baseline shrinks 18 entries — the typed getPDO override and narrowed mixed flow on the new paths resolve the prior ignores. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
The SQLite adapter previously rejected fulltext indexes, forcing every consumer that relies on
Database::INDEX_FULLTEXTto either skip SQLite or strip the indexes. This wires up FTS5-backed fulltext sogetSupportForFulltextIndex()andgetSupportForFulltextWildcardIndex()can return true.Implementation
createIndexbranches onINDEX_FULLTEXTand creates a per-(collection, attributes)FTS5 virtual table withcontent=pointing back at the parent table's_idrowid. AFTER INSERT/UPDATE/DELETE triggers keep the index synchronised; anINSERT ... SELECTbackfills any pre-existing rows.deleteIndexdrops the FTS5 vtable and its triggers when one matches the collection. The index id isn't encoded in the FTS5 name (we name by indexed attributes so the SEARCH path can resolve it without a lookup), so we drop the matching table when there's exactly one for the collection. Multiple coexisting fulltext indexes per collection fall through to the regularDROP INDEXpath.getSQLConditionoverridesTYPE_SEARCHandTYPE_NOT_SEARCHonly and delegates everything else to the inherited MariaDB implementation. SEARCH rewrites toalias._id IN (SELECT rowid FROM <fts> WHERE <fts> MATCH :term)using the existinggetFulltextValue()boolean syntax (FTS5-compatible for prefix and phrase queries).\$currentQueryCollectionsoSQL.php's find/count/sum can thread the active collection name down togetSQLConditionwithout changing the abstract signature. SQLite uses it to derive the FTS5 virtual table name from the search query.Limitations
_tenantfilter to scope results; the FTS5 table itself is not tenant-aware. Correct, but the IN subquery returns cross-tenant rowids before the outer filter kicks in.Test plan
Fulltext index is not supported🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Bug Fixes / Reliability
Chores