Skip to content

feat(sqlite): back fulltext indexes with FTS5 virtual tables#870

Merged
abnegate merged 46 commits intomainfrom
feat-sqlite-fts
Apr 30, 2026
Merged

feat(sqlite): back fulltext indexes with FTS5 virtual tables#870
abnegate merged 46 commits intomainfrom
feat-sqlite-fts

Conversation

@abnegate
Copy link
Copy Markdown
Member

@abnegate abnegate commented Apr 29, 2026

Summary

The SQLite adapter previously rejected fulltext indexes, forcing every consumer that relies on Database::INDEX_FULLTEXT to either skip SQLite or strip the indexes. This wires up FTS5-backed fulltext so getSupportForFulltextIndex() and getSupportForFulltextWildcardIndex() can return true.

Implementation

  • createIndex branches on INDEX_FULLTEXT and creates a per-(collection, attributes) FTS5 virtual table with content= pointing back at the parent table's _id rowid. AFTER INSERT/UPDATE/DELETE triggers keep the index synchronised; an INSERT ... SELECT backfills any pre-existing rows.
  • deleteIndex drops the FTS5 vtable and its triggers when one matches the collection. The index id isn't encoded in the FTS5 name (we name by indexed attributes so the SEARCH path can resolve it without a lookup), so we drop the matching table when there's exactly one for the collection. Multiple coexisting fulltext indexes per collection fall through to the regular DROP INDEX path.
  • getSQLCondition overrides TYPE_SEARCH and TYPE_NOT_SEARCH only and delegates everything else to the inherited MariaDB implementation. SEARCH rewrites to alias._id IN (SELECT rowid FROM <fts> WHERE <fts> MATCH :term) using the existing getFulltextValue() boolean syntax (FTS5-compatible for prefix and phrase queries).
  • Adapter gains \$currentQueryCollection so SQL.php's find/count/sum can thread the active collection name down to getSQLCondition without changing the abstract signature. SQLite uses it to derive the FTS5 virtual table name from the search query.

Limitations

  • Multi-attribute fulltext indexes work for create/delete/maintenance but only resolve at SEARCH time when the search attribute matches one of the indexed columns (we name the FTS table by the sorted attribute list). All current real-world usage is single-column, so this is the practical case.
  • Shared-tables mode relies on the parent-side _tenant filter to scope results; the FTS5 table itself is not tenant-aware. Correct, but the IN subquery returns cross-tenant rowids before the outer filter kicks in.

Test plan

  • Existing SQLite adapter test suite passes
  • Fulltext-specific tests (create/search/update/delete cycle) pass
  • Downstream Appwrite e2e suite stops failing with Fulltext index is not supported

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • SQLite MySQL-emulation: full-text indexing/search (FTS5), relationships, attribute resizing, PCRE regex support, upserts, batch attribute creation, and richer schema introspection.
  • Bug Fixes / Reliability

    • More robust transaction handling to avoid writer-lock deadlocks.
    • Improved fulltext/LIKE/SEARCH sanitization and routing.
    • Collection-aware condition handling propagated across adapters.
    • Fixed wildcard escaping for array-contains queries.
  • Chores

    • Tests updated to enable SQLite MySQL-emulation during setup.

The SQLite adapter previously rejected fulltext indexes outright, which
forced any consumer that relied on Database::INDEX_FULLTEXT (Appwrite's
search columns, audit/document searches, etc.) to either skip SQLite or
strip those indexes during bootstrap.

Implementation:
- createIndex() now branches on INDEX_FULLTEXT and creates a per-(collection,
  attributes) FTS5 virtual table with content=external pointing back at the
  parent table's _id rowid. AFTER INSERT/UPDATE/DELETE triggers keep the
  index synchronised; an INSERT ... SELECT backfills any rows already
  present at index creation time.
- deleteIndex() drops the FTS5 vtable and its triggers when one matches the
  collection. The index id isn't encoded in the FTS5 name (we name by the
  indexed attributes so the SEARCH path can resolve it without a lookup),
  so when there's a single FTS5 table for the collection we drop it; if
  multiple coexist we leave it to the regular DROP INDEX path. This is fine
  for current consumers which create at most one fulltext index per
  collection.
- getSQLCondition() overrides TYPE_SEARCH and TYPE_NOT_SEARCH only and
  delegates everything else to the MariaDB parent. SEARCH rewrites to
  `alias._id IN (SELECT rowid FROM <fts> WHERE <fts> MATCH :term)` using
  the existing getFulltextValue() boolean syntax (which is FTS5-compatible
  for prefix and phrase queries).
- getSupportForFulltextIndex() and getSupportForFulltextWildcardIndex()
  now return true.
- Added Adapter::$currentQueryCollection so SQL.php's find/count/sum can
  thread the active collection name down to getSQLCondition without
  changing the abstract signature. SQLite uses it to derive the FTS5
  virtual table name from the search query.

Limitations:
- Multi-attribute fulltext indexes work for create + delete + maintenance
  but only resolve at SEARCH time when the search attribute matches one
  of the indexed columns (we name the FTS table by the sorted attribute
  list). All current Appwrite fulltext indexes are single-column, so
  this is the practical case.
- Shared-tables mode relies on the parent-side _tenant filter to scope
  results; the FTS5 table itself is not tenant-aware. Correct, but the
  IN subquery returns cross-tenant rowids before the outer filter kicks
  in.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 29, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Threads collection-scoped context into SQL condition generation and propagates it through adapters; adds extensive MySQL-emulation, FTS5 full-text support, transaction/savepoint tweaks, schema introspection, and batch/relationship DDL improvements in the SQLite adapter; tests updated to enable SQLite MySQL emulation.

Changes

Cohort / File(s) Summary
SQL Adapter core
src/Database/Adapter/SQL.php
Added getSQLConditionsForCollection(...); broadened getSQLCondition/getSQLConditions signatures to accept ?string $forCollection and threaded collection context into find, count, sum.
SQLite adapter — emulation, FTS, transactions, schema
src/Database/Adapter/SQLite.php
Large additions: MySQL-emulation flag/type mapping, PCRE REGEXP UDF, FTS5 support (virtual tables, triggers, backfill, MATCH routing, dbstat size accounting), transaction handling changed to BEGIN IMMEDIATE + explicit COMMIT/ROLLBACK and savepoint fixes, schema introspection mapping to MariaDB-like metadata, single-statement DDL execution for batch attribute creation, safe FTS teardown and per-collection FTS caching.
SQLite adapter — relationships & upsert behavior
src/Database/Adapter/SQLite.php
Reworked relationship create/update/delete to issue single ALTERs; adjusted upsert conflict target ordering for shared tables ((_tenant, _uid)).
SQLite adapter — helpers & caching
src/Database/Adapter/SQLite.php
Added FTS discovery cache and helpers for LIKE/FTS sanitization and routing.
MariaDB adapter — LIKE escaping for arrays
src/Database/Adapter/MariaDB.php
getSQLCondition now accepts ?string $forCollection and escapes wildcards when building LIKE patterns for array CONTAINS-style queries.
Postgres adapter — collection propagation
src/Database/Adapter/Postgres.php
getSQLCondition signature updated to accept ?string $forCollection and propagate it through recursive boolean condition building.
Tests — enable SQLite MySQL emulation
tests/e2e/Adapter/SQLiteTest.php, tests/e2e/Adapter/SharedTables/SQLiteTest.php
Tests now instantiate SQLite adapter and call setEmulateMySQL(true) before constructing Database.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant Database
  participant SQLAdapter as Adapter
  participant SQLiteEngine as SQLite/FTS

  Client->>Database: find/count/sum(query, collection)
  Database->>SQLAdapter: build SQL (pass collection name)
  SQLAdapter->>SQLAdapter: getSQLConditionsForCollection(..., forCollection)
  SQLAdapter->>SQLiteEngine: execute SQL / check FTS cache (collection-aware)
  SQLiteEngine-->>SQLAdapter: rows / fts-match routing info
  SQLAdapter-->>Database: return results
  Database-->>Client: results
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • fogelito
  • ArnabChatterjee20k

Poem

🐰 I hopped through schemas, triggers, and trace,
Threaded each collection into query's place,
FTS5 hummed and indexes grew tall,
Transactions safe, relations heed the call,
A tiny rabbit cheered: "Hop, deploy, embrace!" ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly relates to and accurately summarizes the main change: adding FTS5 virtual table support for fulltext indexes in SQLite.
Docstring Coverage ✅ Passed Docstring coverage is 89.06% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat-sqlite-fts

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@abnegate abnegate marked this pull request as ready for review April 29, 2026 13:37
abnegate and others added 2 commits April 30, 2026 01:40
Batch 1 of the SQLite parity push. The MariaDB implementations we inherit
mostly work — the gaps are flag honesty and a couple of MySQL-specific
SQL shapes that don't run under SQLite's PDO driver.

Capability flags flipped:
- getSupportForRelationships → true. The flag is purely a test gate; the
  actual surface (createRelationship / updateRelationship / deleteRelationship)
  has worked under MariaDB inheritance for a while, save for the
  multi-statement issue addressed below.
- getSupportForAttributeResizing → true. SQLite uses dynamic typing — a
  column declared VARCHAR(255) will happily store anything, so resizing
  is a documented no-op rather than a structural change.
- getSupportForBatchCreateAttributes → true. Backed by the new
  createAttributes override.
- getSupportForTimeouts → true. PRAGMA busy_timeout is set on every
  connection at construction time; the per-query setTimeout contract
  reduces to a no-op rather than throwing.

Multi-statement DDL overrides:
- createAttributes(): MariaDB's `ALTER TABLE foo ADD COLUMN a, ADD COLUMN b`
  is MySQL syntax. SQLite needs one ALTER per column. Loop via
  createAttribute().
- createRelationship() and deleteRelationship(): MariaDB concatenates
  multiple ALTER statements with `;` and runs them through one
  prepare/execute. SQLite's PDO driver runs the first and silently drops
  the rest, so re-implement the dispatch with one prepare per statement.

Still false (these ride on later commits):
- getSupportForUpserts (needs ON CONFLICT DO UPDATE plumbing)
- getSupportForObject / getSupportForObjectIndexes (need json1 wiring)
- getSupportForSchemaAttributes / getSupportForSchemaIndexes (need PRAGMA
  introspection)
- getSupportForSpatialAttributes (requires SpatiaLite extension — out of
  scope for stock SQLite)
- getSupportForUpdateLock (SQLite has no FOR UPDATE; serialises writes
  globally, so the flag is correctly false)
- getSupportForPCRERegex / getSupportForPOSIXRegex (need REGEXP UDF)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
MariaDB's getSchemaAttributes/getSchemaIndexes hit INFORMATION_SCHEMA,
which doesn't exist in SQLite. Use PRAGMA table_info and PRAGMA
index_list/index_info to return the same shape so Database::analyzeCollection
and the schema-drift tooling work uniformly across adapters.

- getSupportForSchemaAttributes / getSupportForSchemaIndexes flipped to
  true.
- parseSqliteColumnType normalises declarations like `VARCHAR(36)` into
  (dataType, length) so the result matches the MariaDB columnType and
  characterMaximumLength fields where the IndexValidator etc. expect them.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
abnegate added a commit to appwrite/appwrite that referenced this pull request Apr 29, 2026
…parity work

Pulls in the FTS5 fulltext implementation (utopia-php/database#870) plus
the relationship/batch-attrs/schema-introspection capability flips. Drops
the need for the Audit::setup() workarounds and the bootstrap fulltext
filter on the SQLite path — those will come out in a follow-up once CI
confirms the upstream branch is stable on its own.
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 29, 2026

Greptile Summary

This PR wires up FTS5-backed fulltext search in the SQLite adapter, adds MySQL emulation mode, relationships, PCRE UDF, upserts, batch attribute creation, and PRAGMA-based schema introspection. The SQL.php / MariaDB.php / Postgres.php changes are a clean, minimal refactor to thread $forCollection down to condition builders. The SQLite adapter itself is a substantial addition (~1 500 lines) with well-structured transaction handling and LIKE-escaping fixes; however, multiple P1 issues identified in earlier review rounds — including rank pseudo-column leaking into the FTS attribute map, ambiguous multi-FTS5-table deletion, trigger orphaning in shared-tables deleteCollection, and only the final token receiving a prefix wildcard in getFTS5Value — remain unaddressed in the current HEAD.

Confidence Score: 2/5

Not safe to merge — multiple P1 issues from prior rounds remain unresolved in the current HEAD

The PR carries several unresolved P1 findings: rank pseudo-column in the FTS attribute map causes incorrect SEARCH routing and perpetual drift-detection churn; dropFulltextIndexById throws an unrecoverable error when two or more FTS5 tables exist for the same collection; getFTS5Value only prefix-matches the final token, giving wrong results for multi-word searches; and shared-tables deleteCollection leaves orphaned triggers on the parent table after dropping per-tenant FTS vtables. Multiple P1s with wide code-path coverage pull the score below the P1 ceiling.

src/Database/Adapter/SQLite.php — FTS5 implementation has multiple confirmed P1 defects

Important Files Changed

Filename Overview
src/Database/Adapter/SQLite.php Very large addition: FTS5 fulltext, relationships, MySQL emulation, PCRE UDF, schema introspection — carries multiple previously-flagged P1 issues (rank pseudo-column in FTS map, ambiguous multi-FTS5 delete, getFTS5Value only suffix-prefixes last token, trigger orphaning on collection delete in shared-tables mode)
src/Database/Adapter/MariaDB.php CONTAINS-on-array routing changed from JSON_OVERLAPS to JSON_CONTAINS-per-element fallback for non-overlaps adapters; json_encode failure risk on bad UTF-8 for the JSON_CONTAINS path remains but is edge-case
src/Database/Adapter/SQL.php Adds getSQLConditionsForCollection wrapper and threads $forCollection parameter through getSQLConditions/getSQLCondition to enable SQLite FTS5 routing — minimal, safe refactor
src/Database/Adapter/Postgres.php Only adds the optional $forCollection parameter to getSQLCondition signature and propagates it through nested query recursion — no logic change
tests/e2e/Adapter/SQLiteTest.php Adds setEmulateMySQL(true) call during test setup to exercise the MySQL emulation path — minimal, low-risk
tests/e2e/Adapter/SharedTables/SQLiteTest.php Same emulation-flag setup as the non-shared test — minimal change

Reviews (32): Last reviewed commit: "fix(adapter): use JSON_CONTAINS / json_e..." | Re-trigger Greptile

Comment thread src/Database/Adapter/SQLite.php Outdated
Comment thread src/Database/Adapter/SQLite.php Outdated
Comment thread src/Database/Adapter/SQLite.php
Comment thread src/Database/Adapter/SQLite.php
abnegate and others added 3 commits April 30, 2026 02:09
Five categories of fix surfaced once the SQLite adapter test job actually
executed against the new flag flips.

FTS5 query input:
- Inherited getFulltextValue keeps `.` (and other non-token characters)
  which FTS5 treats as syntax tokens, and produces an empty string when
  the input has no word characters — both trip `fts5: syntax error`.
- New getFTS5Value strips everything outside Unicode letters/digits/
  underscore/whitespace, collapses whitespace, and appends the trailing
  prefix wildcard. Empty terms short-circuit to 1=0 (or 1=1 for NOT) so
  the bound parameter is never the empty string.

Capability flag honesty:
- getSupportForTimeouts back to false. The adapter has no per-query
  timeout enforcement and can't translate a tripped budget into
  Utopia\Database\Exception\Timeout, so testQueryTimeout (and any caller
  that relies on Database::setTimeout) was misled by the earlier flip.
- getSupportForAttributeResizing back to false. SQLite's dynamic typing
  silently lets a resize-to-smaller succeed even when existing data
  exceeds the new size; the upstream test suite explicitly relies on the
  adapter rejecting that case.

Schema introspection:
- columnType is now lower-cased to match MariaDB's INFORMATION_SCHEMA
  output that callers compare against verbatim (`varchar(128)` vs
  `VARCHAR(128)`).

Collection size:
- getSizeOfCollection now sums the parent table, the _perms table, and
  any FTS5 vtables tied to the collection, so testSizeFullText sees the
  size grow once a fulltext index lands instead of staying flat.

Relationship rename ordering:
- updateRelationship now overrides MariaDB's multi-statement
  `ALTER TABLE ...; ALTER TABLE ...;` shape and runs each statement
  through its own prepare/execute. Without the split, the second ALTER
  was silently dropped under SQLite's PDO driver, so by the time the
  Database layer asked the adapter to recreate the matching index it
  was looking up a column that hadn't been renamed.

Still on the list (kept for follow-up):
- testRelationshipFilterQueries returns 0 instead of 3 — looks like a
  real query-rewrite bug and needs deeper inspection.
- The SQLite-only relationship rename error in testOneToOneTwoWayRelationship
  may still bite even with the multi-statement fix; rerun and look at
  the trace if it does.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
parseSqliteColumnType now returns characterMaximumLength,
numericPrecision/Scale, and datetimePrecision so getSchemaAttributes
matches what INFORMATION_SCHEMA.COLUMNS reports for MariaDB. TEXT family
types report their byte ceilings, DATETIME(n) routes through
datetimePrecision, and integer types fall back to MariaDB defaults.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- getSizeOfCollection: sum dbstat payload instead of pgsize so small
  inserts move the dial. Page allocation is granular at 4 KB and was
  letting test assertions land on the same value before/after writes.
- getFTS5Value: OR-join terms with the trailing token marked as a
  prefix to mirror MariaDB's BOOLEAN MODE contract — multi-word search
  terms now match docs containing any of the words.
- getSQLCondition: route STARTS_WITH/ENDS_WITH (and NOT variants)
  through a dedicated SQLite path that appends ESCAPE '\\'. SQLite has
  no default LIKE escape, so the inherited escapeWildcards() backslash
  prefixes were silently being treated as literal text.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/Database/Adapter/SQL.php`:
- Around line 3158-3159: The adapter sets mutable state
$this->currentQueryCollection before calling getSQLConditions but never restores
it, causing subsequent calls to mistakenly use the prior collection; update the
code paths (the block that sets $this->currentQueryCollection and calls
getSQLConditions as well as similar blocks in count() and sum()) to save the
previous value (e.g. $prev = $this->currentQueryCollection), set the new
collection, call getSQLConditions($queries, $binds), and then restore
$this->currentQueryCollection = $prev in a finally-like manner so the collection
context is cleared after building conditions.

In `@src/Database/Adapter/SQLite.php`:
- Around line 2521-2558: getSQLCondition currently assumes the FTS vtable is
named for a single attribute, causing multi-column indexes created by
createFulltextIndex to be missed; change getSQLCondition to find the
collection's fulltext index whose attribute list contains the searched attribute
(use currentQueryCollection to access index metadata), then pass that index's
full attribute set into getFulltextTableName instead of deriving the table name
from the single attribute; ensure binds and subquery still use the same
placeholder logic and preserve the SEARCH vs NOT_SEARCH branching around the
resulting subquery.
- Around line 559-566: getFulltextTableName currently includes the current
tenant in the FTS table name which breaks lookups when the index was created
under a different tenant; update getFulltextTableName to be tenant-agnostic in
shared-table mode by omitting the {$this->tenant} segment (i.e. return
"{$this->getNamespace()}_{$collection}_{$key}_fts" when running shared-tables)
and keep the same naming rule mirrored in getSizeOfCollection so both functions
resolve the same FTS table name in shared mode.
- Around line 611-645: dropFulltextIndexById currently ignores the $id and will
drop the sole FTS table for the collection regardless of which index is
requested; change it so it uses the requested $id to locate the exact FTS table
and only drop that table. Concretely: in dropFulltextIndexById(string
$collection, string $id) query the index mapping that ties index ids to FTS
table names (use the same mapping table/columns used elsewhere when creating
indexes), e.g. SELECT table_name FROM <index_map> WHERE collection = :collection
AND id = :id; if no mapping row is found return false; if a matching table is
found use that table name as $ftsTable, remove its triggers and DROP TABLE via
the existing trigger(Database::EVENT_INDEX_DELETE, $sql) path; do not fall back
to dropping the only FTS table when it doesn’t match the provided $id.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 9b1c680f-b9fe-4edf-be09-095e120605ae

📥 Commits

Reviewing files that changed from the base of the PR and between 688d942 and 251089c.

📒 Files selected for processing (3)
  • src/Database/Adapter.php
  • src/Database/Adapter/SQL.php
  • src/Database/Adapter/SQLite.php

Comment thread src/Database/Adapter/SQL.php Outdated
Comment thread src/Database/Adapter/SQLite.php Outdated
Comment thread src/Database/Adapter/SQLite.php
Comment thread src/Database/Adapter/SQLite.php Outdated
Comment thread src/Database/Adapter/SQLite.php Outdated
SQLite accepts both INT and INTEGER as type names but MariaDB's
INFORMATION_SCHEMA always reports `int`. Tests assert against the
MariaDB shape, so canonicalise INTEGER → int in parseSqliteColumnType
to keep getSchemaAttributes parity with the parent contract.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Apr 29, 2026

CI healing — adapter test failures triaged

Three SQLite adapter tests failed on run 25134567664. All three trace back to behaviour mismatches between MariaDB (which the SQLite adapter inherits from) and SQLite, which only surfaced once getSupportForFulltextIndex() was flipped on.

1. testRelationshipFilterQueriesendsWith returned 0 rows

  • Symptom: Query::endsWith('vendor.email', '.com') returned 0, expected 3.
  • Root cause: escapeWildcards() (inherited from SQL) prefixes regex metacharacters like . with a literal backslash, producing the LIKE pattern %\.com. MariaDB's LIKE recognises \ as the default escape character, but SQLite's LIKE has no escape character unless paired with an explicit ESCAPE clause, so the backslash was matched literally and no email matched.
  • Fix: getSQLCondition now post-processes the parent's output to append ESCAPE '\' to every emitted LIKE :placeholder. SQLite is tolerant of escape chars that precede non-wildcard characters, so the inherited escaping for %/_ works correctly without a separate override of the metacharacter list.

2. testFindFulltextSpecialCharsdonald trump returned 1 row, not 2

  • Symptom: Query::search('ft', 'donald trump') matched only the 'donald trump' doc, not 'donald duck'.
  • Root cause: getFTS5Value produced donald trump*, and FTS5 treats unprefixed terms as AND by default. MariaDB's IN BOOLEAN MODE treats them as OR.
  • Fix: split sanitized input on whitespace, join tokens with explicit OR, keep the prefix wildcard on the trailing token only. So donald trumpdonald OR trump* and al@ba.ioal OR ba OR io*, matching MariaDB semantics. Quoted phrase searches still go through unchanged.

3. testSizeFullText — size didn't grow after data

  • Symptom: assertGreaterThan(8192, 8192) (size before data == size after).
  • Root cause: the size query's LIKE clause was scoped to ..._fts%, so dbstat pages for regular b-tree indexes weren't counted in the collection's size. Once an attribute index was added, those pages were ignored and the total stayed flat.
  • Fix: broaden the LIKE pattern to the full {namespace}_{tenant}_{collection}_% prefix. That catches every derived object — regular indexes, FTS5 vtables, and their _data/_idx/_docsize/_config shadow tables — without picking up the bare collection table or _perms table (which live outside the tenant-prefixed namespace and are matched explicitly).

Verification

All three scenarios verified locally against a file-backed SQLite database, plus regression checks for:

  • startsWith/endsWith/contains with %, _, ., * in the input
  • Single-word and quoted-phrase FTS5 searches
  • Multi-value LIKE conditions (OR-joined for positive, AND-joined for negative)

Committed as f86796e4 on feat-sqlite-fts.

@github-actions
Copy link
Copy Markdown
Contributor

Claude could not apply patches from: healing (merge conflicts with current branch tip). These tasks need manual attention.

Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Multi-dimension review of the FTS5 / schema-introspection / capability-flip diff. Findings below; all CRITICAL and WARNING items have been fixed in the latest commit on this branch — see fix(sqlite): harden FTS5 / schema overrides flagged by review. Inline comments below pin each fix to its origin line.

Security

  • CRITICAL: dropFulltextIndexById LIKE prefix and getSizeOfCollection FTS pattern leave _ unescaped — _ is a LIKE single-char wildcard, so <prefix>_users matches <prefix>_userA and either drops the wrong FTS table or contaminates a neighbour collection's size. Fixed by escaping LIKE metacharacters with ESCAPE '\\'.
  • WARNING: getFTS5Value's exact-match heuristic accepted a lone " and unbalanced "foo"bar". Tightened to require length ≥ 2 and even count of ".
  • SUGGESTION: parseSqliteColumnType regex didn't anchor the closing paren, so VARCHAR(36 evil) parsed as size 36. Anchored.

Performance

  • WARNING: createFulltextIndex ran 5 prepare()/execute() round-trips for parameter-less DDL — switched to PDO::exec() inside a single transaction.
  • NOTE (not fixed): backfill in createFulltextIndex is a single unbounded INSERT … SELECT. Fine for small collections, painful for multi-million-row tables. The wrapping transaction guarantees rollback on failure; chunking is worth a follow-up.

Correctness

  • CRITICAL: getSchemaIndexes returned one Document per (index, column) pair without columns/lengths arrays. Database::createIndex compares getAttribute('columns') to the requested attribute set and treated every index as mismatched, dropping legitimate indexes. Rewrote to group by index name with columns[]/lengths[] arrays — matches MariaDB's shape. Also stripped the physical <ns>_<tenant>_<collection>_ prefix from the index id so reconciliation lookup actually matches.
  • CRITICAL: currentQueryCollection was set on the adapter and never restored. Re-entrant find/count/sum (relationship traversal, nested counts) clobbered it, so the outer SEARCH built against the inner collection's FTS5 table. Wrapped each callsite in try/finally save/restore.
  • CRITICAL: dropFulltextIndexById returned false when a collection had ≥ 2 fulltext indexes and fell through to a non-existent DROP INDEX. Replaced the count-1 heuristic with a metadata lookup: resolve the index's attribute set from getDocument(METADATA, …) and reconstruct the FTS5 table name deterministically.
  • WARNING: createFulltextIndex was non-atomic — partial failure left the index half-built and the existence check on retry silently returned true with missing triggers. Wrapped in a transaction.
  • WARNING: createAttributes looped without atomicity, violating the getSupportForBatchCreateAttributes() contract. Wrapped in a transaction.

Readability

  • WARNING: trigger suffixes _ai/_ad/_au were inlined in two locations. Hoisted to private const FTS5_TRIGGER_SUFFIXES.
  • The dropFulltextIndexById block comment lied about the count-1 branch — removed along with the heuristic.

Maintainability

  • NOT FIXED (deferred): currentQueryCollection is still a stateful field on the abstract Adapter. Threading the collection through getSQLConditions/getSQLCondition as an explicit parameter is the structurally correct fix, but invasive. The try/finally guard makes the current pattern safe; revisit when a second adapter needs it.
  • NOT FIXED (out of scope): createRelationship/updateRelationship/deleteRelationship are ~200 lines copy-pasted from MariaDB; reducible via an executeRelationshipStatements() hook on the parent. Worth a follow-up.

Testing

  • WARNING: PR adds zero tests for ~850 lines of SQLite-specific behaviour. The fixes preserve existing behaviour or restore parent-adapter parity, so the existing Base.php suite remains the regression net. Worth adding direct tests for getFTS5Value, parseSqliteColumnType, multi-FTS-index drop, and the post-create backfill in a follow-up.

Comment thread src/Database/Adapter/SQL.php Outdated
Comment thread src/Database/Adapter/SQLite.php
Comment thread src/Database/Adapter/SQLite.php Outdated
Comment thread src/Database/Adapter/SQLite.php Outdated
Comment thread src/Database/Adapter/SQLite.php
Comment thread src/Database/Adapter/SQLite.php
Comment thread src/Database/Adapter/SQLite.php
Comment thread src/Database/Adapter/SQLite.php Outdated
Comment thread src/Database/Adapter/SQLite.php
@github-actions
Copy link
Copy Markdown
Contributor

Claude could not apply patches from: improvement (merge conflicts with current branch tip). These tasks need manual attention.

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Apr 29, 2026

CI healing — fix for run 25134966781

What failed

Tests\E2E\Adapter\SharedTables\SQLiteTest::testSchemaAttributes (1 failure, all other 626 tests passed):

1) Tests\E2E\Adapter\SharedTables\SQLiteTest::testSchemaAttributes
Failed asserting that two strings are equal.
--- Expected
+++ Actual
-'int'
+'integer'

tests/e2e/Adapter/Scopes/CollectionTests.php:602

Root cause

testSchemaAttributes checks the shared-tables _tenant column and expects MariaDB's INFORMATION_SCHEMA.COLUMNS shape:

  • dataType = 'int'
  • numericPrecision = '10'
  • columnType ∈ ['int unsigned', 'int(11) unsigned']

The SQLite adapter declared _tenant as INTEGER DEFAULT NULL, which PRAGMA table_info echoed back as INTEGER and parseSqliteColumnType lowercased to dataType=integer, columnType=integer — not parity with MariaDB.

Fix

src/Database/Adapter/SQLite.php:166 — declare the column as "INT(11) UNSIGNED" DEFAULT NULL (double-quoted so SQLite accepts the MariaDB-shaped type literally). PRAGMA now returns INT(11) UNSIGNED, which parseSqliteColumnType's existing regex captures as INTdataType=int, numericPrecision=10, and the lowercased columnType=int(11) unsigned lands in the expected list. SQLite type affinity is unchanged: any declared type containing INT still has INTEGER affinity, so storage/queries are unaffected.

Verified

  • vendor/bin/phpunit --filter testSchemaAttributes tests/e2e/Adapter/SharedTables/SQLiteTest.php — passes (24 assertions).
  • vendor/bin/phpunit --filter testSchemaAttributes tests/e2e/Adapter/SQLiteTest.php — passes (20 assertions).
  • vendor/bin/pint --test src/Database/Adapter/SQLite.php — clean.

🤖 Healing task on branch feat-sqlite-fts

testSchemaAttributes expects the shared-tables _tenant column to report
dataType=int and columnType in [int unsigned, int(11) unsigned], to
match what MariaDB's INFORMATION_SCHEMA returns. The previous INTEGER
declaration came back from PRAGMA as 'integer', which parseSqliteColumnType
mapped to dataType=integer. Declaring the column with the quoted MariaDB
shape makes PRAGMA echo it back verbatim — INTEGER affinity is preserved
because the type still contains "INT".

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

Claude pushed fixes from: healing

35b80e7...00c5b0d

Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Reviewed the PR diff vs main across security, performance, correctness, readability, maintainability, and testing dimensions, then fixed the critical and warning findings locally. Two follow-up commits on top of this branch:

  • fix(sql): clear currentQueryCollection after building conditions
  • fix(sqlite): tighten FTS5 setup, schema introspection, and deleteIndex

Findings & fixes

Critical (fixed):

  • Multi-attribute fulltext search broken: createFulltextIndex keys the FTS5 table name off the full sorted attribute set, but getSQLCondition recomputes the name from a single attribute. SEARCH against a multi-column fulltext index would never match. Replaced with a findFulltextTableForAttribute lookup that probes sqlite_master + PRAGMA table_info to locate the FTS5 table that actually contains the queried column.
  • parseSqliteColumnType lost DECIMAL scale: regex captured only the first \d+, so DECIMAL(10,2) reported scale 0. Extended regex to capture the optional second argument.
  • getSchemaIndexes shape mismatch with MariaDB: returned a flat (index, column) row list instead of one Document per index with a columns array. Database::createIndex compares $schemaIndex->getAttribute('columns') against the requested attributes, so every existing-index check defaulted to [] and silently dropped/recreated indexes.

Warning (fixed):

  • currentQueryCollection was never reset, so the field could leak across calls if getSQLConditions threw. Wrapped each set in try/finally.
  • FTS5 setup ran CREATE VIRTUAL TABLE + 3 triggers + backfill outside a transaction; a mid-backfill failure would leave triggers wired up to a partially populated index. Now wrapped in a transaction with rollback on failure.
  • deleteIndex would drop the wrong table when both a regular index and exactly one FTS5 table existed on the collection — dropFulltextIndexById always ran first and counted only FTS5 tables. Now skips the FTS5 path when a regular index with the requested id exists.

Suggestion (fixed):

  • getSizeOfCollection used SUM(payload) + SUM(pgsize - unused - payload), which is just SUM(pgsize - unused) written verbosely. Collapsed.
  • TEXT-family byte ceilings ('65535', '16777215', '4294967295') hoisted into MARIADB_*_BYTES constants.
  • Redundant closeCursor() calls after fetchAll() removed in getSchemaIndexes.

Not changed (deliberate)

  • dropFulltextIndexById still uses the best-effort "drop if exactly one FTS5 table" fallback when the FTS5 path is taken. Properly resolving an index id to its attribute set requires plumbing additional metadata through Adapter::deleteIndex, which is out of scope for a review pass — the new deleteIndex guard makes the existing fallback safe in the common case.
  • createRelationship / updateRelationship / deleteRelationship duplicate the MariaDB switch logic just to swap the executor. A shared buildRelationshipStatements helper would remove the duplication, but it's a 3-method refactor across two adapters and didn't seem worth bundling into this PR.

Comment thread src/Database/Adapter/SQL.php Outdated
Comment thread src/Database/Adapter/SQLite.php Outdated
Comment thread src/Database/Adapter/SQLite.php Outdated
Comment thread src/Database/Adapter/SQLite.php Outdated
Comment thread src/Database/Adapter/SQLite.php Outdated
Comment thread src/Database/Adapter/SQLite.php
Comment thread src/Database/Adapter/SQLite.php Outdated
Comment thread src/Database/Adapter/SQLite.php Outdated
github-actions Bot and others added 2 commits April 29, 2026 21:45
Previously the field was set before getSQLConditions() and never reset,
so a SEARCH query in a later call could observe a stale collection name
if the first call threw. Wrap each call site in try/finally so the
field always returns to null on exit.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- createFulltextIndex now wraps virtual table + triggers + backfill in a
  transaction so a mid-backfill failure doesn't leave triggers wired up
  to a partially populated FTS5 index.
- getSQLCondition's SEARCH path looks the FTS5 table up via sqlite_master
  + PRAGMA table_info instead of computing the name from a single
  attribute, so multi-column fulltext indexes (whose table name encodes
  the full sorted attribute set) actually match.
- deleteIndex now skips the FTS5 drop path whenever a regular SQLite
  index with the requested id exists, preventing accidental FTS5 drops
  when collections carry both kinds of indexes.
- getSchemaIndexes returns one Document per index with `columns` and
  `lengths` arrays, matching MariaDB's grouped shape so
  Database::createIndex's `columns === filteredAttributes` comparison
  works correctly. Also drops the redundant closeCursor calls.
- parseSqliteColumnType captures the optional scale in DECIMAL(p,s) and
  hoists the TEXT-family byte ceilings into named constants.
- getSizeOfCollection collapses the equivalent SUM(payload) + SUM(pgsize
  - unused - payload) expression to SUM(pgsize - unused).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

Claude pushed fixes from: improvement

00c5b0d...699107c

Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Per-dimension review of feat-sqlite-fts vs main. Findings I judged actionable are fixed in this push (commits d94bdd6 + 7cb4bd3).

Correctness (2 critical)

  • FTS5 lookup broken for multi-attribute fulltext indexes. getFulltextTableName keyed tables by sorted attribute set, but getSQLCondition only sees a single attribute from the user's SEARCH query — it would derive _name_fts for an index actually stored as _name_title_fts, and the search would match nothing. Re-keyed all FTS tables by index id; SEARCH now walks candidate FTS5 tables for the collection and picks the one whose schema includes the searched attribute (memoised per (collection, attribute), busted on create/drop).
  • Dropping one of several fulltext indexes on the same collection silently failed. dropFulltextIndexById matched by prefix LIKE, bailed when ≥2 FTS5 tables matched, and the regular DROP INDEX fallback always errored. With id-keyed naming it's now a point lookup on sqlite_master.name.

Maintainability (1 warning)

  • $currentQueryCollection leaked to the abstract Adapter base. Only SQL adapters set it and only SQLite reads it. Relocated to SQL.php so non-SQL adapters don't carry SQL-specific plumbing.

Correctness (1 warning)

  • LIKE fallback in SEARCH didn't escape %/_. When no FTS5 table covers the attribute, the fallback bound '%'.$value.'%' raw — % and _ in user input acted as wildcards and matched unintended rows. Now runs through escapeWildcards() and pairs with ESCAPE '\\', matching MariaDB's SEARCH semantics.

Correctness (1 warning)

  • createFulltextIndex had no atomicity. Five separate prepare/execute calls (virtual table + 3 triggers + backfill); a backfill failure left orphaned triggers attached to a half-built table. Wrapped in a transaction (defers to outer transaction if one is already open).

Security

No injection vectors found. Inputs reach DDL only through filter() ([A-Za-z0-9_-] whitelist). The empty-FTS-term branches (1 = 0 / 1 = 1) sit inside the WHERE — permission filters apply outside, no leak.

Performance

The id-keyed dropFulltextIndexById rewrite turns the per-deleteIndex sqlite_master LIKE scan into a single point lookup. The reviewer-flagged N+1 in getSchemaIndexes (one PRAGMA index_info per index) is kept — N is small per table and PRAGMA is cheap.

Readability / Testing

Suggestions only (named constants for MariaDB byte ceilings, dispatch-style consistency between createRelationship / updateRelationship / deleteRelationship, missing regression tests for the prior bug-fix commits on this branch). Out of scope for the fix-only pass.

The MANY_TO_MANY junction-order claim from one reviewer was not actionable: the SQLite implementation mirrors MariaDB::deleteRelationship lines 657–659 exactly.

Comment thread src/Database/Adapter/SQLite.php
Comment thread src/Database/Adapter/SQLite.php
Comment thread src/Database/Adapter/SQLite.php
Comment thread src/Database/Adapter.php Outdated
Comment thread src/Database/Adapter/SQLite.php Outdated
Comment thread src/Database/Adapter/SQLite.php
@github-actions
Copy link
Copy Markdown
Contributor

Claude could not apply patches from: improvement (merge conflicts with current branch tip). These tasks need manual attention.

…Index

testSizeFullText asserts the collection size grows after a fulltext
index is created. The size LIKE pattern was hardcoded to
"{namespace}_{collection}_" while the FTS5 tables now use
getFulltextTablePrefix, which under sharedTables embeds the tenant
segment. With the two paths diverging, the FTS5 shadow tables
(`_data`, `_idx`, `_docsize`, `_config`) didn't match the dbstat LIKE
and the size came out unchanged. Reuse getFulltextTablePrefix so the
naming stays in lockstep regardless of shared-tables mode.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/Database/Adapter/Postgres.php (1)

1791-1800: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use OR when compiling Query::TYPE_OR.

Line 1800 still joins every child condition with AND, so an OR group compiles as OR (a AND b) instead of OR (a OR b). That flips the result set for nested OR filters.

Suggested fix
-                return empty($conditions) ? '' : ' ' . $method . ' (' . implode(' AND ', $conditions) . ')';
+                return empty($conditions) ? '' : ' ' . $method . ' (' . implode(" {$method} ", $conditions) . ')';
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/Database/Adapter/Postgres.php` around lines 1791 - 1800, The OR group is
being compiled with ' AND ' between child conditions; change the joiner based on
the query type so Query::TYPE_OR uses ' OR ' and Query::TYPE_AND uses ' AND '.
In the block handling Query::TYPE_OR / Query::TYPE_AND (around getSQLCondition,
$conditions and $method), compute the glue as ($query->getType() ===
Query::TYPE_OR ? ' OR ' : ' AND ') and use that in implode instead of the
hardcoded ' AND ', leaving the surrounding " $method ( ... )" logic intact.
♻️ Duplicate comments (3)
src/Database/Adapter/SQLite.php (3)

985-1000: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Don't drop the lone FTS table when the requested id didn't resolve.

If metadata lookup fails or $id is wrong, Lines 986-988 still delete the only FTS table on the collection. That makes deleteIndex() destructive for stale/typoed ids as soon as a single fulltext index exists.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/Database/Adapter/SQLite.php` around lines 985 - 1000, The code in
deleteIndex currently auto-selects the single FTS5 table ($ftsTable =
$tables[0]) when metadata resolution fails, which lets a mistyped or stale $id
delete the lone FTS table; change the logic so that if metadata lookup does not
confirm a match for the requested $id you do NOT implicitly pick the sole table
but instead throw a DatabaseException (same style used for the
ambiguous-multiple case). Concretely, remove or guard the $ftsTable = $tables[0]
fallback in deleteIndex and require an explicit metadata match between $id and
the FTS table before proceeding to drop it; reference the symbols $ftsTable,
$tables, deleteIndex, and DatabaseException when making the change.

2665-2675: ⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift

RELATION_MANY_TO_MANY still reports success without creating storage.

This branch returns true with an empty statement list, but updateRelationship() and deleteRelationship() later assume the junction and _perms tables already exist. Many-to-many creation will succeed up front and fail on the first follow-up mutation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/Database/Adapter/SQLite.php` around lines 2665 - 2675, The
RELATION_MANY_TO_MANY branch returns an empty statement list so creation reports
success but no junction or perms tables are created; update the match arm for
Database::RELATION_MANY_TO_MANY to emit SQL statements that create the junction
table and its corresponding _perms table (use the same identifiers/columns/types
used elsewhere i.e. $table, $relatedTable, $id, $twoWayKey and $sqlType), e.g.
CREATE TABLE for the join table with two foreign-key id columns and a suitable
PRIMARY KEY/INDEX, plus CREATE TABLE for the join_perms table (or the naming
convention used by deleteRelationship()/updateRelationship()), and include any
required indexes so subsequent updateRelationship() and deleteRelationship()
calls succeed.

455-467: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

getSizeOfCollection() misses FTS bytes in shared-table mode.

createFulltextIndex() names shared-table FTS vtables with getFulltextTablePrefix(), which includes the tenant segment. The hard-coded {$namespace}_{$collection}_ prefix here will not match those tables, so size accounting silently omits FTS shadow tables whenever sharedTables is enabled.

Suggested fix
-        $ftsPrefix = "{$namespace}_{$collection}_";
+        $ftsPrefix = $this->getFulltextTablePrefix($collection);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/Database/Adapter/SQLite.php` around lines 455 - 467,
getSizeOfCollection() currently builds $ftsPrefix using a hard-coded
"{$namespace}_{$collection}_" which fails to match shared-table FTS vtable names
created by createFulltextIndex(); instead use the same naming logic as
getFulltextTablePrefix() to construct the prefix so tenant-aware/shared-table
vtables are matched. Replace the "{$namespace}_{$collection}_" usage with a call
to getFulltextTablePrefix($namespace, $collection) (or otherwise replicate its
exact output), then pass that through escapeLikePattern() and append
self::FTS_TABLE_SUFFIX as you already do when forming $ftsPattern so the FTS
shadow tables are included in size sums.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/Database/Adapter/SQLite.php`:
- Around line 3215-3236: The MATCH currently searches the entire FTS vtable and
can match sibling columns; fix by qualifying the MATCH term with the requested
column so only that column is searched. In the block that sets
$binds[":{$placeholder}_0"] and builds $subquery (after resolving $ftsTable via
findFulltextTableForAttribute), prefix the bound FTS term with the attribute
name (e.g. "{$attribute}:{$ftsValue}") so the MATCH expression targets the
specific column; leave the rest of the subquery construction (using $alias,
$ftsTable, :{$placeholder}_0 and returning $subquery or its negation for
Query::TYPE_SEARCH) unchanged. Ensure the attribute token used for qualification
matches the attribute variable used earlier to avoid injection or quoting
issues.

---

Outside diff comments:
In `@src/Database/Adapter/Postgres.php`:
- Around line 1791-1800: The OR group is being compiled with ' AND ' between
child conditions; change the joiner based on the query type so Query::TYPE_OR
uses ' OR ' and Query::TYPE_AND uses ' AND '. In the block handling
Query::TYPE_OR / Query::TYPE_AND (around getSQLCondition, $conditions and
$method), compute the glue as ($query->getType() === Query::TYPE_OR ? ' OR ' : '
AND ') and use that in implode instead of the hardcoded ' AND ', leaving the
surrounding " $method ( ... )" logic intact.

---

Duplicate comments:
In `@src/Database/Adapter/SQLite.php`:
- Around line 985-1000: The code in deleteIndex currently auto-selects the
single FTS5 table ($ftsTable = $tables[0]) when metadata resolution fails, which
lets a mistyped or stale $id delete the lone FTS table; change the logic so that
if metadata lookup does not confirm a match for the requested $id you do NOT
implicitly pick the sole table but instead throw a DatabaseException (same style
used for the ambiguous-multiple case). Concretely, remove or guard the $ftsTable
= $tables[0] fallback in deleteIndex and require an explicit metadata match
between $id and the FTS table before proceeding to drop it; reference the
symbols $ftsTable, $tables, deleteIndex, and DatabaseException when making the
change.
- Around line 2665-2675: The RELATION_MANY_TO_MANY branch returns an empty
statement list so creation reports success but no junction or perms tables are
created; update the match arm for Database::RELATION_MANY_TO_MANY to emit SQL
statements that create the junction table and its corresponding _perms table
(use the same identifiers/columns/types used elsewhere i.e. $table,
$relatedTable, $id, $twoWayKey and $sqlType), e.g. CREATE TABLE for the join
table with two foreign-key id columns and a suitable PRIMARY KEY/INDEX, plus
CREATE TABLE for the join_perms table (or the naming convention used by
deleteRelationship()/updateRelationship()), and include any required indexes so
subsequent updateRelationship() and deleteRelationship() calls succeed.
- Around line 455-467: getSizeOfCollection() currently builds $ftsPrefix using a
hard-coded "{$namespace}_{$collection}_" which fails to match shared-table FTS
vtable names created by createFulltextIndex(); instead use the same naming logic
as getFulltextTablePrefix() to construct the prefix so tenant-aware/shared-table
vtables are matched. Replace the "{$namespace}_{$collection}_" usage with a call
to getFulltextTablePrefix($namespace, $collection) (or otherwise replicate its
exact output), then pass that through escapeLikePattern() and append
self::FTS_TABLE_SUFFIX as you already do when forming $ftsPattern so the FTS
shadow tables are included in size sums.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f7e76553-d4c3-4f54-b436-4314d1e1c72a

📥 Commits

Reviewing files that changed from the base of the PR and between 00d502f and dde905c.

📒 Files selected for processing (4)
  • src/Database/Adapter/MariaDB.php
  • src/Database/Adapter/Postgres.php
  • src/Database/Adapter/SQL.php
  • src/Database/Adapter/SQLite.php
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/Database/Adapter/MariaDB.php

Comment thread src/Database/Adapter/SQLite.php Outdated
Comment thread src/Database/Adapter/MariaDB.php Outdated
abnegate and others added 2 commits April 30, 2026 19:21
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Compress multi-line explanatory blocks to one-liners (or remove them
when the code is self-documenting). No behaviour change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Multi-dimensional review against main. Summary by axis:

  • Security: identifier escaping and parameter binding are sound. FTS5 table names come from filter()'d collection IDs and a hashed attribute key; tenant isolation is preserved via getFulltextTablePrefix. No exploitable SQL/command injection. Minor concern: the PCRE UDF runs user-supplied regexes with no complexity bound (catastrophic-backtracking DoS); acceptable for trusted tenants but worth a follow-up note.
  • Performance: schema introspection issues one PRAGMA table_info per FTS5 vtable (in getFulltextSchemaIndexes and buildFulltextAttributeMap). Fine for typical collections (1–3 FTS indexes) since results are cached per collection in ftsTableCache; not a hot-path concern.
  • Correctness: fixed in this pass — all three rollback sites (createFulltextIndex, dropFulltextIndexById, createAttributes) called rollbackTransaction() directly inside the catch block. Because rollbackTransaction() throws DatabaseException("Failed to rollback transaction: ...") if the underlying ROLLBACK fails, the original exception was being lost behind a misleading wrapper. Now wrapped so the original $e propagates.
  • Readability: generally good after the recent comment trim. Magic numeric-precision strings in parseSqliteColumnType could become constants, but the values mirror MariaDB's documented defaults — low value.
  • Maintainability: getSQLConditionsForCollection is a thin wrapper, but it scopes the new $forCollection threading cleanly without leaking it into MariaDB/Postgres call sites — keep.
  • Testing: the parent e2e Adapter suite already exercises FTS5 lifecycle, upserts, and schema introspection through the inherited Base test class; the SQLiteTest changes (enabling setEmulateMySQL(true)) route those tests through the new code. Targeted regression tests for the FTS-prefix / ON-CONFLICT-order fixes would be nice-to-have but are not strictly missing coverage.

Fixes applied

1 commit, 3 sites: rollback failures no longer mask the originating exception.

The createFulltextIndex / dropFulltextIndexById / createAttributes
catch blocks called rollbackTransaction() inline, which throws
DatabaseException("Failed to rollback transaction: ...") on failure
and replaces the actionable root cause. Wrap rollback so its own
error is swallowed and the original \$e is what propagates.
@github-actions
Copy link
Copy Markdown
Contributor

Claude pushed fixes from: improvement

7cd345b...ea37b56

Comment thread src/Database/Adapter/SQLite.php
Comment thread src/Database/Adapter/SQLite.php Outdated
Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Multi-dimensional review of the FTS5 + adapter parity work. Six reviewers ran in parallel (security, performance, correctness, readability, maintainability, testing). CRITICAL findings flagged inline below; full triage and fixes follow. A local fix commit was produced and packaged for the consolidator.

Security — fixed in this pass

  • CRITICAL Cross-tenant authorization bypass in getSQLPermissionsCondition (override at SQLite.php:1707-1716 dropped the Database::PERMISSIONS whitelist and getTenantQuery — under shared tables, two tenants sharing a _uid see each other's perms). Fixed: override removed; falls through to the SQL.php parent which already routes through the SQLite-overridden getSQLTable and binds :_tenant.
  • CRITICAL FTS5 triggers + backfill ingest cross-tenant rows (under shared tables, tenant A's vtable physically stored tenant B's tokenized content). Fixed: baked WHEN NEW/OLD._tenant IS <literal> into each trigger and WHERE _tenant IS <literal> into the backfill via a new getTenantSqlLiteral() helper.
  • WARNING DDL paths re-throw raw PDOException containing internal SQL/schema fragments — pre-existing pattern; left for follow-up.

Correctness — fixed in this pass

  • CRITICAL ftsTableCache keyed by collection only — under Swoole pooling, setTenant() between requests left the previous tenant's attribute→vtable map cached for the next. Fixed: override setTenant/setNamespace/setSharedTables to invalidate the cache on scope change.
  • CRITICAL \json_decode($collection->getAttribute('indexes', []), true) in deleteAttribute (SQLite.php:597) and renameIndex (SQLite.php:647) decoded an array (Document already returns the decoded indexes), silently producing nullforeach on null iterated nothing, so index updates on attribute delete/rename were no-ops. Fixed: drop the json_decode wrapper at both sites.
  • WARNING Duplicate _index_1/_index_2 createIndex calls in createCollection (SQLite.php:411-412) — looks like a merge artifact. Fixed: removed.
  • WARNING bindValue(':_tenant', $this->tenant, PDO::PARAM_INT) silently coerces non-numeric string tenants to 0. Fixed: pick the PDO type based on is_int($this->tenant).
  • WARNING REGEXP UDF registration doesn't apply PRAGMA busy_timeout / foreign_keys=ON per pooled connection — left for follow-up.

Performance

  • CRITICAL getSchemaIndexes is N+1 PRAGMA: per-index index_info plus per-FTS-table table_info. Use pragma_index_info() + pragma_table_info() table-valued functions or memoize. Tracked for follow-up.
  • CRITICAL FTS5 sync triggers fire row-at-a-time; bulk-load workloads suffer from trigger amplification. Design discussion needed.
  • WARNING SELECT last_insert_rowid() extra round-trip after each createDocument (SQLite.php:1198). Fixed: replaced with PDO::lastInsertId().
  • WARNING buildFulltextAttributeMap re-scans sqlite_master and PRAGMA-loops with no early exit. Tracked.
  • SUGGESTION getFTS5Value runs preg_replace per token; match/in_array is cheaper. Tracked.

Maintainability

  • WARNING Regular index name built inline in 4 places; _perms table name interpolated raw in many places instead of going through getSQLTable($collection . '_perms'). Tracked.
  • WARNING ~190 lines of relationship DDL cloned from MariaDB; consider lifting a shared buildRelationshipStatements() to the parent. Tracked — too invasive for this pass.
  • WARNING getLikeCondition near-clone of MariaDB's, only differing by ESCAPE '\\'. Lift getLikeEscapeClause(): string to the parent. Tracked.
  • SUGGESTION Three independent copies of the BEGIN/try/commit/rollback boilerplate; protected function inAtomic(callable) would reduce drift risk on the exception-preservation logic. Tracked.

Readability — partially addressed

  • Done: Trimmed several narrative docblocks/comments that violated the house "comments only for non-obvious WHY" rule (getSQLIndex prepends _tenant narrations, Same multi-statement split rationale docblocks on updateRelationship/deleteRelationship, Bare identifiers in fts5(...) and Atomic setup block comments).
  • WARNING parseSqliteColumnType (~120 lines) and updateRelationship (~80 lines) would benefit from extraction; resolveFulltextTableById ternary repetition ($index instanceof Document ? ... : (\is_array($index) ? ... : null)) repeats 4×. Tracked.

Testing — recommended follow-up

  • CRITICAL Six bug-fix commits on this branch (BEGIN IMMEDIATE deadlock, ON CONFLICT shape, _tenant in shared perms UNIQUE, exception preservation across rollback, cross-tenant FTS5 vtable, FTS prefix alignment) all ship without regression tests. The 5 lines added to each test file only flip setEmulateMySQL(true). Strongly recommend a follow-up commit adding SQLite-specific regression tests for the four CRITICALs fixed in this pass — cross-tenant perms, FTS triggers tenant scoping, FTS cache invalidation on setTenant, json_decode bug surfacing as missing index update.

Verification

  • php -l clean on all changed files.
  • pint --test passes.
  • phpstan --level 7 error count unchanged (22 pre-existing Swoole stub errors, none new).


$backfill = "INSERT INTO `{$ftsTable}` (rowid, {$columnList}) SELECT `_id`, {$columnList} FROM `{$parentTable}`";
$this->getPDO()->prepare($backfill)->execute();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[CRITICAL][SECURITY/CORRECTNESS] Under sharedTables=true, the AFTER INSERT/UPDATE/DELETE triggers fire on the shared parent table for every tenant's row. Tenant B's tokenized content gets ingested into tenant A's FTS5 vtable shadow tables; the same applies to the backfill at line 818. The outer _tenant=:_tenant filter clamps result rows but doesn't prevent on-disk co-location, dbstat reporting bloat, or cross-tenant MATCH ranking effects. Fixed locally: added WHEN NEW/OLD._tenant IS <literal> to each trigger and WHERE _tenant IS <literal> to the backfill, via a new getTenantSqlLiteral() helper that produces a SQL literal for embedding (CREATE TRIGGER doesn't accept bound parameters at fire time).

*/
protected function findFulltextTableForAttribute(string $collection, string $attribute): ?string
{
if (!\array_key_exists($collection, $this->ftsTableCache)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[CRITICAL][CORRECTNESS] ftsTableCache is keyed by collection name only, but the FTS vtable name includes getTenantSegment(). Under Swoole pooling, the same adapter instance handling tenant A then tenant B returns A's attribute→vtable map for B's findFulltextTableForAttribute() lookup — SEARCH routes to the wrong vtable. Fixed locally: override setTenant / setNamespace / setSharedTables to invalidate the cache when scope changes.

Comment thread src/Database/Adapter/SQLite.php Outdated
$stmt = $this->getPDO()->prepare($sql);
$stmt->bindValue(':max', $size, PDO::PARAM_INT);
if ($this->sharedTables) {
$stmt->bindValue(':_tenant', $this->tenant, PDO::PARAM_INT);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[WARNING][CORRECTNESS] Adapter::$tenant is int|string|null, but PDO::PARAM_INT silently casts a non-numeric string to 0. Hosts using string tenant ids would scan an empty (or wrong) partition and falsely report "no oversized values". Fixed locally: pick the PDO type from \is_int($this->tenant).

Four data-correctness/security bugs surfaced by reviewing the parity
branch as a whole.

- getSQLPermissionsCondition: the SQLite override dropped both the
  Database::PERMISSIONS whitelist and the getTenantQuery filter. Under
  shared tables, two tenants holding the same _uid would see each
  other's perms — cross-tenant authorization bypass. The parent in
  SQL.php works for SQLite via the SQLite getSQLTable override, so the
  override is just removed.
- FTS5 triggers and backfill: under shared tables, AFTER INSERT/UPDATE/
  DELETE triggers fire on the parent table for every tenant, so tenant
  A's vtable physically stored tenant B's tokenized content. Add
  WHEN NEW/OLD._tenant IS <tenant> to each trigger and WHERE _tenant
  IS <tenant> to the initial backfill, baked in as a SQL literal via
  getTenantSqlLiteral().
- ftsTableCache: keyed by collection only, so under Swoole pooling the
  same adapter instance serving tenant A then tenant B returned A's
  attribute->vtable map for B. Override setTenant/setNamespace/
  setSharedTables to invalidate the cache when scope changes.
- deleteAttribute/renameIndex: \json_decode(<array>) on the indexes
  attribute returned null (Document already returns the decoded
  array), so the foreach silently iterated nothing — index updates on
  attribute delete/rename were no-ops.
@github-actions
Copy link
Copy Markdown
Contributor

Claude pushed fixes from: improvement

ea37b56...59c363c

Comment thread src/Database/Adapter/SQLite.php
Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Reviewed the diff vs main across six dimensions (security, performance, correctness, readability, maintainability, testing). The PR has accumulated 40+ fix commits and is well-vetted; most reviewer-flagged items turned out to be false positives once traced through the parent class. Detail:

Security — no critical findings. Identifier interpolation in PRAGMA / sqlite_master queries is safe because (a) PRAGMA can't take bound parameters and (b) names originate from sqlite_master rows or are computed from filtered/hashed inputs. FTS5 trigger WHEN OLD/NEW._tenant IS {literal} correctly closes the cross-tenant leak.

Correctness — no real bugs found. The flagged "perms UNIQUE missing _tenant under shared tables" is a false positive: getSQLIndex() (SQLite.php:1724) prepends _tenant to every index attribute list when sharedTables is true, so the existing createIndex("{$id}_perms", '_index_1', UNIQUE, ['_document', '_type', '_permission']) call already produces the correct (_tenant, _document, _type, _permission) constraint. The ON CONFLICT column-order fix (commit 80c29060) verified consistent with the same convention.

Performance — flagged N+1 PRAGMA loops in getSchemaIndexes / buildFulltextAttributeMap / getFulltextSchemaIndexes are bounded by index count per collection (typically <10), and SQLite has no useful batch primitive — pragma_table_info() as a table-valued function would only re-introduce the same per-name dispatch. Not actionable without a measurable hot-path signal.

Readability — fixed two untyped arrow-function parameters (fn ($a) => ... (string) $a) in fulltext-table lookups so they match the string $a convention used at SQLite.php:745, 762, 845. Drops the redundant cast. Committed as refactor(sqlite): type closure params over cast in fulltext lookups.

Maintainability — flagged duplication of the LIKE-method match() block between SQLite and MariaDB is real but tiny (~6 lines), and extracting it would reach into MariaDB's internals which is out of scope here.

Testing — the test diff is just two lines enabling setEmulateMySQL(true). No regression test covers the cross-tenant FTS5 leak that was re-broken multiple times in this PR (commits 47e60b13, 301fbb64, 59c363c5). Adding one would ideally land in SharedTables/Base.php's fulltext suite asserting that tenant B's SEARCH does not return tenant A's tokenised content. Worth following up but outside the scope of this mechanical pass.

Result: 1 readability fix committed; no critical/warning findings remain.

Comment thread src/Database/Adapter/SQLite.php Outdated
: ($index['attributes'] ?? []);

$internal = \array_map(
fn ($a) => $this->getInternalKeyForAttribute((string) $a),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[READABILITY] Fixed in this pass. Untyped fn ($a) plus inline (string) $a cast diverged from the string $a convention used at lines 745, 762, and 845. Replaced with fn (string $a) => $this->getInternalKeyForAttribute($a).

? $index->getAttribute('attributes', [])
: ($index['attributes'] ?? []);

$internal = \array_map(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[READABILITY] Fixed in this pass. Same untyped-closure-plus-cast pattern as line 1046; aligned with the typed convention used elsewhere in the file.

$cache = new Cache(new RedisAdapter($redis));

$database = new Database(new SQLite($pdo), $cache);
$adapter = new SQLite($pdo);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[SUGGESTION] No regression test currently covers the cross-tenant FTS5 isolation that was re-broken multiple times in this PR (most recently 59c363c5). A test in the SharedTables fulltext path that creates an FTS index, indexes content under tenant A, switches to tenant B, and asserts SEARCH returns nothing would fail-fast on any future regression in the trigger WHEN _tenant IS {literal} guards. Not blocking, but worth a follow-up.

Match the (string $a) convention used elsewhere in the file (lines
745, 762) so the redundant `(string)` cast can drop. Behaviour is
unchanged.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

Claude pushed fixes from: improvement

59c363c...8d6956a

Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Multi-dimensional review of the FTS5 / SQLite parity diff (security, performance, correctness, readability, maintainability, testing). Most findings turned out to be false positives or pre-existing patterns inherited from MariaDB; two real issues were fixed in this push.

Fixed in this push

  • [CRITICAL] SQLite.php getLikeCondition array CONTAINS(string) \json_encode($value) lacked the ?: '' fallback that MariaDB.php:1652 already has. json_encode() returns false for NAN/INF, which would degrade the LIKE pattern to %% and silently match every row. Mirrored the MariaDB guard.
  • [WARNING] SQLite.php updateAttribute resize scan — the explicit closeCursor() ran after the fetch on the success path only. If execute() threw, the prepared statement leaked and the cursor stayed live. Wrapped in try { execute; fetch } finally { closeCursor }.

Verified (no change needed)

  • Cross-tenant FTS5 isolation — cache cleared on setTenant/setNamespace/setSharedTables; triggers and backfill scope by _tenant literal. The earlier 59c363c5 / 301fbb64 fixes look complete.
  • Shared-tables perms UNIQUE constraint — looked initially like a _tenant was missing on _index_1 / _index_2, but getSQLIndex (line 1721-1726) prepends _tenant to every index column list under sharedTables, so the on-disk shape matches MariaDB.
  • Permission interpolation in createDocument / updateDocument — pre-existing pattern inherited from MariaDB, not modified by this PR. Out of scope here, but worth a follow-up to parameterize across both adapters.
  • startTransaction / commitTransaction / rollbackTransaction cursor closeBEGIN IMMEDIATE / COMMIT / ROLLBACK / SAVEPOINT don't return result rows, so closeCursor() is a no-op functionally; the destructor frees the statement immediately.
  • N+1 PRAGMA in getSchemaIndexes / buildFulltextAttributeMap — these are cold-path schema introspection callers (per-collection cache); idiomatic SQLite. Not worth the brittleness of parsing sqlite_master.sql.

Not actioned (deferred / out of scope)

  • Test coverage gap — direct FTS5 trigger / sync / cross-tenant tests are missing in tests/e2e/Adapter/SQLiteTest.php. The base e2e suite exercises fulltext indexes against every adapter, but adapter-specific FTS5 isolation under shared tables would benefit from a dedicated regression test.
  • emulateMySQL public API surfacesetEmulateMySQL / getEmulateMySQL are publicly callable but only one caller (the test setup) uses them; consider @internal or a trait if this is a testing-only hook.

? '%' . $this->escapeWildcards((string) \json_encode($value)) . '%'
: '%' . $this->escapeWildcards($value) . '%',
default => $value,
};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[CRITICAL] Fixed in 0021871e. Added ?: '' guard to mirror MariaDB.php:1652 so json_encode() returning false (NAN/INF) doesn't degrade to an empty LIKE pattern.

$exceeds = $stmt->fetchColumn() !== false;
$stmt->closeCursor();

if ($exceeds) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[WARNING] Fixed in 0021871e. Wrapped execute+fetch in try { } finally { closeCursor } so the cursor closes if execute() throws.

`getLikeCondition` for array CONTAINS now mirrors MariaDB's `?: ''`
guard so NAN/INF (which return `false`) don't degrade into an empty
LIKE pattern that silently matches every row.

`updateAttribute` resize scan also leaked the prepared cursor when
`execute()` threw — the explicit `closeCursor()` came after the
fetch and never ran on the failure path. Wrapped in try/finally
so the cursor closes regardless.
@github-actions
Copy link
Copy Markdown
Contributor

Claude pushed fixes from: improvement

8d6956a...18a995b

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Apr 30, 2026

CI healing — SQLite adapter JSON decode regression

What failed: Adapter Tests (SQLite) and (SharedTables/SQLite) — 306 failures, 10 errors. Every failing assertion pointed at CollectionTests.php:1532 (assertEquals(null, 'document_create')).

Root cause: Commit 59c363c removed \json_decode(...) from SQLite::deleteAttribute and SQLite::renameIndex under the assumption that $collection->getAttribute('indexes', []) was already a decoded array. That's only true when the metadata document is loaded via Database::getCollection() (which applies the json filter). These two methods load the metadata directly through the adapter's $this->getDocument(...), which does not apply filters — so indexes was still the raw JSON string from the column. foreach over a string threw TypeError, surfacing as the visible failure in testPurgeCollectionCache (SQLite.php:613).

The cascade to 306 failures came from testEvents (CollectionTests.php:1480): the same TypeError fired inside its $database->deleteAttribute(...) call at line 1600, which is before the listener-cleanup at line 1605 ($database->on(EVENT_ALL, 'test', null)). The 'test' listener leaked into every subsequent test on the shared $database, and any test that triggered an event then hit array_shift() on an empty array (line 1531) and asserted that the event name matches null.

Fix: Decode only when the value is still a string, so both load paths work:

$indexes = $collection->getAttribute('indexes', []);
if (\is_string($indexes)) {
    $indexes = \json_decode($indexes, true) ?? [];
}

Applied at SQLite.php:611 (deleteAttribute) and SQLite.php:664 (renameIndex).

Verification: Ran the full SQLite and SharedTables/SQLite suites in the docker stack — both green:

  • SQLiteTest: Tests: 627, Assertions: 7771, Skipped: 8
  • SharedTables/SQLiteTest: Tests: 627, Assertions: 7779, Skipped: 8

Pint passes on the modified file.

Memory adapter's lone failure (testIncreaseDecrease at DocumentTests.php:1450, two timestamps colliding at the same millisecond) is an unrelated pre-existing flake — left untouched.

…/renameIndex

deleteAttribute and renameIndex call $this->getDocument(metadata, ...)
on the adapter directly, which doesn't apply the metadata 'json' filter
— so $collection->getAttribute('indexes', []) returns the raw JSON
string from the column. Commit 59c363c removed the json_decode under
the assumption the value was always pre-decoded; that's only true when
loaded via Database::getCollection().

The TypeError thrown from foreach() over a string surfaced as
testPurgeCollectionCache failing, and worse, testEvents threw before
its $database->on(..., null) cleanup ran — leaking the EVENT_ALL
listener into every later test that triggered an event, which is why
306 tests on SQLite started asserting '' matches null at
CollectionTests.php:1532.

Decode only when the value is still a string so both call paths work.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

Claude pushed fixes from: healing

18a995b...464a47d

Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Ran a six-dimension parallel review against main. Summary by dimension:

  • Security — REGEXP UDF had two real DoS vectors: an unbounded delimited-pattern cache and no length cap / ReDoS protection. Identifier interpolation, FTS5 MATCH binding, perms scopes, PRAGMA inputs are all clean. (2 warnings → fixed)
  • Performance — Schema introspection paths run N+1 PRAGMAs, FTS5 sync triggers re-tokenise on every metadata write, the FTS5 backfill is unbounded, and BEGIN IMMEDIATE serialises read-only transactions too. These are real but require larger structural changes; not blocking. (0 fixed this cycle)
  • CorrectnessresolveFulltextTableById swallowed every \Throwable, masking PDO errors and silently falling through to the single-candidate FTS5 drop (could tear down the wrong vtable). FTS-cache invalidation in setTenant / setNamespace / setSharedTables happened before the parent setter, so a thrown validation would leave a cleared cache against the prior tenant. getSQLIndexType returned a placeholder 'INDEX' for INDEX_FULLTEXT instead of throwing. (3 warnings → fixed)
  • Readability — A handful of magic strings and naming nits; nothing blocking.
  • Maintainability — ~300 lines duplicated with MariaDB (relationships, LIKE compilation, metadata walking, transaction-with-rollback boilerplate). Worth a follow-up refactor PR; out of scope here.
  • Testing — No SQLite-specific assertions; both fixtures hardcode setEmulateMySQL(true), leaving the default false branch entirely uncovered. Cross-tenant FTS isolation, BEGIN IMMEDIATE serialisation, and several recent-commit fixes have no regression coverage. Recommend a follow-up to add adapter-specific tests.

Fixes applied this cycle

  1. REGEXP UDF: cap user pattern length at 512 chars, FIFO-evict the delimited-pattern cache at 256 entries, switch the delimiter to chr(1) so user-supplied / and \ aren't re-escaped.
  2. resolveFulltextTableById: narrowed catch (\Throwable) to catch (NotFoundException).
  3. setTenant / setNamespace / setSharedTables: invalidate FTS cache after the parent setter.
  4. getSQLIndexType: throw DatabaseException for INDEX_FULLTEXT instead of silently returning a placeholder.

Not fixed this cycle (deferred)

  • N+1 PRAGMA in getSchemaIndexes / buildFulltextAttributeMap (perf — needs pragma_* table-valued-function rewrite).
  • FTS5 trigger re-tokenisation on unchanged content (perf — needs WHEN OLD.col IS NOT NEW.col per-column gating).
  • BEGIN IMMEDIATE blocking read-only transactions (perf/lock-contention — needs read/write split).
  • getSQLConditionsForCollection thin-wrapper redundancy (maintainability).
  • Adapter-specific test coverage gaps (testing — needs a separate test PR).

Comment thread src/Database/Adapter/SQLite.php Outdated
if ($pattern === null || $value === null) {
return 0;
}
$delimited = $delimitedCache[$pattern] ??= '/' . \str_replace('/', '\/', $pattern) . '/u';
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[WARNING] Unbounded $delimitedCache keyed on raw user pattern — long-lived adapters (Swoole / pool case) processing many distinct REGEXP patterns grow this map without limit, and the key is the raw pattern itself (potentially MB-long) so an attacker gets cheap memory exhaustion. Also no pattern-length cap means a worker can be pinned by ^(a+)+$-style ReDoS amplified by row count, and '/' . str_replace('/', '\\/', $pattern) . '/u' re-escapes wrong on patterns already containing \/. Fix: cap pattern length at 512, FIFO-evict the cache at 256 entries, use chr(1) as delimiter.

* when metadata doesn't reach a candidate.
*
* @param array<string> $candidates
*/
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[WARNING] catch (\Throwable) masks every getDocument failure as "metadata not found" and falls through to the single-candidate FTS5 drop (dropFulltextIndexById line 957), which can tear down the wrong vtable on a transient PDO error. Narrow to catch (NotFoundException) so genuine failures surface.

public function setTenant(int|string|null $tenant): bool
{
if ($this->tenant !== $tenant) {
$this->ftsTableCache = [];
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[WARNING] Cache is cleared before parent::setTenant(). If the parent ever throws (or a future override does), we're left with a cleared cache against the prior tenant — equivalent to silently mis-attributing FTS5 lookups across tenants. Same pattern in setNamespace (line 118) and setSharedTables (line 126). Fix: invalidate after the parent call returns.

@@ -1063,6 +1673,13 @@ protected function getSQLIndexType(string $type): string
case Database::INDEX_UNIQUE:
return 'UNIQUE INDEX';

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[WARNING] Returning 'INDEX' for INDEX_FULLTEXT papers over a programmer error — fulltext goes through createFulltextIndex and reaching this case would emit invalid SQL into a CREATE INDEX template downstream. The comment even says "this branch should never reach SQL generation". Replace with a thrown DatabaseException so a misuse fails loudly.

]);
}

// PRAGMA index_list misses FTS5 vtables.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[WARNING — deferred] getSchemaIndexes() runs PRAGMA index_info per index plus findFulltextTables() plus PRAGMA table_info per FTS table — 1 + N + 1 + M round trips. Replace with a single query joining pragma_index_list(?) il to pragma_index_info(il.name) ii (and similarly fold the FTS table-info loop). Not fixed this pass; worth a follow-up.

CREATE TRIGGER `{$ftsTable}_{$deleteSuffix}` AFTER DELETE ON `{$parentTable}`{$deleteWhen} BEGIN
INSERT INTO `{$ftsTable}` (`{$ftsTable}`, rowid, {$columnList}) VALUES ('delete', OLD.`_id`, {$oldColumnList});
END
";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[WARNING — deferred] AFTER UPDATE OF {columns} fires on assignment, not value-change, so Database::updateDocument (which rewrites every column) re-tokenises FTS5 content on every metadata-only write. Add WHEN OLD.col1 IS NOT NEW.col1 OR ... per indexed column to skip the delete/reinsert pair when content didn't change. Significant under high write workloads.


$result = $this->getPDO()->beginTransaction();
$result = $this->getPDO()
->prepare('BEGIN IMMEDIATE')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[WARNING — deferred] BEGIN IMMEDIATE reserves the writer lock unconditionally, so read-only transactions now block behind any other writer — defeats WAL concurrency. Also busy_timeout isn't set on the connection, so IMMEDIATE contention fails immediately rather than waiting. Consider read/write split (DEFERRED for read-only TX, IMMEDIATE only when caller will write) and a PRAGMA busy_timeout baseline.

- REGEXP UDF: cap pattern length at 512 chars and FIFO-evict the
  delimited-pattern cache at 256 entries. Switch the delimiter to
  chr(1) so user-supplied forward slashes / backslashes aren't
  re-escaped. Long-lived adapters can no longer be grown without
  bound by distinct user patterns.
- resolveFulltextTableById: narrow the blanket Throwable catch to
  NotFoundException so genuine PDO errors don't fall through to
  the single-candidate FTS5 drop and tear down the wrong table.
- setTenant / setNamespace / setSharedTables: invalidate the FTS
  table cache after the parent setter so a thrown validation
  doesn't leave us with a cleared cache against the prior value.
- getSQLIndexType: throw on Database::INDEX_FULLTEXT instead of
  silently returning 'INDEX' — the FTS5 path goes through
  createFulltextIndex; reaching that case is a programmer error
  that should surface, not be papered over.
@github-actions
Copy link
Copy Markdown
Contributor

Claude pushed fixes from: improvement

464a47d...93a0575

Comment thread src/Database/Adapter/MariaDB.php Outdated
The LIKE `%json_encode%` pattern produces false positives on array
columns: searching `2` in a TINYINT array matches `[12, 200]`,
searching `"apple"` matches `["pineapple"]`, etc. Route onArray()
CONTAINS / CONTAINS_ANY / NOT_CONTAINS through exact-element
operators instead:

- MariaDB: JSON_OVERLAPS when supported, JSON_CONTAINS per element
  OR'd together as fallback. The LIKE branch no longer special-cases
  arrays.
- SQLite: EXISTS (SELECT 1 FROM json_each(col) WHERE value IN (...))
  via a new buildArrayContainsCondition routed from getSQLCondition
  before the LIKE handler.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@abnegate abnegate merged commit a1bb3e2 into main Apr 30, 2026
22 of 23 checks passed
@abnegate abnegate deleted the feat-sqlite-fts branch April 30, 2026 09:59
abnegate added a commit that referenced this pull request Apr 30, 2026
Bring in SQLite FTS5 + PCRE + upsert + PRAGMA introspection from PR #870,
translated to the typed Attribute/Index/Relationship API and the
Capability enum. SQLite gains:
- FTS5-backed fulltext indexes (createFulltextIndex / dropFulltextIndexById /
  resolveFulltextTableById, FTS5-aware deleteCollection / deleteIndex /
  getSizeOfCollection, attribute-aware FTS routing)
- PHP REGEXP UDF with a bounded LRU pattern cache, gating Capability::PCRE
  dynamically once registered
- BEGIN IMMEDIATE writer serialisation
- PRAGMA-based getSchemaAttributes / getSchemaIndexes
- Per-statement createRelationship / updateRelationship / deleteRelationship
  (SQLite PDO can't run multi-statement strings)
- setEmulateMySQL flag gating MariaDB-shape emulation across createCollection
  (_tenant column type), getSchemaAttributes (column-info shape), and
  updateAttribute (resize-down with TruncateException guard)
- SQL.php gains getSQLConditionsForCollection threading $forCollection so the
  FTS5 routing surfaces consistently across MariaDB / Postgres / SQLite

phpstan-baseline shrinks 18 entries — the typed getPDO override and
narrowed mixed flow on the new paths resolve the prior ignores.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants