Skip to content

feat(db): add missing sqlite indices to help with performance#9141

Merged
eliandoran merged 2 commits into
mainfrom
feat/add-missing-sqlite-indices
Mar 30, 2026
Merged

feat(db): add missing sqlite indices to help with performance#9141
eliandoran merged 2 commits into
mainfrom
feat/add-missing-sqlite-indices

Conversation

@perfectra1n
Copy link
Copy Markdown
Member

@perfectra1n perfectra1n commented Mar 22, 2026

Adds 7 new indices and replaces 1 existing index to improve query performance on frequently-accessed code paths. It also fixes a bug where IDX_attachments_utcDateScheduledForErasureSince was missing from schema.sql for fresh installs.

Index Table Columns New / Replaced
IDX_entity_changes_isSynced_id entity_changes (isSynced, id) New
IDX_entity_changes_isErased_entityName entity_changes (isErased, entityName) New
IDX_notes_isDeleted_utcDateModified notes (isDeleted, utcDateModified) New
IDX_branches_isDeleted_utcDateModified branches (isDeleted, utcDateModified) New
IDX_attributes_isDeleted_utcDateModified attributes (isDeleted, utcDateModified) New
IDX_attachments_isDeleted_utcDateModified attachments (isDeleted, utcDateModified) New
IDX_attachments_utcDateScheduledForErasureSince attachments (utcDateScheduledForErasureSince) Bug fix (schema.sql only)
IDX_branches_parentNoteId_isDeleted_notePosition branches (parentNoteId, isDeleted, notePosition) Replaces IDX_branches_parentNoteId
  1. IDX_entity_changes_isSynced_id, entity_changes (isSynced, id)

Impact: High, sync is a core, frequently-executed operation.

The sync push loop is the hottest query path during synchronization. It runs repeatedly in a loop until all changes are pushed, processing 1000 rows at a time. The entity_changes table grows unbounded (every edit appends a row), so full table scans become increasingly expensive over time.

Queries served (5 across 4 files):

File Line Query Pattern
services/sync.ts 207 SELECT * FROM entity_changes WHERE isSynced = 1 AND id > ? LIMIT 1000
services/sync.ts 275 SELECT EXISTS(SELECT 1 FROM entity_changes WHERE isSynced = 1 AND id > ?)
routes/index.ts 60 SELECT COALESCE(MAX(id), 0) FROM entity_changes WHERE isSynced = 1
routes/api/sync.ts 60 SELECT COALESCE(MAX(id), 0) FROM entity_changes WHERE isSynced = 1
routes/api/login.ts 119 SELECT COALESCE(MAX(id), 0) FROM entity_changes WHERE isSynced = 1

Why this column order: The composite (isSynced, id) allows SQLite to jump directly to isSynced = 1 in the B-tree, then range-scan forward from id > ?. This turns a full table scan + filter into a single contiguous index range scan. The MAX(id) queries benefit similarly, SQLite can read the last entry in the isSynced = 1 segment.


  1. IDX_entity_changes_isErased_entityName, entity_changes (isErased, entityName)

Impact: Medium, used in consistency checks and entity change maintenance.

The existing unique index (entityName, entityId) does not help when isErased is the leading filter condition. These queries run during consistency checks (periodic integrity validation) and during fillAllEntityChanges (a repair/rebuild operation).

Queries served (4 across 2 files):

File Line Query Pattern
services/entity_changes.ts 115–121 DELETE FROM entity_changes WHERE isErased = 0 AND entityName = '...' AND entityId NOT IN (...)
services/entity_changes.ts 181 DELETE FROM entity_changes WHERE isErased = 0
services/consistency_checks.ts 757 SELECT ... FROM entity_changes WHERE entity_changes.isErased = 0 AND entity_changes.entityName = '...'
services/consistency_checks.ts 777 SELECT ... FROM entity_changes WHERE entity_changes.isErased = 1 AND entity_changes.entityName = '...'

3–6. IDX_{table}_isDeleted_utcDateModified, (isDeleted, utcDateModified) on notes, branches, attributes, attachments

Impact: Medium-High, periodic cleanup that scans all four core tables.

The eraseDeletedEntities() function in services/erase.ts runs every 4 hours and once at startup (after a 5 minute delay). It finds soft-deleted entities older than a configurable threshold and permanently erases them. Without indices, each of these four queries performs a full table scan.

Since isDeleted = 1 typically represents a tiny fraction of rows, the index is highly selective, SQLite jumps to the deleted rows in the B-tree, then range-scans utcDateModified <= ? within that subset.

Queries served (4 in services/erase.ts lines 126–135):

SELECT noteId FROM notes WHERE isDeleted = 1 AND utcDateModified <= ?
SELECT branchId FROM branches WHERE isDeleted = 1 AND utcDateModified <= ?
SELECT attributeId FROM attributes WHERE isDeleted = 1 AND utcDateModified <= ?
SELECT attachmentId FROM attachments WHERE isDeleted = 1 AND utcDateModified <= ?

Note: These indices also partially serve the eraseNotesWithDeleteId() queries (WHERE isDeleted = 1 AND deleteId = ? in services/erase.ts lines 143–152). The isDeleted = 1 prefix narrows to the small deleted set, making the subsequent deleteId filter cheap even without a dedicated (isDeleted, deleteId) composite.


  1. IDX_attachments_utcDateScheduledForErasureSince, Bug Fix (schema.sql only)

Impact: Correctness fix, this index was missing for fresh database installations.

Migration 219 (which introduced the attachments table) created this index for databases that upgrade. However, the index was accidentally omitted from schema.sql, meaning any database created fresh (new installations) has been missing it. This migration does not re-create the index (migration 219 already handles existing databases); the fix is adding it to schema.sql so new installs get it.

Query served (services/erase.ts line 173):

SELECT attachmentId FROM attachments WHERE utcDateScheduledForErasureSince < ?

This runs hourly to clean up attachments that are no longer referenced in note content.


  1. IDX_branches_parentNoteId_isDeleted_notePosition, Replaces IDX_branches_parentNoteId

Impact: High, the most frequently queried pattern on the branches table (11 queries across 7 files).

The old single-column IDX_branches_parentNoteId index finds rows by parent but then must scan all matching rows to filter isDeleted and sort by notePosition. The new composite (parentNoteId, isDeleted, notePosition) makes all of these operations a single contiguous B-tree range scan with no extra filtering or sorting required.

The old index is dropped in this migration since the new composite index fully subsumes it (any query that used parentNoteId alone will use the leading column of the new composite).

Queries served (11 across 7 files):

File Line Query Pattern
services/branches.ts 16 SELECT MAX(notePosition) FROM branches WHERE parentNoteId = ? AND isDeleted = 0
services/entity_changes.ts 63 SELECT branchId, notePosition FROM branches WHERE isDeleted = 0 AND parentNoteId = ?
services/sync.ts 342 SELECT branchId, notePosition FROM branches WHERE parentNoteId = ? AND isDeleted = 0
services/sql_init.ts 144 SELECT noteId FROM branches WHERE parentNoteId = 'root' AND isDeleted = 0 ORDER BY notePosition
services/notes.ts 296 UPDATE branches SET notePosition = notePosition + 10 WHERE parentNoteId = ? AND notePosition > ? AND isDeleted = 0
services/notes.ts 309 UPDATE branches SET notePosition = notePosition - 10 WHERE parentNoteId = ? AND notePosition < ? AND isDeleted = 0
services/cloning.ts 100 SELECT branchId FROM branches WHERE noteId = ? AND parentNoteId = ? AND isDeleted = 0
services/cloning.ts 167 UPDATE branches SET notePosition = notePosition + 10 WHERE parentNoteId = ? AND notePosition > ? AND isDeleted = 0
routes/api/branches.ts 50 UPDATE branches SET notePosition = notePosition + 10 WHERE parentNoteId = ? AND notePosition >= ? AND isDeleted = 0
routes/api/branches.ts 97 UPDATE branches SET notePosition = notePosition + 10 WHERE parentNoteId = ? AND notePosition > ? AND isDeleted = 0
share/shaca/shaca_loader.ts 57 SELECT ... FROM branches WHERE isDeleted = 0 AND parentNoteId IN (...) ORDER BY notePosition

Was also with the help of olav: https://github.com/orgs/TriliumNext/discussions/9138

@dosubot dosubot Bot added the size:S This PR changes 10-29 lines, ignoring generated files. label Mar 22, 2026
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on enhancing database performance by introducing a set of new SQLite indices. These additions aim to optimize common queries across several key tables, ensuring smoother and more efficient data retrieval. A new database migration has been added to seamlessly integrate these schema changes into existing installations.

Highlights

  • Database Performance: Added several new SQLite indices to the entity_changes, notes, branches, attributes, and attachments tables to improve query performance.
  • Database Migration: Implemented a new database migration (version 235) to ensure these new indices are applied correctly during database updates.
  • Version Update: Incremented the APP_DB_VERSION to reflect the new database schema changes.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces several new database indices to improve query performance. The changes are well-structured, including updates to the main schema file, a new database migration for existing installations, and an update to the database version constant. I've found one minor issue regarding a redundant index creation in the new migration, which I've detailed in a specific comment.

Comment thread apps/server/src/migrations/migrations.ts
@dosubot dosubot Bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:S This PR changes 10-29 lines, ignoring generated files. labels Mar 22, 2026
Copy link
Copy Markdown
Contributor

@eliandoran eliandoran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@perfectra1n , what's the size impact of these indices?

@perfectra1n
Copy link
Copy Markdown
Member Author

@eliandoran measured on a 787 MB production database after applying migration 235.

Metric Value
Database size 787 MB
Page size 4,096 bytes
Pages before indices 201,225
Pages after indices 202,126
Pages added 901
Total index overhead ~3.7 MB (0.46% of DB)

Index Sizes

Index Size Pages
IDX_entity_changes_isErased_entityName 1.4 MB 363
IDX_entity_changes_isSynced_id 1.0 MB 257
IDX_attachments_isDeleted_utcDateModified 533 KB 133
IDX_branches_isDeleted_utcDateModified 168 KB 42
IDX_notes_isDeleted_utcDateModified 160 KB 40
IDX_attributes_isDeleted_utcDateModified 136 KB 34
IDX_attachments_utcDateScheduledForErasureSince 132 KB 33
IDX_branches_parentNoteId_isDeleted_notePosition 120 KB 30

Table Row Counts

Table Total Rows
entity_changes 80,857
attachments 16,088
branches 5,000
notes 4,833
attributes 4,017

@perfectra1n
Copy link
Copy Markdown
Member Author

That was using my own personal Trilium DB - if there's another example or something you'd like as an additional data point, just let me know

@eliandoran eliandoran added this to the v0.103.0 milestone Mar 30, 2026
Copy link
Copy Markdown
Contributor

@eliandoran eliandoran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@perfectra1n , what's the size impact of these indices?

@dosubot dosubot Bot added the lgtm This PR has been approved by a maintainer label Mar 30, 2026
@eliandoran eliandoran merged commit 952d6b9 into main Mar 30, 2026
10 of 11 checks passed
@eliandoran eliandoran deleted the feat/add-missing-sqlite-indices branch March 30, 2026 18:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm This PR has been approved by a maintainer size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants