Summary
On a large archive (~34 GB, 422k messages), msgvault verify surfaced persistent SQLite corruption:
Tree 26 page 8231140 cell 2: Rowid 421177 out of order
Tree 58 page 8231146 cell 2: Rowid 421177 out of order
malformed inverted index for FTS5 table main.messages_fts
MIME sample verified 100% OK — only the FTS5 index and one message_bodies B-tree page were affected.
Environment
- msgvault: v0.14.0 (corruption also present in v0.12.1)
- OS: macOS 15.1 (Darwin 25.3.0), Apple Silicon
- Archive: 422,009 messages, 34 GB, APFS, SQLite bundled
- Archive grew over ~6 months of continuous sync; no specific incident pinpointed
What didn't work
INSERT INTO messages_fts(messages_fts) VALUES('rebuild') — completes in ~2 min but leaves the same B-tree corruption. Presumably because it regenerates from within the corrupt shadow tables rather than from the source messages/message_bodies/message_recipients/participants tables.
INSERT INTO messages_fts(messages_fts) VALUES('delete-all') — errors: 'delete-all' may only be used with a contentless or external content fts5 table. messages_fts is contentful (no content= clause).
.recover — not feasible with <34 GB free.
What worked
Full manual DROP + recreate + batched backfill using the SQL already present in internal/store/dialect_sqlite.go (FTSBackfillBatchSQL):
- Stop daemon + MCP
DROP TABLE messages_fts;
CREATE VIRTUAL TABLE messages_fts USING fts5(message_id UNINDEXED, subject, body, from_addr, to_addr, cc_addr, tokenize='unicode61 remove_diacritics 1');
- Loop
FTSBackfillBatchSQL over 5000-ID batches (mirrors Store.BackfillFTS)
PRAGMA integrity_check — FTS5 error cleared.
Total time: ~5 min for 422k rows. Peak extra disk: ~3 GB (FTS5 shadow tables for this archive are ~2.5 GB).
The remaining Tree 26 ... Rowid out of order in message_bodies turned out to be a phantom-duplicate row on a corrupt page — resolved via DELETE FROM message_bodies WHERE message_id = <rowid>; INSERT ...; (hit a UNIQUE-constraint error on re-insert, which revealed a shadow duplicate; one copy deleted, page ordering healed).
Final PRAGMA integrity_check: ok. msgvault verify now exits clean.
Suggestions
- Expose a maintenance command such as
msgvault rebuild-fts that runs the same DROP + CREATE + batched backfill — currently users must either know SQLite internals or do a full resync (or .recover, which requires 2× disk).
- Document recovery in
docs/ — the contentful-FTS5 corruption path is non-obvious (the 'rebuild' and 'delete-all' pragmas behave differently than on contentless tables).
- Optional:
serve --skip-db-check flag (mirroring verify --skip-db-check) so a corrupted-but-functional archive can still run the daemon while the user schedules a rebuild.
Happy to PR (1) if you agree with the approach — essentially Store.RebuildFTS() = drop + create-from-schema + call existing Store.BackfillFTS.
Related
Summary
On a large archive (~34 GB, 422k messages),
msgvault verifysurfaced persistent SQLite corruption:MIME sample verified 100% OK — only the FTS5 index and one
message_bodiesB-tree page were affected.Environment
What didn't work
INSERT INTO messages_fts(messages_fts) VALUES('rebuild')— completes in ~2 min but leaves the same B-tree corruption. Presumably because it regenerates from within the corrupt shadow tables rather than from the sourcemessages/message_bodies/message_recipients/participantstables.INSERT INTO messages_fts(messages_fts) VALUES('delete-all')— errors:'delete-all' may only be used with a contentless or external content fts5 table.messages_ftsis contentful (nocontent=clause)..recover— not feasible with <34 GB free.What worked
Full manual DROP + recreate + batched backfill using the SQL already present in
internal/store/dialect_sqlite.go(FTSBackfillBatchSQL):DROP TABLE messages_fts;CREATE VIRTUAL TABLE messages_fts USING fts5(message_id UNINDEXED, subject, body, from_addr, to_addr, cc_addr, tokenize='unicode61 remove_diacritics 1');FTSBackfillBatchSQLover 5000-ID batches (mirrorsStore.BackfillFTS)PRAGMA integrity_check— FTS5 error cleared.Total time: ~5 min for 422k rows. Peak extra disk: ~3 GB (FTS5 shadow tables for this archive are ~2.5 GB).
The remaining
Tree 26 ... Rowid out of orderinmessage_bodiesturned out to be a phantom-duplicate row on a corrupt page — resolved viaDELETE FROM message_bodies WHERE message_id = <rowid>; INSERT ...;(hit a UNIQUE-constraint error on re-insert, which revealed a shadow duplicate; one copy deleted, page ordering healed).Final
PRAGMA integrity_check: ok.msgvault verifynow exits clean.Suggestions
msgvault rebuild-ftsthat runs the same DROP + CREATE + batched backfill — currently users must either know SQLite internals or do a full resync (or.recover, which requires 2× disk).docs/— the contentful-FTS5 corruption path is non-obvious (the 'rebuild' and 'delete-all' pragmas behave differently than on contentless tables).serve --skip-db-checkflag (mirroringverify --skip-db-check) so a corrupted-but-functional archive can still run the daemon while the user schedules a rebuild.Happy to PR (1) if you agree with the approach — essentially
Store.RebuildFTS()= drop + create-from-schema + call existingStore.BackfillFTS.Related
failed to fetch message (nil response)messages #260 (failed to fetch message (nil response)) — hit 11 of these during today's sync but unrelated to the FTS corruption.