Reconciles 172 ahead / 37 behind state with upstream wesm/msgvault.
Strategy: accept upstream wholesale for connector code (M365, iMessage,
gvoice, IMAP XOAUTH2) where upstream's implementations are more
battle-tested and already cover the fork's bug fixes. Hand-resolve
store/sync/search/cmd/build files to union both feature sets.
Preserved from fork:
- SQLCipher encryption-at-rest (passphrase, AES-GCM token encryption)
- Advisory file locking (tryLock, lockFile, syscall.Flock)
- AI Archive Intelligence subsystem (internal/embedding/, vec_messages,
pipeline_runs, --semantic search)
- Web UI (React/TypeScript SPA in web/)
- Hot-path search tokenizer (dispatchToken, toLowerFast, parseSizeFast)
- migrateAddContentID, InitVectorTable, content_id attachment column
Adopted from upstream:
- Dialect interface + loggedDB wrapper + structured logging pipeline
(kenn-io#276 PostgreSQL dialect refactor foundation)
- OpenReadOnly() for MCP read-only access
- IsBusyError, SchemaStale helpers
- Unified text import (kenn-io#238) — M365 OAuth (kenn-io#228), iMessage (kenn-io#224),
Google Voice (kenn-io#225) — all wholesale
- Search enhancements: regex, FTS5 snippets, sorting (kenn-io#252),
domain normalization (normalizeAddr, looksLikeDomain, gTLDs)
- rebuild-fts command (kenn-io#287), 8 bug fixes from kenn-io#254
- IMAP date filtering (kenn-io#222), greeting wait (kenn-io#248)
- Vector subsystem (kenn-io#277) — coexists with fork's AI Archive
Intelligence as parallel implementation; future cleanup needed
Build/runtime fixes applied during merge:
- Replaced mattn/go-sqlite3 imports with mutecomm/go-sqlcipher/v4
(drop-in API-compatible) to resolve duplicate symbol linker errors
- Dropped sqlite_vec from default BUILD_TAGS (requires SQLite 3.38+
APIs sqlcipher v4.4.2 does not expose; re-enable when sqlcipher
upgrades)
- safeRowsAffected helper in db_logger.go: defer recover around
RowsAffected() call (sqlcipher returns nil internal Result for
multi-statement DDL)
- Wired normalizeAddr into hot-path tokenizer for from:/to:/cc:/bcc:
Stubbed under unreachable build tag (need follow-up decision):
- cmd/msgvault/cmd/sync_gvoice.go — fork's sync API obsolete vs
upstream's import-based gvoice
- cmd/msgvault/cmd/sync_imessage.go — same situation
Verified: go build ./... passing, go vet clean, 45/45 test packages
pass with 0 failures. See MERGE_REPORT.md for file-by-file resolution
notes.
Fix and Improve search accuracy and performance across full-text, fast (Parquet), and aggregate query paths with word-boundary matching, better snippet handling, and consistent result sorting.
Problem
While importing my old 1990's era email archives, I discovered a number of issues with search
Proposed Solution
Changes
snippet inclusion, sort parameter, escapeRegex helper
Testing
substring matches