feat(search): IndexedDB-backed persistent search index for all rooms#881
Open
Just-Insane wants to merge 23 commits into
Open
feat(search): IndexedDB-backed persistent search index for all rooms#881Just-Insane wants to merge 23 commits into
Just-Insane wants to merge 23 commits into
Conversation
Split the search room list into encrypted / plaintext buckets. Server search covers plaintext rooms unchanged. Encrypted rooms are searched synchronously against their in-memory live timeline so decrypted content is always available. Key details: - partitionRoomsByEncryption() splits the room filter; for global search (rooms=undefined) all joined encrypted rooms are scanned - In-memory results are merged into the first page only (no pagination token for local results) - For 'recent' order, groups are interleaved by timestamp; for 'rank' order, server results come first - An info banner is shown when encrypted rooms were searched so users know coverage is limited to cached messages - Controlled by features.encryptedSearch in config.json (default true) - 18 unit tests covering matching, filtering, partitioning, merging
…arch - Adds 'Encrypted Room Search' toggle to Settings > Experimental - Setting defaults to true; operator can hard-disable via config.json features.encryptedSearch = false - Lock icon shown next to encrypted rooms in the search room picker when the feature is active, indicating local-cache coverage - useMessageSearch now checks both the operator flag and user setting
Removes the guard that hid the search icon in encrypted room headers. Encrypted rooms now navigate to message search pre-filtered to that room, showing in-memory results when the feature is enabled. Tooltip reads "Search (local cache)" for encrypted rooms.
- Fix DM rooms missing from search: replace useRooms (excludes DMs) with useSelectedRooms+isRoom selector so DM room IDs pass URL param validation; room picker always uses the full allRooms list - Add SearchHasType (image/file/audio/video/link) to searchEncryptedRooms.ts with mEventMatchesHasTypes filtering in in-memory timeline search - Add hasTypes to MessageSearchParams; pass contains_url:true for has:link on server requests; post-filter server results by msgtype/URL pattern - Add HasFilterChips and SelectSenderButton components to SearchFilters; new has: row with Image/File/Audio/Video/Link toggles plus From: sender chips with Matrix ID input popup - Wire has URL param through MessageSearch: parse, encode, pass to SearchFilters and msgSearchParams; add handleHasTypesChange/handleSendersChange
- Fix mDirects undefined crash in MessageSearch (re-add atom import) - Allow has: filters to trigger search without a text term - searchEncryptedRooms: skip body text check when lowerTerm is empty - useMessageSearch: only early-return when both term and hasTypes are absent - When no term: skip server search (server requires search_term), in-memory only - MessageSearch: enable query when hasTypes is set even without a term - Add DM search page at /direct/search/ - DIRECT_SEARCH_PATH constant in paths.ts - getDirectSearchPath() helper in pathUtils.ts - useDirectSearchSelected() hook in useDirectSelected.ts - DirectSearch component (scoped to DM rooms) - Route registered in Router.tsx - 'Message Search' nav item added to Direct Messages panel - RoomViewHeader: clicking search in a DM navigates to DM search
…r for from: filter
…display names; fix DM create button alignment
…o unencrypted rooms
Typing > in the room search modal switches to message search mode. A 'Search messages: <query>' item appears; pressing Enter or clicking it navigates to the context-appropriate message search page with the term pre-filled: - /direct/ context → DM message search - /:spaceIdOrAlias/ context → space message search - /home/ or other → home message search The hint text is updated to include > for messages. The prefix is disabled when the modal is used for room-picking (forwarding).
- Install MiniSearch 7.2.0 for TypeScript-native full-text search - Add idbSearchIndex and searchIndexMessageLimit to settings atom - Create SearchIndexToggle experimental toggle (second opt-in) - Add searchWorker.ts Web Worker owning MiniSearch index + IDB persistence - IDB schema: 'index' store (serialised index + room queues), 'backfill' store - Multi-tab write safety via navigator.locks - Debounced flush (5s) + beforeunload flush - Per-room LRU eviction when queue exceeds 110% of configured limit - Create useSearchIndex.tsx React context + hook - Live indexing via RoomEvent.Timeline listener - Headless EventTimelineSet backfill in idle callbacks - Query, getStats, clearIndex public API - Wrap ClientNonUIFeatures in SearchIndexProvider - Add SearchIndexCache to Developer Tools: stats, per-room limit selector, backfill progress, clear button (auto-refreshes every 5s) - Wire useMessageSearch to use IDB index when idbSearchIndex is enabled - Export EMPTY_CONTEXT from searchEncryptedRooms for reuse
…pted; update UI text - Remove isRoomEncrypted guard from indexEvent and startBackfill so all non-space rooms are backfilled and live-indexed (not just encrypted ones) - Add IDB chip-only query path for unencrypted rooms in useMessageSearch (useIdbSearch flag, usedIdbForUnencrypted for accurate inMemoryRoomCount) - Rename 'Encrypted Search Index' → 'Message Search Index' throughout UI - Update SearchIndexToggle description to disclose plaintext IDB storage - Update EncryptedSearch description to clarify in-memory-only (no write) - Remove stale mx dep from indexEvent useCallback (isRoomEncrypted removed)
- Cap simultaneous room backfills at 2 (MAX_CONCURRENT_BACKFILLS) so the HTTP connection pool is never saturated by pagination requests, keeping the /sync long-poll responsive on mobile. - Track Matrix sync state in syncStateRef; pause backfill when sync is unhealthy (Error / Reconnecting) and automatically resume via a ClientEvent.Sync listener when it recovers. - Raise the requestIdleCallback fallback delay from 200 ms to 1 s for environments (iOS Safari) that lack the API. - Replace the 'schedule all at once' loop in startBackfill with a proper queue (backfillQueueRef) drained by resumeBackfill().
idbEventsToGroups now looks up each event via mx.getRoom().findEventById() and calls toSearchEvent() for full decrypted content (url, file, info). Falls back to msgtype m.text when the event is no longer in memory, preventing BrokenContent from showing 'Broken message: [filename]'. Regression introduced in 544658d which added ev.msgtype to the synthetic event without providing the media fields renderers require.
…results Extended IndexableEvent with url/file/info/filename fields so media events render correctly from IDB without requiring the live room timeline cache. Changes: - types.ts: add optional url/file/info/filename to IndexableEvent - toIndexableEvent: extract media fields from getContent() for m.image, m.file, m.audio, m.video - searchWorker: add new fields to storeFields; bump IDB schema to v3 (clears old index so all rooms re-backfill with full media content) - idbEventsToGroups: reconstruct full content from stored IDB fields; only fall back to m.text for pre-v3 entries that lack media fields Previously only events still in the live timeline window rendered as images — all older history showed 'Broken message'. After re-backfill, all indexed media events will render with full thumbnails and previews.
…uestIdleCallback is available MAX_CONCURRENT_BACKFILLS is now Infinity on desktop/Android (where the browser's idle scheduler is the natural throttle) and 4 on iOS (where we cap concurrency to protect the HTTP connection pool). Also restores the iOS fallback delay to 150ms (was raised to 1000ms in bf4d8d6, making backfill ~5x slower with no benefit beyond caution).
…useSearchIndex
- MessageSearch.tsx: move VALID_HAS_TYPES to module scope as
Set<SearchHasType> (prefer-set-has); remove redundant !! on isRoom()
call (no-unnecessary-type-conversion).
- useMessageSearch.ts: remove unnecessary 'as IResultContext' on
EMPTY_CONTEXT; remove two unnecessary non-null assertions on searchIndex.
- searchEncryptedRooms.ts: fix no-unsafe-enum-comparison — cast
mEvent.getType() to EventType before comparing with EventType.RoomMessage.
- useSearchIndex.tsx: replace multi-line worker.postMessage with postToWorker()
(eliminates multi-line oxlint-disable-next-line scope issue); add
postToWorker to useEffect dep array; return () => {} in early exits
for consistent-return.
…ing sync begins with an initial window of 100 rooms, so\n`startBackfill` only sees those 100 when it first runs. Additional\nrooms are loaded progressively as the list window expands, firing\n`ClientEvent.Room` on the Matrix client. A new listener for that event\nenqueues each newly-discovered room (using its persisted backfill state\nif present, or a fresh default state) so all rooms are eventually\nindexed, not just the initial 100."
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Adds a Phase 2 IndexedDB-backed persistent search index, building on top of the in-memory encrypted-room search (#871). A MiniSearch 7.2.0 web worker owns the index and IDB persistence with multi-tab write safety via
navigator.locks, debounced flushing, and per-room LRU eviction. Gated behind a second experimental settings toggle ("Message Search Index").The index covers all rooms (not just encrypted ones), which deserves a note on rationale:
For unencrypted rooms the Matrix SDK's own IndexedDB sync cache already holds plaintext event bodies, so at first glance indexing them again looks redundant. However the SDK's IDB is an opaque sync store — it isn't queryable for full-text search or chip-filter scans (Has: Image / File / Audio / Video / Link) without loading events page-by-page into memory, which means results are limited to what happens to be in the live timeline. The MiniSearch index buys O(1) chip-filter lookups across full backfilled history regardless of room type. Only a small set of fields are stored (eventId, roomId, sender, msgtype, ts, body), so the overhead is modest, and the result is consistent search depth across encrypted and unencrypted rooms alike.
Fixes #
Type of change
Checklist:
AI disclosure:
The worker owns a MiniSearch instance and IDB object stores for document records and a pending-flush queue. The main thread posts
ADD_EVENTSmessages after timeline updates; the worker debounces these into batches, acquires anavigator.locksexclusive write lock, and flushes to IDB. On startup the worker rehydrates MiniSearch by reading all stored document records from IDB. An LRU counter per room tracks total stored events; when the cap is exceeded the worker evicts the room with the oldest last-access timestamp and removes its records from IDB and MiniSearch.