Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
Skip binary indexing during upgrade #446
Reindexing the whole db takes too much time. A possible workaround: skip binary text extract (at least for documents) and only index metadata, and provide a tool or API for reindexing files gradually later, while the site is already running.
Options for re-indexing binaries
The chosen solution:
We create small tasks in a dedicated table during indexing for documents that we skipped binaries of. After restarting the site we load and lock these tasks so that they are not executed multiple times and reindex those documents gradually.
During the patch there is no need to save the index document to the index itself, we only have to regenerate the binary serialized document stored in the Versions table (because the format of the index document stored in the Lucene index have not actually changed).