Skip to content

Reindexing documents

Adam Hooper edited this page Jun 23, 2017 · 11 revisions

When Overview slices and dices your documents, it stores some parts of them in different places. The authoritative data store is Postgres. We store text in a Lucene index for a speed boost; that is derived data.

This page explains how to rebuild the Lucene data using the data in Postgres.

Why reindex?

You may wish to reindex:

  • If you had an unexpected failure and you aren't certain your Lucene data is correct
  • If you want to perform an upgrade and this seems like an easy option

Why not reindex?

Reindexing can take a few hours, and it will slow down Overview noticeably.

How to reindex: the "do it now" approach

First, run this SQL incantation:

INSERT INTO document_set_reindex_job
  (document_set_id, last_requested_at, started_at, progress)
SELECT id, created_at, NULL, 0.0
FROM document_set
WHERE NOT deleted

Then restart Overview. (More specifically: restart the Overview worker.)

Clone this wiki locally