Skip to content

Updating

Vladimir Kotal edited this page Feb 22, 2021 · 11 revisions

Updating OpenGrok from one version to another is usually easy:

  1. extract the distribution archive
  2. deploy the new web application

When updating OpenGrok, it is usually prudent to update Universal ctags as well. That said, it is recommended to update them one at a time to avoid surprises. I.e. update OpenGrok first, wait for couple of reindex runs, then update Universal ctags.

In general, the update should happen only if reindex is not in progress.

If you are using the Python tools for managing OpenGrok, this is another thing to update.

Sometimes, there is a case where certain part of configuration changes. Such cases are treated separately and do not usually require reindex from scratch.

Reindexing from scratch

However, in case the index format changed, it is necessary to reindex from scratch, i.e. delete all files under data root and indexing anew. How do you tell that index format changed ? The release notes will contain such information. Since reindexing from scratch is usually costly procedure (both time and resource wise), for most production deployments it should be done in the background, to a dedicated data root (to avoid overwriting existing data), using a snapshot of the source root (to avoid indexing data in flux). Once it is over, the data root can be switched (e.g. by renaming the directories) and the new web application deployed. Since the web application has a dependency on concrete format of the index, it is necessary to switch it only in the last step.

There is a way how to make this step easier, if you are using a file system that supports snapshots and data sets, such as ZFS.

Example

Here is a basic outline of steps to perform reindex from scratch:

  1. make the latest configuration persistent (using the opengrok-projadm Python tool or getting it via the RESTful API)
  2. create temporary data root
  3. create Java logging configuration file that logs everything to a distinct file
  4. extract the distribution from the tar ball into temporary location
  5. if using the Python tools, update them in the temporary location
  6. create stable image of source code
  7. if using read-only configuration, change it to reflect any OpenGrok configuration changes between currently running OpenGrok version and the version we are upgrading to (e.g. new/changed configuration options)
  8. reindex. make sure that:
    • the configuration is not sent to the web application (do not use the -U option)
    • Java has bigger heap (-Xmx)
    • specific logging configuration is used (the one created above)
    • configuration file is written (-W)
    • less worker threads (-T) to impose less load on the system (half the CPUs in the system)
    • set source root to the temporary/stable location of the input data (-s)
    • set the data root to the new location of the data root (-d)
  9. check indexer logs
  10. rewire the source and data root directories
  11. change the values of sourceRoot and dataRoot properties in the configuration written by the indexer
  12. extract the distribution to the original location
  13. upgrade the Python tools in the original location
  14. deploy the new web app
  15. cleanup the old source, data root

It might be good idea to stop Tomcat for the last couple of steps, starting where the source/data directories are changed.

Notes

There is an important aspect to this: the reindex from scratch might take several days for large input data. During these days, the old indexer and the old web app can run just fine, because the new indexer is reading from stable image (e.g. a snapshot) of the source root and writing to separate data root. Thus, the old mirroring/indexing and new mirroring/indexing processes do not stomp on each other.

However, if the mirroring/indexing is running along the reindex from scratch, then just before switching the source root contents will change. This would lead to inconsistency between the new index and the source root. To overcome this, one would stop the periodic mirroring/reindxing and perform separate reindex using fresh source root snapshot. Then the switch over can be performed and periodic mirror/reindex can be enabled once again.

This is something similar to what is done e.g. when migrating a Virtual Machine across physical hosts: the bulk of the memory pages is transferred first and then incremental updates are transferred just before the machine is stopped and brought to life in the new destination.

Just to give you an idea about time, the reindex step (number 8) above can take number of days to complete, while the incremental reindex might take hours or minutes.