Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Update search index restore playbook wrt k8s.
Update the playbook for restoring the search indices from backup snapshots, with regard to the move to Kubernetes. - Remove references to `ssh` and update the example commands. - Remove an unnecessary step (no need to list the available snapshot repos, because there's only one backup repo and its name is fixed). - Add missing steps for how to tell when the restore is done. - Add a version of the procedure for restoring all the indices, so that we can recover from losing the search cluster.
- Loading branch information
Showing
1 changed file
with
64 additions
and
30 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,56 +1,90 @@ | ||
--- | ||
owner_slack: "#govuk-searchandnav" | ||
title: Backup and restore Elasticsearch indices | ||
title: Restore Elasticsearch indices from backup | ||
parent: "/manual.html" | ||
layout: manual_layout | ||
section: Backups | ||
--- | ||
|
||
GOV.UK uses AWS Managed Elasticsearch which takes daily snapshots of | ||
the cluster as part of the managed service. These are stored in a S3 | ||
bucket that is not made available to us. Restoration is done by | ||
making HTTP requests to the `_snapshot` endpoint. | ||
## Background | ||
|
||
To restore a snapshot, follow these steps: | ||
AWS Managed Elasticsearch automatically takes hourly snapshots for backup and | ||
disaster recovery purposes. The snapshot data is stored in an Amazon-owned S3 | ||
bucket that is not directly available to us via S3 but is configured as an | ||
Elasticsearch snapshot repository called `cs-automated-enc`. | ||
|
||
0. SSH to a `search` box: | ||
Restores are done via the Elasticsearch API, by making HTTP requests to the | ||
`_snapshot` endpoint. | ||
|
||
``` | ||
gds govuk connect ssh -e integration search | ||
``` | ||
We also have a `govuk-production` snapshot repository, which is normally only | ||
used for copying indices from production to the non-production environments. | ||
|
||
0. Query the `_snapshot` endpoint of Elasticsearch to get the snapshot | ||
repository name: | ||
## Restore a specific index from a snapshot | ||
|
||
``` | ||
govuk_setenv search-api \ | ||
bash -c 'curl "$ELASTICSEARCH_URI/_snapshot?pretty"' | ||
1. List the available backup snapshots in the `cs-automated-enc` snapshot | ||
respository and identify the snapshot that you want to restore from. | ||
|
||
```sh | ||
k exec deploy/search-api -- \ | ||
sh -c 'curl "$ELASTICSEARCH_URI/_snapshot/cs-automated-enc/_all?pretty"' | ||
``` | ||
|
||
0. Query the `_all` endpoint to identify the available snapshots in | ||
the named repository: | ||
This can take a few seconds. | ||
|
||
``` | ||
govuk_setenv search-api \ | ||
bash -c 'curl "$ELASTICSEARCH_URI/_snapshot/<repository-name>/_all?pretty"' | ||
2. If an index already exists with the same name as the one you want to | ||
restore, delete the existing index. | ||
|
||
```sh | ||
k exec deploy/search-api -- sh -c 'curl -XDELETE "$ELASTICSEARCH_URI/<index-name>"' | ||
``` | ||
|
||
0. If an index already exists with the same name as the one being | ||
restored, delete the existing index: | ||
3. Restore the index from the snapshot. Fill in `<snaphot-id>` and | ||
`<index-name>` as appropriate. | ||
|
||
```sh | ||
k exec deploy/search-api -- \ | ||
sh -c 'curl -XPOST -H 'Content-Type: application/json' "$ELASTICSEARCH_URI/_snapshot/cs-automated-enc/<snapshot-id>/_restore" -d "{\"indices\": \"<index-name>\"}"' | ||
``` | ||
govuk_setenv search-api \ | ||
bash -c 'curl -XDELETE "$ELASTICSEARCH_URI/<index-name>"' | ||
|
||
4. The restore can take a few minutes. The `/_cat/recovery` resource gives an | ||
indication of progress. | ||
|
||
```sh | ||
k exec deploy/search-api -- sh -c 'curl "$ELASTICSEARCH_URI/_cat/recovery"' | ||
``` | ||
|
||
0. Restore the index from the snapshot: | ||
5. Once the restore has finished, [reprocess any content | ||
changes](/manual/fix-out-of-date-search-indices.html) that happened after | ||
the backup. | ||
|
||
> The reprocessing step is necessary in order to bring the restored index up | ||
> to date, because GOV.UK's indexing is incremental only. In other words, | ||
> there is no regular full reindex. | ||
## Restore all indices from a snapshot | ||
|
||
Restoring all indices is a similar procedure to restoring a specific index. | ||
|
||
1. Identify the snapshot to restore. See step 1 above. | ||
|
||
1. Delete all indices. | ||
|
||
```sh | ||
k exec deploy/search-api -- sh -c 'curl -XDELETE "$ELASTICSEARCH_URI/_all"' | ||
``` | ||
govuk_setenv search-api \ | ||
bash -c 'curl -XPOST -H 'Content-Type: application/json' "$ELASTICSEARCH_URI/_snapshot/<repository-name>/<snapshot-id>/_restore" -d "{\"indices\": \"<index-name>\"}"' | ||
|
||
1. Restore all indices from the snapshot. | ||
|
||
```sh | ||
k exec deploy/search-api -- \ | ||
sh -c 'curl -XPOST "$ELASTICSEARCH_URI/_snapshot/cs-automated-enc/<snapshot-id>/_restore"' | ||
``` | ||
|
||
> Further information about Elasticsearch snapshots can be found in the [AWS documentation](https://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/es-managedomains-snapshots.html) | ||
1. Once the restore has finished, reprocess recent content changes to bring the | ||
indices up to date. See steps 4 and 5 above. | ||
|
||
## Further reading | ||
|
||
After a restore has taken place, you will need to [fix the out-of-date search indices](/manual/fix-out-of-date-search-indices.html) | ||
following the restore, since any changes made in publishing apps since the backup was taken will be missing. | ||
See [Restoring | ||
snapshots](https://docs.aws.amazon.com/opensearch-service/latest/developerguide/managedomains-snapshots.html#managedomains-snapshot-restore) | ||
in the AWS Managed Elasticsearch/Opensearch documentation. |