Skip to content

Latest commit

 

History

History
103 lines (77 loc) · 3.76 KB

es-snapshot-restore-s3.md

File metadata and controls

103 lines (77 loc) · 3.76 KB

#Elasticsearch snapshot and restore

Elasticsearch snapshot and restore API's allows to create snapshots of individual indices or an entire cluster into a remote repository. The API's allows to take a snapshot and save it to many repository types like file system, shared UNC paths, Amazon S3 (and other cloud providers), HDFS and etc.

In this post I will briefly explain how to take a cluster snapshot running on one machine and restore it on another. I will focus on how to take a snapshot specifically on Amazon EC2 instance using Amazon S3 as a repository and restore it on another Amazon EC2 instance.

I deliberately keep it as simple as possible and if you wish to have more advanced options, you can always refer to Elasticsearch documentation here and here

##Prerequisites

  • In order to continue I assume that you have at least two running Amazon EC2 instances
  • On each instance you already have Elasticsearch installed
  • On each instance you need to install cloud-aws plugin. Here is how to install it:
  #Under your Elasticsearch installation usually under /usr/share/elasticsearch run
  sudo bin/plugin install cloud-aws
    
  #You'll need to restart elasticsearch (make sure you know how to do it without harming your cluster) 
  service elasticsearch restart
  • For elaborated information on how to have the prerequisites done, you may refer to the following article

##REST APIs for cluster snapshot to S3 and restore Define a the snapshot configuration in Elasticsearch

  #Set snapshot definitions. Refer to Elasticsearch documentation for advanced options
  PUT _snapshot/my_snapshot
  {
    "type": "s3",
    "settings": {
      "bucket": "your_predifined_s3_bucket",
      "region": "us-west-1",
      "base_path": "path_under_s3_bucket",
      "access_key": "your_amazon_s3_accesskey",
      "secret_key": "your_amazon_s3_secretkey"
    }
  }

Take a snapshot and give it a name (ex. snapshot_1)

  #Run the snapshot process
  PUT /_snapshot/my_snapshot/snapshot_1?wait_for_completion=true

Get snapshot process status (it may take time to complete the operation)

  #Get snapshot status
  GET /_snapshot/my_snapshot/_status

Validate the snapshot

  #Validate snapshot
  POST /_snapshot/my_snapshot/_verify

##Reload the snapshot on another machine Define the same snapshot configuration as above:

  #In order to reload cluster at another machine
  PUT _snapshot/my_snapshot
  {
    "type": "s3",
    "settings": {
      "bucket": "your_predifined_s3_bucket",
      "region": "us-west-1",
      "base_path": "path_under_s3_bucket",
      "access_key": "your_amazon_s3_accesskey",
      "secret_key": "your_amazon_s3_secretkey"
    }
  }

Run the restore process on the second cluster

  #Run restore
  POST /_snapshot/my_snapshot/snapshot_1/_restore

Run validations as above on the second cluster

Please note that you can have control on many parameters like the indexes to be snapshot/restored, index metadata, the snapshot rate, bulk sizes and many more parameters. All are explained well in the following Elasticsearch documentation

Hope that this post helps :)

Follow me on:

Medium | Twitter | Linkedin | Stackoverflow | GitHub