Permalink
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
104 lines (63 sloc) 6.25 KB

Azure deployment

Create the machine images with Packer

Go to the packer folder and see the README there. Once you have the machine image IDs, return here and continue with the next steps.

Create key-pair or use your own

This deployment is configured to use your default SSH keys as machine credentials. If you want to use other keys, change the path to the keys you want to use (look for key_path in variables.tf). Use this guide to generate new keys if needed.

Configurations

Edit variables.tf to specify the following:

  • azure_location - the Azure location where to launch the cluster in.
  • azure_subscription_id, azure_client_id, azure_client_secret, azure_tenant_id - the same credentials used in the Packer step. See the README there for instructions on how to retrieve them.
  • es_cluster - the name of the Elasticsearch cluster to launch.
  • key_path - the filesystem path to the SSH key to use as virtual machines login credentials.
  • data_instance_type, master_instance_type, client_instance_type - Azure machine instance types to use for each machine type in the cluster.
  • security_enabled, monitoring_enabled - whether to enable X-Pack Security and Monitoring features, respectively.
  • client_user - the username to use for HTTP basic authentication that is enabled on the client nodes. Password is generated automatically and can be accessed after deployment by running terraform output.

The rest of the configurations are mostly around cluster topology and machine types and sizes.

Cluster topology

Two modes of deployment are supported:

  • A recommended configuration, with dedicated master-eligible nodes, data nodes, and client nodes. This is a production-ready and best-practice configuration. See more details in the official documentation.
  • Single node mode - mostly useful for experimentation

At this point we consider the role ingest as unanimous with data, so all data nodes are also ingest nodes.

The default mode is the single-node mode. To change it to the recommended configuration, edit variables.tf and set number of master nodes to 3, data nodes to at least 2, and client nodes to at least 1.

All nodes with the client role will be attached to an Azure load balancer, so access to all client nodes can be done via the DNS it exposes.

Launch the cluster with Terraform

terraform plan
terraform apply

When terraform is done, you should see a lot of output ending with something like this:

Apply complete! Resources: 14 added, 0 changed, 0 destroyed.

The state of your infrastructure has been saved to the path
below. This state is required to modify and destroy your
infrastructure, so keep it safe. To inspect the complete state
use the `terraform show` command.

State path: terraform.tfstate

Outputs:

public_dns = elasticsearch-cluster-foo.eastus.cloudapp.azure.com
vm_password = rBTKoLsf7x8ODZVd

Note clients_lb_public_ipaddress and vm-password - that's your entry point to the cluster and the password for the exampleuser default user.

Look around

The client nodes are the ones exposed to external networks. They provide endpoints for Kibana, Grafana, Cerebro and direct Elasticsearch access. By default client nodes are accessible via their public IPs and the DNS of the load balancer they are attached to (see above).

Client nodes listen on port 8080 and are password protected. Access is managed by nginx which is expecting a username and password pair. Default user name is exampleuser and the password is generated automatically when deploying. You can change those defaults by editing this file and running Packer again.

On client nodes you will find:

  • Kibana access is direct on port 80 of the load balancer host (http://host)
  • Cerebro (a cluster management UI) is available on http://host/cerebro/
  • For direct Elasticsearch access, go to http://host/es/
  • In the single-node deployment mode, the default port is 8080 and the host is the machine host (not the load balancer)
  • Grafana is accessible on port 3000 - http://host:3000/

The default credentials are exampleuser as username, and password as generated by Terraform during the deployment (will show up as vm-password after deployment when you run terraform output).

Elastic's X-Pack is deployed on the cluster out of the box with monitoring enabled but security disabled - you should enable and setup X-Pack Security for any production deployment.

To ssh to one of the instances:

ssh ubuntu@{public IP / DNS of the instance or load balancer}

Backups

The Azure repository plugin is installed on the cluster and ready to be used for index snapshots and (should you ever need) a restore. Official documentation is available here: https://www.elastic.co/guide/en/elasticsearch/plugins/current/repository-azure-usage.html

Auto- and manual- scale out

The entire stack is deployed using Azure scale-sets, which are easy to scale up and down manually (from the Azure portal, from the command line, or using the same Terraform scripts), or automatically based on host metrics and application metrics using Azure scale-set features.

Elastic Discovery on Azure

Unfortunately, the story of cluster discovery on Azure is practically non-existent. There is an Azure "Classic" discovery plugin that has been deprecated since circa 5.0 and Elastic are yet to release a properly working discovery plugin (there is a PR for one which is open for over a year now if you want to track it).

A discovery plugin on a public cloud is important because it takes a lot of complexity off you, and manages the initial cluster nodes discovery using the available cloud APIs.

Having none available, I defaulted to using vnet and naming conventions. Another viable option is using file-based discovery, which is a file describing your cluster you can upload to the images and use as a seed.