Skip to content

Latest commit

 

History

History
199 lines (141 loc) · 8.41 KB

demo-cloud-migration.md

File metadata and controls

199 lines (141 loc) · 8.41 KB
title summary toc
Cloud Migration
Use a local cluster to simulate migrating from one cloud platform to another.
false

CockroachDB's flexible replication controls make it trivially easy to run a single CockroachDB cluster across cloud platforms or to migrate a cluster from one cloud to another without any service interruption. This page walks you through a local simulation of the process.

Before You Begin

Make sure you have already installed CockroachDB.

Step 1. Start a 3-node cluster on "cloud 1"

If you've already started a local cluster, these commands should be familiar to you. The new flag to note is --locality, which accepts key-value pairs that describe the locality of a node. In this case, you're using the flag to specify that the first 3 nodes are running on "cloud platform 1".

# In a new terminal, start node 1 on cloud 1:
$ cockroach start --insecure \
--locality=cloud=1 \
--store=cloud1node1 \
--host=localhost \
--cache=100MB

# In a new terminal, start node 2 on cloud 1:
$ cockroach start --insecure \
--locality=cloud=1 \
--store=cloud1node2 \
--host=localhost \
--port=25258 \
--http-port=8081 \
--join=localhost:26257 \
--cache=100MB

# In a new terminal, start node 3 on cloud 1:
$ cockroach start --insecure \
--locality=cloud=1 \
--store=cloud1node3 \
--host=localhost \
--port=25259 \
--http-port=8082 \
--join=localhost:26257 \
--cache=100MB

Step 2. Set up HAProxy load balancing

Each CockroachDB node is an equally suitable SQL gateway to your cluster, but to ensure an even balancing of client requests across available nodes, we can use a TCP load balancer. HAProxy is one of the most popular open-source TCP load balancers, and CockroachDB includes a built-in command for generating a configuration file that is preset to work with your running cluster, so you'll use that tool here.

In a new terminal, install HAProxy:

$ brew install haproxy

Then run the cockroach gen haproxy command, specifying the port of any node:

$ cockroach gen haproxy --insecure --host=localhost --port=26257

This command generates an haproxy.cfg file automatically configured to work with the 3 nodes of your running cluster. In the file, change bind :26257 to bind :26270. This changes the port on which HAProxy accepts requests to a port that is not already in use by a node and that won't be used by the nodes you'll add later.

global
  maxconn 4096

defaults
    mode                tcp
    timeout connect     10s
    timeout client      1m
    timeout server      1m

listen psql
    bind :26270
    mode tcp
    balance roundrobin
    server cockroach1 localhost:26257
    server cockroach2 localhost:26258
    server cockroach3 localhost:26259

Start HAProxy, with the -f flag pointing to the haproxy.cfg file:

$ haproxy -f haproxy.cfg

Step 3. Start a load generator

CockroachDB provides a version of the YCSB load generator, which lets you simulate several client connections performing mixed read/write workloads against the cluster.

In a new terminal, install and build the CockroachDB version of YCSB:

$ go get github.com/cockroachdb/loadgen/ycsb

Then start the load generator:

$ ycsb -duration 20m -splits 50 -tolerate-errors -concurrency 10 -rate-limit 100 'postgresql://root@localhost:26270?sslmode=disable'

This command initiates 10 concurrent client workloads for 20 minutes, but limits each worker to 100 operations per second (since you're running everything on a single machine).

Step 4. Watch data balance across all 3 nodes

Open the Admin UI at http://localhost:8080 and hover over the SQL Queries graph at the top. After a minute or so, you'll see that the load generator is executing approximately 95% reads and 5% writes across all nodes:

CockroachDB Admin UI

Scroll down a bit and hover over the Replicas per Node graph. Because CockroachDB replicates each piece of data 3 times by default, the replica count on each of your 3 nodes should be identical:

CockroachDB Admin UI

Step 5. Add 3 nodes on "cloud 2"

Again, the flag to note is --locality, which you're using to specify that these next 3 nodes are running on "cloud platform 2".

{{site.data.alerts.callout_info}}If you were running nodes across clouds for real, you'd also configure firewalls to allow inbound and outbound communication between all the nodes. {{site.data.alerts.end}}

# In a new terminal, start node 4 on cloud 2
$ cockroach start --insecure \
--locality=cloud=2 \
--store=cloud2node4 \
--host=localhost \
--port=26261 \
--http-port=8083 \
--join=localhost:26257 \
--cache=100MB

# In a new terminal, start node 5 on cloud 2:
$ cockroach start --insecure \
--locality=cloud=2 \
--store=cloud2node5 \
--host=localhost \
--port=25262 \
--http-port=8084 \
--join=localhost:26257 \
--cache=100MB

# In a new terminal, start node 6 on cloud 2:
$ cockroach start --insecure \
--locality=cloud=2 \
--store=cloud2node6 \
--host=localhost \
--port=25263 \
--http-port=8085 \
--join=localhost:26257 \
--cache=100MB

Step 6. Watch data balance across all 6 nodes

Back in the Admin UI, hover over the Replicas per Node graph again. Because you set --locality settings to specify that nodes are running across 2 clouds, you'll see an approximately even replica count on each node, indicating that CockroachDB has automatically rebalanced replicas across the cloud platforms:

CockroachDB Admin UI

Step 7. Migrate all data to "cloud 2"

In a new terminal, edit the default replication zone, adding a hard constraint that all replicas must be on nodes with --locality=cloud=2:

$ echo 'constraints: [+cloud=2]' | cockroach zone set .default --insecure --host=localhost -f -

{{site.data.alerts.callout_info}} As you'll see in the next step, as long as the --locality flag was set properly on nodes, this single command is all it takes to initiate an automatic migration from one cloud to another.{{site.data.alerts.end}}

Step 8. Verify the data migration

Back in the Admin UI, hover over the Replicas per Node graph again. Very soon, you should see the replica count double on nodes 4, 5, and 6 and drop to 0 on nodes 1, 2, and 3:

CockroachDB Admin UI

This indicates that all data has been migrated from "cloud 1" to "cloud 2". In a real cloud migration scenario, at this point you would update the load balancer to point to the nodes on "cloud 2" and then stop the nodes on "cloud 1". But for the purpose of this local simulation, there's no need to do that.

Step 9. Stop the cluster

Stop HAProxy and YCSB by switching into their terminals and pressing CTRL + C. Do the same for each CockroachDB node.

{{site.data.alerts.callout_success}}For the last node, the shutdown process will take longer (about a minute) and will eventually force kill the node. This is because, with only 1 node still online, a majority of replicas are no longer available (2 of 3), and so the cluster is not operational. To speed up the process, press CTRL + C a second time.{{site.data.alerts.end}}

If you don't plan to restart the cluster, you may want to remove the nodes' data stores and the HAProxy config file:

$ rm -rf cloud1node1 cloud1node2 cloud1node3 cloud2node4 cloud2node5 cloud2node6 haproxy.cfg

What's Next?

Use a local cluster to explore these other core CockroachDB features: