Skip to content
Branch: master
Find file History
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
..
Failed to load latest commit information.
cloud
config
docker
install
k8s
ssh
ui
vm
.gitignore
README.md
hosts.go
main.go

README.md

roachprod

⚠️ roachprod is an internal tool for creating and testing CockroachDB clusters. Use at your own risk! ⚠️

Setup

  1. Make sure you have gcloud installed and configured (gcloud auth list to check, gcloud auth login to authenticate). You may want to update old installations (gcloud components update).
  2. Build a local binary of roachprod: make bin/roachprod
  3. Add $PWD/bin to your PATH so you can run roachprod from the root directory of cockroach.

Summary

  • By default, clusters are created in the cockroach-ephemeral GCE project. Use the --gce-project flag or GCE_PROJECT environment variable to create clusters in a different GCE project. Note that the lifetime functionality requires roachprod gc --gce-project=<name> to be run periodically (i.e. via a cronjob). This is only provided out-of-the-box for the cockroach-ephemeral cluster.
  • Anyone can connect to any port on VMs in cockroach-ephemeral. DO NOT STORE SENSITIVE DATA.
  • Cluster names are prefixed with the user creating them. For example, roachprod create test creates the marc-test cluster.
  • VMs have a default lifetime of 12 hours (changeable with the --lifetime flag).
  • Default settings create 4 VMs (-n 4) with 4 CPUs, 15GB memory (--machine-type=n1-standard-4), and local SSDs (--local-ssd).

Cluster quick-start using roachprod

# Create a cluster with 4 nodes and local SSD. The last node is used as a
# load generator for some tests. Note that the cluster name must always begin
# with your username.
export CLUSTER="${USER}-test"
roachprod create ${CLUSTER} -n 4 --local-ssd

# Add gcloud SSH key. Optional for most commands, but some require it.
ssh-add ~/.ssh/google_compute_engine

# Stage binaries.
roachprod stage ${CLUSTER} workload
roachprod stage ${CLUSTER} release v2.0.5

# ...or using roachprod directly (e.g., for your locally-built binary).
build/builder.sh mkrelease
roachprod put ${CLUSTER} cockroach-linux-2.6.32-gnu-amd64 cockroach

# Start a cluster.
roachprod start ${CLUSTER}

# Check the admin UI.
roachprod admin --open ${CLUSTER}:1

# Run a workload.
roachprod run ${CLUSTER}:4 -- ./workload init kv
roachprod run ${CLUSTER}:4 -- ./workload run kv --read-percent=0 --splits=1000 --concurrency=384 --duration=5m

# Open a SQL connection to the first node.
roachprod sql ${CLUSTER}:1

# Extend lifetime by another 6 hours.
roachprod extend ${CLUSTER} --lifetime=6h

# Destroy the cluster.
roachprod destroy ${CLUSTER}

Command reference

Warning: this reference is incomplete. Be prepared to refer to the CLI help text and the source code.

Create a cluster

$ roachprod create foo
Creating cluster marc-foo with 3 nodes
OK
marc-foo: 23h59m42s remaining
  marc-foo-0000   [marc-foo-0000.us-east1-b.cockroach-ephemeral]
  marc-foo-0001   [marc-foo-0001.us-east1-b.cockroach-ephemeral]
  marc-foo-0002   [marc-foo-0002.us-east1-b.cockroach-ephemeral]
Syncing...

Choosing a Provider

Use the --clouds flag to set which cloud provider(s) to use. Ex:

$ roachprod create foo --clouds gce,aws

Node Distribution Options

There are a couple flags that interact to create nodes in one zone or in geographically distributed zones:

  • --geo
  • the --[provider]-zones flags (--gce-zones, --aws-zones, --azure-locations)

Here's what to expect when the options are combined:

  • If neither are set: nodes are all placed within one of the the provider's default zones
  • --geo only: nodes are spread across the provider's default zones
  • --[provider]-zones or --geo --[provider]-zones: nodes are spread across all the specified zones

Interact using crl-prod tools

roachprod populates hosts files in ~/.roachprod/hosts. These are used by crl-prod tools to map clusters to node addresses.

$ crl-ssh marc-foo all df -h /
1: marc-foo-0000.us-east1-b.cockroach-ephemeral
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1        49G  1.2G   48G   3% /

2: marc-foo-0001.us-east1-b.cockroach-ephemeral
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1        49G  1.2G   48G   3% /

3: marc-foo-0002.us-east1-b.cockroach-ephemeral
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1        49G  1.2G   48G   3% /

Interact using roachprod directly

# Add ssh-key
$ ssh-add ~/.ssh/google_compute_engine

$ roachprod status marc-foo
marc-foo: status 3/3
   1: not running
   2: not running
   3: not running

SSH into hosts

roachprod uses gcloud to sync the list of hostnames to ~/.ssh/config and set up keys.

$ ssh marc-foo-0000.us-east1-b.cockroach-ephemeral

List clusters

$ roachprod list
marc-foo: 23h58m27s remaining
  marc-foo-0000
  marc-foo-0001
  marc-foo-0002
Syncing...

Destroy cluster

$ roachprod destroy marc-foo
Destroying cluster marc-foo with 3 nodes
OK

See roachprod help <command> for further details.

Future improvements

  • Bigger loadgen VM (last instance)

  • Ease the creation of test metadata and then running a series of tests using roachprod <cluster> test <dir1> <dir2> .... Perhaps something like roachprod prepare <test> <binary>.

  • Automatically detect stalled tests and restart tests upon unexpected failures. Detection of stalled tests could be done by noticing zero output for a period of time.

  • Detect crashed cockroach nodes.

You can’t perform that action at this time.