Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 69 additions & 1 deletion v2.2/cockroach-workload.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ Workload | Description
`kv` | Reads and writes to keys spread (by default, uniformly at random) across the cluster.<br><br>For this workload, you run `workload init` to load the schema and then `workload run` to generate data.
`startrek` | Loads a `startrek` database, with two tables, `episodes` and `quotes`.<br><br>For this workload, you run only `workload init` to load the data. The `workload run` subcommand is not applicable.
`tpcc` | Simulates a transaction processing workload using a rich schema of multiple tables.<br><br>For this workload, you run `workload init` to load the schema and then `workload run` to generate data.
`ycsb` | Simulates a high-scale key value workload, either read-heavy, write-heavy, or scan-based, with additional customizations.<br><br>For this workload, you run `workload init` to load the schema and then `workload run` to generate data.

## Flags

Expand All @@ -61,7 +62,7 @@ The `cockroach workload` command does not support connection or security flags l

Flag | Description
-----|------------
`--concurrency` | The number of concurrent workers.<br><br>**Applicable commands:** `init` or `run`<br>**Default:** `8`
`--concurrency` | The number of concurrent workers.<br><br>**Applicable commands:** `init` or `run`<br>**Default:** 2 * number of CPUs
`--db` | The SQL database to use.<br><br>**Applicable commands:** `init` or `run`<br>**Default:** `bank`
`--drop` | Drop the existing database, if it exists.<br><br>**Applicable commands:** `init` or `run`. For the `run` command, this flag must be used in conjunction with `--init`.
`--duration` | The duration to run.<br><br>**Applicable commands:** `init` or `run`<br>**Default:** `0`, which means run forever.
Expand Down Expand Up @@ -138,6 +139,30 @@ Flag | Description
`--workers` | The number of concurrent workers.<br><br>**Applicable commands:** `init` or `run`<br>**Default:** `--warehouses` * 10
`--zones` | The number of [replication zones](configure-replication-zones.html) for partitioning. This number should match the number of `--partitions` and the zones used to start the cluster.<br><br>**Applicable command:** `init`

### `ycsb` workload

Flag | Description
-----|------------
`--concurrency` | The number of concurrent workers.<br><br>**Applicable commands:** `init` or `run`<br>**Default:** `8`
`--db` | The SQL database to use.<br><br>**Applicable commands:** `init` or `run`<br>**Default:** `ycsb`
`--drop` | Drop the existing database, if it exists.<br><br>**Applicable commands:** `init` or `run`. For the `run` command, this flag must be used in conjunction with `--init`.
`--duration` | The duration to run.<br><br>**Applicable command:** `run`<br>**Default:** `0`, which means run forever.
`--families` | Place each column in its own [column family](column-families.html).<br><br>**Applicable commands:** `init` or `run`
`--histograms` | The file to write per-op incremental and cumulative histogram data to.<br><br>**Applicable command:** `run`
`--init` | Automatically run the `init` command.<br><br>**Applicable command:** `run`
`--initial-rows` | Initial number of rows to sequentially insert before beginning random number generation.<br><br>**Applicable commands:** `init` or `run`<br>**Default:** `10000`
`--json` | Use JSONB rather than relational data.<br><br>**Applicable commands:** `init` or `run`
`--max-ops` | The maximum number of operations to run.<br><br>**Applicable command:** `run`
`--max-rate` | The maximum frequency of operations (reads/writes).<br><br>**Applicable command:** `run`<br>**Default:** `0`, which means unlimited.
`--method` | The SQL issue method (`prepare`, `noprepare`, `simple`).<br><br>**Applicable commands:** `init` or `run`<br>**Default:** `prepare`
`--pprofport` | The port for pprof endpoint.<br><br>**Applicable commands:** `init` or `run`. For the `run` command, this flag must be used in conjunction with `--init`.<br>**Default:** `33333`
`--ramp` | The duration over which to ramp up load.<br><br>**Applicable command:** `run`
`--request-distribution` | Distribution for the random number generator (`zipfian`, `uniform`).<br><br>**Applicable commands:** `init` or `run`.<br>**Default:** `zipfian`
`--seed` | The random number generator seed.<br><br>**Applicable commands:** `init` or `run`<br>**Default:** `1`
`--splits` | Number of [splits](split-at.html) to perform before starting normal operations.<br><br>**Applicable commands:** `init` or `run`
`--tolerate-errors` | Keep running on error.<br><br>**Applicable command:** `run`
`--workload` | The type of workload to run (`A`, `B`, `C`, `D`, or `F`). For details about these workloads, see [YCSB Workloads](https://github.com/brianfrankcooper/YCSB/wiki/Core-Workloads).<br><br>**Applicable commands:** `init` or `run`<br>**Default:** `B`

### Logging

By default, the `cockroach workload` command logs errors to `stderr`.
Expand Down Expand Up @@ -400,6 +425,49 @@ $ cockroach start \
600.0s 0 823902 1373.2 5.8 5.5 10.0 15.2 209.7
~~~

### Run the `ycsb` workload

1. Load the initial schema and data:

{% include copy-clipboard.html %}
~~~ shell
$ cockroach workload init ycsb \
'postgresql://root@localhost:26257?sslmode=disable'
~~~

2. Run the workload for 10 minutes:

{% include copy-clipboard.html %}
~~~ shell
$ cockroach workload run ycsb \
--duration=10m \
'postgresql://root@localhost:26257?sslmode=disable'
~~~

You'll see per-operation statistics print to standard output every second:

~~~
_elapsed___errors__ops/sec(inst)___ops/sec(cum)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)
1s 0 9258.1 9666.6 0.7 1.3 2.0 8.9 read
1s 0 470.1 490.9 1.7 2.9 4.1 5.0 update
2s 0 10244.6 9955.6 0.7 1.2 2.0 6.6 read
2s 0 559.0 525.0 1.6 3.1 6.0 7.3 update
3s 0 9870.8 9927.4 0.7 1.4 2.4 10.0 read
3s 0 500.0 516.6 1.6 4.2 7.9 15.2 update
4s 0 9847.2 9907.3 0.7 1.4 2.4 23.1 read
4s 0 506.8 514.2 1.6 3.7 7.6 17.8 update
5s 0 10084.4 9942.6 0.7 1.3 2.1 7.1 read
5s 0 537.2 518.8 1.5 3.5 10.0 15.2 update
...
~~~

After the specified duration (10 minutes in this case), the workload will stop and you'll see totals printed to standard output:

~~~
_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__result
600.0s 0 4728286 7880.2 1.0 0.9 2.2 5.2 268.4
~~~

## See also

- [`cockroach demo`](cockroach-demo.html)
Expand Down
58 changes: 42 additions & 16 deletions v2.2/demo-automatic-cloud-migration.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,19 +6,16 @@ toc: true

CockroachDB's flexible [replication controls](configure-replication-zones.html) make it trivially easy to run a single CockroachDB cluster across cloud platforms and to migrate data from one cloud to another without any service interruption. This page walks you through a local simulation of the process.


## Watch the demo

<iframe width="560" height="315" src="https://www.youtube.com/embed/cCJkgZy6s2Q" frameborder="0" allowfullscreen></iframe>

## Step 1. Install prerequisites

In this tutorial, you'll use CockroachDB, the HAProxy load balancer, and CockroachDB's version of the YCSB load generator, which requires Go. Before you begin, make sure these applications are installed:
In this tutorial, you'll use CockroachDB, its built-in `ycsb` workload, and the HAProxy load balancer. Before you begin, make sure these applications are installed:

- Install the latest version of [CockroachDB](install-cockroachdb.html).
- Install [HAProxy](http://www.haproxy.org/). If you're on a Mac and using Homebrew, use `brew install haproxy`.
- Install [Go](https://golang.org/doc/install) version 1.9 or higher. If you're on a Mac and using Homebrew, use `brew install go`. You can check your local version by running `go version`.
- Install the [CockroachDB version of YCSB](https://github.com/cockroachdb/loadgen/tree/master/ycsb): `go get github.com/cockroachdb/loadgen/ycsb`

Also, to keep track of the data files and logs for your cluster, you may want to create a new directory (e.g., `mkdir cloud-migration`) and start all your nodes in that directory.

Expand Down Expand Up @@ -125,18 +122,47 @@ Start HAProxy, with the `-f` flag pointing to the `haproxy.cfg` file:
$ haproxy -f haproxy.cfg
~~~

## Step 5. Start a load generator

Now that you have a load balancer running in front of your cluster, let's use the YCSB load generator that you installed earlier to simulate multiple client connections, each performing mixed read/write workloads.

In a new terminal, start `ycsb`, pointing it at HAProxy's port:

{% include copy-clipboard.html %}
~~~ shell
$ $HOME/go/bin/ycsb -duration 20m -tolerate-errors -concurrency 10 -max-rate 1000 'postgresql://root@localhost:26000?sslmode=disable'
~~~

This command initiates 10 concurrent client workloads for 20 minutes, but limits the total load to 1000 operations per second (since you're running everything on a single machine).
## Step 5. Run a sample workload

Now that you have a load balancer running in front of your cluster, lets use the YCSB workload built into CockroachDB to simulate multiple client connections, each performing mixed read/write workloads.

1. In a new terminal, load the initial `ycsb` schema and data, pointing it at HAProxy's port:

{% include copy-clipboard.html %}
~~~ shell
$ cockroach workload init ycsb \
'postgresql://root@localhost:26000?sslmode=disable'
~~~

2. Run the `ycsb` workload, pointing it at HAProxy's port:

{% include copy-clipboard.html %}
~~~ shell
$ cockroach workload run ycsb \
--duration=20m \
--concurrency=10 \
--max-rate=1000
'postgresql://root@localhost:26257?sslmode=disable'
~~~

This command initiates 10 concurrent client workloads for 20 minutes, but limits the total load to 1000 operations per second (since you're running everything on a single machine).

You'll soon see per-operation statistics print to standard output every second:

~~~
_elapsed___errors__ops/sec(inst)___ops/sec(cum)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)
1s 0 9258.1 9666.6 0.7 1.3 2.0 8.9 read
1s 0 470.1 490.9 1.7 2.9 4.1 5.0 update
2s 0 10244.6 9955.6 0.7 1.2 2.0 6.6 read
2s 0 559.0 525.0 1.6 3.1 6.0 7.3 update
3s 0 9870.8 9927.4 0.7 1.4 2.4 10.0 read
3s 0 500.0 516.6 1.6 4.2 7.9 15.2 update
4s 0 9847.2 9907.3 0.7 1.4 2.4 23.1 read
4s 0 506.8 514.2 1.6 3.7 7.6 17.8 update
5s 0 10084.4 9942.6 0.7 1.3 2.1 7.1 read
5s 0 537.2 518.8 1.5 3.5 10.0 15.2 update
...
~~~

## Step 6. Watch data balance across all 3 nodes

Expand Down
49 changes: 16 additions & 33 deletions v2.2/training/fault-tolerance-and-automated-repair.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ Make sure you have already completed [Cluster Startup and Scaling](cluster-start

## Step 1. Set up load balancing

In this module, you'll run a load generator to simulate multiple client connections. Each node is an equally suitable SQL gateway for the load, but it's always recommended to spread requests evenly across nodes. You'll use the open-source [HAProxy](http://www.haproxy.org/) load balancer to do that here.
In this module, you'll run a sample workload to simulate multiple client connections. Each node is an equally suitable SQL gateway for the load, but it's always recommended to spread requests evenly across nodes. You'll use the open-source [HAProxy](http://www.haproxy.org/) load balancer to do that here.

1. In a new terminal, install HAProxy. If you're on a Mac and use Homebrew, run:

Expand Down Expand Up @@ -91,54 +91,37 @@ In this module, you'll run a load generator to simulate multiple client connecti
$ haproxy -f haproxy.cfg
~~~

## Step 2. Start a load generator
## Step 2. Run a sample workload

Now that you have a load balancer running in front of your cluster, download and start a load generator to simulate client traffic.
Now that you have a load balancer running in front of your cluster, use the YCSB workload built into CockroachDB to simulate multiple client connections, each performing mixed read/write workloads.

1. In a new terminal, download the archive for the CockroachDB version of YCSB, and extract the binary:
1. In a new terminal, load the initial `ycsb` schema and data, pointing it at HAProxy's port:

<div class="filters clearfix">
<button style="width: 15%" class="filter-button" data-scope="mac">Mac</button>
<button style="width: 15%" class="filter-button" data-scope="linux">Linux</button>
</div>
<p></p>

<div class="filter-content" markdown="1" data-scope="mac">
{% include copy-clipboard.html %}
~~~ shell
$ curl {{site.url}}/docs/v2.2/training/resources/crdb-ycsb-mac.tar.gz \
| tar -xJ
~~~
</div>

<div class="filter-content" markdown="1" data-scope="linux">
{% include copy-clipboard.html %}
~~~ shell
$ wget -qO- {{site.url}}/docs/v2.2/training/resources/crdb-ycsb-linux.tar.gz \
| tar xvz
$ cockroach workload init ycsb \
'postgresql://root@localhost:26000?sslmode=disable'
~~~
</div>

2. Start `ycsb`, pointing it at HAProxy's port:
2. Run the `ycsb` workload, pointing it at HAProxy's port:

{% include copy-clipboard.html %}
~~~ shell
$ ./ycsb \
-duration 20m \
-tolerate-errors \
-concurrency 3 \
-splits 50 \
-max-rate 100 \
'postgresql://root@localhost:26000?sslmode=disable'
$ cockroach workload run ycsb \
--duration=20m \
--concurrency=3 \
--max-rate=1000 \
--splits=50 \
'postgresql://root@localhost:26257?sslmode=disable'
~~~

This command initiates 3 concurrent client workloads for 20 minutes, but limits the benchmark to just 100 operations per second (since you're running everything on a single machine).
This command initiates 3 concurrent client workloads for 20 minutes, but limits the total load to 1000 operations per second (since you're running everything on a single machine).

Also, the `-splits` flag tells the load generator to manually split ranges a number of times. This is not something you'd normally do, but for the purpose of this training, it makes it easier to visualize the movement of data in the cluster.
Also, the `--splits` flag tells the workload to manually split ranges a number of times. This is not something you'd normally do, but for the purpose of this training, it makes it easier to visualize the movement of data in the cluster.

## Step 3. Check the workload

Initially, the load generator creates a new database called `ycsb`, creates a `usertable` table in that database, and inserts a bunch of rows into the table. Soon, the load generator starts executing approximately 95% reads and 5% writes.
Initially, the workload creates a new database called `ycsb`, creates a `usertable` table in that database, and inserts a bunch of rows into the table. Soon, the load generator starts executing approximately 95% reads and 5% writes.

1. To check the SQL queries getting executed, go back to the Admin UI at <a href="http://localhost:8080" data-proofer-ignore>http://localhost:8080</a>, click **Metrics** on the left, and hover over the **SQL Queries** graph at the top:

Expand Down
Binary file removed v2.2/training/resources/crdb-ycsb-linux.tar.gz
Binary file not shown.
Binary file removed v2.2/training/resources/crdb-ycsb-mac.tar.gz
Binary file not shown.