cockroachdb · jseldess · Feb 8, 2019 · Feb 8, 2019 · Feb 8, 2019
diff --git a/v2.2/cockroach-workload.md b/v2.2/cockroach-workload.md
@@ -50,6 +50,7 @@ Workload | Description
 `kv` | Reads and writes to keys spread (by default, uniformly at random) across the cluster.<br><br>For this workload, you run `workload init` to load the schema and then `workload run` to generate data.
 `startrek` | Loads a `startrek` database, with two tables, `episodes` and `quotes`.<br><br>For this workload, you run only `workload init` to load the data. The `workload run` subcommand is not applicable.
 `tpcc` | Simulates a transaction processing workload using a rich schema of multiple tables.<br><br>For this workload, you run `workload init` to load the schema and then `workload run` to generate data.
+`ycsb` | Simulates a high-scale key value workload, either read-heavy, write-heavy, or scan-based, with additional customizations.<br><br>For this workload, you run `workload init` to load the schema and then `workload run` to generate data.
 
 ## Flags
 
@@ -61,7 +62,7 @@ The `cockroach workload` command does not support connection or security flags l
 
 Flag | Description
 -----|------------
-`--concurrency` | The number of concurrent workers.<br><br>**Applicable commands:** `init` or `run`<br>**Default:** `8`
+`--concurrency` | The number of concurrent workers.<br><br>**Applicable commands:** `init` or `run`<br>**Default:** 2 * number of CPUs
 `--db` | The SQL database to use.<br><br>**Applicable commands:** `init` or `run`<br>**Default:** `bank`
 `--drop` | Drop the existing database, if it exists.<br><br>**Applicable commands:** `init` or `run`. For the `run` command, this flag must be used in conjunction with `--init`.
 `--duration` | The duration to run.<br><br>**Applicable commands:** `init` or `run`<br>**Default:** `0`, which means run forever.
@@ -138,6 +139,30 @@ Flag | Description
 `--workers` | The number of concurrent workers.<br><br>**Applicable commands:** `init` or `run`<br>**Default:** `--warehouses` * 10
 `--zones` | The number of [replication zones](configure-replication-zones.html) for partitioning. This number should match the number of `--partitions` and the zones used to start the cluster.<br><br>**Applicable command:** `init`
 
+### `ycsb` workload
+
+Flag | Description
+-----|------------
+`--concurrency` | The number of concurrent workers.<br><br>**Applicable commands:** `init` or `run`<br>**Default:** `8`
+`--db` | The SQL database to use.<br><br>**Applicable commands:** `init` or `run`<br>**Default:** `ycsb`
+`--drop` | Drop the existing database, if it exists.<br><br>**Applicable commands:** `init` or `run`. For the `run` command, this flag must be used in conjunction with `--init`.
+`--duration` | The duration to run.<br><br>**Applicable command:** `run`<br>**Default:** `0`, which means run forever.
+`--families` | Place each column in its own [column family](column-families.html).<br><br>**Applicable commands:** `init` or `run`
+`--histograms` | The file to write per-op incremental and cumulative histogram data to.<br><br>**Applicable command:** `run`
+`--init` | Automatically run the `init` command.<br><br>**Applicable command:** `run`
+`--initial-rows` | Initial number of rows to sequentially insert before beginning random number generation.<br><br>**Applicable commands:** `init` or `run`<br>**Default:** `10000`
+`--json` | Use JSONB rather than relational data.<br><br>**Applicable commands:** `init` or `run`
+`--max-ops` | The maximum number of operations to run.<br><br>**Applicable command:** `run`
+`--max-rate` | The maximum frequency of operations (reads/writes).<br><br>**Applicable command:** `run`<br>**Default:** `0`, which means unlimited.
+`--method` | The SQL issue method (`prepare`, `noprepare`, `simple`).<br><br>**Applicable commands:** `init` or `run`<br>**Default:** `prepare`
+`--pprofport` | The port for pprof endpoint.<br><br>**Applicable commands:** `init` or `run`. For the `run` command, this flag must be used in conjunction with `--init`.<br>**Default:** `33333`
+`--ramp` | The duration over which to ramp up load.<br><br>**Applicable command:** `run`
+`--request-distribution` | Distribution for the random number generator (`zipfian`, `uniform`).<br><br>**Applicable commands:** `init` or `run`.<br>**Default:** `zipfian`
+`--seed` | The random number generator seed.<br><br>**Applicable commands:** `init` or `run`<br>**Default:** `1`
+`--splits` | Number of [splits](split-at.html) to perform before starting normal operations.<br><br>**Applicable commands:** `init` or `run`
+`--tolerate-errors` | Keep running on error.<br><br>**Applicable command:** `run`
+`--workload` | The type of workload to run (`A`, `B`, `C`, `D`, or `F`). For details about these workloads, see [YCSB Workloads](https://github.com/brianfrankcooper/YCSB/wiki/Core-Workloads).<br><br>**Applicable commands:** `init` or `run`<br>**Default:** `B`
+
 ### Logging
 
 By default, the `cockroach workload` command logs errors to `stderr`.
@@ -400,6 +425,49 @@ $ cockroach start \
       600.0s        0         823902         1373.2      5.8      5.5     10.0     15.2    209.7
     ~~~
 
+### Run the `ycsb` workload
+
+1. Load the initial schema and data:
+
+    {% include copy-clipboard.html %}
+    ~~~ shell
+    $ cockroach workload init ycsb \
+    'postgresql://root@localhost:26257?sslmode=disable'
+    ~~~
+
+2. Run the workload for 10 minutes:
+
+    {% include copy-clipboard.html %}
+    ~~~ shell
+    $ cockroach workload run ycsb \
+    --duration=10m \
+    'postgresql://root@localhost:26257?sslmode=disable'
+    ~~~
+
+    You'll see per-operation statistics print to standard output every second:
+
+    ~~~
+    _elapsed___errors__ops/sec(inst)___ops/sec(cum)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)
+          1s        0         9258.1         9666.6      0.7      1.3      2.0      8.9 read
+          1s        0          470.1          490.9      1.7      2.9      4.1      5.0 update
+          2s        0        10244.6         9955.6      0.7      1.2      2.0      6.6 read
+          2s        0          559.0          525.0      1.6      3.1      6.0      7.3 update
+          3s        0         9870.8         9927.4      0.7      1.4      2.4     10.0 read
+          3s        0          500.0          516.6      1.6      4.2      7.9     15.2 update
+          4s        0         9847.2         9907.3      0.7      1.4      2.4     23.1 read
+          4s        0          506.8          514.2      1.6      3.7      7.6     17.8 update
+          5s        0        10084.4         9942.6      0.7      1.3      2.1      7.1 read
+          5s        0          537.2          518.8      1.5      3.5     10.0     15.2 update
+    ...
+    ~~~
+
+    After the specified duration (10 minutes in this case), the workload will stop and you'll see totals printed to standard output:
+
+    ~~~
+    _elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__result
+      600.0s        0        4728286         7880.2      1.0      0.9      2.2      5.2    268.4
+    ~~~  
+
 ## See also
 
 - [`cockroach demo`](cockroach-demo.html)

diff --git a/v2.2/demo-automatic-cloud-migration.md b/v2.2/demo-automatic-cloud-migration.md
@@ -6,19 +6,16 @@ toc: true
 
 CockroachDB's flexible [replication controls](configure-replication-zones.html) make it trivially easy to run a single CockroachDB cluster across cloud platforms and to migrate data from one cloud to another without any service interruption. This page walks you through a local simulation of the process.
 
-
 ## Watch the demo
 
 <iframe width="560" height="315" src="https://www.youtube.com/embed/cCJkgZy6s2Q" frameborder="0" allowfullscreen></iframe>
 
 ## Step 1. Install prerequisites
 
-In this tutorial, you'll use CockroachDB, the HAProxy load balancer, and CockroachDB's version of the YCSB load generator, which requires Go. Before you begin, make sure these applications are installed:
+In this tutorial, you'll use CockroachDB, its built-in `ycsb` workload, and the HAProxy load balancer. Before you begin, make sure these applications are installed:
 
 - Install the latest version of [CockroachDB](install-cockroachdb.html).
 - Install [HAProxy](http://www.haproxy.org/). If you're on a Mac and using Homebrew, use `brew install haproxy`.
-- Install [Go](https://golang.org/doc/install) version 1.9 or higher. If you're on a Mac and using Homebrew, use `brew install go`. You can check your local version by running `go version`.
-- Install the [CockroachDB version of YCSB](https://github.com/cockroachdb/loadgen/tree/master/ycsb): `go get github.com/cockroachdb/loadgen/ycsb`
 
 Also, to keep track of the data files and logs for your cluster, you may want to create a new directory (e.g., `mkdir cloud-migration`) and start all your nodes in that directory.
 
@@ -125,18 +122,47 @@ Start HAProxy, with the `-f` flag pointing to the `haproxy.cfg` file:
 $ haproxy -f haproxy.cfg
 ~~~
 
-## Step 5. Start a load generator
-
-Now that you have a load balancer running in front of your cluster, let's use the YCSB load generator that you installed earlier to simulate multiple client connections, each performing mixed read/write workloads.
-
-In a new terminal, start `ycsb`, pointing it at HAProxy's port:
-
-{% include copy-clipboard.html %}
-~~~ shell
-$ $HOME/go/bin/ycsb -duration 20m -tolerate-errors -concurrency 10 -max-rate 1000 'postgresql://root@localhost:26000?sslmode=disable'
-~~~
-
-This command initiates 10 concurrent client workloads for 20 minutes, but limits the total load to 1000 operations per second (since you're running everything on a single machine).
+## Step 5. Run a sample workload
+
+Now that you have a load balancer running in front of your cluster, lets use the YCSB workload built into CockroachDB to simulate multiple client connections, each performing mixed read/write workloads.
+
+1. In a new terminal, load the initial `ycsb` schema and data, pointing it at HAProxy's port:
+
+    {% include copy-clipboard.html %}
+    ~~~ shell
+    $ cockroach workload init ycsb \
+    'postgresql://root@localhost:26000?sslmode=disable'
+    ~~~
+
+2. Run the `ycsb` workload, pointing it at HAProxy's port:
+
+    {% include copy-clipboard.html %}
+    ~~~ shell
+    $ cockroach workload run ycsb \
+    --duration=20m \
+    --concurrency=10 \
+    --max-rate=1000
+    'postgresql://root@localhost:26257?sslmode=disable'
+    ~~~
+
+    This command initiates 10 concurrent client workloads for 20 minutes, but limits the total load to 1000 operations per second (since you're running everything on a single machine).
+
+    You'll soon see per-operation statistics print to standard output every second:
+
+    ~~~
+    _elapsed___errors__ops/sec(inst)___ops/sec(cum)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)
+          1s        0         9258.1         9666.6      0.7      1.3      2.0      8.9 read
+          1s        0          470.1          490.9      1.7      2.9      4.1      5.0 update
+          2s        0        10244.6         9955.6      0.7      1.2      2.0      6.6 read
+          2s        0          559.0          525.0      1.6      3.1      6.0      7.3 update
+          3s        0         9870.8         9927.4      0.7      1.4      2.4     10.0 read
+          3s        0          500.0          516.6      1.6      4.2      7.9     15.2 update
+          4s        0         9847.2         9907.3      0.7      1.4      2.4     23.1 read
+          4s        0          506.8          514.2      1.6      3.7      7.6     17.8 update
+          5s        0        10084.4         9942.6      0.7      1.3      2.1      7.1 read
+          5s        0          537.2          518.8      1.5      3.5     10.0     15.2 update
+    ...
+    ~~~
 
 ## Step 6. Watch data balance across all 3 nodes
 

diff --git a/v2.2/training/fault-tolerance-and-automated-repair.md b/v2.2/training/fault-tolerance-and-automated-repair.md
@@ -20,7 +20,7 @@ Make sure you have already completed [Cluster Startup and Scaling](cluster-start
 
 ## Step 1. Set up load balancing
 
-In this module, you'll run a load generator to simulate multiple client connections. Each node is an equally suitable SQL gateway for the load, but it's always recommended to spread requests evenly across nodes. You'll use the open-source [HAProxy](http://www.haproxy.org/) load balancer to do that here.
+In this module, you'll run a sample workload to simulate multiple client connections. Each node is an equally suitable SQL gateway for the load, but it's always recommended to spread requests evenly across nodes. You'll use the open-source [HAProxy](http://www.haproxy.org/) load balancer to do that here.
 
 1. In a new terminal, install HAProxy. If you're on a Mac and use Homebrew, run:
 
@@ -91,54 +91,37 @@ In this module, you'll run a load generator to simulate multiple client connecti
     $ haproxy -f haproxy.cfg
     ~~~
 
-## Step 2. Start a load generator
+## Step 2. Run a sample workload
 
-Now that you have a load balancer running in front of your cluster, download and start a load generator to simulate client traffic.
+Now that you have a load balancer running in front of your cluster, use the YCSB workload built into CockroachDB to simulate multiple client connections, each performing mixed read/write workloads.
 
-1. In a new terminal, download the archive for the CockroachDB version of YCSB, and extract the binary:
+1. In a new terminal, load the initial `ycsb` schema and data, pointing it at HAProxy's port:
 
-    <div class="filters clearfix">
-      <button style="width: 15%" class="filter-button" data-scope="mac">Mac</button>
-      <button style="width: 15%" class="filter-button" data-scope="linux">Linux</button>
-    </div>
-    <p></p>
-
-    <div class="filter-content" markdown="1" data-scope="mac">
-    {% include copy-clipboard.html %}
-    ~~~ shell
-    $ curl {{site.url}}/docs/v2.2/training/resources/crdb-ycsb-mac.tar.gz \
-    | tar -xJ
-    ~~~
-    </div>
-
-    <div class="filter-content" markdown="1" data-scope="linux">
     {% include copy-clipboard.html %}
     ~~~ shell
-    $ wget -qO- {{site.url}}/docs/v2.2/training/resources/crdb-ycsb-linux.tar.gz \
-    | tar xvz
+    $ cockroach workload init ycsb \
+    'postgresql://root@localhost:26000?sslmode=disable'
     ~~~
-    </div>
 
-2. Start `ycsb`, pointing it at HAProxy's port:
+2. Run the `ycsb` workload, pointing it at HAProxy's port:
 
     {% include copy-clipboard.html %}
     ~~~ shell
-    $ ./ycsb \
-    -duration 20m \
-    -tolerate-errors \
-    -concurrency 3 \
-    -splits 50 \
-    -max-rate 100 \
-    'postgresql://root@localhost:26000?sslmode=disable'
+    $ cockroach workload run ycsb \
+    --duration=20m \
+    --concurrency=3 \
+    --max-rate=1000 \
+    --splits=50 \
+    'postgresql://root@localhost:26257?sslmode=disable'
     ~~~
 
-    This command initiates 3 concurrent client workloads for 20 minutes, but limits the benchmark to just 100 operations per second (since you're running everything on a single machine).
+    This command initiates 3 concurrent client workloads for 20 minutes, but limits the total load to 1000 operations per second (since you're running everything on a single machine).
 
-    Also, the `-splits` flag tells the load generator to manually split ranges a number of times. This is not something you'd normally do, but for the purpose of this training, it makes it easier to visualize the movement of data in the cluster.
+    Also, the `--splits` flag tells the workload to manually split ranges a number of times. This is not something you'd normally do, but for the purpose of this training, it makes it easier to visualize the movement of data in the cluster.
 
 ## Step 3. Check the workload
 
-Initially, the load generator creates a new database called `ycsb`, creates a `usertable` table in that database, and inserts a bunch of rows into the table. Soon, the load generator starts executing approximately 95% reads and 5% writes.
+Initially, the workload creates a new database called `ycsb`, creates a `usertable` table in that database, and inserts a bunch of rows into the table. Soon, the load generator starts executing approximately 95% reads and 5% writes.
 
 1. To check the SQL queries getting executed, go back to the Admin UI at <a href="http://localhost:8080" data-proofer-ignore>http://localhost:8080</a>, click **Metrics** on the left, and hover over the **SQL Queries** graph at the top:
 

diff --git a/v2.2/training/resources/crdb-ycsb-linux.tar.gz b/v2.2/training/resources/crdb-ycsb-linux.tar.gz
diff --git a/v2.2/training/resources/crdb-ycsb-mac.tar.gz b/v2.2/training/resources/crdb-ycsb-mac.tar.gz