Merge pull request #3 from cockroachlabs/semi-final

Fleshed out tool
cockroachlabs · Sep 18, 2019 · 695d115 · 695d115
2 parents 045d59a + c90f9e4
commit 695d115
Show file tree

Hide file tree

Showing 32 changed files with 1,330 additions and 78 deletions.
diff --git a/.gitignore b/.gitignore
@@ -0,0 +1,7 @@
+cloud-report-2019
+scripts.zip
+logs/azure/*
+logs/aws/*
+logs/gcp/*
+credentials.json
+token.json
diff --git a/README.md b/README.md
@@ -8,15 +8,35 @@ This repo will let us aggregate and share data among team members, including pro
 
 [Project plan doc](https://docs.google.com/document/d/195l-Opbq_Pd3hHqRM5ynUa4FOtyGJqCriUYfqji38VI/edit)
 
+## Results
+
+You can view the raw results of these tests in the `results` directory of this repo, or in [a spreadsheet where results are automatically tracked](https://docs.google.com/spreadsheets/d/175Q3g3Ti40rEmaCMwm0CBXRnHIo07WSBsxOVhmqNyEY).
+
+## Enclosed Binary
+
+The go program contained in this repo can automatically run tests on cloud providers baked into Roachprod, as well as Microsoft Azure.
+
+For more details, see `reproduction-steps.md` in this repo.
+
+_Note_: It would be possible to extend this binary to run on other platforms relatively easily, but requires some work to handle cloud-specific tasks––namely, getting machine metadata.
+
+## Future Work
+
+When Azure is baked into `roachprod`, one might be able to simply add `azure` to `cloudDetails/default.json` and things will work. At that point you could remove all of the Azure machinery. However, I think that leaving in the `shellRunner` might be nice because it would let us extend this kind of framework into something more like an arbitrary script runner.
+
 ## Staff
 
 **Andy Woods** for Product
 - Vision, structure, messaging, writing
+
 **Jessica Edwards** for Marketing
 - Report production and promotion, messaging
+
 **Jim Walker** for Product Marketing
 - Messaging, direction, and competitive landscape
+
 **Nathan VanBenschoten** for Engineering
 - Technical oversight and insight
+
 **Sean Loiselle** for Product Marketing
-- Collecting data, writing
+- Data collection and aggregation
diff --git a/cloudDetails/adhoc.json b/cloudDetails/adhoc.json
@@ -0,0 +1,8 @@
+[
+  {
+    "name": "gcp",
+    "machineTypes": [
+      "n2-standard-16"
+    ]
+  }
+]
diff --git a/cloudDetails/aws-ebs.json b/cloudDetails/aws-ebs.json
@@ -0,0 +1,12 @@
+[
+  {
+    "name": "aws",
+    "ebsMachineTypes": [
+      "c5.4xlarge",
+      "c5n.4xlarge",
+      "m5.4xlarge",
+      "m5a.4xlarge",
+      "r5a.4xlarge"
+    ]
+  }
+]
diff --git a/cloudDetails/aws.json b/cloudDetails/aws.json
@@ -0,0 +1,20 @@
+[
+  {
+    "name": "aws",
+    "machineTypes": [
+      "m5d.4xlarge",
+      "c5d.4xlarge",
+      "m5ad.4xlarge",
+      "i3.4xlarge",
+      "r5ad.4xlarge",
+      "r5d.4xlarge"
+    ],
+    "ebsMachineTypes": [
+      "c5.4xlarge",
+      "c5n.4xlarge",
+      "m5.4xlarge",
+      "m5a.4xlarge",
+      "r5a.4xlarge"
+    ]
+  }
+]
diff --git a/cloudDetails/default.json b/cloudDetails/default.json
@@ -0,0 +1,32 @@
+[
+  {
+    "name": "aws",
+    "machineTypes": [
+      "c5d.4xlarge",
+      "m5d.4xlarge",
+      "m5ad.4xlarge",
+      "i3.4xlarge",
+      "r5ad.4xlarge",
+      "r5d.4xlarge"
+    ],
+    "ebsMachineTypes": [
+      "c5.4xlarge",
+      "c5n.4xlarge",
+      "m5.4xlarge",
+      "m5a.4xlarge",
+      "r5a.4xlarge"
+    ]
+  },
+  {
+    "name": "gcp",
+    "machineTypes": [
+      "n2-standard-16",
+      "n2-highmem-16",
+      "n2-highcpu-16",
+      "n1-standard-16",
+      "n1-highmem-16",
+      "n1-highcpu-16"
+    ]
+  }
+]
+
diff --git a/cloudDetails/gcp.json b/cloudDetails/gcp.json
@@ -0,0 +1,14 @@
+[
+    {
+        "name": "gcp",
+        "machineTypes": [
+        "n2-standard-16",
+        "n2-highmem-16",
+        "n2-highcpu-16",
+        "n1-standard-16",
+        "n1-highmem-16",
+        "n1-highcpu-16"
+        ]
+    }
+]
+
diff --git a/deployment-steps.md b/deployment-steps.md
@@ -16,12 +16,6 @@ roachprod create $CLUSTER -n 4 --gce-machine-type=$MACHINETYPE
 roachprod create $CLUSTER -n 4 --clouds=aws --aws-machine-type-ssd=$MACHINETYPE
 ```
 
-## EBS gp2
-
-```
-roachprod create $CLUSTER -n 4 --clouds=aws --aws-machine-type-ssd=$MACHINETYPE --local-ssd=false
-```
-
 ## EBS io1
 
 ```
@@ -36,9 +30,7 @@ All Azure environments require a manually configured [network with open CRDB por
 
 ## 1. Create Machines
 
-### Temporary Disks
-
-#### Creating VM
+### Creating VM
 
 **Note**: All machines should be in the same region for these tests.
 
@@ -57,7 +49,7 @@ All Azure environments require a manually configured [network with open CRDB por
 
 2. Create a 4th VM in the same region as the others, which you'll use to run your workloads.
 
-#### Mounting Temporary Disk
+### Mounting Temporary Disk
 
 SSH to each machine that will become a CockroachDB node and mount the temporary disk using the following commands:
 
@@ -68,82 +60,29 @@ mkdir cockroach-data
 sudo mount /dev/sdb1 cockroach-data
 ```
 
-### Attached Disks
-
-#### Creating VM
-
-**Note**: All machines should be in the same region for these tests.
-
-1. Create 3 VMs with following options:
-
-	Tab | Option | Value
-	----|--------|-------
-	Basics | Region | Choose the region of your resource group.
-	Basics | Resource group | Choose your resource group.
-	Basics | Image | Ubuntu Server 18.04 LTS
-	Basics | Size | _Variable_
-	Disks | Data disks | **Create and attach a new disk** > **1024GiB**
-	Networking | Virtual network | Choose the virtual network you configured to open CRDB ports.
-	Networking | Select Inbound Ports | SSH
-
-2. Create a 4th VM in the same region as the others, which you'll use to run your workloads.
-
-#### Mounting the Attached Disk
-
-1. Format the remote disk:
-
-	```
-	sudo gdisk /dev/sdc
-	n
-	p
-	w
-	sudo mkfs -t ext4 /dev/sdc1
-	```
-
-2. Mount the attached disk:
-
-	```
-	sudo mkdir ~/cockroach-data
-	sudo chmod a+rwx ~/cockroach-data
-	sudo mount /dev/sdc1 ~/cockroach-data
-	```
-3. Mount the disk on reboot.
-
-	Get the device's UUID from `blkid`
-	```
-	sudo -i blkid
-	```
-
-	Add the device to `/etc/fstab`:
-	```
-	sudo vim /etc/fstab
-	```
-
-	Append a line with the following format:
-	```
-	UUID=<device UUID>   /datadrive   ext4   defaults,nofail   1   2
-	```
-
-	All of this can be combined with the following one-liner:
-
-	```
-	sudo -i blkid | grep -Po '\/dev\/sdc1: UUID="\K.{36}' | while read uuid; do sudo echo -e "UUID=${uuid}\t~/cockroach-data\text4\tdefaults,nofail\t1\t2" | sudo tee -a /etc/fstab; done
-	```
-
-Note that it is also possible to re-use the same attached disk across VMs by simply detatching it from the first machine and attaching it to subsequent machines. This might be useful with tests that are run serially instead of in parallel.
-
 ## 2. Deploying Cockcroach
 
+Get the Cockroach binary.
 ```
 wget -qO- https://binaries.cockroachdb.com/cockroach-v19.1.3.linux-amd64.tgz | tar  xvz
 sudo cp -i cockroach-v19.1.3.linux-amd64/cockroach /usr/local/bin
 ```
+
+Enable `nobarrier`.
 ```
-sudo cockroach start --insecure --advertise-addr=<node1 address> --join=<node1 address>,<node2 address>,<node3 address> --background
+DEV=$(mount | grep /mnt | awk '{print $1}');
+sudo umount /mnt;
+sudo mount -o discard,defaults,barrier=0 ${DEV} /mnt
+mount | grep /mnt
+sudo mkdir /mnt/data1
 ```
 
-Initialize the cluster:
+Start the node:
 ```
-cockroach init --insecure --host=<address of any node>
+sudo cockroach start --insecure --advertise-addr=<node1 address> --join=<node1 address>,<node2 address>,<node3 address> --store=/mnt/data1 --background
 ```
 
+After starting the node on all machines, initialize the cluster:
+```
+cockroach init --insecure --host=<address of any node>
+```