Permalink
Newer
100644
109 lines (83 sloc)
4.9 KB
1
# Cloud Resources
2
3
This document attempts to catalog the cloud resources created by or required by
4
scripts in this repository. At the time of writing, 16 August 2017, this
5
document was not nearly comprehensive; please do not take it as such.
6
7
## Test Fixtures
8
9
Our acceptance tests push quite a bit of data around. For a rough sense of
10
scale, the biggest allocator tests create clusters with hundreds of gigabytes of
11
data, and the biggest backup/restore tests push around several terabytes of
12
data. This is far too much data to store in a VM image and far too much data to
13
generate on demand, so we stash "test fixtures" in cloud blob storage.
14
15
At the moment, all our test fixtures live in [Azure Blob
16
Storage][azure-blob-storage], Azure's equivalent of Amazon S3. The object
17
hierarchy looks like this:
18
19
* **roachfixtures{region}/**
20
* **backups/** — the output of `BACKUP... TO 'azure://roachfixtures/backups/FOO'`,
21
used to test `RESTORE` without manually running a backup.
22
* **2tb/**
24
* **store-dumps/** — gzipped tarballs of raw stores (i.e., `cockroach-data`
25
directories), used to test allocator rebalancing and
26
backups without manually inserting gigabytes of data.
27
* **1node-17gb-841ranges/** - source: `RESTORE` of `tpch10`
28
* **1node-113gb-9595ranges/** - source: `tpch100 IMPORT`
29
* **3nodes-17gb-841ranges/** - source: `RESTORE` of `tpch10`
30
* **6nodes-67gb-9588ranges/** - source: `RESTORE` of `tpch100`
31
* **10nodes-2tb-50000ranges/**
32
* **csvs/** — huge CSVs used to test distributed CSV import (`IMPORT...`).
33
34
*PLEA(benesch):* Please keep the above list up to date if you add additional
35
objects to the storage account. It's very difficult to track down an object's
36
origin story.
37
38
Note that egress bandwidth is *expensive*. Every gigabyte of outbound traffic
39
costs 8¢, so one 2TB restore costs approximately $160. Data transfer within a
40
region, like from an storage account in the `eastus` region to a VM in the
41
`eastus` region, is not considered outbound traffic, however, and so is free.
42
43
Ideally, we'd limit ourselves to one region and frolic in free bandwidth
44
forever. In practice, of course, things are never simple. The `eastus` region
45
doesn't support newer VMs, and we want to test backup/restore on both old and
46
new VMs. So we duplicate the `roachfixtures` storage accounts in each region we
47
spin up acceptance tests. At the moment, we have the following storage accounts:
48
49
* **`roachfixtureseastus`** — missing new VMs
50
* **`roachfixtureswestus`** — has new VMs
51
52
### Syncing `roachfixtures{region}` Buckets
53
54
By far the fastest way to interact with Azure Blob Storage is the
55
[`azcopy`][azcopy] command. It's a bit of a pain to install, but the Azure CLIs
56
(`az` and `azure`) don't attempt to parallelize operations and won't come close
57
to saturating your network bandwidth.
58
59
Here's a sample invocation to sync data from `eastus` to `westus`:
60
61
```shell
62
for container in $(az storage container list --account-name roachfixtureseastus -o tsv --query '[*].name')
63
do
64
azcopy --recursive --exclude-older --sync-copy \
65
--source "https://roachfixtureseastus.blob.core.windows.net/$container" \
66
--destination "https://roachfixtureswestus.blob.core.windows.net/$container" \
67
--source-key "$source_key" --dest-key "$dest_key"
68
done
69
```
70
71
Since egress is expensive and ingress is free, be sure to run this on an
72
azworker located in the source region—`eastus` in this case.
73
74
You can fetch the source and destination access keysfrom the Azure Portal or
75
with the following Azure CLI 2.0 command:
76
77
```shell
78
az storage account keys list -g fixtures -n roachfixtures{region} -o tsv --query '[0].value'
79
```
80
81
TODO(benesch): install `azcopy` on azworkers.
82
83
TODO(benesch): set up a TeamCity build to sync fixtures buckets automatically.
84
85
## Ephemeral Storage
86
87
Backup acceptance tests need a cloud storage account to use as the backup
88
destination. These backups don't need to last beyond the end of each acceptance
89
test, and so the files are periodically cleaned up to avoid paying for
90
unnecessary storage.
91
92
TODO(benesch): automate cleaning out ephemeral data. Right now, the process is
93
entirely manual.
94
95
To avoid accidentally deleting fixture data—which can take several *days* to
96
regenerate—the ephemeral data is stored in separate storage accounts from the
97
backup data. Currently, we maintain the following storage accounts:
98
99
* **`roachephemeraleastus`**
100
* **`roachephemeralwestus`**
101
102
Some acceptance tests follow a backup to ephemeral storage with a restore from
103
that same ephemeral storage, triggering both ingress and egress bandwidth. To
104
avoid paying for this egress bandwidth, we colocate storage accounts with the
105
VMs running the acceptance tests, as we do for test fixtures.
106
107
[azcopy]: https://docs.microsoft.com/en-us/azure/storage/storage-use-azcopy-linux
108
[azure-blob-storage]: https://docs.microsoft.com/en-us/azure/storage/storage-introduction#blob-storage