Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Create support files to run benchmarks in GKE. (#2756)
To build and run benchmarks in a (mostly) reproducible way I created configuration files to create a Docker image with the benchmarks installed, and to run said Docker image in GKE. Eventually I would like these to automatically build and run using Google Cloud Build, but for now they are very handy when we need to run 20 copies of the benchmarks.
- Loading branch information
Showing
5 changed files
with
414 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,100 @@ | ||
# Copyright 2019 Google LLC | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
FROM ubuntu:bionic AS google-cloud-cpp-dependencies | ||
|
||
RUN apt update && \ | ||
apt install -y build-essential cmake git gcc g++ cmake \ | ||
libc-ares-dev libc-ares2 libcurl4-openssl-dev libssl-dev make \ | ||
pkg-config tar wget zlib1g-dev | ||
|
||
# #### crc32c | ||
|
||
WORKDIR /var/tmp/build | ||
RUN wget -q https://github.com/google/crc32c/archive/1.0.6.tar.gz | ||
RUN tar -xf 1.0.6.tar.gz | ||
WORKDIR /var/tmp/build/crc32c-1.0.6 | ||
RUN cmake \ | ||
-DCMAKE_BUILD_TYPE=Release \ | ||
-DBUILD_SHARED_LIBS=yes \ | ||
-DCRC32C_BUILD_TESTS=OFF \ | ||
-DCRC32C_BUILD_BENCHMARKS=OFF \ | ||
-DCRC32C_USE_GLOG=OFF \ | ||
-H. -Bcmake-out/crc32c | ||
RUN cmake --build cmake-out/crc32c --target install -- -j $(nproc) | ||
RUN ldconfig | ||
|
||
# #### Protobuf | ||
|
||
WORKDIR /var/tmp/build | ||
RUN wget -q https://github.com/google/protobuf/archive/v3.7.1.tar.gz | ||
RUN tar -xf v3.7.1.tar.gz | ||
WORKDIR /var/tmp/build/protobuf-3.7.1/cmake | ||
RUN cmake \ | ||
-DCMAKE_BUILD_TYPE=Release \ | ||
-DBUILD_SHARED_LIBS=yes \ | ||
-Dprotobuf_BUILD_TESTS=OFF \ | ||
-H. -Bcmake-out | ||
RUN cmake --build cmake-out --target install -- -j $(nproc) | ||
RUN ldconfig | ||
|
||
# #### gRPC | ||
|
||
WORKDIR /var/tmp/build | ||
RUN wget -q https://github.com/grpc/grpc/archive/v1.21.0.tar.gz | ||
RUN tar -xf v1.21.0.tar.gz | ||
WORKDIR /var/tmp/build/grpc-1.21.0 | ||
RUN make -j $(nproc) | ||
RUN make install | ||
RUN ldconfig | ||
|
||
# Get the source code | ||
|
||
FROM google-cloud-cpp-dependencies AS google-cloud-cpp-build | ||
|
||
WORKDIR /w | ||
COPY . /w | ||
|
||
# #### google-cloud-cpp | ||
|
||
ARG CMAKE_BUILD_TYPE=Release | ||
ARG SHORT_SHA="" | ||
|
||
RUN cmake -H. -Bcmake-out \ | ||
-DGOOGLE_CLOUD_CPP_DEPENDENCY_PROVIDER=package -DBUILD_TESTING=OFF \ | ||
-DCMAKE_BUILD_TYPE=${CMAKE_BUILD_TYPE} \ | ||
-DGOOGLE_CLOUD_CPP_BUILD_METADATA=${SHORT_SHA} | ||
RUN cmake --build cmake-out -- -j $(nproc) | ||
WORKDIR /w/cmake-out | ||
RUN cmake --build . --target install | ||
|
||
# ================================================================ | ||
|
||
# Prepare the final image, this image is much smaller because we only install: | ||
# - The final binaries, without any intermdiate object files. | ||
# - The run-time dependencies, without the build tool dependencies. | ||
FROM ubuntu:bionic | ||
RUN apt update && \ | ||
apt install -y ca-certificates libc-ares2 libcurl4 \ | ||
libstdc++6 libssl1.1 zlib1g | ||
RUN /usr/sbin/update-ca-certificates | ||
|
||
COPY --from=google-cloud-cpp-build /usr/local/lib /usr/local/lib | ||
COPY --from=google-cloud-cpp-build /usr/local/bin /usr/local/bin | ||
COPY --from=google-cloud-cpp-build /w/cmake-out/google/cloud/storage/benchmarks/*_benchmark /r/ | ||
COPY --from=google-cloud-cpp-build /w/cmake-out/google/cloud/storage/examples/*_samples /r/ | ||
COPY --from=google-cloud-cpp-build /w/ci/benchmarks/storage/*.sh /r | ||
RUN ldconfig | ||
|
||
CMD ["/bin/false"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
# Setting up GKE Deployments for the Benchmarks. | ||
|
||
`google-cloud-cpp` has a number of benchmarks that cannot be executed as part | ||
of the CI builds because: | ||
|
||
- They take too long to run (multiple hours). | ||
- Their results require an analysis pipeline to determine if there is a problem. | ||
- They need production resources (and therefore production credentials) to run. | ||
|
||
Instead, we execute these benchmarks in a Google Kubernetes Engine (GKE) | ||
cluster. Currently these benchmarks are manually started. Eventually we would | ||
like to setup a continuous deployment (CD) pipeline using Google Cloud Build and | ||
GKE. | ||
|
||
## Creating the Docker image | ||
|
||
The benchmarks run from a single Docker image. To build this image we use | ||
Google Cloud Build: | ||
|
||
```console | ||
$ gcloud builds submit \ | ||
--substitutions=SHORT_SHA=$(git rev-parse --short HEAD) \ | ||
--config cloudbuild.yaml . | ||
``` | ||
|
||
## Setting up GKE and other configuration | ||
|
||
Please read the README file for the [storage](storage/README.md) benchmarks for | ||
more information on how to setup a GKE cluster and execute the benchmarks in | ||
this cluster. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,180 @@ | ||
# Setting up GKE Deployments for the GCS C++ Client Benchmarks. | ||
|
||
This document describes how to setup GKE deployments to continuously run the | ||
Google Cloud Storage C++ client benchmarks. Please see the general | ||
[README](../README.md) about benchmarks for the motivations behind this design. | ||
|
||
## Creating the Docker image | ||
|
||
The benchmarks run from a single Docker image. To build this image we use | ||
Google Cloud Build: | ||
|
||
```console | ||
$ cd $HOME/google-cloud-cpp | ||
$ gcloud builds submit \ | ||
--substitutions=SHORT_SHA=$(git rev-parse --short HEAD) \ | ||
--config cloudbuild.yaml . | ||
``` | ||
|
||
## Create a GKE cluster for the Storage Benchmarks | ||
|
||
Select the project and zone where you will run the benchmarks. | ||
|
||
```console | ||
$ PROJECT_ID=... # The name of your project | ||
# e.g., PROJECT_ID=$(gcloud config get-value project) | ||
$ STORAGE_BENCHMARKS_ZONE=... # e.g. us-central1-a | ||
$ STORAGE_BENCHMARKS_REGION="$(gcloud compute zones list \ | ||
"--filter=(name=${STORAGE_BENCHMARKS_ZONE})" \ | ||
"--format=csv[no-heading](region)")" | ||
``` | ||
|
||
Create the GKE clusters. All the performance benchmarks run in one cluster, | ||
because the work well with a single CPU for each. | ||
|
||
```console | ||
$ gcloud container clusters create --zone=${STORAGE_BENCHMARKS_ZONE} \ | ||
--num-nodes=30 storage-benchmarks-cluster | ||
``` | ||
|
||
Pick a name for the service account, for example: | ||
|
||
```console | ||
$ SA_NAME=storage-benchmark-sa | ||
``` | ||
|
||
Then create the service account: | ||
|
||
```console | ||
$ gcloud iam service-accounts create ${SA_NAME} \ | ||
--display-name="Service account to run GCS C++ Client Benchmarks" | ||
``` | ||
|
||
Grant the service account `roles/storage.admin` permissions. The benchmarks | ||
create and delete their own buckets. This is the only role that grants these | ||
permissions, and incidentally also grants permission to read, write, and delete | ||
objects: | ||
|
||
```console | ||
$ gcloud projects add-iam-policy-binding ${PROJECT_ID} \ | ||
--member serviceAccount:${SA_NAME}@${PROJECT_ID}.iam.gserviceaccount.com \ | ||
--role roles/storage.admin | ||
``` | ||
|
||
Create new keys for this service account and download then to a temporary place: | ||
|
||
```console | ||
$ gcloud iam service-accounts keys create /dev/shm/key.json \ | ||
--iam-account ${SA_NAME}@${PROJECT_ID}.iam.gserviceaccount.com | ||
``` | ||
|
||
Copy the key to the GKE cluster: | ||
|
||
```console | ||
$ gcloud container clusters get-credentials \ | ||
--zone "${STORAGE_BENCHMARKS_ZONE}" storage-benchmarks-cluster | ||
$ kubectl create secret generic service-account-key \ | ||
--from-file=key.json=/dev/shm/key.json | ||
``` | ||
|
||
And then remove it from your machine: | ||
|
||
```bash | ||
$ rm /dev/shm/key.json | ||
``` | ||
|
||
### Starting the Throughput vs. CPU Benchmark | ||
|
||
The throughput vs. CPU benchmark measures the effective upload and download | ||
throughput for different object sizes, as well as the CPU required to achieve | ||
this throughput. In principle, we want the client libraries to achieve high | ||
throughput with minimal CPU utilization. | ||
|
||
The benchmark uploads an object of random size, measures the throughput and CPU | ||
usage, then downloads the same object and measures the throughput and CPU | ||
utilization. The results are reported to the standard output as a CSV file. | ||
|
||
Create a bucket to store the benchmark results: | ||
|
||
```console | ||
$ LOGGING_BUCKET=... # e.g. ${PROJECT_ID}-benchmark-logs | ||
$ gsutil mb -l us gs://${LOGGING_BUCKET} | ||
``` | ||
|
||
Start the deployment: | ||
|
||
```console | ||
$ gcloud container clusters get-credentials \ | ||
--zone "${STORAGE_BENCHMARKS_ZONE}" storage-benchmarks-cluster | ||
$ VERSION=$(git rev-parse --short HEAD) # The image version. | ||
$ sed -e "s/@PROJECT_ID@/${PROJECT_ID}/" \ | ||
-e "s/@LOGGING_BUCKET@/${LOGGING_BUCKET}/" \ | ||
-e "s/@REGION@/${STORAGE_BENCHMARKS_REGION}/" \ | ||
-e "s/@VERSION@/${VERSION}/" \ | ||
ci/benchmarks/storage/throughput-vs-cpu-job.yaml | kubectl apply -f - | ||
``` | ||
|
||
To upgrade the deployment to a new image: | ||
|
||
```console | ||
$ VERSION= ... # New version | ||
$ kubectl set image deployment/storage-throughput-vs-cpu \ | ||
"benchmark-image=gcr.io/${PROJECT_ID}/google-cloud-cpp-benchmarks:${VERSION}" | ||
$ kubectl rollout status -w deployment/storage-throughput-vs-cpu | ||
``` | ||
|
||
To restart the deployment | ||
|
||
```console | ||
$ sed -e "s/@PROJECT_ID@/${PROJECT_ID}/" \ | ||
-e "s/@LOGGING_BUCKET@/${LOGGING_BUCKET}/" \ | ||
-e "s/@REGION@/${STORAGE_BENCHMARKS_REGION}/" \ | ||
-e "s/@VERSION@/${VERSION}/" \ | ||
ci/benchmarks/storage/throughput-vs-cpu-job.yaml | \ | ||
kubectl replace --force -f - | ||
``` | ||
|
||
## Analyze Throughput-vs-CPU results. | ||
|
||
If you haven't already, create a BigQuery dataset to hold the data: | ||
|
||
```console | ||
$ bq mk --dataset \ | ||
--description "Holds data and results from GCS C++ Client Library Benchmarks" \ | ||
"${PROJECT_ID}:storage_benchmarks" | ||
``` | ||
|
||
Create a table in this dataset to hold the benchmark results: | ||
|
||
```console | ||
$ TABLE_COLUMNS=( | ||
"op:STRING" | ||
"object_size:INT64" | ||
"chunk_size:INT64" | ||
"buffer_size:INT64" | ||
"elapsed_us:INT64" | ||
"cpu_us:INT64" | ||
"status:STRING" | ||
"version:STRING" | ||
) | ||
$ printf -v schema ",%s" "${TABLE_COLUMNS[@]}" | ||
$ schema=${schema:1} | ||
$ bq mk --table --description "Raw Data from throughput-vs-cpu benchmark" \ | ||
"${PROJECT_ID}:storage_benchmarks.throughput_vs_cpu_data" \ | ||
"${schema}" | ||
``` | ||
|
||
Make a list of all the objects: | ||
|
||
```console | ||
$ objects=$(gsutil ls gs://${PROJECT_ID}-benchmark-logs/throughput-vs-cpu/ | wc -l) | ||
$ max_errors=$(( ${objects} * 2 )) | ||
``` | ||
|
||
Upload them to BigQuery: | ||
|
||
```console | ||
$ gsutil ls gs://${PROJECT_ID}-benchmark-logs/throughput-vs-cpu/ | \ | ||
xargs -I{} bq load --noreplace --skip_leading_rows=16 --max_bad_records=2 \ | ||
"${PROJECT_ID}:storage_benchmarks.throughput_vs_cpu_data" {} | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
#!/usr/bin/env bash | ||
# Copyright 2019 Google LLC | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
set -eu | ||
|
||
echo "# ================================================================" | ||
|
||
if [[ $# -lt 1 ]]; then | ||
echo "Usage: $(basename "$0") <logging-bucket> [benchmark args...]" | ||
exit 1 | ||
fi | ||
|
||
readonly LOGGING_BUCKET="$1" | ||
shift 1 | ||
|
||
# Disable -e because we always want to upload the log, even partial results | ||
# are useful. | ||
set +e | ||
/r/storage_throughput_vs_cpu_benchmark "$@" 2>&1 | tee tp-vs-cpu.txt | ||
exit_status=$? | ||
|
||
# We need to upload the results to a unique object in GCS. The program creates | ||
# a unique bucket to run, so we can use the same bucket name as part of a unique | ||
# name. | ||
readonly B="$(sed -n 's/# Running test on bucket: \(.*\)/\1/p' tp-vs-cpu.txt)" | ||
# While the bucket name (which is randomly generated and fairly long) is almost | ||
# guaranteed to be a unique name, we also want to easily search for "all the | ||
# uploads around this time". Using a timestamp as a prefix makes that relatively | ||
# easy. | ||
readonly D="$(date -u +%Y-%m-%dT%H:%M:%S)" | ||
readonly OBJECT_NAME="throughput-vs-cpu/${D}-${B}.txt" | ||
|
||
/r/storage_object_samples upload-file tp-vs-cpu.txt \ | ||
"${LOGGING_BUCKET}" "${OBJECT_NAME}" >/dev/null | ||
|
||
exit ${exit_status} |
Oops, something went wrong.