Skip to content

Commit

Permalink
Create support files to run benchmarks in GKE. (#2756)
Browse files Browse the repository at this point in the history
To build and run benchmarks in a (mostly) reproducible way I created
configuration files to create a Docker image with the benchmarks
installed, and to run said Docker image in GKE. Eventually I would like
these to automatically build and run using Google Cloud Build, but for
now they are very handy when we need to run 20 copies of the benchmarks.
  • Loading branch information
coryan committed Jun 12, 2019
1 parent e8c138c commit 0347fb5
Show file tree
Hide file tree
Showing 5 changed files with 414 additions and 0 deletions.
100 changes: 100 additions & 0 deletions ci/benchmarks/Dockerfile
@@ -0,0 +1,100 @@
# Copyright 2019 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

FROM ubuntu:bionic AS google-cloud-cpp-dependencies

RUN apt update && \
apt install -y build-essential cmake git gcc g++ cmake \
libc-ares-dev libc-ares2 libcurl4-openssl-dev libssl-dev make \
pkg-config tar wget zlib1g-dev

# #### crc32c

WORKDIR /var/tmp/build
RUN wget -q https://github.com/google/crc32c/archive/1.0.6.tar.gz
RUN tar -xf 1.0.6.tar.gz
WORKDIR /var/tmp/build/crc32c-1.0.6
RUN cmake \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=yes \
-DCRC32C_BUILD_TESTS=OFF \
-DCRC32C_BUILD_BENCHMARKS=OFF \
-DCRC32C_USE_GLOG=OFF \
-H. -Bcmake-out/crc32c
RUN cmake --build cmake-out/crc32c --target install -- -j $(nproc)
RUN ldconfig

# #### Protobuf

WORKDIR /var/tmp/build
RUN wget -q https://github.com/google/protobuf/archive/v3.7.1.tar.gz
RUN tar -xf v3.7.1.tar.gz
WORKDIR /var/tmp/build/protobuf-3.7.1/cmake
RUN cmake \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=yes \
-Dprotobuf_BUILD_TESTS=OFF \
-H. -Bcmake-out
RUN cmake --build cmake-out --target install -- -j $(nproc)
RUN ldconfig

# #### gRPC

WORKDIR /var/tmp/build
RUN wget -q https://github.com/grpc/grpc/archive/v1.21.0.tar.gz
RUN tar -xf v1.21.0.tar.gz
WORKDIR /var/tmp/build/grpc-1.21.0
RUN make -j $(nproc)
RUN make install
RUN ldconfig

# Get the source code

FROM google-cloud-cpp-dependencies AS google-cloud-cpp-build

WORKDIR /w
COPY . /w

# #### google-cloud-cpp

ARG CMAKE_BUILD_TYPE=Release
ARG SHORT_SHA=""

RUN cmake -H. -Bcmake-out \
-DGOOGLE_CLOUD_CPP_DEPENDENCY_PROVIDER=package -DBUILD_TESTING=OFF \
-DCMAKE_BUILD_TYPE=${CMAKE_BUILD_TYPE} \
-DGOOGLE_CLOUD_CPP_BUILD_METADATA=${SHORT_SHA}
RUN cmake --build cmake-out -- -j $(nproc)
WORKDIR /w/cmake-out
RUN cmake --build . --target install

# ================================================================

# Prepare the final image, this image is much smaller because we only install:
# - The final binaries, without any intermdiate object files.
# - The run-time dependencies, without the build tool dependencies.
FROM ubuntu:bionic
RUN apt update && \
apt install -y ca-certificates libc-ares2 libcurl4 \
libstdc++6 libssl1.1 zlib1g
RUN /usr/sbin/update-ca-certificates

COPY --from=google-cloud-cpp-build /usr/local/lib /usr/local/lib
COPY --from=google-cloud-cpp-build /usr/local/bin /usr/local/bin
COPY --from=google-cloud-cpp-build /w/cmake-out/google/cloud/storage/benchmarks/*_benchmark /r/
COPY --from=google-cloud-cpp-build /w/cmake-out/google/cloud/storage/examples/*_samples /r/
COPY --from=google-cloud-cpp-build /w/ci/benchmarks/storage/*.sh /r
RUN ldconfig

CMD ["/bin/false"]
30 changes: 30 additions & 0 deletions ci/benchmarks/README.md
@@ -0,0 +1,30 @@
# Setting up GKE Deployments for the Benchmarks.

`google-cloud-cpp` has a number of benchmarks that cannot be executed as part
of the CI builds because:

- They take too long to run (multiple hours).
- Their results require an analysis pipeline to determine if there is a problem.
- They need production resources (and therefore production credentials) to run.

Instead, we execute these benchmarks in a Google Kubernetes Engine (GKE)
cluster. Currently these benchmarks are manually started. Eventually we would
like to setup a continuous deployment (CD) pipeline using Google Cloud Build and
GKE.

## Creating the Docker image

The benchmarks run from a single Docker image. To build this image we use
Google Cloud Build:

```console
$ gcloud builds submit \
--substitutions=SHORT_SHA=$(git rev-parse --short HEAD) \
--config cloudbuild.yaml .
```

## Setting up GKE and other configuration

Please read the README file for the [storage](storage/README.md) benchmarks for
more information on how to setup a GKE cluster and execute the benchmarks in
this cluster.
180 changes: 180 additions & 0 deletions ci/benchmarks/storage/README.md
@@ -0,0 +1,180 @@
# Setting up GKE Deployments for the GCS C++ Client Benchmarks.

This document describes how to setup GKE deployments to continuously run the
Google Cloud Storage C++ client benchmarks. Please see the general
[README](../README.md) about benchmarks for the motivations behind this design.

## Creating the Docker image

The benchmarks run from a single Docker image. To build this image we use
Google Cloud Build:

```console
$ cd $HOME/google-cloud-cpp
$ gcloud builds submit \
--substitutions=SHORT_SHA=$(git rev-parse --short HEAD) \
--config cloudbuild.yaml .
```

## Create a GKE cluster for the Storage Benchmarks

Select the project and zone where you will run the benchmarks.

```console
$ PROJECT_ID=... # The name of your project
# e.g., PROJECT_ID=$(gcloud config get-value project)
$ STORAGE_BENCHMARKS_ZONE=... # e.g. us-central1-a
$ STORAGE_BENCHMARKS_REGION="$(gcloud compute zones list \
"--filter=(name=${STORAGE_BENCHMARKS_ZONE})" \
"--format=csv[no-heading](region)")"
```

Create the GKE clusters. All the performance benchmarks run in one cluster,
because the work well with a single CPU for each.

```console
$ gcloud container clusters create --zone=${STORAGE_BENCHMARKS_ZONE} \
--num-nodes=30 storage-benchmarks-cluster
```

Pick a name for the service account, for example:

```console
$ SA_NAME=storage-benchmark-sa
```

Then create the service account:

```console
$ gcloud iam service-accounts create ${SA_NAME} \
--display-name="Service account to run GCS C++ Client Benchmarks"
```

Grant the service account `roles/storage.admin` permissions. The benchmarks
create and delete their own buckets. This is the only role that grants these
permissions, and incidentally also grants permission to read, write, and delete
objects:

```console
$ gcloud projects add-iam-policy-binding ${PROJECT_ID} \
--member serviceAccount:${SA_NAME}@${PROJECT_ID}.iam.gserviceaccount.com \
--role roles/storage.admin
```

Create new keys for this service account and download then to a temporary place:

```console
$ gcloud iam service-accounts keys create /dev/shm/key.json \
--iam-account ${SA_NAME}@${PROJECT_ID}.iam.gserviceaccount.com
```

Copy the key to the GKE cluster:

```console
$ gcloud container clusters get-credentials \
--zone "${STORAGE_BENCHMARKS_ZONE}" storage-benchmarks-cluster
$ kubectl create secret generic service-account-key \
--from-file=key.json=/dev/shm/key.json
```

And then remove it from your machine:

```bash
$ rm /dev/shm/key.json
```

### Starting the Throughput vs. CPU Benchmark

The throughput vs. CPU benchmark measures the effective upload and download
throughput for different object sizes, as well as the CPU required to achieve
this throughput. In principle, we want the client libraries to achieve high
throughput with minimal CPU utilization.

The benchmark uploads an object of random size, measures the throughput and CPU
usage, then downloads the same object and measures the throughput and CPU
utilization. The results are reported to the standard output as a CSV file.

Create a bucket to store the benchmark results:

```console
$ LOGGING_BUCKET=... # e.g. ${PROJECT_ID}-benchmark-logs
$ gsutil mb -l us gs://${LOGGING_BUCKET}
```

Start the deployment:

```console
$ gcloud container clusters get-credentials \
--zone "${STORAGE_BENCHMARKS_ZONE}" storage-benchmarks-cluster
$ VERSION=$(git rev-parse --short HEAD) # The image version.
$ sed -e "s/@PROJECT_ID@/${PROJECT_ID}/" \
-e "s/@LOGGING_BUCKET@/${LOGGING_BUCKET}/" \
-e "s/@REGION@/${STORAGE_BENCHMARKS_REGION}/" \
-e "s/@VERSION@/${VERSION}/" \
ci/benchmarks/storage/throughput-vs-cpu-job.yaml | kubectl apply -f -
```

To upgrade the deployment to a new image:

```console
$ VERSION= ... # New version
$ kubectl set image deployment/storage-throughput-vs-cpu \
"benchmark-image=gcr.io/${PROJECT_ID}/google-cloud-cpp-benchmarks:${VERSION}"
$ kubectl rollout status -w deployment/storage-throughput-vs-cpu
```

To restart the deployment

```console
$ sed -e "s/@PROJECT_ID@/${PROJECT_ID}/" \
-e "s/@LOGGING_BUCKET@/${LOGGING_BUCKET}/" \
-e "s/@REGION@/${STORAGE_BENCHMARKS_REGION}/" \
-e "s/@VERSION@/${VERSION}/" \
ci/benchmarks/storage/throughput-vs-cpu-job.yaml | \
kubectl replace --force -f -
```

## Analyze Throughput-vs-CPU results.

If you haven't already, create a BigQuery dataset to hold the data:

```console
$ bq mk --dataset \
--description "Holds data and results from GCS C++ Client Library Benchmarks" \
"${PROJECT_ID}:storage_benchmarks"
```

Create a table in this dataset to hold the benchmark results:

```console
$ TABLE_COLUMNS=(
"op:STRING"
"object_size:INT64"
"chunk_size:INT64"
"buffer_size:INT64"
"elapsed_us:INT64"
"cpu_us:INT64"
"status:STRING"
"version:STRING"
)
$ printf -v schema ",%s" "${TABLE_COLUMNS[@]}"
$ schema=${schema:1}
$ bq mk --table --description "Raw Data from throughput-vs-cpu benchmark" \
"${PROJECT_ID}:storage_benchmarks.throughput_vs_cpu_data" \
"${schema}"
```

Make a list of all the objects:

```console
$ objects=$(gsutil ls gs://${PROJECT_ID}-benchmark-logs/throughput-vs-cpu/ | wc -l)
$ max_errors=$(( ${objects} * 2 ))
```

Upload them to BigQuery:

```console
$ gsutil ls gs://${PROJECT_ID}-benchmark-logs/throughput-vs-cpu/ | \
xargs -I{} bq load --noreplace --skip_leading_rows=16 --max_bad_records=2 \
"${PROJECT_ID}:storage_benchmarks.throughput_vs_cpu_data" {}
```
48 changes: 48 additions & 0 deletions ci/benchmarks/storage/throughput-vs-cpu-driver.sh
@@ -0,0 +1,48 @@
#!/usr/bin/env bash
# Copyright 2019 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

set -eu

echo "# ================================================================"

if [[ $# -lt 1 ]]; then
echo "Usage: $(basename "$0") <logging-bucket> [benchmark args...]"
exit 1
fi

readonly LOGGING_BUCKET="$1"
shift 1

# Disable -e because we always want to upload the log, even partial results
# are useful.
set +e
/r/storage_throughput_vs_cpu_benchmark "$@" 2>&1 | tee tp-vs-cpu.txt
exit_status=$?

# We need to upload the results to a unique object in GCS. The program creates
# a unique bucket to run, so we can use the same bucket name as part of a unique
# name.
readonly B="$(sed -n 's/# Running test on bucket: \(.*\)/\1/p' tp-vs-cpu.txt)"
# While the bucket name (which is randomly generated and fairly long) is almost
# guaranteed to be a unique name, we also want to easily search for "all the
# uploads around this time". Using a timestamp as a prefix makes that relatively
# easy.
readonly D="$(date -u +%Y-%m-%dT%H:%M:%S)"
readonly OBJECT_NAME="throughput-vs-cpu/${D}-${B}.txt"

/r/storage_object_samples upload-file tp-vs-cpu.txt \
"${LOGGING_BUCKET}" "${OBJECT_NAME}" >/dev/null

exit ${exit_status}

0 comments on commit 0347fb5

Please sign in to comment.