Benchmark options #938

Kern-- · 2023-11-13T19:08:14Z

Issue #, if available:
N/A

Description of changes:

All of our existing benchmark images are run with the same set of containerd
options which are only configurable at the test level to control things
like which snapshotter is used.

This is a problem for benchmarking GPU workloads, for example, where we need to
pass additional options to mount the GPU in the container which don't
apply to all images in the test.

Additionally, our benchmarker assumes that the benchmarked images
require no configuration, however this can make experimentation hard in
cases where a single base-image can be used for multiple use cases
depending on environment variables, confiration mounts, etc.

This change adds the ability to configure image-specific options when
loading benchmarks from json. The options are not required and if not
passed, the benchmarker will behave as it did before this change.

The set of options available in this change are those that were
necessary for benchmarking the LLM workloads that I was trying to test.
They are not comprehensive, but can be built upon as use cases arise.

I kept 2 commits to separate some refactoring that I needed to make these change from the changes themselves. I'm happy to squash or pull out the refactoring into a separate commit if that's preferred.

Testing performed:
~/benchmark.json

[{
  "short_name": "djl-serving",
  "image_ref": "<ACCOUNT>.dkr.ecr.us-west-2.amazonaws.com/djl-serving:0.25.0-deepspeed",
  "soci_index_manifest_ref": "<INDEX DIGEST>",
  "ready_line": "ModelServer BOTH API bind to: http://0.0.0.0:8080/",
  "timeout_sec": 1000,
  "options": {
    "net": "host",
    "gpu": true,
    "shm_size": 1000000,
    "mounts": [
      {
        "source": "/home/ubuntu/model",
        "destination": "/opt/ml/model",
        "type": "bind",
        "options": ["rbind", "ro"]
      },
      {
        "source": "/home/ubuntu/model_logs",
        "destination": "/opt/djl/logs",
        "type": "bind",
        "options": ["rbind", "rw"]
      }
    ]
  }
}]

BENCHMARK_FLAGS="-count 1 -f ~benchmark-tgi.json" make benchmarks-perf-test
BENCHMARK_FLAGS="-count 1 -f ~/benchmark-tgi.json" make benchmarks-perf-test

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

This change passes the full ImageDescriptor to the benchmark tests rather than deconstructing it. This is really a setup so that we can pass more complex options in the future. Signed-off-by: Kern Walster <walster@amazon.com>

benchmark/utils.go

benchmark/performanceTest/main.go

benchmark/stargzTest/main.go

sondavidb

Unrelated, but is there a reason benchmarkTests.go is in camelcase and everything else uses underscores?

benchmark/framework/containerd_utils.go

All of our existing benchmark images are run with the same set of containerd options which are only configurable at the test level to control things like which snapshotter is used. This is a problem for benchmarking GPU workloads, for example, where we need to pass additional options to mount the GPU in the container which don't apply to all images in the test. Additionally, our benchmarker assumes that the benchmarked images require no configuration, however this can make experimentation hard in cases where a single base-image can be used for multiple use cases depending on environment variables, confiration mounts, etc. This change adds the ability to configure image-specific options when loading benchmarks from json. The options are not required and if not passed, the benchmarker will behave as it did before this change. The set of options available in this change are those that were necessary for benchmarking the LLM workloads that I was trying to test. They are not comprehensive, but can be built upon as use cases arise. Signed-off-by: Kern Walster <walster@amazon.com>

Kern-- · 2023-11-15T23:43:19Z

Unrelated, but is there a reason benchmarkTests.go is in camelcase and everything else uses underscores?

No, I don't think so. Historical accident.

Pass full ImageDescriptor to benchmark functions

2e7a474

This change passes the full ImageDescriptor to the benchmark tests rather than deconstructing it. This is really a setup so that we can pass more complex options in the future. Signed-off-by: Kern Walster <walster@amazon.com>

Kern-- requested a review from a team as a code owner November 13, 2023 19:08

turan18 previously approved these changes Nov 13, 2023

View reviewed changes

benchmark/utils.go Outdated Show resolved Hide resolved

benchmark/utils.go Show resolved Hide resolved

austinvazquez reviewed Nov 14, 2023

View reviewed changes

benchmark/performanceTest/main.go Show resolved Hide resolved

austinvazquez reviewed Nov 14, 2023

View reviewed changes

benchmark/stargzTest/main.go Show resolved Hide resolved

sondavidb reviewed Nov 14, 2023

View reviewed changes

benchmark/framework/containerd_utils.go Outdated Show resolved Hide resolved

Kern-- dismissed turan18’s stale review via 192801f November 14, 2023 19:36

Kern-- force-pushed the benchmark-options branch from 65b5fcb to 192801f Compare November 14, 2023 19:36

austinvazquez approved these changes Nov 14, 2023

View reviewed changes

turan18 approved these changes Nov 14, 2023

View reviewed changes

Kern-- mentioned this pull request Nov 15, 2023

[FEATURE] Remove benchmarker CSV input in favor of JSON #946

Open

Kern-- merged commit f675946 into awslabs:main Nov 15, 2023
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmark options #938

Benchmark options #938

Kern-- commented Nov 13, 2023

sondavidb left a comment

Kern-- commented Nov 15, 2023

Benchmark options #938

Benchmark options #938

Conversation

Kern-- commented Nov 13, 2023

sondavidb left a comment

Choose a reason for hiding this comment

Kern-- commented Nov 15, 2023