Skip to content

Commit

Permalink
Update CI to include benchmarking changes in Test Suite. (iree-org#17655
Browse files Browse the repository at this point in the history
)

This commit updates the SHARK-Test ref, config files, and yaml files to
have the most up to date flags and benchmarking support. I will also
concentrate on a python implementation for pulling in configs in
Test-Suite, so we don't have to rely on all these config files. Checked
the golden values over 15 times in the CI so should be good.

This commit also adds support to the CI, so that it generates a job
summary of the benchmark mean times for e2e and all the sub models. This
can be seen by developers in the summary tab of the PckgCI testing.
Example: https://github.com/iree-org/iree/actions/runs/9501523985

<img width="646" alt="image"
src="https://github.com/iree-org/iree/assets/77521230/8c0e8732-64a9-4147-b596-64520b0622d6">


Side note: The build_test_all_bazel was failing the first couple times
and then passed. Seems to be unstable

---------

Signed-off-by: saienduri <saimanas.enduri@amd.com>
  • Loading branch information
saienduri committed Jun 13, 2024
1 parent 97fbe5f commit b4321ea
Show file tree
Hide file tree
Showing 14 changed files with 797 additions and 17 deletions.
63 changes: 54 additions & 9 deletions .github/workflows/pkgci_regression_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,7 @@ jobs:
uses: actions/checkout@8f4b7f84864484a7bf31766abe9204da3cbe65b3 # v3.5.0
with:
repository: nod-ai/SHARK-TestSuite
ref: c9b3337e1f754c83d178568be1339aaef5f08045
ref: ab932cc54f1e460ccd9b4a4f1efa07d0ee069eb5
path: SHARK-TestSuite
submodules: false
lfs: false
Expand Down Expand Up @@ -138,15 +138,19 @@ jobs:
# CPU
- name: cpu_llvm_task
models-config-file: pytorch_models_cpu_llvm_task.json
sdxl-config-file: sdxl_scheduled_unet_cpu_llvm_task.json
sdxl-unet-config-file: sdxl_scheduled_unet_cpu_llvm_task.json
sdxl-vae-config-file: sdxl_vae_decode_cpu_llvm_task.json
sdxl-clip-config-file: sdxl_prompt_encoder_cpu_llvm_task.json
runs-on: nodai-amdgpu-w7900-x86-64

# AMD GPU
- name: amdgpu_rocm_gfx90a
models-config-file: pytorch_models_gpu_rocm_gfx90a.json
models-extra-flags-config-file: pytorch_models_gpu_rocm_gfx90a_additional_flags.json
sdxl-config-file: sdxl_scheduled_unet_gpu_rocm_gfx90a.json
runs-on: nodai-amdgpu-mi250-x86-64
sdxl-unet-config-file: sdxl_scheduled_unet_gpu_rocm_gfx90a.json
sdxl-vae-config-file: sdxl_vae_decode_gpu_rocm_gfx90a.json
sdxl-clip-config-file: sdxl_prompt_encoder_gpu_rocm_gfx90a.json
runs-on: nodai-amdgpu-mi210-x86-64
- name: amdgpu_vulkan
models-config-file: pytorch_models_gpu_vulkan.json
runs-on: nodai-amdgpu-w7900-x86-64
Expand All @@ -166,7 +170,9 @@ jobs:
IREE_TEST_PATH_EXTENSION: ${{ github.workspace }}/build_tools/pkgci/external_test_suite
MODELS_CONFIG_FILE_PATH: build_tools/pkgci/external_test_suite/${{ matrix.models-config-file }}
MODELS_EXTRA_FLAGS_CONFIG_FILE_PATH: build_tools/pkgci/external_test_suite/${{ matrix.models-extra-flags-config-file }}
SDXL_CONFIG_FILE_PATH: build_tools/pkgci/external_test_suite/${{ matrix.sdxl-config-file }}
SDXL_UNET_CONFIG_FILE_PATH: build_tools/pkgci/external_test_suite/${{ matrix.sdxl-unet-config-file }}
SDXL_CLIP_CONFIG_FILE_PATH: build_tools/pkgci/external_test_suite/${{ matrix.sdxl-clip-config-file }}
SDXL_VAE_CONFIG_FILE_PATH: build_tools/pkgci/external_test_suite/${{ matrix.sdxl-vae-config-file }}
VENV_DIR: ${{ github.workspace }}/venv
steps:
- name: Checking out IREE repository
Expand Down Expand Up @@ -201,7 +207,7 @@ jobs:
uses: actions/checkout@8f4b7f84864484a7bf31766abe9204da3cbe65b3 # v3.5.0
with:
repository: nod-ai/SHARK-TestSuite
ref: c9b3337e1f754c83d178568be1339aaef5f08045
ref: ab932cc54f1e460ccd9b4a4f1efa07d0ee069eb5
path: SHARK-TestSuite
submodules: false
lfs: true
Expand Down Expand Up @@ -243,7 +249,7 @@ jobs:
--config-files=${MODELS_EXTRA_FLAGS_CONFIG_FILE_PATH}
- name: "Run external tests - SDXL scheduled unet"
if: "matrix.sdxl-config-file != '' && !cancelled()"
if: "matrix.sdxl-unet-config-file != '' && !cancelled()"
run: |
source ${VENV_DIR}/bin/activate
pytest SHARK-TestSuite/iree_tests/pytorch/models/sdxl-scheduled-unet-3-tank \
Expand All @@ -254,10 +260,49 @@ jobs:
--log-cli-level=info \
--timeout=1200 \
--durations=0 \
--config-files=${SDXL_CONFIG_FILE_PATH}
--config-files=${SDXL_UNET_CONFIG_FILE_PATH}
- name: "Run external tests - SDXL prompt encoder"
if: "matrix.sdxl-clip-config-file != '' && !cancelled()"
run: |
source ${VENV_DIR}/bin/activate
pytest SHARK-TestSuite/iree_tests/pytorch/models/sdxl-prompt-encoder-tank \
-rpfE \
-k real_weights \
--no-skip-tests-missing-files \
--capture=no \
--log-cli-level=info \
--timeout=1200 \
--durations=0 \
--config-files=${SDXL_CLIP_CONFIG_FILE_PATH}
- name: "Run external tests - SDXL vae decode"
if: "matrix.sdxl-vae-config-file != '' && !cancelled()"
run: |
source ${VENV_DIR}/bin/activate
pytest SHARK-TestSuite/iree_tests/pytorch/models/sdxl-vae-decode-tank \
-rpfE \
-k real_weights \
--no-skip-tests-missing-files \
--capture=no \
--log-cli-level=info \
--timeout=1200 \
--durations=0 \
--config-files=${SDXL_VAE_CONFIG_FILE_PATH}
- name: "Running SDXL rocm pipeline benchmark"
if: contains(matrix.name, 'rocm')
run: |
source ${VENV_DIR}/bin/activate
bash SHARK-TestSuite/iree_tests/benchmarks/benchmark_sdxl_rocm.sh
pytest SHARK-TestSuite/iree_tests/benchmarks/benchmark_sdxl_rocm.py \
--goldentime-rocm-e2e-ms 1636 \
--goldentime-rocm-unet-ms 442 \
--goldentime-rocm-clip-ms 16.5 \
--goldentime-rocm-vae-ms 285 \
--gpu-number 3 \
--rocm-chip gfx90a \
--log-cli-level=info \
--retries 7
echo "### SDXL Benchmark Summary:" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY # this is a blank line
echo "$(<job_summary.txt )" >> $GITHUB_STEP_SUMMARY
Loading

0 comments on commit b4321ea

Please sign in to comment.