Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate llvm-project @56069ab1a35e74d0d8d632121e1891d41cb56a2d #17862

Merged
merged 2 commits into from
Jul 12, 2024

Conversation

yzhang93
Copy link
Contributor

@yzhang93 yzhang93 commented Jul 11, 2024

@yzhang93 yzhang93 marked this pull request as ready for review July 11, 2024 04:11
@yzhang93
Copy link
Contributor Author

@MaheshRavishankar @kuhar Not sure why rocm regression tests are failed.

@yzhang93 yzhang93 requested a review from kuhar July 11, 2024 04:42
@kuhar
Copy link
Member

kuhar commented Jul 11, 2024

I think it's because of a binary size check that's too strict @ScottTodd @saienduri

@ScottTodd
Copy link
Collaborator

The binary sizes changed for a compiled SDXL model and we have a change detector test. Update the values in

- name: "Running SDXL rocm pipeline benchmark (mi250)"
if: contains(matrix.name, 'rocm_mi250_gfx90a')
run: |
source ${VENV_DIR}/bin/activate
pytest SHARK-TestSuite/iree_tests/benchmarks/sdxl/benchmark_sdxl_rocm.py \
--goldentime-rocm-e2e-ms 1336.0 \
--goldentime-rocm-unet-ms 340.0 \
--goldentime-rocm-clip-ms 17.5 \
--goldentime-rocm-vae-ms 300.0 \
--goldendispatch-rocm-unet 1714 \
--goldendispatch-rocm-clip 1569 \
--goldendispatch-rocm-vae 248 \
--goldensize-rocm-unet-bytes 2062938 \
--goldensize-rocm-clip-bytes 780328 \
--goldensize-rocm-vae-bytes 757933 \
--gpu-number 6 \
--rocm-chip gfx90a \
--log-cli-level=info \
--retries 7
echo "$(<job_summary.md )" >> $GITHUB_STEP_SUMMARY
rm job_summary.md
- name: "Running SDXL rocm pipeline benchmark (mi300)"
if: contains(matrix.name, 'rocm_mi300_gfx942')
run: |
source ${VENV_DIR}/bin/activate
pytest SHARK-TestSuite/iree_tests/benchmarks/sdxl/benchmark_sdxl_rocm.py \
--goldentime-rocm-e2e-ms 320 \
--goldentime-rocm-unet-ms 77 \
--goldentime-rocm-clip-ms 15 \
--goldentime-rocm-vae-ms 74 \
--goldendispatch-rocm-unet 1714 \
--goldendispatch-rocm-clip 1569 \
--goldendispatch-rocm-vae 248 \
--goldensize-rocm-unet-bytes 2054938 \
--goldensize-rocm-clip-bytes 780328 \
--goldensize-rocm-vae-bytes 758509 \
--gpu-number 0 \
--rocm-chip gfx942 \
--log-cli-level=info \
--retries 7
echo "$(<job_summary.md )" >> $GITHUB_STEP_SUMMARY
based on the logs

@yzhang93
Copy link
Contributor Author

@ScottTodd So I think the failure has nothing to do with llvm integration. Shall we merge this PR?

@ScottTodd
Copy link
Collaborator

Why? Compiled binary size is regularly affected by upstream MLIR changes.

@yzhang93
Copy link
Contributor Author

Why? Compiled binary size is regularly affected by upstream MLIR changes.

The test_models::amdgpu_rocm_mi300_gfx942 failed without any new llvm commits. #17871

The other test amdgpu_rocm_mi250_gfx90a failure seems to be a new one.

@ScottTodd
Copy link
Collaborator

I'm addressing the diffs affecting all PRs since 429aafd with #17873. This PR looked like it introduced new diffs.

Signed-off-by: yzhang93 <zhyuhang88@gmail.com>
…t #96596 (#98411) (Keith Smiley on 2024-07-10 16:18:26 -0700) (26 of 27)

Signed-off-by: yzhang93 <zhyuhang88@gmail.com>
@yzhang93
Copy link
Contributor Author

The regression tests now passed with this PR
#17879.

@yzhang93 yzhang93 merged commit 6df0372 into main Jul 12, 2024
62 of 63 checks passed
@yzhang93 yzhang93 deleted the integrates/llvm-20240710 branch July 12, 2024 16:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants