microbenchmark - CPU Stream Benchmark Revise #712

polarG · 2025-05-21T17:31:58Z

In the current implementation, the CPU‑stream benchmark code renames the binary before the microbench base class can verify its existence, causing the default‐binary check to fail.

This PR adds a “default” binary—built with the standard compile parameters—so that the base class can always find and validate it. Once the default binary is in place, the CPU‑stream code will rename it as needed and re‑check its presence before running the benchmark.

The PR also enable CPU stream in the default settings.

codecov · 2025-05-27T03:37:18Z

Codecov Report

Attention: Patch coverage is 60.00000% with 2 lines in your changes missing coverage. Please review.

Project coverage is 86.07%. Comparing base (431bf19) to head (2311870).
Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
...chmarks/micro_benchmarks/cpu_stream_performance.py	60.00%	2 Missing ⚠️

❌ Your patch status has failed because the patch coverage (60.00%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #712   +/-   ##
=======================================
  Coverage   86.06%   86.07%           
=======================================
  Files          99       99           
  Lines        7211     7209    -2     
=======================================
- Hits         6206     6205    -1     
+ Misses       1005     1004    -1

Flag	Coverage Δ
cpu-python3.10-unit-test	`71.77% <60.00%> (+<0.01%)`	⬆️
cpu-python3.12-unit-test	`71.77% <60.00%> (+<0.01%)`	⬆️
cpu-python3.7-unit-test	`71.35% <60.00%> (+<0.01%)`	⬆️
cuda-unit-test	`83.51% <60.00%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

abuccts

pls fix the build

third_party/Makefile

Description Add release note for v0.12.0 # Main Features ## SuperBench Improvement 1. - [x] Update Image Build Pipeline (#659) 2. - [x] Add support for arm64 build (#660) 3. - [x] Upgrade dependency versions in pipeline (#671) 4. - [x] Fix installation and lint issues (#684) 5. - [x] Update Flake8 repo (#683) 6. - [x] Init latest python support. (#687) 7. - [x] Add image build on arm64 arch (#690) 8. - [x] Enhancement of ignoring errors for import pkg_resources (#692) 9. - [x] Update label in the ROCm image build (#693) 10. - [x] Support cuda12.8 for Blackwell arch (#682) 11. - [x] Merge multi-arch image (#696) 12. - [x] Update OS of runner to the latest. (#702) 13. - [x] cuda arch flag for cublaslt (#701) ## Micro-benchmark Improvement 1. - [x] Bug Fix - Fix numa error on grace cpu in gpu-copy (#658) 2. - [x] Dependency - Bump onnxruntime-gpu version from 1.10.0 to 1.12.0 (#663) 3. - [x] Benchmarks: micro benchmarks - add general CPU bandwidth and latency benchmark (#662) 4. - [x] Benchmarks: micro benchmarks - add nvbandwidth build and benchmark (#665 and #669) 5. - [x] Fix stderr message in gpu-copy benchmark (#673) 6. - [x] Add arch support for 10.0 in gemm-flops (#680) 7. - [x] Fix tensorrt-inference parsing (#674) 8. - [x] nvbandwidth benchmark need to handle N/A value (#675) 9. - [x] Avoid Unintended nvbandwidth Function Calls in All Benchmarks (#685) 10. - [x] Add GPU Stream Micro Benchmark (#697) 11. - [x] Cuda arch flag for cublaslt (#701) 12. - [x] Support autotuning in cublaslt gemm (#706) 14. - [x] Add FP4 GEMM FLOPS support for cublaslt_gemm benchmark (#711) 15. - [x] CPU Stream Benchmark Revise (#712) 16. - [x] Add cuda12.9 docker image (#716) 17. - [x] Add Grace CPU support for CPU Stream (#719) ## Model Benchmark Improvement 1. - [x] Add LLaMA-2 Models (#668) 2. - [x] Fix typos in documentation and code files (#686) 3. - [x] Add Mixture of Experts Model (#679) 4. - [ ] Add DeepSeek Training Benchmark 5. - [x] Add DeepSeek Inference Benchmark (AMD GPU) (#713) ## Documentation 1. - [x] Update CODEOWNERS (#670) 2. - [x] Update CODEOWNERS (#718) ## Result Analysis 1. - [x] Enhance logging information for diagnosis rule op baseline errors. (#689)

Hongtao Zhang added 3 commits May 20, 2025 14:58

Revise cpu_stream benchmark.

e00858a

Enable cpu_stream in common.

b5a41c8

Decrease default stream array size.

f7032f6

polarG requested a review from a team as a code owner May 21, 2025 17:31

Add default bin in unittest.

4e88409

polarG enabled auto-merge (squash) May 28, 2025 17:21

polarG added benchmarks SuperBench Benchmarks micro-benchmarks Micro Benchmark Test for SuperBench Benchmarks labels May 28, 2025

polarG requested review from abuccts and guoshzhao June 4, 2025 23:44

Hongtao Zhang added 2 commits June 5, 2025 11:00

Remove exe from bins' name.

77beddd

Remove exe from bin names.

71061e5

abuccts reviewed Jun 7, 2025

View reviewed changes

third_party/Makefile Show resolved Hide resolved

guoshzhao approved these changes Jun 9, 2025

View reviewed changes

abuccts approved these changes Jun 10, 2025

View reviewed changes

Merge branch 'main' into hongtao/cpu-stream-revise

2311870

polarG merged commit 991c005 into main Jun 14, 2025
21 of 23 checks passed

polarG deleted the hongtao/cpu-stream-revise branch June 14, 2025 08:27

guoshzhao mentioned this pull request Jul 2, 2025

V0.12.0 Release Plan #710

Closed

40 tasks

polarG mentioned this pull request Aug 6, 2025

Docs - Upgrade version and release note #727

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

microbenchmark - CPU Stream Benchmark Revise #712

microbenchmark - CPU Stream Benchmark Revise #712

Uh oh!

polarG commented May 21, 2025 •

edited

Loading

Uh oh!

codecov bot commented May 27, 2025 •

edited

Loading

Uh oh!

abuccts left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

microbenchmark - CPU Stream Benchmark Revise #712

microbenchmark - CPU Stream Benchmark Revise #712

Uh oh!

Conversation

polarG commented May 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented May 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

abuccts left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

polarG commented May 21, 2025 •

edited

Loading

codecov bot commented May 27, 2025 •

edited

Loading