Enhancement - Fix hipblaslt-gemm result parsing for hipBLASLt v1500+ output format by polarG · Pull Request #791 · microsoft/superbenchmark

polarG · 2026-03-20T17:52:42Z

Description
The hipblaslt-gemm benchmark result parser fails with MICROBENCHMARK_RESULT_PARSING_FAILURE (return code 33) when running against hipBLASLt v1500+. The benchmark kernels execute successfully and produce valid TFLOPS data, but SuperBench cannot parse the results into structured metrics.

Root cause: The parser hardcodes len(fields) != 23 to validate the output CSV, but hipBLASLt v1500 outputs 34 columns — it added a_type, b_type, c_type, d_type, scaleA, scaleB, scaleC, scaleD, amaxD, bias_type, aux_type, and hipblaslt-GB/s. This causes two bugs:

The field count check rejects every result line as invalid.
Even if it passed, fields[-2] would return hipblaslt-GB/s instead of hipblaslt-Gflops.

Fix
Replace the hardcoded field-count check and positional index with header-based column lookup:

Parse the CSV header line to dynamically find the column index of hipblaslt-Gflops.
Validate that the data line has the same number of columns as the header (instead of a magic number).
Use the discovered column index to extract the Gflops value.

This approach is:

Backward compatible — works with the old 23-column format (hipBLASLt v600).
Forward compatible — will handle any future column additions as long as hipblaslt-Gflops remains in the header.

Copilot

Pull request overview

Updates hipblaslt-gemm result parsing to support hipBLASLt v1500+ CSV output by using header-based column lookup instead of hardcoded field counts/positions.

Changes:

Parse the CSV header to dynamically locate the hipblaslt-Gflops column and validate row width against the header.
Add a unit test covering the new v1500+ (34-column) output format.
Clarify parsing intent via inline comments.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File	Description
tests/benchmarks/micro_benchmarks/test_hipblaslt_function.py	Adds a new test case for v1500+ output and annotates the existing old-format test.
superbench/benchmarks/micro_benchmarks/hipblaslt_function.py	Switches parsing to header-based `hipblaslt-Gflops` extraction and header/data column-count validation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

codecov · 2026-03-20T18:02:10Z

Codecov Report

❌ Patch coverage is 89.47368% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.70%. Comparing base (932d9f6) to head (cff175a).

Files with missing lines	Patch %	Lines
.../benchmarks/micro_benchmarks/hipblaslt_function.py	89.47%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #791      +/-   ##
==========================================
+ Coverage   85.69%   85.70%   +0.01%     
==========================================
  Files         103      103              
  Lines        7890     7907      +17     
==========================================
+ Hits         6761     6777      +16     
- Misses       1129     1130       +1

Flag	Coverage Δ
cpu-python3.10-unit-test	`70.47% <89.47%> (+0.05%)`	⬆️
cpu-python3.12-unit-test	`70.47% <89.47%> (+0.05%)`	⬆️
cpu-python3.7-unit-test	`69.90% <89.47%> (+0.05%)`	⬆️
cuda-unit-test	`83.62% <89.47%> (+0.02%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

…sserts - Use exact match (col_name.strip() == 'hipblaslt-Gflops') for the FLOPS column lookup, avoiding accidental matches against future names like 'hipblaslt-Gflops-peak'. - Rewrite the inline comment to drop misleading positional index claims and use the correct 'hipblaslt-GB/s' name instead of 'GB_s'. - Replace brittle assertEqual(2, len(...)) and exact float assertEqual with assertIn(...) + assertAlmostEqual(..., places=3) for the new-format test.

- Harden first-header-field rank-marker stripping with an anchored regex (re.sub r'^\s*\[\d+\]:?'), so a column whose name happens to contain ']' is not truncated. - Add an explicit IndexError guard before reading lines[index + 1] so hipblaslt-bench output that ends on the header (e.g., crash after printing it) yields a clear ValueError instead of a generic IndexError.

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Resolves the remaining unresolved Copilot review comment on PR #791 (comment id 3177652631): the previous fix only resolved the hipblaslt-Gflops column by header name; the metric key still used fixed positions 3-6 for batch_count/m/n/k, defeating the forward-compatibility claim. Changes: - Build a column-name -> index map from the parsed header (first-occurrence-wins via dict.setdefault) and resolve every key column (batch_count, m, n, k, hipblaslt-Gflops) by name. - Raise ValueError listing all missing required columns so unknown output formats fail loudly instead of silently producing a wrong metric key. - Strip whitespace from each selected data value before building the metric key, so any CSV padding does not bleed into the key. - Replace the misleading 'indices 3/4-6 are stable' comment with a note explaining that all key fields are header-driven. Tests: - Add forward-compat positive test that inserts an extra column before batch_count and pads data values with whitespace, asserting the correct key is produced and that no key derived from the wrong positional field leaks through. - Add negative test asserting MICROBENCHMARK_RESULT_PARSING_FAILURE when a required column (batch_count) is missing from the header.

…version

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

Resolves Copilot inline review comment id 3277063504 on PR #791. The v1500 and future-format tests already followed the assertIn + assertAlmostEqual pattern after the previous review round; this commit applies the same hardening to the v600 positive test that was missed. - Drop self.assertEqual(2, len(benchmark.result)) which coupled the test to the always-present return_code metric instead of the behavior under test. - Replace self.assertEqual(58.6245, ...) (exact float equality) with self.assertIn(...) + self.assertAlmostEqual(..., places=4) so the numeric check is not flaky on floating-point representation. places=4 matches the four decimal places in the expected fixture value.

Fix hpblaslt cannot parse the result on V1500.

622592c

polarG requested a review from a team as a code owner March 20, 2026 17:52

Copilot AI review requested due to automatic review settings March 20, 2026 17:52

Copilot AI reviewed Mar 20, 2026

View reviewed changes

Comment thread tests/benchmarks/micro_benchmarks/test_hipblaslt_function.py Outdated

Comment thread superbench/benchmarks/micro_benchmarks/hipblaslt_function.py Outdated

Comment thread superbench/benchmarks/micro_benchmarks/hipblaslt_function.py Outdated

Copilot started reviewing on behalf of polarG March 20, 2026 18:17 View session

polarG self-assigned this Mar 23, 2026

polarG added enhancement New feature or request ROCm labels Mar 23, 2026

Hongtao Zhang added 2 commits May 2, 2026 00:11

Copilot AI review requested due to automatic review settings May 3, 2026 04:57

Copilot started reviewing on behalf of polarG May 3, 2026 04:58 View session

Copilot AI reviewed May 3, 2026

View reviewed changes

Comment thread superbench/benchmarks/micro_benchmarks/hipblaslt_function.py Outdated

WenqingLan1 approved these changes May 12, 2026

View reviewed changes

guoshzhao approved these changes May 13, 2026

View reviewed changes

Hongtao Zhang and others added 2 commits May 20, 2026 20:38

Merge branch 'main' into dev/hongtaozhang/fix-hipblaslt-parse-in-new-…

73440eb

…version

Copilot AI review requested due to automatic review settings May 20, 2026 20:40

Copilot started reviewing on behalf of polarG May 20, 2026 20:40 View session

polarG enabled auto-merge (squash) May 20, 2026 20:40

Copilot AI reviewed May 20, 2026

View reviewed changes

Comment thread tests/benchmarks/micro_benchmarks/test_hipblaslt_function.py

Comment thread tests/benchmarks/micro_benchmarks/test_hipblaslt_function.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhancement - Fix hipblaslt-gemm result parsing for hipBLASLt v1500+ output format#791

Enhancement - Fix hipblaslt-gemm result parsing for hipBLASLt v1500+ output format#791
polarG wants to merge 6 commits into
mainfrom
dev/hongtaozhang/fix-hipblaslt-parse-in-new-version

polarG commented Mar 20, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov Bot commented Mar 20, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

polarG commented Mar 20, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov Bot commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov Bot commented Mar 20, 2026 •

edited

Loading