Skip to content

Use arch when writing to tuningDB, perfRunnerlooks for arch#2259

Merged
umangyadav merged 6 commits intodevelopfrom
fixPerfRunner
Mar 3, 2026
Merged

Use arch when writing to tuningDB, perfRunnerlooks for arch#2259
umangyadav merged 6 commits intodevelopfrom
fixPerfRunner

Conversation

@umangyadav
Copy link
Copy Markdown
Member

Motivation

TuningDB has column "arch". But currently it stores "chip".

Arch has full name e.g. gfx950:sramecc+:xnack-
Chip only has gfx950.

PerfRunner looks for "arch"

if (arch, num_cu, num_chiplets, config_str) in tuning_db:

When it can not find it, it sets TFLops to NaN.

Technical Details

Make tuningRunner store "arch" instead of chip

Test Plan

Reproduce steps :

python3 ./bin/tuningRunner.py --operation gemm -c ../mlir/utils/performance/configs/tier1-gemm-configs --tuning-space=quick -o gemm_quick.tsv | python3 ./bin/perfRunner.py --operation gemm -c ../mlir/utils/performance/configs/tier1-gemm-configs -t gemm_quick.tsv -o gemm.csv --batch_mlir

This produces NaNs.

After this PR running same produces correct TFlops values.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a bug where tuningRunner was writing the chip name (e.g., "gfx950") to the tuning database instead of the full architecture string (e.g., "gfx950:sramecc+:xnack-"). This caused perfRunner to fail to find matching entries when looking up tuning configurations, resulting in NaN TFlops values.

Changes:

  • Changed tuningRunner to write arch instead of chip to the tuning database, aligning with the database schema and perfRunner's lookup expectations

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread mlir/utils/performance/tuningRunner.py Outdated
Comment thread mlir/utils/performance/tuningRunner.py Outdated
umangyadav and others added 4 commits February 27, 2026 15:29
Co-authored-by: Mirza Halilčević <109971222+mirza-halilcevic@users.noreply.github.com>
Co-authored-by: Mirza Halilčević <109971222+mirza-halilcevic@users.noreply.github.com>
@umangyadav umangyadav merged commit 11d5c9d into develop Mar 3, 2026
7 of 14 checks passed
@umangyadav umangyadav deleted the fixPerfRunner branch March 3, 2026 15:45
cursor Bot pushed a commit that referenced this pull request Mar 13, 2026
Automated weekly review of merged PRs #2234 #2240 #2248 #2249 #2251
#2254 #2257 #2258 #2259 #2270 #2271.

Identifies 6 areas with weak test coverage and meaningful business risk:
1. ConcurrentQueue (no unit tests, multi-threaded, silent deadlock risk)
2. parse_tuning_db_line / read_tuning_db key schema change (no Python tests)
3. BooleanElementwiseConverter missing f16/unsigned dtype coverage
4. Attention MaxNumFOp vs MaximumFOp NaN correctness (no dedicated test)
5. firstCausalMaskIter off-by-one risk (no non-trivial shape test)
6. Sliding window attention edge cases (windowSize=0/>=seqLen/unaligned)

The GitHub discussion API returned FORBIDDEN (read-only CI token);
analysis committed here as a permanent record.

Co-authored-by: Djordje Antic <djordje.antic@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants