Skip to content

Fix compute capability encoding for mathDx#1170

Merged
cliffburdick merged 1 commit into
mainfrom
fix/sm-encoding
May 7, 2026
Merged

Fix compute capability encoding for mathDx#1170
cliffburdick merged 1 commit into
mainfrom
fix/sm-encoding

Conversation

@simonbyrne
Copy link
Copy Markdown
Collaborator

This was getting computed incorrectly (e.g. 809 instead of 890)

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 6, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@simonbyrne simonbyrne requested a review from cliffburdick May 6, 2026 21:25
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 6, 2026

Greptile Summary

This PR fixes an off-by-10× encoding bug in compute capability calculation where the minor version was added as a raw digit instead of being multiplied by 10 (e.g., SM 8.9 was encoded as 809 instead of the correct 890). The fix is applied consistently across all three affected sites.

  • include/matx/core/utils.h: GetComputeCapability() now returns major * 100 + minor * 10, matching the __CUDA_ARCH__ encoding convention used throughout CUDA. The downstream consumer in tensor_impl.h (>= 1000 for SM 10.0) remains correct with both encodings since SM 10.0 has minor = 0.
  • include/matx/operators/fft.h and include/matx/operators/matmul.h: The same fix is applied to the two local cc computations that feed into mathDx helper set_cc() calls, ensuring mathDx selects the correct code path for GPUs with non-zero minor versions (e.g., Ada Lovelace SM 8.9).

Confidence Score: 5/5

Safe to merge — all three encoding sites are corrected and no callers are broken by the fix.

The change is a minimal, targeted arithmetic correction applied consistently to every location that computes the CUDA compute capability value. The one downstream caller comparing against >= 1000 for SM 10.0 is unaffected because SM 10.0 has minor = 0, making both encodings produce identical results there.

No files require special attention.

Important Files Changed

Filename Overview
include/matx/core/utils.h Fixed GetComputeCapability() to multiply the minor version by 10, so SM 8.9 -> 890 instead of 809; also removes a trailing blank line.
include/matx/operators/matmul.h Fixed local compute-capability encoding in the mathDx GEMM helper initialization path to use minor * 10.
include/matx/operators/fft.h Fixed local compute-capability encoding in the mathDx FFT helper initialization path to use minor * 10; the previously-noted bug is now resolved.

Reviews (2): Last reviewed commit: "Fix compute capability encoding for math..." | Re-trigger Greptile

@simonbyrne
Copy link
Copy Markdown
Collaborator Author

/build

@coveralls
Copy link
Copy Markdown

Coverage Status

Coverage is 94.308%fix/sm-encoding into main. No base build found for main.

@cliffburdick cliffburdick merged commit aeac6b8 into main May 7, 2026
1 check passed
@cliffburdick cliffburdick deleted the fix/sm-encoding branch May 7, 2026 17:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants