[Common] Always define cuBLASMp comm GEMM API by vcherepanov-nv · Pull Request #2963 · NVIDIA/TransformerEngine

vcherepanov-nv · 2026-05-06T00:24:17Z

Description

Make cuBLASMp-related TE API unconditional.
NVTE_WITH_CUBLASMP options now controls whether the user gets the real implementation or subs, emitting runtime errors.

Fixes # (issue)

Type of change

Documentation change (change only to the documentation, either a fix or a new content)
Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Infra/Build change
Code refactoring

Changes

Please list the changes introduced in this PR:

Make comm-gemm API unconditional, stubs emit runtime errors if called without NVTE_WITH_CUBLASMP flag set.

Checklist:

I have read and followed the contributing guidelines
The functionality is complete
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

Signed-off-by: Vladimir Cherepanov <vcherepanov@nvidia.com>

for more information, see https://pre-commit.ci

greptile-apps · 2026-05-06T00:28:31Z

Greptile Summary

This PR makes the cuBLASMp comm-GEMM API unconditional: comm_gemm/comm_gemm.cpp is now always compiled, with the real implementation guarded by #ifdef NVTE_WITH_CUBLASMP and a #else block of stubs that throw a descriptive runtime error when called without cuBLASMp support.

CMakeLists.txt: comm_gemm/comm_gemm.cpp is moved from a conditional NVTE_WITH_CUBLASMP-gated append to transformer_engine_SOURCES into the unconditional transformer_engine_cpp_sources list, which is already merged into transformer_engine_SOURCES at build time.
comm_gemm.cpp: #include <cublasmp.h> and using namespace transformer_engine are now inside the #ifdef NVTE_WITH_CUBLASMP block; a new #else block provides stub definitions for all five public API functions that immediately throw std::runtime_error with a clear build-configuration message.

Confidence Score: 4/5

Safe to merge. The change is a straightforward build-system and conditional-compilation refactor with no logic modifications to the real cuBLASMp path.

The CMakeLists.txt change is correct — transformer_engine_cpp_sources is merged into transformer_engine_SOURCES at line 237, so the file ends up in exactly the right place. The stub implementations cover all five public API functions and use the standard NVTE_ERROR pattern. The only nit is that two non-void stubs (nvte_comm_gemm_ctx_create and nvte_comm_gemm_numroc) omit explicit return statements after NVTE_ERROR, which is harmless in practice since the macro always throws, but may generate warnings under stricter toolchain configurations.

No files require special attention; both changed files are straightforward.

Important Files Changed

Filename	Overview
transformer_engine/common/comm_gemm/comm_gemm.cpp	Wraps the real cuBLASMp implementation in `#ifdef NVTE_WITH_CUBLASMP` and adds a `#else` block with stub functions that throw runtime errors. Two non-void stub functions (`nvte_comm_gemm_ctx_create` and `nvte_comm_gemm_numroc`) omit explicit `return` statements, which may produce warnings on some toolchains.
transformer_engine/common/CMakeLists.txt	Moves `comm_gemm/comm_gemm.cpp` from a conditional `transformer_engine_SOURCES` append (guarded by `NVTE_WITH_CUBLASMP`) to the unconditional `transformer_engine_cpp_sources` list, which is merged into `transformer_engine_SOURCES` at line 237. The change is correct and consistent with how all other unconditional C++ sources are registered.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[comm_gemm.cpp compiled unconditionally] --> B{NVTE_WITH_CUBLASMP defined?}
    B -- Yes --> C[#include cublasmp.h\nusing namespace transformer_engine\nReal implementation]
    B -- No --> D[Stub implementations\nNVTE_ERROR on all calls]
    C --> E[nvte_comm_gemm_ctx_create\nnvte_comm_gemm_ctx_destroy\nnvte_all_gather_gemm\nnvte_gemm_reduce_scatter\nnvte_gemm_all_reduce\nnvte_comm_gemm_numroc]
    D --> F[Same API surface\nthrows runtime_error]

_{Reviews (1): Last reviewed commit: "[pre-commit.ci] auto fixes from pre-comm..." | Re-trigger Greptile}

greptile-apps · 2026-05-06T00:28:35Z

+NVTECommGemmCtx* nvte_comm_gemm_ctx_create(ncclComm_t comm, int nranks, int rank) {
+  NVTE_ERROR("Transformer Engine has not been built with cuBLASMp support.");
+}


Both nvte_comm_gemm_ctx_create (returns NVTECommGemmCtx*) and nvte_comm_gemm_numroc (returns int64_t) in the stub block have no explicit return statement after NVTE_ERROR. NVTE_ERROR always throws so execution never reaches the end, but certain compilers or static-analysis passes may still flag these as missing returns. Adding unreachable return values makes the intent explicit and is consistent with many defensive coding patterns for non-[[noreturn]] macros.

Suggested change

NVTECommGemmCtx* nvte_comm_gemm_ctx_create(ncclComm_t comm, int nranks, int rank) {

NVTE_ERROR("Transformer Engine has not been built with cuBLASMp support.");

}

NVTECommGemmCtx* nvte_comm_gemm_ctx_create(ncclComm_t comm, int nranks, int rank) {

NVTE_ERROR("Transformer Engine has not been built with cuBLASMp support.");

return nullptr;

}

greptile-apps · 2026-05-06T00:28:36Z

+int64_t nvte_comm_gemm_numroc(NVTECommGemmCtx* ctx, int64_t global_size) {
+  NVTE_ERROR("Transformer Engine has not been built with cuBLASMp support.");
+}


nvte_comm_gemm_numroc also returns a non-void type (int64_t) without an explicit return statement in the stub. Same reasoning as above — add an unreachable sentinel to be safe.

Suggested change

int64_t nvte_comm_gemm_numroc(NVTECommGemmCtx* ctx, int64_t global_size) {

NVTE_ERROR("Transformer Engine has not been built with cuBLASMp support.");

}

int64_t nvte_comm_gemm_numroc(NVTECommGemmCtx* ctx, int64_t global_size) {

NVTE_ERROR("Transformer Engine has not been built with cuBLASMp support.");

return 0;

}

timmoon10

Strong approve. This will help avoid linker weirdness.

* Always define cuBLASMp comm GEMM API Signed-off-by: Vladimir Cherepanov <vcherepanov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Vladimir Cherepanov <vcherepanov@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

Always define cuBLASMp comm GEMM API

c1f1a8e

Signed-off-by: Vladimir Cherepanov <vcherepanov@nvidia.com>

vcherepanov-nv requested a review from timmoon10 May 6, 2026 00:24

vcherepanov-nv added the 2.16.0 label May 6, 2026

[pre-commit.ci] auto fixes from pre-commit.com hooks

7b4cd17

for more information, see https://pre-commit.ci

greptile-apps Bot reviewed May 6, 2026

View reviewed changes

timmoon10 approved these changes May 6, 2026

View reviewed changes

vcherepanov-nv merged commit 3c89426 into NVIDIA:main May 6, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Common] Always define cuBLASMp comm GEMM API#2963

[Common] Always define cuBLASMp comm GEMM API#2963
vcherepanov-nv merged 2 commits into
NVIDIA:mainfrom
vcherepanov-nv:cublasmp-dummy

vcherepanov-nv commented May 6, 2026

Uh oh!

greptile-apps Bot commented May 6, 2026

Uh oh!

greptile-apps Bot May 6, 2026

Uh oh!

greptile-apps Bot May 6, 2026

Uh oh!

timmoon10 left a comment •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

vcherepanov-nv commented May 6, 2026

Description

Type of change

Changes

Checklist:

Uh oh!

greptile-apps Bot commented May 6, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

timmoon10 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

timmoon10 left a comment •

edited

Loading