[Common] Always define cuBLASMp comm GEMM API#2963
Conversation
Signed-off-by: Vladimir Cherepanov <vcherepanov@nvidia.com>
for more information, see https://pre-commit.ci
Greptile SummaryThis PR makes the cuBLASMp comm-GEMM API unconditional:
Confidence Score: 4/5Safe to merge. The change is a straightforward build-system and conditional-compilation refactor with no logic modifications to the real cuBLASMp path. The CMakeLists.txt change is correct — No files require special attention; both changed files are straightforward. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[comm_gemm.cpp compiled unconditionally] --> B{NVTE_WITH_CUBLASMP defined?}
B -- Yes --> C[#include cublasmp.h\nusing namespace transformer_engine\nReal implementation]
B -- No --> D[Stub implementations\nNVTE_ERROR on all calls]
C --> E[nvte_comm_gemm_ctx_create\nnvte_comm_gemm_ctx_destroy\nnvte_all_gather_gemm\nnvte_gemm_reduce_scatter\nnvte_gemm_all_reduce\nnvte_comm_gemm_numroc]
D --> F[Same API surface\nthrows runtime_error]
Reviews (1): Last reviewed commit: "[pre-commit.ci] auto fixes from pre-comm..." | Re-trigger Greptile |
| NVTECommGemmCtx* nvte_comm_gemm_ctx_create(ncclComm_t comm, int nranks, int rank) { | ||
| NVTE_ERROR("Transformer Engine has not been built with cuBLASMp support."); | ||
| } |
There was a problem hiding this comment.
Both
nvte_comm_gemm_ctx_create (returns NVTECommGemmCtx*) and nvte_comm_gemm_numroc (returns int64_t) in the stub block have no explicit return statement after NVTE_ERROR. NVTE_ERROR always throws so execution never reaches the end, but certain compilers or static-analysis passes may still flag these as missing returns. Adding unreachable return values makes the intent explicit and is consistent with many defensive coding patterns for non-[[noreturn]] macros.
| NVTECommGemmCtx* nvte_comm_gemm_ctx_create(ncclComm_t comm, int nranks, int rank) { | |
| NVTE_ERROR("Transformer Engine has not been built with cuBLASMp support."); | |
| } | |
| NVTECommGemmCtx* nvte_comm_gemm_ctx_create(ncclComm_t comm, int nranks, int rank) { | |
| NVTE_ERROR("Transformer Engine has not been built with cuBLASMp support."); | |
| return nullptr; | |
| } |
| int64_t nvte_comm_gemm_numroc(NVTECommGemmCtx* ctx, int64_t global_size) { | ||
| NVTE_ERROR("Transformer Engine has not been built with cuBLASMp support."); | ||
| } |
There was a problem hiding this comment.
nvte_comm_gemm_numroc also returns a non-void type (int64_t) without an explicit return statement in the stub. Same reasoning as above — add an unreachable sentinel to be safe.
| int64_t nvte_comm_gemm_numroc(NVTECommGemmCtx* ctx, int64_t global_size) { | |
| NVTE_ERROR("Transformer Engine has not been built with cuBLASMp support."); | |
| } | |
| int64_t nvte_comm_gemm_numroc(NVTECommGemmCtx* ctx, int64_t global_size) { | |
| NVTE_ERROR("Transformer Engine has not been built with cuBLASMp support."); | |
| return 0; | |
| } |
* Always define cuBLASMp comm GEMM API Signed-off-by: Vladimir Cherepanov <vcherepanov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Vladimir Cherepanov <vcherepanov@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Description
Make cuBLASMp-related TE API unconditional.
NVTE_WITH_CUBLASMP options now controls whether the user gets the real implementation or subs, emitting runtime errors.
Fixes # (issue)
Type of change
Changes
Please list the changes introduced in this PR:
Checklist: