Skip to content

Compiling error: ‘OVecT’ has not been declared #3155

Description

@cryoco

Describe the bug

/TransformerEngine/transformer_engine/common/activation/./../cast/dispatch/../fp8/group_quantize_fp8.cuh:1075: error: template argument 2 is invalid
/TransformerEngine/transformer_engine/common/activation/./../cast/dispatch/../fp8/group_quantize_fp8.cuh:1075: error: ‘IVecT’ has not been declared
/TransformerEngine/transformer_engine/common/activation/./../cast/dispatch/../fp8/group_quantize_fp8.cuh:1075: error: ‘OVecT’ has not been declared

Steps/Code to reproduce bug

NCCL_HOME=/usr/local/miniconda3/lib/python3.12/site-packages/nvidia/nccl NVTE_BUILD_WITH_NCCL_EP=1 USE_NCCL=1 NVTE_CUDA_ARCHS="100a" NVTE_BUILD_THREADS_PER_JOB=8 NVTE_FRAMEWORK=pytorch VERBOSE=1 pip install -vvv --no-build-isolation --no-cache-dir -e .

Environment details

  • B200
  • PyTorch 2.11
  • Python 3.12
  • Transformer Engine version 4cd244e
  • CUDA 13.0
  • CUDNN 9.19

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingbuildBuild system

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions