Skip to content

[BUG] nvpf4 tensor creation looks incorrect #3057

@whatdhack

Description

@whatdhack

Which component has the problem?

CuTe DSL

Bug Report

Describe the bug
The functions used for creating tensors , create_tensors_abc_for_all_groups () and create_tensor_and_stride () , in cutlass/examples/python/CuTeDSL/blackwell/grouped_blockscaled_gemm.py seem incorrect for cutlass.Float4E2M1FN and torch.float4_e2m1fn_x2. The 4 bit packing does not appear to have been coded.

Steps/Code to reproduce bug
See above functions

Expected behavior
k//2 vs k , for example in create_tensor_and_stride(l, m, k, a_major == "m", ab_dtype)

Environment details (please complete the following information):
cutlass master

Additional context
N/A

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions