Skip to content

Conv3dv2 only supports groups = 1 or groups = cin #85

@3manifold

Description

@3manifold

I attempted to run Open-Sora 2.0 DC-AE inference on NPU and got this error:

[rank0]:   File "xxx/lib/python3.9/site-packages/torch/nn/modules/conv.py", line 603, in _conv_forward
[rank0]:     return F.conv3d(
[rank0]: RuntimeError: call aclnnConvolution failed, detail:EZ1001: [PID: xxx] 2025-xx-xx-17:44:xxx Conv3dv2 only supports groups = 1 or groups = cin
[rank0]:         TraceBack (most recent call last):
[rank0]:         conv3d raise an unknown error
[rank0]:         check the condition which is "ret == ACLNN_SUCCESS" failed, except the condition is true.

Specs

python3.9.10,CANN-8.0.RC3, torch-npu==2.4.0

Question

The issue occurs only when using torch.npu.config.allow_internal_format = False. There is no issue using True but inference is extremely slow, this is why i planned to try with True as well.
Is there a way to overcome this issue and successfully run the inference using torch.npu.config.allow_internal_format = False? Can the issue relate to #73 ?

More info: hpcaitech/Open-Sora#884

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions