-
Notifications
You must be signed in to change notification settings - Fork 38
Open
Description
I attempted to run Open-Sora 2.0 DC-AE inference on NPU and got this error:
[rank0]: File "xxx/lib/python3.9/site-packages/torch/nn/modules/conv.py", line 603, in _conv_forward
[rank0]: return F.conv3d(
[rank0]: RuntimeError: call aclnnConvolution failed, detail:EZ1001: [PID: xxx] 2025-xx-xx-17:44:xxx Conv3dv2 only supports groups = 1 or groups = cin
[rank0]: TraceBack (most recent call last):
[rank0]: conv3d raise an unknown error
[rank0]: check the condition which is "ret == ACLNN_SUCCESS" failed, except the condition is true.
Specs
python3.9.10,CANN-8.0.RC3, torch-npu==2.4.0
Question
The issue occurs only when using torch.npu.config.allow_internal_format = False. There is no issue using True but inference is extremely slow, this is why i planned to try with True as well.
Is there a way to overcome this issue and successfully run the inference using torch.npu.config.allow_internal_format = False? Can the issue relate to #73 ?
More info: hpcaitech/Open-Sora#884
Metadata
Metadata
Assignees
Labels
No labels