Skip to content

Problem encountered when running DeepseekV2-Lite #37

@Instant-Nebula

Description

@Instant-Nebula

hi, I encountered the following error when running DeepseekV2-Lite:

[rank0]:   File "/nnScaler/nnscaler/ir/operator.py", line 65, in verify_shape
[rank0]:     infered_shapes = self.infer_shape()
[rank0]:   File "/nnScaler/nnscaler/graph/function/dimops.py", line 795, in infer_shape
[rank0]:     raise ValueError(f"Missing dimension length for identifier: {identifier}, {shape_anno[odim].identifiers}")
[rank0]: ValueError: Missing dimension length for identifier: k, ('k',)

The error appears to occur when processing the moe_route operator, where the dimension identifier k in the output part isn't assigned an appropriate value.
Based on the model configuration, this dimension should correspond to the number of experts selected per token , but it seems the code doesn't properly handle the logic for this part, failing to assign the correct dimension value for k during compilation or execution.

Could you please investigate this issue? Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions