Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got CPUFloatType instead (while checking arguments for embedding) #5194

Closed
is-sixfive opened this issue Dec 4, 2023 · 2 comments

Comments

@is-sixfive
Copy link

error log | 日志或报错信息 | ログ

RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got CPUFloatType instead (while checking arguments for embedding)

model | 模型 | モデル

  1. Frozen CLIP

how to reproduce | 复现步骤 | 再現方法

1.torch model export to torchscript model

class TransformerCLIP(nn.Module):
    def __init__(self, version="openai/clip-vit-large-patch14", device="cpu", max_length=77):
        super().__init__()
        self.transformer = CLIPTextModel.from_pretrained(version)
        self.freeze()

    def forward(self, tokens ):
        outputs = self.transformer(input_ids=tokens)
        z = outputs.last_hidden_state
        return z
    
    def freeze(self):
        self.transformer = self.transformer.eval()
        for param in self.parameters():
            param.requires_grad = False

model = TransformerCLIP()
tokens = torch.full((1, 77), 10, dtype=torch.long)

# 保存为torchscript
model = torch.jit.trace( model,tokens)
torch.jit.save(model,"FrozenCLIPEmbedder.pt")

The above code saves the model file :FrozenCLIPEmbedder.pt

2.pnnx export to ncnn
Converting from PyTorch to NCNN by PNNX

./pnnx FrozenCLIPEmbedder/FrozenCLIPEmbedder.pt inputshape=[1,77] device=cpu

or

./pnnx FrozenCLIPEmbedder/FrozenCLIPEmbedder.pt inputshape=[1,77]

Get the error

terminate called after throwing an instance of 'std::runtime_error'
  what():  The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
  File "code/__torch__/transformers/models/clip/modeling_clip/___torch_mangle_295.py", line 18, in forward
    _1 = torch.slice(position_ids, 0, 0, 9223372036854775807)
    input = torch.slice(_1, 1, 0, _0)
    _2 = (token_embedding).forward(input_ids, )
          ~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
    _3 = (position_embedding).forward(input, )
    return torch.add(_2, _3)
  File "code/__torch__/torch/nn/modules/sparse/___torch_mangle_293.py", line 10, in forward
    input_ids: Tensor) -> Tensor:
    weight = self.weight
    return torch.embedding(weight, input_ids)
           ~~~~~~~~~~~~~~~ <--- HERE
           
RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got CPUFloatType instead (while checking arguments for embedding)

the input type must be Long, but got CPUFloatType instead

@nihui
Copy link
Member

nihui commented Dec 4, 2023

./pnnx FrozenCLIPEmbedder/FrozenCLIPEmbedder.pt inputshape=[1,77]i32

@is-sixfive
Copy link
Author

is-sixfive commented Dec 4, 2023

./pnnx FrozenCLIPEmbedder/FrozenCLIPEmbedder.pt inputshape=[1,77]i32

I got a new problem after converting:

My input:

tensor([[49406,  3306,   267,  1002,   256, 49407, 49407, 49407, 49407, 49407,
         49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407,
         49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407,
         49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407,
         49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407,
         49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407,
         49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407,
         49407, 49407, 49407, 49407, 49407, 49407, 49407]])

The TorchScript modle(FrozenCLIPEmbedder.pt) output result:

tensor([[[-0.3884,  0.0229, -0.0522,  ..., -0.4899, -0.3066,  0.0675],
         [ 1.4503,  0.1754, -1.5768,  ..., -0.6718, -0.6458, -0.3667],
         [-0.2894, -0.1653,  0.6982,  ..., -0.4293, -0.2959, -0.4589],
         ...,
         [ 1.5088, -0.4841, -0.7476,  ...,  1.1054, -0.7395, -0.0396],
         [ 1.5248, -0.4859, -0.7393,  ...,  1.1230, -0.7272, -0.0448],
         [ 1.5190, -0.3933, -0.6825,  ...,  1.0893, -0.7731, -0.0805]]])

Cover the .pt modle to ncnn by ./pnnx FrozenCLIPEmbedder/FrozenCLIPEmbedder.pt inputshape=[1,77]i32 fp16=0 ,and the ncnn output result:

tensor([[[-0.3884,  0.0229, -0.0522,  ..., -0.4899, -0.3066,  0.0675],
         [-0.6827, -0.8874,  1.5381,  ..., -2.3455,  0.6647, -0.1998],
         [ 1.3404, -0.1230, -1.4677,  ..., -0.3110, -1.4932,  0.3119],
         ...,
         [ 0.9224, -1.0389, -0.1586,  ..., -0.3814, -1.3504,  0.2904],
         [ 0.7440, -0.5869, -0.9922,  ..., -0.3876, -0.8899, -0.4330],
         [ 0.9144, -0.6212, -0.0662,  ..., -0.3169, -1.3071,  0.1817]]])

Only the first line of the output is the same: [-0.3884, 0.0229, -0.0522, ..., -0.4899, -0.3066, 0.0675]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants