Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PNNX 转ncnn失败 想知道inputshape2的正确用法: ( #3480

Closed
Always-Naive opened this issue Jan 5, 2022 · 6 comments
Closed

PNNX 转ncnn失败 想知道inputshape2的正确用法: ( #3480

Always-Naive opened this issue Jan 5, 2022 · 6 comments

Comments

@Always-Naive
Copy link

error log | 日志或报错信息 | ログ


1.尝试了pt.tar-> onnx -> simplifier -> ncnn output如下:
Gather not supported yet!
# axis=2
Gather not supported yet!
# axis=1
Expand not supported yet!
ScatterND not supported yet!
Gather not supported yet!
# axis=2
Gather not supported yet!
# axis=1
Expand not supported yet!
ScatterND not supported yet!
Less not supported yet!
NonZero not supported yet!
GatherND not supported yet!
Less not supported yet!
Not not supported yet!
NonZero not supported yet!
GatherND not supported yet!
Expand not supported yet!
Tile not supported yet!
Expand not supported yet!
Tile not supported yet!
Expand not supported yet!
Tile not supported yet!
Expand not supported yet!
Tile not supported yet!
Less not supported yet!
NonZero not supported yet!
GatherND not supported yet!
Less not supported yet!
Not not supported yet!
NonZero not supported yet!
GatherND not supported yet!
发现有好多 不支持的op 比较菜 不会改 于是换了pnnx想试试运气

pnnx 依旧不大行
Scenario1:
./pnnx pnnx.pt inputshape=[1,3,120,120] inputshape2=[1,62]
pnnxparam = pnnx.pnnx.param
pnnxbin = pnnx.pnnx.bin
pnnxpy = pnnx_pnnx.py
ncnnparam = pnnx.ncnn.param
ncnnbin = pnnx.ncnn.bin
ncnnpy = pnnx_ncnn.py
optlevel = 2
device = cpu
inputshape = [1,3,120,120]f32
inputshape2 = [1,62]f32
customop =
moduleop =
############# pass_level0
inline module = backbone_nets.mobilenetv2_backbone.ConvBNReLU
inline module = backbone_nets.mobilenetv2_backbone.InvertedResidual
inline module = backbone_nets.mobilenetv2_backbone.MobileNetV2
inline module = backbone_nets.pointnet_backbone.MLP_for
inline module = backbone_nets.pointnet_backbone.MLP_rev
inline module = loss_definition.ParamLoss
inline module = loss_definition.WingLoss
inline module = model_building.I2P
terminate called after throwing an instance of 'c10::Error'
what(): forward() is missing value for argument 'target'. Declaration: forward(torch.model_building.SynergyNet self, Tensor input, Tensor target) -> (Dict(str, Tensor))
Exception raised from checkAndNormalizeInputs at ../aten/src/ATen/core/function_schema_inl.h:261 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x42 (0x7f72cd9b5d62 in /home/dagu/.local/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::string const&) + 0x5b (0x7f72cd9b268b in /home/dagu/.local/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #2: + 0x1279e04 (0x7f72b8423e04 in /home/dagu/.local/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #3: torch::jit::GraphFunction::operator()(std::vector<c10::IValue, std::allocatorc10::IValue >, std::unordered_map<std::string, c10::IValue, std::hashstd::string, std::equal_tostd::string, std::allocator<std::pair<std::string const, c10::IValue> > > const&) + 0x2d (0x7f72baa7534d in /home/dagu/.local/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #4: torch::jit::Method::operator()(std::vector<c10::IValue, std::allocatorc10::IValue >, std::unordered_map<std::string, c10::IValue, std::hashstd::string, std::equal_tostd::string, std::allocator<std::pair<std::string const, c10::IValue> > > const&) const + 0x14d (0x7f72baa8465d in /home/dagu/.local/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #5: + 0x80b0a (0x562d434dab0a in ./pnnx)
frame #6: + 0x7ec64 (0x562d434d8c64 in ./pnnx)
frame #7: + 0x63363 (0x562d434bd363 in ./pnnx)
frame #8: + 0x32fc4 (0x562d4348cfc4 in ./pnnx)
frame #9: __libc_start_main + 0xe7 (0x7f72705f4bf7 in /lib/x86_64-linux-gnu/libc.so.6)
frame #10: + 0x313ea (0x562d4348b3ea in ./pnnx)

Aborted (core dumped)

Scenario2:
./pnnx pnnx.pt inputshape=[1,3,120,120],[1,62]
pnnxparam = pnnx.pnnx.param
pnnxbin = pnnx.pnnx.bin
pnnxpy = pnnx_pnnx.py
ncnnparam = pnnx.ncnn.param
ncnnbin = pnnx.ncnn.bin
ncnnpy = pnnx_ncnn.py
optlevel = 2
device = cpu
inputshape = [1,3,120,120]f32,[1,62]f32
inputshape2 =
customop =
moduleop =
############# pass_level0
inline module = backbone_nets.mobilenetv2_backbone.ConvBNReLU
inline module = backbone_nets.mobilenetv2_backbone.InvertedResidual
inline module = backbone_nets.mobilenetv2_backbone.MobileNetV2
inline module = backbone_nets.pointnet_backbone.MLP_for
inline module = backbone_nets.pointnet_backbone.MLP_rev
inline module = loss_definition.ParamLoss
inline module = loss_definition.WingLoss
inline module = model_building.I2P
terminate called after throwing an instance of 'std::runtime_error'
what(): The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
File "code/torch/backbone_nets/mobilenetv2_backbone.py", line 35, in forward
_8 = getattr(self, "2")
_9 = getattr(self, "1")
_10 = (getattr(self, "0")).forward(input, )
~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
return (_8).forward((_9).forward(_10, ), )
class InvertedResidual(Module):
File "code/torch/torch/nn/modules/conv.py", line 9, in forward
def forward(self: torch.torch.nn.modules.conv.Conv2d,
input: Tensor) -> Tensor:
input0 = torch._convolution(input, self.weight, None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1, True, False, True, True)
~~~~~~~~~~~~~~~~~~ <--- HERE
return input0
class Conv1d(Module):

Traceback of TorchScript, original code (most recent call last):
/home/dagu/anaconda3/envs/snet/lib/python3.8/site-packages/torch/nn/modules/conv.py(439): _conv_forward
/home/dagu/anaconda3/envs/snet/lib/python3.8/site-packages/torch/nn/modules/conv.py(443): forward
/home/dagu/anaconda3/envs/snet/lib/python3.8/site-packages/torch/nn/modules/module.py(1039): _slow_forward
/home/dagu/anaconda3/envs/snet/lib/python3.8/site-packages/torch/nn/modules/module.py(1051): _call_impl
/home/dagu/anaconda3/envs/snet/lib/python3.8/site-packages/torch/nn/modules/container.py(139): forward
/home/dagu/anaconda3/envs/snet/lib/python3.8/site-packages/torch/nn/modules/module.py(1039): _slow_forward
/home/dagu/anaconda3/envs/snet/lib/python3.8/site-packages/torch/nn/modules/module.py(1051): _call_impl
/home/dagu/anaconda3/envs/snet/lib/python3.8/site-packages/torch/nn/modules/container.py(139): forward
/home/dagu/anaconda3/envs/snet/lib/python3.8/site-packages/torch/nn/modules/module.py(1039): _slow_forward
/home/dagu/anaconda3/envs/snet/lib/python3.8/site-packages/torch/nn/modules/module.py(1051): _call_impl
/home/dagu/Desktop/SynergyNet-main/backbone_nets/mobilenetv2_backbone.py(177): _forward_impl
/home/dagu/Desktop/SynergyNet-main/backbone_nets/mobilenetv2_backbone.py(192): forward
/home/dagu/anaconda3/envs/snet/lib/python3.8/site-packages/torch/nn/modules/module.py(1039): _slow_forward
/home/dagu/anaconda3/envs/snet/lib/python3.8/site-packages/torch/nn/modules/module.py(1051): _call_impl
/home/dagu/Desktop/SynergyNet-main/model_building.py(48): forward
/home/dagu/anaconda3/envs/snet/lib/python3.8/site-packages/torch/nn/modules/module.py(1039): _slow_forward
/home/dagu/anaconda3/envs/snet/lib/python3.8/site-packages/torch/nn/modules/module.py(1051): _call_impl
/home/dagu/Desktop/SynergyNet-main/model_building.py(135): forward
/home/dagu/anaconda3/envs/snet/lib/python3.8/site-packages/torch/nn/modules/module.py(1039): _slow_forward
/home/dagu/anaconda3/envs/snet/lib/python3.8/site-packages/torch/nn/modules/module.py(1051): _call_impl
/home/dagu/anaconda3/envs/snet/lib/python3.8/site-packages/torch/jit/_trace.py(952): trace_module
/home/dagu/anaconda3/envs/snet/lib/python3.8/site-packages/torch/jit/_trace.py(735): trace
test.py(63): pth_to_onnx
test.py(79):
RuntimeError: Input type (CPUFloatType) and weight type (CUDAFloatType) should be the same or input should be a MKLDNN tensor and weight is a dense tensor

Aborted (core dumped)

model | 模型 | モデル

  1. original model
    是这个repo的模型
    这个 里面有onnx和pnnx.pt 两个转好的

how to reproduce | 复现步骤 | 再現方法

  1. 如上所述,求大佬指明前进方向

救救孩子

@nihui
Copy link
Member

nihui commented Jan 5, 2022

如果是两个输入

./pnnx pnnx.pt inputshape=[1,3,120,120],[1,62]

inputshape2是给动态shape用的

@Always-Naive
Copy link
Author

我试了这种 详见上面 Scenario2:还是失败 :(

@nihui
Copy link
Member

nihui commented Jan 5, 2022

另外,trace torchscript 用 cpu model,否则在没有n卡的机器上跑不了

@Always-Naive
Copy link
Author

@nihui 大佬 改到cpu上之后好像离成功更近了一步 但是segmentation fault又把我击碎了 有没有新的方向可以尝试呀?
模型

pnnxparam = pnnx.pnnx.param
pnnxbin = pnnx.pnnx.bin
pnnxpy = pnnx_pnnx.py
ncnnparam = pnnx.ncnn.param
ncnnbin = pnnx.ncnn.bin
ncnnpy = pnnx_ncnn.py
optlevel = 2
device = cpu
inputshape = [1,3,120,120]f32,[1,62]f32
inputshape2 =
customop =
moduleop =
############# pass_level0
inline module = backbone_nets.mobilenetv2_backbone.ConvBNReLU
inline module = backbone_nets.mobilenetv2_backbone.InvertedResidual
inline module = backbone_nets.mobilenetv2_backbone.MobileNetV2
inline module = backbone_nets.pointnet_backbone.MLP_for
inline module = backbone_nets.pointnet_backbone.MLP_rev
inline module = loss_definition.ParamLoss
inline module = loss_definition.WingLoss
inline module = model_building.I2P
############# pass_level1
no attribute value
unknown Parameter value kind prim::Constant
no attribute value
no attribute value
unknown Parameter value kind prim::Constant
no attribute value
unknown Parameter value kind prim::Constant
no attribute value
no attribute value
no attribute value
no attribute value
no attribute value
no attribute value
no attribute value
unknown Parameter value kind prim::Constant
no attribute value
unknown Parameter value kind prim::Constant
no attribute value
no attribute value
no attribute value
no attribute value
no attribute value
no attribute value
############# pass_level2
############# pass_level3
############# pass_level4
############# pass_level5
make_slice_expression input 10
make_slice_expression input 11
make_slice_expression input 175
make_slice_expression input 175
make_slice_expression input 175
make_slice_expression input 177
make_slice_expression input 10
make_slice_expression input 11
make_slice_expression input 194
make_slice_expression input 194
make_slice_expression input 194
make_slice_expression input 196
make_slice_expression input 171
make_slice_expression input 172
make_slice_expression input 171
make_slice_expression input 172
make_slice_expression input 171
make_slice_expression input 171
make_slice_expression input 334
make_slice_expression input 172
make_slice_expression input 334
make_slice_expression input 171
############# pass_ncnn
Segmentation fault (core dumped)

@nihui
Copy link
Member

nihui commented Jan 5, 2022

修了下最后的crash
导出torchscript的时候不应把训练loss计算也导出来

比如 pass_level0 会显示出所有可以当作 moduleop 的部分,其中有两个loss,放进moduleop参数

############# pass_level0
inline module = backbone_nets.mobilenetv2_backbone.ConvBNReLU
inline module = backbone_nets.mobilenetv2_backbone.InvertedResidual
inline module = backbone_nets.mobilenetv2_backbone.MobileNetV2
inline module = backbone_nets.pointnet_backbone.MLP_for
inline module = backbone_nets.pointnet_backbone.MLP_rev
inline module = loss_definition.ParamLoss
inline module = loss_definition.WingLoss
inline module = model_building.I2P
./pnnx pnnx_cpu.pt inputshape=[1,3,120,120],[1,62] moduleop=loss_definition.ParamLoss,loss_definition.WingLoss

@Always-Naive
Copy link
Author

大佬太强了 不明觉厉 再世华佗 妙手回春 新时代的活雷锋 痛哭流涕 感激不尽

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants