Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.pdmodel转换.nb失败,报错Check failed: it != outputs_.end(): #10511

Closed
RRRRUSH opened this issue May 7, 2024 · 18 comments
Closed

.pdmodel转换.nb失败,报错Check failed: it != outputs_.end(): #10511

RRRRUSH opened this issue May 7, 2024 · 18 comments
Assignees

Comments

@RRRRUSH
Copy link

RRRRUSH commented May 7, 2024

  • 系统环境/System Environment:win11
  • 版本号/Version:Paddlelite:2.9.0
  • 运行指令/Command Code:运行以下convert脚本时出现错误
# 引用Paddlelite预测库
from paddlelite.lite import Opt

# 1. 创建opt实例
opt=Opt()
# 2. 指定输入模型地址
opt.set_model_file("model.pdmodel")
opt.set_param_file("model.pdiparams")
# 3. 指定转化类型: arm、x86、opencl、npu
opt.set_valid_places("arm")
# 4. 指定模型转化类型: naive_buffer、protobuf
opt.set_model_type("naive_buffer")
# 3. 输出模型地址
opt.set_optimize_out("model_opt.nb")
# 5. 执行模型优化
opt.run()
  • 完整报错/Complete Error Message:

    Loading topology data from model.pdmodel
    Loading params data from model.pdiparams

    1. Model is successfully loaded!
      [F 5/ 7 16: 9:41. 17 ...ite\lite\model_parser\general\op_desc.cc:63 paddle::lite::general::OpDesc::Output] Check failed: it != outputs_.end():

    进程已结束,退出代码为 -1073740791 (0xC0000409)

@csy0225
Copy link
Collaborator

csy0225 commented May 8, 2024

看你的这个报错,应该是模型文件与 paddlelite 版本不兼容导致的,您使用的 paddlelite 版本过低,建议使用最新代码分支或者 release/v2.13 试一下, 应该可以解决你的问题。

@RRRRUSH
Copy link
Author

RRRRUSH commented May 8, 2024

更新至2.13版本后报错如下:

Loading topology data from model.pdmodel
Loading params data from model.pdiparams

  1. Model is successfully loaded!

进程已结束,退出代码为 -1073741819 (0xC0000005)

@RRRRUSH
Copy link
Author

RRRRUSH commented May 8, 2024

因为进程直接结束了,也无法进行调试,并且没有错误信息,低版本的paddlelite会显示如题目的报错,高版本的直接就没有。开发网络的paddlepaddle版本为2.6.0,不知paddlelite是不支持哪部分。

@csy0225
Copy link
Collaborator

csy0225 commented May 8, 2024

恩,是的,paddle 版本太高了,应该是哪里 paddlelite 没有适配。有两种办法:1.尝试使用低版本的 paddle 重新导出一下模型 2.可以在 paddlelite 里面增加一下打印,然后进行调试,但是这种对开发者要求较高。 3.其实如果模型比较小,也可以通过裁剪模型,看下哪个算子有差异。

@RRRRUSH
Copy link
Author

RRRRUSH commented May 8, 2024

class Block(nn.Layer):
    def __init__(self, in_channels=3, out_channels=3, kernel_size=3, *args):
        super(Block, self).__init__()
        self.nn = nn.Sequential(
            nn.Conv1D(
                in_channels=in_channels,    # 输入通道
                out_channels=out_channels,  # 输出通道
                kernel_size=kernel_size,    # 卷积核大小
                padding=0,                  # 填充大小
                bias_attr=False             # 是否包含偏置参数
                ),
            nn.BatchNorm1D(num_features=out_channels),
            nn.LeakyReLU(.2),
            nn.Dropout(p=.2)
        )

    def forward(self, x):
        return self.nn(x)

class FallNet(nn.Layer):
    def __init__(self, in_channels=3, out_classes=5, hid=64, num=64):

        super(FallNet, self).__init__()

        self.cnn0 = Block(in_channels,hid,7,0)
        self.cnn1 = Block(hid,hid,5,0)
        self.cnn2 = Block(hid,hid,3,0)
        self.cnn3 = Block(hid,hid,1,0)

        self.avg = nn.AdaptiveAvgPool1D(output_size=num)

        self.rnn0 = nn.GRU(input_size=145, hidden_size=num, num_layers=1, dropout=0.2)
        self.rnn1 = Block(hid,hid,1,0)
        self.rnn2 = Block(hid,4,3,0)

        self.cls = nn.Sequential(
            nn.Linear(in_features=1016, out_features=128),
            nn.Dropout(p=.2),
            nn.Linear(in_features=128, out_features=out_classes),
            nn.Softmax(axis=1)
        )

    def forward(self, x):
        x = self.cnn0(x)
        y = self.cnn1(x)
        y1 = self.avg(y)

        y = self.cnn2(y)
        y2 = self.avg(y)

        y = self.cnn3(y)
        y3 = self.avg(y)

        r,t = self.rnn0(x)

        x = paddle.concat([y1,y2,y3,r], axis=-1)

        x = self.rnn1(x)
        x = self.rnn2(x)
        x = paddle.flatten(x, start_axis=1)

        x = self.cls(x)
        return x
这是整个网络的代码,是从Aistudio社区中的一个项目,现在还在学习阶段水平有限,不知道网络设计方面是否存在缺陷导致出现以上bug,如果可以的话希望您指导一下

@RRRRUSH
Copy link
Author

RRRRUSH commented May 8, 2024

需要实现根据手机加速度传感器xyz轴数据对5种人体动作进行分类,训练集每条数据包含了xyz三轴各轴151条瞬时加速度数据,以上网络是根据社区项目和github开源的内容开发的,训练和eval都没有问题,模型转换时失败了

@csy0225
Copy link
Collaborator

csy0225 commented May 8, 2024

您不用修改组网代码,先把 paddle 版本换成2.5/2.4 ,然后重新导出试一下。

@RRRRUSH
Copy link
Author

RRRRUSH commented May 8, 2024

2.4与2.5版本依旧报错

Loading topology data from test.pdmodel
Loading params data from test.pdiparams
1. Model is successfully loaded!

进程已结束,退出代码为 -1073740791 (0xC0000409)

@csy0225
Copy link
Collaborator

csy0225 commented May 8, 2024

方便的话可以上传一下您的 paddle 模型,我这边试下

@RRRRUSH
Copy link
Author

RRRRUSH commented May 8, 2024

model.zip
这是导出的模型

@csy0225
Copy link
Collaborator

csy0225 commented May 9, 2024

您好,定位到是您的模型中的第四个卷积触发了我们这边的一个 bug,sparse_conv_pass 在处理时出现了 segment fault 的错误,暂时帮您屏蔽了,所以附件中帮您使用 develop 版本重新编译了一个 opt 工具,您可以使用这个版本的 opt 工具重新产出一下 nb 模型,我亲测是可以的。注意:附件中是已经转换好的 nb 模型,和新编译的 opt 工具,但是这两个是基于 develop 版本编译的,您可以先试一下,如果需要 release 版本的我可以在帮你编译一下,但是应该是兼容的。

@csy0225
Copy link
Collaborator

csy0225 commented May 9, 2024

issue.zip

@RRRRUSH
Copy link
Author

RRRRUSH commented May 9, 2024

将转换好的.nb部署到Android后出现如下报错,无法打开.nb文件,不知是否是libpaddle_lite_jni.so文件有问题?

Abort message: '[F  5/ 9 20: 3: 6.399 ...d/Paddle-Lite/lite/core/model/base/io.cc:46 BinaryFileReader] 
Check failed: file_: Unable to open file: /data/user/0/com.lhz.prolbs/cache/paddle/model.nb
                                                                                                    '
2024-05-09 20:03:06.993 23690-23690 DEBUG         crash_dump32            A        #01 pc 0007a0ad 
/data/app/~~9NEiVuHbUtKwMUZDSz0D6g==/com.lhz.prolbs-xx7Xg5gjQHcRF9fXYQ6F3g==/lib/arm/libpaddle_lite_jni.so (BuildId: 99342f670b057b9b0372e8d820eb3f64265d6b05)

@csy0225
Copy link
Collaborator

csy0225 commented May 9, 2024

重新用 release/v2.13编译了一下,你再试试
issue_v2.tar.gz

@RRRRUSH
Copy link
Author

RRRRUSH commented May 9, 2024

依旧出现了如下报错

 A  [F  5/ 9 22:17:53.353 ...odel_parser/naive_buffer/naive_buffer.cc:62 LoadFromFile] Check failed: fp: Unable to open file: /data/user/0/com.lhz.prolbs/cache/model.nb
2024-05-09 22:17:53.353 16263-16263 libc                    com.lhz.prolbs                       A  FORTIFY: fseeko: null FILE*
2024-05-09 22:17:53.354 16263-16263 libc                    com.lhz.prolbs                       A  Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 16263 (com.lhz.prolbs), pid 16263 (com.lhz.prolbs)
2024-05-09 22:17:53.609 16612-16612 DEBUG                   crash_dump32                         A  Cmdline: com.lhz.prolbs
2024-05-09 22:17:53.609 16612-16612 DEBUG                   crash_dump32                         A  pid: 16263, tid: 16263, name: com.lhz.prolbs  >>> com.lhz.prolbs <<<
2024-05-09 22:17:53.609 16612-16612 DEBUG                   crash_dump32                         A  Abort message: '[F  5/ 9 22:17:53.353 ...odel_parser/naive_buffer/naive_buffer.cc:62 LoadFromFile] Check failed: fp: Unable to open file: /data/user/0/com.lhz.prolbs/cache/model.nb
                                                                                                    '
2024-05-09 22:17:53.609 16612-16612 DEBUG                   crash_dump32                         A        #03 pc 001819ab  /data/app/~~3hYhbkk4dT1HTuMR3tn-gw==/com.lhz.prolbs-CVj6vnBdtOKNeSoXBYnIAA==/lib/arm/libpaddle_lite_jni.so
2024-05-09 22:17:53.696  2778-2914  DollieAdapterService    com.huawei.systemserver              E  notifyActivityState pkg:com.lhz.prolbs/com.lhz.prolbs.MainActivity state:19 fg:false mUid:10530

@RRRRUSH
Copy link
Author

RRRRUSH commented May 9, 2024

我更换了不同版本的libpaddle_lite_jni.so尝试了,报错是相同的Unable to open file

@RRRRUSH
Copy link
Author

RRRRUSH commented May 9, 2024

Android代码本身有些问题,调整问题后报错如下,这是Android端paddlelite版本的问题吧

                                                                                                    warning: the version of opt that transformed this model is not consistent with current Paddle-Lite version.
                                                                                                          version of opt:cd09a8e01
                                                                                                          version of current Paddle-Lite:v2.10
2024-05-09 22:53:59.387  2701-2701  libc                    com.lhz.prolbs                       A  Fatal signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x4 in tid 2701 (com.lhz.prolbs), pid 2701 (com.lhz.prolbs)
2024-05-09 22:54:00.194  3234-3234  DEBUG                   pid-3234                             A  Cmdline: com.lhz.prolbs
2024-05-09 22:54:00.194  3234-3234  DEBUG                   pid-3234                             A  pid: 2701, tid: 2701, name: com.lhz.prolbs  >>> com.lhz.prolbs <<<
2024-05-09 22:54:00.194  3234-3234  DEBUG                   pid-3234                             A        #00 pc 00216834  /data/app/~~qWZP-5S3Gvt30qtt7dqgmg==/com.lhz.prolbs-S9o8myZgnUXOSqj6De4vdg==/lib/arm/libpaddle_lite_jni.so (BuildId: 99342f670b057b9b0372e8d820eb3f64265d6b05)
2024-05-09 22:54:00.265  2778-2914  DollieAdapterService    com.huawei.systemserver              E  notifyActivityState pkg:com.lhz.prolbs/com.lhz.prolbs.MainActivity state:19 fg:false mUid:10530

@RRRRUSH
Copy link
Author

RRRRUSH commented May 9, 2024

更换了release/v2.13的libpaddle_lite_jni.so程序可以跑通了,非常感谢您给予的帮助!

@csy0225 csy0225 closed this as completed May 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants