New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
使用华为NPU推理pp-yoloe-r模型报错,提示cann版本不匹配 #10514
Comments
检查一下环境是否安装好,看报错 log 应该是没有找到 device,其他的报错都是 warning,不影响运行 [F 5/ 9 20:23:58.112 ...driver/huawei_ascend_npu/model_client.cc:54 InitAclClientEnv] Check failed: (reinterpret_cast(aclrtSetDevice(device_id_)) == ACL_ERROR_NONE): 507033!==0 507033 Unknown ACL error code(507033) |
好的谢谢您,请问这个device报错是华为那边环境问题还是paddlelite或者fastdeploy的环境问题呢,我好针对排查 |
这个应该是你本身昇腾环境安装的问题吧,看下驱动安装。 |
您好,目前 cann 版本8.x 我们这边没有适配,最高验证过6.x,所以可能需要您自己适配一下了。 |
您调试的时候也可以打开 ascend 的日志,看下 log 。 |
确认下你的 fastdeploy 调用的是编译出来的 Paddle-Lite 么?是否需要设置一下 export LD_LIBRARY_PATH? 感觉像是版本不对,以至于没有找到 so。 |
您可以用一下我们官方 demo 尝试跑一下,排除 fastdeploy 的干扰,如果官方 Demo 没问题,在使用 fastdeploy 进行集成。 |
您好,我下载了PaddleLite-generic-demo.tar.gz,使用 我的paddlelite当时是从github上下载后使用命令 |
用demo的话,你不是用python跑的,所以手动指定一下LD_LIBRARY_PATH 路径就好了,配置一下run.sh的脚本即可,无需卸载。 |
您指的LD_LIBRARY_PATH 路径是哪条路径,run.sh中的哪部分需要更改
demo部分报错信息如下,前几行和fastdeploy的报错相同:
|
如果你ascend 那部分环境变量设置正确的话,还出现错误那就是需要单独适配了,需要开启ascend相关日志看下是什么原因 |
您好,这个问题看起来很难解决,有可能是难以兼容8.0rc1的华为软件,我准备重新开始,将paddlelite卸载干净后把昇腾的软件降到6.0.RC1再测试demo,有几个问题向您请教 |
Host 环境:Ubuntu 18.04
华为NPU:A300Ipro驱动固件已装,CANN版本为5.1.RC1.alpha001
操作过程
1、使用如下命令配置paddle-lite环境:
./lite/tools/build_linux.sh --arch=x86 --with_extra=ON --with_log=ON --with_exception=ON --with_nnadapter=ON --nnadapter_with_huawei_ascend_npu=ON --nnadapter_huawei_ascend_npu_sdk_root=/usr/local/Ascend/ascend-toolkit/latest full_publish
2、之后安装Fastdeploy环境并安装whl
git clone https://github.com/PaddlePaddle/FastDeploy.git
cd FastDeploy/python
export WITH_ASCEND=ON
export ENABLE_VISION=ON
python setup.py build
python setup.py bdist_wheel
3、进入如下路径,复制准备好的paddle模型和测试图片,模型为pp-yolor-e
路径:FastDeploy/examples/vision/detection/paddledetection/python/
4、进行推理
python infer_ppyoloe_r.py --model_dir ppyoloe_r_crn_l_3x_dota --image P0861__1.0__1154___824.png --device ascend
部分报错信息如下:
[INFO] fastdeploy/runtime/runtime.cc(354)::CreateLiteBackend Runtime initialized with Backend::PDLITE in Device::ASCEND.
2222
[W 5/ 9 20:23:15.461 .../src/driver/huawei_ascend_npu/utility.cc:57 InitializeAscendCL] CANN version mismatch. The build version is 5.1.0, but the current environment version is 5.1.2.
[I 5/ 9 20:23:15.495 ...r/src/driver/huawei_ascend_npu/engine.cc:41 Context] properties:
[I 5/ 9 20:23:15.495 ...r/src/driver/huawei_ascend_npu/engine.cc:66 Context] selected device ids:
[I 5/ 9 20:23:15.495 ...r/src/driver/huawei_ascend_npu/engine.cc:68 Context] 0
[I 5/ 9 20:23:15.495 ...r/src/driver/huawei_ascend_npu/engine.cc:78 Context] profiling path:
[I 5/ 9 20:23:15.495 ...r/src/driver/huawei_ascend_npu/engine.cc:88 Context] dump model path:
[I 5/ 9 20:23:15.495 ...r/src/driver/huawei_ascend_npu/engine.cc:98 Context] precision mode:
[I 5/ 9 20:23:15.495 ...r/src/driver/huawei_ascend_npu/engine.cc:120 Context] op select impl mode:
[I 5/ 9 20:23:15.495 ...r/src/driver/huawei_ascend_npu/engine.cc:130 Context] op type list for impl mode:
[I 5/ 9 20:23:15.495 ...r/src/driver/huawei_ascend_npu/engine.cc:140 Context] enable compressw weight:
[I 5/ 9 20:23:15.495 ...r/src/driver/huawei_ascend_npu/engine.cc:150 Context] auto tune mode:
[I 5/ 9 20:23:15.495 ...r/src/driver/huawei_ascend_npu/engine.cc:160 Context] enable dynamic shape range:
[I 5/ 9 20:23:15.495 ...r/src/driver/huawei_ascend_npu/engine.cc:176 Context] initial buffer length of dynamic shape range: -1
[W 5/ 9 20:23:15.495 ...ter/nnadapter/src/runtime/compilation.cc:334 Finish] Warning: Failed to create a program, No model and cache is provided.
[W 5/ 9 20:23:15.495 ...le-Lite/lite/kernels/nnadapter/engine.cc:149 LoadFromCache] Warning: Build model failed(3) !
[W 5/ 9 20:23:15.512 ...nnadapter/nnadapter/src/runtime/model.cc:86 GetSupportedOperations] Warning: Failed to get the supported operations for device 'huawei_ascend_npu', because the HAL interface 'validate_program' is not implemented!
[W 5/ 9 20:23:15.512 ...kernels/nnadapter/converter/converter.cc:171 Apply] Warning: Failed to get the supported operations for the selected devices, one or more of the selected devices are not supported!
[I 5/ 9 20:23:15.512 ...r/src/driver/huawei_ascend_npu/driver.cc:70 CreateProgram] Create program for huawei_ascend_npu.
[F 5/ 9 20:23:58.112 ...driver/huawei_ascend_npu/model_client.cc:54 InitAclClientEnv] Check failed: (reinterpret_cast(aclrtSetDevice(device_id_)) == ACL_ERROR_NONE): 507033!==0 507033 Unknown ACL error code(507033)
[F 5/ 9 20:23:58.112 ...driver/huawei_ascend_npu/model_client.cc:54 InitAclClientEnv] Check failed: (reinterpret_cast(aclrtSetDevice(device_id_)) == ACL_ERROR_NONE): 507033!==0 507033 Unknown ACL error code(507033)
[F 5/ 9 20:23:58.125 ...ter/nnadapter/src/runtime/compilation.cc:98 ~Program] Check failed: device_context: No device found.
[F 5/ 9 20:23:58.125 ...ter/nnadapter/src/runtime/compilation.cc:98 ~Program] Check failed: device_context: No device found.
terminate called after throwing an instance of 'nnadapter::logging::Exception'
what(): NNAdapter C++ Exception:
[F 5/ 9 20:23:58.125 ...ter/nnadapter/src/runtime/compilation.cc:98 ~Program] Check failed: device_context: No device found.
Aborted (core dumped)
root@hitsz-NF5280M5:/home/hitsz/PP-Yoloe-R/FastDeploy/examples/vision/detection/paddledetection/python# /usr/local/python3.7.5/lib/python3.7/multiprocessing/semaphore_tracker.py:144: UserWarning: semaphore_tracker: There appear to be 96 leaked semaphores to clean up at shutdown
len(cache))
The text was updated successfully, but these errors were encountered: