Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TX2 设备部署方法 已解决 我尝试在TX2设备上部署YOLOV8项目,但是TX2上的tensorrt,转换模型时会报告不支持int64的错误,有什么方法可以转换为int32呢? #15

Closed
lyb36524 opened this issue Feb 27, 2023 · 19 comments

Comments

@lyb36524
Copy link

lyb36524 commented Feb 27, 2023

问题成功解决,感谢飞哥大力援助,以下是,解决方法:
TX2 系统版本,jetpak4.6
关键步骤:
1.在PC or TX2导出静态onnx.
2.在TX2上,用TRT8.2编译onnx,得到trt文件.
注意:
然后编译tensorrt-alpha代码时所用的Tensorrt版本,要与trt转换时的一致。
关键命令:
1.在PC 或者 TX2导出静态onnx,注意这里与其他X86 ubuntu 上的转换命令不一致
yolo mode=export model=yolov8n.pt format=onnx batch=1
2.将onnx文件拷贝到,TX2上,并在TX2上运行以下命令编译trt文件:
../../TensorRT-8.2.1.8/bin/trtexec --onnx=yolov8n.onnx --saveEngine=yolov8n.trt --buildOnly
TX2的trtexec可执行文件目录在:
/usr/src/tensorrt/bin
注意自行更改命令的目录
3.运行测试:
./app_yolov8 --model=../../data/yolov8/yolov8n.trt --size=640 --batch_size=1 --img=../../data/6406407.jpg --show


我尝试在TX2设备上部署YOLOV8项目,但是TX2上的tensorrt,转换模型时会报告不支持int64的错误,有什么方法可以转换为int32呢?
或者正确转换出模型呢?
image
`nvidia@ubuntu:~/TensorRT-Alpha-main/data/yolov8$ ./trtexec --onnx=yolov8n.onnx --saveEngine=yolov8n.trt --buildOnly --minShapes=images:1x3x640x640 --optShapes=images:4x3x640x640 --maxShapes=images:8x3x640x640
&&&& RUNNING TensorRT.trtexec [TensorRT v8201] # ./trtexec --onnx=yolov8n.onnx --saveEngine=yolov8n.trt --buildOnly --minShapes=images:1x3x640x640 --optShapes=images:4x3x640x640 --maxShapes=images:8x3x640x640
[02/28/2023-01:55:01] [I] === Model Options ===
[02/28/2023-01:55:01] [I] Format: ONNX
[02/28/2023-01:55:01] [I] Model: yolov8n.onnx
[02/28/2023-01:55:01] [I] Output:
[02/28/2023-01:55:01] [I] === Build Options ===
[02/28/2023-01:55:01] [I] Max batch: explicit batch
[02/28/2023-01:55:01] [I] Workspace: 16 MiB
[02/28/2023-01:55:01] [I] minTiming: 1
[02/28/2023-01:55:01] [I] avgTiming: 8
[02/28/2023-01:55:01] [I] Precision: FP32
[02/28/2023-01:55:01] [I] Calibration:
[02/28/2023-01:55:01] [I] Refit: Disabled
[02/28/2023-01:55:01] [I] Sparsity: Disabled
[02/28/2023-01:55:01] [I] Safe mode: Disabled
[02/28/2023-01:55:01] [I] DirectIO mode: Disabled
[02/28/2023-01:55:01] [I] Restricted mode: Disabled
[02/28/2023-01:55:01] [I] Save engine: yolov8n.trt
[02/28/2023-01:55:01] [I] Load engine:
[02/28/2023-01:55:01] [I] Profiling verbosity: 0
[02/28/2023-01:55:01] [I] Tactic sources: Using default tactic sources
[02/28/2023-01:55:01] [I] timingCacheMode: local
[02/28/2023-01:55:01] [I] timingCacheFile:
[02/28/2023-01:55:01] [I] Input(s)s format: fp32:CHW
[02/28/2023-01:55:01] [I] Output(s)s format: fp32:CHW
[02/28/2023-01:55:01] [I] Input build shape: images=1x3x640x640+4x3x640x640+8x3x640x640
[02/28/2023-01:55:01] [I] Input calibration shapes: model
[02/28/2023-01:55:01] [I] === System Options ===
[02/28/2023-01:55:01] [I] Device: 0
[02/28/2023-01:55:01] [I] DLACore:
[02/28/2023-01:55:01] [I] Plugins:
[02/28/2023-01:55:01] [I] === Inference Options ===
[02/28/2023-01:55:01] [I] Batch: Explicit
[02/28/2023-01:55:01] [I] Input inference shape: images=4x3x640x640
[02/28/2023-01:55:01] [I] Iterations: 10
[02/28/2023-01:55:01] [I] Duration: 3s (+ 200ms warm up)
[02/28/2023-01:55:01] [I] Sleep time: 0ms
[02/28/2023-01:55:01] [I] Idle time: 0ms
[02/28/2023-01:55:01] [I] Streams: 1
[02/28/2023-01:55:01] [I] ExposeDMA: Disabled
[02/28/2023-01:55:01] [I] Data transfers: Enabled
[02/28/2023-01:55:01] [I] Spin-wait: Disabled
[02/28/2023-01:55:01] [I] Multithreading: Disabled
[02/28/2023-01:55:01] [I] CUDA Graph: Disabled
[02/28/2023-01:55:01] [I] Separate profiling: Disabled
[02/28/2023-01:55:01] [I] Time Deserialize: Disabled
[02/28/2023-01:55:01] [I] Time Refit: Disabled
[02/28/2023-01:55:01] [I] Skip inference: Enabled
[02/28/2023-01:55:01] [I] Inputs:
[02/28/2023-01:55:01] [I] === Reporting Options ===
[02/28/2023-01:55:01] [I] Verbose: Disabled
[02/28/2023-01:55:01] [I] Averages: 10 inferences
[02/28/2023-01:55:01] [I] Percentile: 99
[02/28/2023-01:55:01] [I] Dump refittable layers:Disabled
[02/28/2023-01:55:01] [I] Dump output: Disabled
[02/28/2023-01:55:01] [I] Profile: Disabled
[02/28/2023-01:55:01] [I] Export timing to JSON file:
[02/28/2023-01:55:01] [I] Export output to JSON file:
[02/28/2023-01:55:01] [I] Export profile to JSON file:
[02/28/2023-01:55:01] [I]
[02/28/2023-01:55:01] [I] === Device Information ===
[02/28/2023-01:55:01] [I] Selected Device: NVIDIA Tegra X2
[02/28/2023-01:55:01] [I] Compute Capability: 6.2
[02/28/2023-01:55:01] [I] SMs: 2
[02/28/2023-01:55:01] [I] Compute Clock Rate: 1.3 GHz
[02/28/2023-01:55:01] [I] Device Global Memory: 7850 MiB
[02/28/2023-01:55:01] [I] Shared Memory per SM: 64 KiB
[02/28/2023-01:55:01] [I] Memory Bus Width: 128 bits (ECC disabled)
[02/28/2023-01:55:01] [I] Memory Clock Rate: 1.3 GHz
[02/28/2023-01:55:01] [I]
[02/28/2023-01:55:01] [I] TensorRT version: 8.2.1
[02/28/2023-01:55:03] [I] [TRT] [MemUsageChange] Init CUDA: CPU +266, GPU +0, now: CPU 285, GPU 6703 (MiB)
[02/28/2023-01:55:03] [I] [TRT] [MemUsageSnapshot] Begin constructing builder kernel library: CPU 285 MiB, GPU 6704 MiB
[02/28/2023-01:55:03] [I] [TRT] [MemUsageSnapshot] End constructing builder kernel library: CPU 314 MiB, GPU 6732 MiB
[02/28/2023-01:55:03] [I] Start parsing network model
[02/28/2023-01:55:03] [I] [TRT] ----------------------------------------------------------------
[02/28/2023-01:55:03] [I] [TRT] Input filename: yolov8n.onnx
[02/28/2023-01:55:03] [I] [TRT] ONNX IR version: 0.0.8
[02/28/2023-01:55:03] [I] [TRT] Opset version: 17
[02/28/2023-01:55:03] [I] [TRT] Producer name: pytorch
[02/28/2023-01:55:03] [I] [TRT] Producer version: 1.13.1
[02/28/2023-01:55:03] [I] [TRT] Domain:
[02/28/2023-01:55:03] [I] [TRT] Model version: 0
[02/28/2023-01:55:03] [I] [TRT] Doc string:
[02/28/2023-01:55:03] [I] [TRT] ----------------------------------------------------------------
[02/28/2023-01:55:03] [W] [TRT] onnx2trt_utils.cpp:366: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[02/28/2023-01:55:04] [E] [TRT] ModelImporter.cpp:773: While parsing node number 239 [Range -> "/model.22/Range_output_0"]:
[02/28/2023-01:55:04] [E] [TRT] ModelImporter.cpp:774: --- Begin node ---
[02/28/2023-01:55:04] [E] [TRT] ModelImporter.cpp:775: input: "/model.22/Constant_8_output_0"
input: "/model.22/Cast_output_0"
input: "/model.22/Constant_9_output_0"
output: "/model.22/Range_output_0"
name: "/model.22/Range"
op_type: "Range"

[02/28/2023-01:55:04] [E] [TRT] ModelImporter.cpp:776: --- End node ---
[02/28/2023-01:55:04] [E] [TRT] ModelImporter.cpp:779: ERROR: builtin_op_importers.cpp:3352 In function importRange:
[8] Assertion failed: inputs.at(0).isInt32() && "For range operator with dynamic inputs, this version of TensorRT only supports INT32!"
[02/28/2023-01:55:04] [E] Failed to parse onnx file
[02/28/2023-01:55:04] [I] Finish parsing network model
[02/28/2023-01:55:04] [E] Parsing model failed
[02/28/2023-01:55:04] [E] Failed to create engine from model.
[02/28/2023-01:55:04] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v8201] # ./trtexec --onnx=yolov8n.onnx --saveEngine=yolov8n.trt --buildOnly --minShapes=images:1x3x640x640 --optShapes=images:4x3x640x640 --maxShapes=images:8x3x640x640
nvidia@ubuntu:~/TensorRT-Alpha-main/data/yolov8$
`

我还尝试了这个方法,但是没什么作用,依然有int64的节点:
https://blog.csdn.net/dou3516/article/details/124577344

@FeiYull
Copy link
Owner

FeiYull commented Feb 28, 2023

@lyb36524 For TensorRT8.4.2.4, there will not be this error.

@lyb36524
Copy link
Author

@lyb36524 For TensorRT8.4.2.4, there will not be this error.

对在x86上,使用TensorRT只会有警告,不会停止并且可以得到文件。但是在TX2上最高版本只有TensorRT8.2.1,完不成转换,我尝试在x86,转换得到模型,但是在TX2上推理会报错,引擎版本不一致,我尝试了好多TensorRT的版本,以及对应的8.2.1,依然有这个问题,所以我好像必须在TX2上完成转换?

@FeiYull
Copy link
Owner

FeiYull commented Feb 28, 2023

@lyb36524 for Tensorrt8.2, you can reference the following step:

  • export onnx:yolo mode=export model=yolov8n.pt format=onnx batch=1

  • compile onnx:../../TensorRT-8.2.1.8/bin/trtexec --onnx=yolov8n.onnx --saveEngine=yolov8n.trt --buildOnly

  • This is the compilation success message:
    [02/28/2023-11:35:37] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +13, now: CPU 0, GPU 13 (MiB)
    [02/28/2023-11:35:37] [I] Engine built in 128.87 sec.
    &&&& PASSED TensorRT.trtexec [TensorRT v8201] # ../../TensorRT-8.2.1.8/bin/trtexec --onnx=yolov8n.onnx --saveEngine=yolov8n.trt --buildOnly

  • inference:./app_yolov8 --model=../../data/yolov8/yolov8n.trt --size=640 --batch_size=1 --img=../../data/6406407.jpg --show

demo

@lyb36524
Copy link
Author

lyb36524 commented Feb 28, 2023

@lyb36524 Tensorrt8.2,您可以参考以下步骤:

  • 导出 ONNX:yolo mode=export model=yolov8n.pt format=onnx batch=1
  • 编译 onnx:../../TensorRT-8.2.1.8/bin/trtexec --onnx=yolov8n.onnx --saveEngine=yolov8n.trt --buildOnly
  • 这是编译成功消息: [02/28/2023-11:35:37] [I] [TRT] [内存用法更改] 引擎反序列化中的 TensorRT 管理分配:CPU +0、GPU +13,现在:CPU 0、GPU 13 (MiB) [02/28/2023-11:35:37] [I] 引擎在 128.87 秒内建成。 &&&& 通过了 TensorRT.trtexec [TensorRT v8201] # ../../TensorRT-8.2.1.8/bin/trtexec --onnx=yolov8n.onnx --saveEngine=yolov8n.trt --buildOnly
  • 推理:./app_yolov8 --model=../../data/yolov8/yolov8n_bs4.trt --size=640 --batch_size=1 --img=../../data/6406407.jpg --show

demo

  • 注意:如果要进行多批量推理,只需要修改上述批量参数,但在进行推理时可能需要修改yolov8.cpp,因为代码支持动态批量。

谢谢你的回复,但是我在X86平台上,使用不同的TensorRT,版本都可以,完成转换。
但是在TX2上,就会报错int64,有可能是ARM架构的原因。
然后我试着在X86平台上找到TX2平台上的TensorRT对应版本,完成转换,我把8.0的版本都试了一遍,可以完成转换。
但是在TX2上推理时都会出现类似的错误:

nvidia@ubuntu:~/TensorRT-Alpha-main/yolov8/build$ ./app_yolov8 --model=../../data/yolov8/yolov8n.trt --size=640 --batch_size=1 --img=../../data/6406407.jpg --show --savePath
[02/28/2023-02:33:47] [I] model_path = ../../data/yolov8/yolov8n.trt
[02/28/2023-02:33:47] [I] size = 640
[02/28/2023-02:33:47] [I] batch_size = 1
[02/28/2023-02:33:47] [I] image_path = ../../data/6406407.jpg
[02/28/2023-02:33:47] [I] is_show = 1
[02/28/2023-02:33:47] [I] save_path = true
[02/28/2023-02:33:48] [I] [TRT] [MemUsageChange] Init CUDA: CPU +261, GPU +0, now: CPU 302, GPU 6756 (MiB)
[02/28/2023-02:33:48] [I] [TRT] Loaded engine size: 16 MiB
[02/28/2023-02:33:48] [E] [TRT] 1: [stdArchiveReader.cpp::StdArchiveReader::40] Error Code 1: Serialization (Serialization assertion stdVersionRead == serializationVersion failed.Version tag does not match. Note: Current Version: 205, Serialized Engine Version: 232)
[02/28/2023-02:33:48] [E] [TRT] 4: [runtime.cpp::deserializeCudaEngine::50] Error Code 4: Internal Error (Engine deserialization failed.)
[02/28/2023-02:33:48] [E] initEngine() ocur errors!
runtime error /home/nvidia/TensorRT-Alpha-main/yolov8/yolov8.cpp:10 cudaFree(m_output_src_transpose_device) failed.
code = cudaErrorInvalidValue, message = invalid argument
nvidia@ubuntu:

所以,我在想是不是必须在TX2,上完成onnxx到trt的转换,所以有什么方法可以将,onnx转为INT32的格式?或者其他导出trt的方法?

@FeiYull
Copy link
Owner

FeiYull commented Feb 28, 2023

@lyb36524 According to your error message, suggestions:

  1. Please compile onnx and run app_yolov8 on the same device
  2. The TensorRT version corresponding to compiling onnx must be consistent with the version linked by app_yolov8

@lyb36524
Copy link
Author

@lyb36524 According to your error message, suggestions:

  1. Please compile onnx and run app_yolov8 on the same device
  2. The TensorRT version corresponding to compiling onnx must be consistent with the version linked by app_yolov8

是的,但是我好像绕不开,TX2上的ARM平台的TensorRT8.2.1,如果我在其他平台编译了app_yolov8,在TX2上就依然无法运行。
所以,好像必须把ONNX int64 转换为 int32,再转换trt,但是我不知道有什么方法可以转换,或着其他,在TX2上转换模型的方法?

@FeiYull
Copy link
Owner

FeiYull commented Feb 28, 2023

@lyb36524 Why do you copy trt files between different devices? This is forbidden by TensorRT! #15 (comment)

@lyb36524
Copy link
Author

@lyb36524 Why do you copy trt files between different devices? This is forbidden by TensorRT! #15 (comment)

因为,我在TX2平台上无法,完成onnx到trt 文件 的转换,他会提示:

[02/28/2023-12:49:19] [E] [TRT] ModelImporter.cpp:779: ERROR: builtin_op_importers.cpp:3352 In function importRange:
[8] Assertion failed: inputs.at(0).isInt32() && "For range operator with dynamic inputs, this version of TensorRT only supports INT32!"

我不知道还有什么方法,能完成 trt文件的转换,或者怎么修改ONNX文件,实现trt文件的转换?谢谢

@FeiYull
Copy link
Owner

FeiYull commented Feb 28, 2023

@lyb36524

@lyb36524
Copy link
Author

@lyb36524

在你哪里,onnx拷贝到TX2可以完成转换?我这里ONNX拷贝到TX2,不能完成转换,会出现:
[02/28/2023-01:55:04] [E] [TRT] ModelImporter.cpp:779: ERROR: builtin_op_importers.cpp:3352 In function importRange:
[8] Assertion failed: inputs.at(0).isInt32() && "For range operator with dynamic inputs, this version of TensorRT only supports INT32!"
就是第一张图片那样。
我的TX2使用的是,JetPack4.6.3,已经是TX2支持的最高版本了。

@FeiYull
Copy link
Owner

FeiYull commented Feb 28, 2023

@lyb36524 上面指令试过?batch=1。另外,yolov8可能有个bug,batch=4导出onnx还是1

@lyb36524
Copy link
Author

@lyb36524 上面指令试过?batch=1。另外,yolov8可能有个bug,batch=4导出onnx还是1

试过,我一开始就是按照,这个步骤,在TX2上操作的:
https://blog.csdn.net/m0_72734364/article/details/128758544?app_version=5.12.1&csdn_share_tail=%7B%22type%22%3A%22blog%22%2C%22rType%22%3A%22article%22%2C%22rId%22%3A%22128758544%22%2C%22source%22%3A%22m0_72734364%22%7D&utm_source=app

直到我遇到了这个问题:
[02/28/2023-01:55:04] [E] [TRT] ModelImporter.cpp:779: ERROR: builtin_op_importers.cpp:3352 In function importRange:
[8] Assertion failed: inputs.at(0).isInt32() && "For range operator with dynamic inputs, this version of TensorRT only supports INT32!"

@FeiYull
Copy link
Owner

FeiYull commented Feb 28, 2023

@lyb36524 不应该,有两个兄弟反馈在TX2跑通了,加企鹅732369616

@lyb36524
Copy link
Author

@lyb36524 不应该,有两个兄弟反馈在TX2跑通了,加企鹅732369616

太感谢了,我现在要出去一下,我先加您QQ

@lyb36524 lyb36524 changed the title 我尝试在TX2设备上部署YOLOV8项目,但是TX2上的tensorrt,转换模型时会报告不支持int64的错误,有什么方法可以转换为int32呢? TX2 设备部署方法 已解决 我尝试在TX2设备上部署YOLOV8项目,但是TX2上的tensorrt,转换模型时会报告不支持int64的错误,有什么方法可以转换为int32呢? Feb 28, 2023
@lyb36524
Copy link
Author

@lyb36524 不应该,有两个兄弟反馈在TX2跑通了,加企鹅732369616

问题成功解决,感谢飞哥大力援助,以下是,解决方法:
TX2 系统版本,jetpak4.6
关键步骤:
1.在PC or TX2导出静态onnx.
2.在TX2上,用TRT8.2编译onnx,得到trt文件.
注意:
然后编译tensorrt-alpha代码时所用的Tensorrt版本,要与trt转换时的一致。
关键命令:
1.在PC 或者 TX2导出静态onnx,注意这里与其他X86 ubuntu 上的转换命令不一致
yolo mode=export model=yolov8n.pt format=onnx batch=1
2.将onnx文件拷贝到,TX2上,并在TX2上运行以下命令编译trt文件:
../../TensorRT-8.2.1.8/bin/trtexec --onnx=yolov8n.onnx --saveEngine=yolov8n.trt --buildOnly
TX2的trtexec可执行文件目录在:
/usr/src/tensorrt/bin
注意自行更改命令的目录
3.运行测试:
./app_yolov8 --model=../../data/yolov8/yolov8n.trt --size=640 --batch_size=1 --img=../../data/6406407.jpg --show

@FeiYull
Copy link
Owner

FeiYull commented Feb 28, 2023

@lyb36524 不应该,有两个兄弟反馈在TX2跑通了,加企鹅732369616

问题成功解决,感谢飞哥大力援助,以下是,解决方法: TX2 系统版本,jetpak4.6 关键步骤: 1.在PC or TX2导出静态onnx. 2.在TX2上,用TRT8.2编译onnx,得到trt文件. 注意: 然后编译tensorrt-alpha代码时所用的Tensorrt版本,要与trt转换时的一致。 关键命令: 1.在PC 或者 TX2导出静态onnx,注意这里与其他X86 ubuntu 上的转换命令不一致yolo mode=export model=yolov8n.pt format=onnx batch=1 2.将onnx文件拷贝到,TX2上,并在TX2上运行以下命令编译trt文件: ../../TensorRT-8.2.1.8/bin/trtexec --onnx=yolov8n.onnx --saveEngine=yolov8n.trt --buildOnly TX2的trtexec可执行文件目录在: /usr/src/tensorrt/bin 注意自行更改命令的目录 3.运行测试: ./app_yolov8 --model=../../data/yolov8/yolov8n.trt --size=640 --batch_size=1 --img=../../data/6406407.jpg --show

除此之外,TX2设备拷贝数据需要简单修改下,参考:#16 (comment)

@FeiYull FeiYull closed this as completed Feb 28, 2023
@Phoenix8215
Copy link

Phoenix8215 commented Oct 27, 2023

把作者逼的最后说中文了😄,感谢,我也碰到类似问题

@FeiYull
Copy link
Owner

FeiYull commented Oct 28, 2023

@Phoenix8215 可以直接提新的issue

@FeiYull
Copy link
Owner

FeiYull commented Oct 28, 2023

@Phoenix8215 如图在yolo.cpp中,将0改为1,最下方代码注释。
捕获

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants