Skip to content

LogoPlanner实时部署的耗时问题 #91

@xxqs13

Description

@xxqs13

服务端是部署在GeForce RTX 4090 ,然后用3090的客户端,相机是奥比相机,图片和深度图的分辨率输入也是640,480。
我的一次推理耗时有1.6s,

[PERF] Request parsing: 8.2 ms
[PERF] Image processing: 14.0 ms
[PERF] Depth dilation: 302.7 ms
[PERF] Model inference: 656.9 ms
[PERF] MPC setup: 24.3 ms
[PERF] MPC solve: 137.7 ms (20 steps)
[PERF] Trajectory plot: 439.1 ms
[PERF] Post-processing: 0.0 ms

[PERF] TOTAL: 1582.9 ms (0.63 Hz)

,就算不用Trajectory plot: 439.1 ms,也顶多是1hz,达不到作者说的。在配备 GeForce RTX 4090 显卡的本地机器上,我们的实时避障推理速度可达约 10Hz。
有可能是因为Warning, cannot find cuda-compiled version of RoPE2D,using a slow pytorch version instead 这个原因导致我模型推理得比较慢吗?
其他的参数设置跟官方案例保持一致

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions