Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

将表情驱动部署成服务以后,请求中断时对应推理所占用的显存不会释放 #735

Open
boreas-l opened this issue Jan 18, 2023 · 1 comment
Assignees

Comments

@boreas-l
Copy link

这边尝试将表情驱动基于tornado框架部署成了服务,然后由于表情驱动耗时较长,如果在post请求还未收到结果的情况下提前中断,则由于服务端在接收到请求的时候就会分配一定显存用于模型推理,但是中断post请求以后,对应的推理线程所占用的显存并不会释放,然后连续多次执行中断操作以后,就会把显存占满,使得后续正常的推理请求都没法正常执行,报错。请问,paddle有处理内部线程显存释放的处理吗?谢谢。

@LokeZhou
Copy link
Collaborator

LokeZhou commented Mar 4, 2024

可以试一下这个paddle.device.cuda.empty_cache()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants