-
Notifications
You must be signed in to change notification settings - Fork 352
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
希望服务稳定性提升 #1089
Comments
日志完整的能贴一下吗 |
我复现一下我的操作过程,方便复现。
${MODEL_PATH}是一个文件夹,里面的子文件夹是模型文件夹。
测试代码大概是这个意思,删删改改多次,可能有bug。核心调用大模型的其实就是最后一句result = evaluate( |
所以是怎么解决的 |
我部署了xinference,版本是0.9.0.
一旦对显存的请求过大,服务就会hang住。
表现为:刷新UI没有任何模型显示,所有正在运行的模型全部异常退出,无法自动恢复。
日志信息表示:Out of memory.
目前解决方案:杀死容器,重新部署。
我心目中的服务,当自身负载过高时,有序处理当下任务,临时拒绝额外请求。保证服务可用。
The text was updated successfully, but these errors were encountered: