Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

运行7B模型32G的T4不够嘛,每次都被kill了 #22

Closed
ruterfu opened this issue Apr 11, 2023 · 5 comments
Closed

运行7B模型32G的T4不够嘛,每次都被kill了 #22

ruterfu opened this issue Apr 11, 2023 · 5 comments

Comments

@ruterfu
Copy link

ruterfu commented Apr 11, 2023

按照【快速开始】来进行快速使用,python执行后,GenerateLm吃了95%的内存,然后load_model(model, args.load_model_path)运行到一半直接被kill了(我后来加了个20G虚拟内存依然被kill了),还是我操作有问题

@ruterfu ruterfu closed this as not planned Won't fix, can't repro, duplicate, stale Apr 11, 2023
@ruterfu ruterfu reopened this Apr 11, 2023
@fengyh3
Copy link
Collaborator

fengyh3 commented Apr 12, 2023

可以尝试一下:https://github.com/fengyh3/llama_inference
上述脚本是针对tencentpretrain的llama做inference的~

至于你提到的问题,我之前也试过,大概率是运行内存不够的问题。我运行内存是14G,然后加了24G的swap内存,没问题。

@ruterfu
Copy link
Author

ruterfu commented Apr 13, 2023

啊。。我那台GPU是32G的内存。。。GenerateLm运行完毕后直接吃掉95%,然load_model,我分配了20G的虚拟内存,全部吃光然后被kill。。。分配30G虚拟内存还没试。。。腾讯云T4 GPU

@wujianming1996
Copy link

RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx
请问在部署微服务的时候如果没有GPU可以用CPU吗?

@fengyh3
Copy link
Collaborator

fengyh3 commented Apr 20, 2023

@wujianming1996 尝试注释掉llama_server.py的第13行

@wujianming1996
Copy link

@fengyh3 谢谢,我试试。

@ruterfu ruterfu closed this as completed Apr 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants