-
Notifications
You must be signed in to change notification settings - Fork 233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
运行7B模型32G的T4不够嘛,每次都被kill了 #22
Comments
可以尝试一下:https://github.com/fengyh3/llama_inference 至于你提到的问题,我之前也试过,大概率是运行内存不够的问题。我运行内存是14G,然后加了24G的swap内存,没问题。 |
啊。。我那台GPU是32G的内存。。。GenerateLm运行完毕后直接吃掉95%,然load_model,我分配了20G的虚拟内存,全部吃光然后被kill。。。分配30G虚拟内存还没试。。。腾讯云T4 GPU |
RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx |
@wujianming1996 尝试注释掉llama_server.py的第13行 |
@fengyh3 谢谢,我试试。 |
按照【快速开始】来进行快速使用,python执行后,GenerateLm吃了95%的内存,然后load_model(model, args.load_model_path)运行到一半直接被kill了(我后来加了个20G虚拟内存依然被kill了),还是我操作有问题
The text was updated successfully, but these errors were encountered: