We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
请问我尝试使用colab运行示例量化程序,但因内存问题无法启动,这是我的使用的示例程序
from transformers import AutoTokenizer from auto_gptq import AutoGPTQForCausalLM model = AutoGPTQForCausalLM.from_quantized('FlagAlpha/Llama2-Chinese-13b-Chat-4bit', device="cuda:0") tokenizer = AutoTokenizer.from_pretrained('FlagAlpha/Llama2-Chinese-13b-Chat-4bit',use_fast=False) input_ids = tokenizer(['<s>Human: 怎么登上火星\n</s><s>Assistant: '], return_tensors="pt",add_special_tokens=False).input_ids.to('cuda') generate_input = { "input_ids":input_ids, "max_new_tokens":512, "do_sample":True, "top_k":50, "top_p":0.95, "temperature":0.3, "repetition_penalty":1.3, "eos_token_id":tokenizer.eos_token_id, "bos_token_id":tokenizer.bos_token_id, "pad_token_id":tokenizer.pad_token_id } generate_ids = model.generate(**generate_input) text = tokenizer.decode(generate_ids[0]) print(text)
colab分配的配置为12.7G内存 T4显卡
The text was updated successfully, but these errors were encountered:
12G的显存看起是可以的,报什么错呀, 12G显存卡着耗尽的边缘,你申请一个24G的显存卡试试
Sorry, something went wrong.
No branches or pull requests
请问我尝试使用colab运行示例量化程序,但因内存问题无法启动,这是我的使用的示例程序
colab分配的配置为12.7G内存 T4显卡
The text was updated successfully, but these errors were encountered: