We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
您好,尝试在M1的mac上运行模型,由于内存问题,加了一个offload_folder和torch_dtype,代码如下:
from transformers import AutoModelForCausalLM, AutoTokenizer from transformers.generation import GenerationConfig import torch tokenizer = AutoTokenizer.from_pretrained("/Users/sniper/model/Qwen-7b-chat", trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained("/Users/sniper/model/Qwen-7b-chat", device_map="auto", offload_folder="offload", torch_dtype=torch.float16, trust_remote_code=True, fp16=True).eval() model.generation_config = GenerationConfig.from_pretrained("/Users/sniper/model/Qwen-7b-chat", trust_remote_code=True) # 第一轮对话 1st dialogue turn response, history = model.chat(tokenizer, "你好", history=None) print(response)
但是在chat行(倒数第二)出现错误:
position_ids = attention_mask.long().cumsum(-1) - 1 RuntimeError: MPS does not support cumsum op with int64 input
请问这是什么原因呀
The text was updated successfully, but these errors were encountered:
看起来是一个pytorch的bug,pytorch/pytorch#96610
可以尝试安装一个开发版的pytorch解决这个问题。
pip3 install --upgrade --no-deps --force-reinstall --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cpu
Sorry, something went wrong.
还是老老实实上GPU吧hhh 想偷个懒用mac测,10分钟没推理出你好。感觉是offload浪费了很多时间
No branches or pull requests
您好,尝试在M1的mac上运行模型,由于内存问题,加了一个offload_folder和torch_dtype,代码如下:
但是在chat行(倒数第二)出现错误:
请问这是什么原因呀
The text was updated successfully, but these errors were encountered: