-
Notifications
You must be signed in to change notification settings - Fork 25.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GGUF interaction with Transformers using AutoModel Class #30889
Comments
Hi @Abdullah-kwl |
yes, I am using an updated transformer library. |
Thanks @Abdullah-kwl , will try to repro and report back here |
Hi @Abdullah-kwl |
@younesbelkada yes i again run it now its working there was some dependencies conflict with other libraries but now its running. but now I am facing problem that my session crashed after using all available RAM, I think its loading model on ram but when I use llama-cpp-python it does not load model on ram and we can easily inference with larger models even more then 7B. try out using this model: model_id = "TheBloke/WestLake-7B-v2-GGUF" |
Feature request
https://huggingface.co/docs/transformers/main/en/gguf
in above documentation it shows that it loads the gguf model and provided the simple example
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF"
filename = "tinyllama-1.1b-chat-v1.0.Q6_K.gguf"
model = AutoModelForCausalLM.from_pretrained(model_id, gguf_file=filename)
tokenizer = AutoTokenizer.from_pretrained(model_id, gguf_file=filename)
but when I run the code it shows the error :
OSError: TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF does not appear to have a file named pytorch_model.bin, tf_model.h5, model.ckpt or flax_model.msgpack.
and some time show your transformer may be not updated
even after updating transformer it shows the same error not loading gguf model , add support to load gguf model
Motivation
it will solve to load gguf model with out the help of other library such as llama.cpp and ollama
Your contribution
I do not have a complete implementation in mind, but I suggest to start with previous method which mention in https://huggingface.co/docs/transformers/main/en/gguf
The text was updated successfully, but these errors were encountered: