Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raise NameError(f"Could not load Llama model from path: {model_path}") NameError: Could not load Llama model from path: models/ggml-model-q4_0.bin #140

Closed
b007zk opened this issue May 14, 2023 · 16 comments
Labels
primordial Related to the primordial version of PrivateGPT, which is now frozen in favour of the new PrivateGPT

Comments

@b007zk
Copy link

b007zk commented May 14, 2023

PS D:\privateGPT> python .\privateGPT.py
llama.cpp: loading model from models/ggml-model-q4_0.bin
llama.cpp: can't use mmap because tensors are not aligned; convert to new format to avoid this
llama_model_load_internal: format = 'ggml' (old version with low tokenizer quality and no mmap support)
llama_model_load_internal: n_vocab = 32000
llama_model_load_internal: n_ctx = 1000
llama_model_load_internal: n_embd = 4096
llama_model_load_internal: n_mult = 256
llama_model_load_internal: n_head = 32
llama_model_load_internal: n_layer = 32
llama_model_load_internal: n_rot = 128
llama_model_load_internal: ftype = 2 (mostly Q4_0)
llama_model_load_internal: n_ff = 11008
llama_model_load_internal: n_parts = 1
llama_model_load_internal: model size = 7B
error loading model: this format is no longer supported (see ggerganov/llama.cpp#1305)
llama_init_from_file: failed to load model
Traceback (most recent call last):
File "C:\Users<user>\AppData\Local\Programs\Python\Python311\Lib\site-packages\langchain\embeddings\llamacpp.py", line 78, in validate_environment
values["client"] = Llama(
^^^^^^
File "C:\Users<user>\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_cpp\llama.py", line 161, in init
assert self.ctx is not None
^^^^^^^^^^^^^^^^^^^^
AssertionError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "D:\privateGPT\privateGPT.py", line 57, in
main()
File "D:\privateGPT\privateGPT.py", line 21, in main
llama = LlamaCppEmbeddings(model_path=llama_embeddings_model, n_ctx=model_n_ctx)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "pydantic\main.py", line 339, in pydantic.main.BaseModel.init
File "pydantic\main.py", line 1102, in pydantic.main.validate_model
File "C:\Users<user>\AppData\Local\Programs\Python\Python311\Lib\site-packages\langchain\embeddings\llamacpp.py", line 98, in validate_environment
raise NameError(f"Could not load Llama model from path: {model_path}")
NameError: Could not load Llama model from path: models/ggml-model-q4_0.bin

@BassAzayda
Copy link

Did you download and place the model in the models folder?

@b007zk
Copy link
Author

b007zk commented May 14, 2023

@BassAzayda Yup the models are in a folder called models.

And this is what's in my .env :

PERSIST_DIRECTORY=db
LLAMA_EMBEDDINGS_MODEL=models/ggml-model-q4_0.bin
MODEL_TYPE=GPT4All
MODEL_PATH=models/ggml-gpt4all-j-v1.3-groovy.bin
MODEL_N_CTX=1000

@BassAzayda
Copy link

If using Windows could it be the slashes need to be the other way? Sorry Mac user here?

@b007zk
Copy link
Author

b007zk commented May 14, 2023

@BassAzayda I thought about that as a possibility, which is why I'm trying it also on WSL (Ubuntu).

@b007zk
Copy link
Author

b007zk commented May 14, 2023

@BassAzayda Unfortunately changing the slashes doesn't work :( Nor does providing the full path. It seems like the path isn't the problem here but rather an actual problem with loading the model due to the version of python and the library being used.

@b007zk
Copy link
Author

b007zk commented May 14, 2023

@imartinez Any solution to this?

@d2rgaming-9000
Copy link

You have to have the same python version, if it is a python version related problem. I think someone with a similar problem solved it by updating his to latest.

@b007zk
Copy link
Author

b007zk commented May 14, 2023

@d2rgaming-9000 i do have the latest Python 3.11.3

@smileBeda
Copy link

smileBeda commented May 15, 2023

@b007zk have you tried passing absolute path instead of relative pat in the .env?
That is what the readme asks for and what worked for me.
LLAMA_EMBEDDINGS_MODEL=/abspath/models/ggml-model-q4_0.bin
MODEL_PATH=/abspath/models/ggml-gpt4all-j-v1.3-groovy.bin

Where /abspath/ is the full, absolute path to that file.

And make sure you use =, not : (I run into your same issue when I had a typo there)

@b007zk
Copy link
Author

b007zk commented May 16, 2023

@smileBeda yes I have tried that, no luck unfortunately. I don't think the path is the problem here.

@aHardReset
Copy link

@b007zk I had the same exact issue in WSL. Please take a look at #198. I believe it addresses this issue and solves the problem. It introduces a file validator to the ingest.py module using the pathlib PATH module. You can review the changes and verify if they effectively solve your issue.

@b007zk
Copy link
Author

b007zk commented May 18, 2023

@aHardReset Hey, thanks for your comment. I actually tried with your changes but unfortunately still running into the following issue when I run the privateGPT.py

Traceback (most recent call last):
File "D:\privateGPT\privateGPT.py", line 57, in
main()
File "D:\privateGPT\privateGPT.py", line 21, in main
embeddings = HuggingFaceEmbeddings(model_name=embeddings_model_name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users<user>\AppData\Local\Programs\Python\Python311\Lib\site-packages\langchain\embeddings\huggingface.py", line 44, in init
super().init(**kwargs)
File "pydantic\main.py", line 341, in pydantic.main.BaseModel.init
pydantic.error_wrappers.ValidationError: 1 validation error for HuggingFaceEmbeddings
model_name
none is not an allowed value (type=type_error.none.not_allowed)

@kuppu
Copy link

kuppu commented May 20, 2023

pip install llama-cpp-python==0.1.48 resolved my issue, along with
pip install 'pygpt4all==v1.0.1' --force-reinstall

when using
https://huggingface.co/mrgaang/aira/blob/main/gpt4all-converted.bin
https://huggingface.co/Pi3141/alpaca-native-7B-ggml/blob/main/ggml-model-q4_0.bin

@b007zk
Copy link
Author

b007zk commented May 23, 2023

@kuppu thanks but that didn't work for me.

@BassAzayda
Copy link

Try running Python 3.10.X use pyenv if available on windows so you can switch environments as well as try another embeddings model from huggingface?

@Anonym0us33
Copy link

Anonym0us33 commented May 26, 2023

cpp is pretty f*'d up so is it possible to just use the koboldcpp.exe file as a server for this?
As far as I can tell, there is no reason to have to navigate any dependancies besides langchain and maybe web browsing for this project. IMO (and I am very stupid and unemployed) best practice would be to have the model be hosted by an extension with options as :

GPT API
llama.cpp
localhost
remotehost
and koboldcpp.exe 

and then have

langchain
urllib3
tabulate
tqdm

or whatever as core dependencies. Pytorch is also often an important dependency for llama models to run above 10 t/s, but different GPUs have different CUDA requirements. eg, tesla k80/p40/H100 or GTX660/RTX4090 not to mention AMD

@imartinez imartinez added the primordial Related to the primordial version of PrivateGPT, which is now frozen in favour of the new PrivateGPT label Oct 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
primordial Related to the primordial version of PrivateGPT, which is now frozen in favour of the new PrivateGPT
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants