Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pydantic.error_wrappers.ValidationError: 1 validation error for GPT4All #7778

Closed
1 of 14 tasks
CreepyGutties opened this issue Jul 16, 2023 · 6 comments
Closed
1 of 14 tasks
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature Ɑ: embeddings Related to text embedding models module

Comments

@CreepyGutties
Copy link

System Info

Python 3.8.10
gpt4all==1.0.5
langchain==0.0.234
pydantic==1.10.11
pydantic-core==2.3.0

Who can help?

No response

Information

  • The official example notebooks/scripts
  • My own modified scripts

Related Components

  • LLMs/Chat Models
  • Embedding Models
  • Prompts / Prompt Templates / Prompt Selectors
  • Output Parsers
  • Document Loaders
  • Vector Stores / Retrievers
  • Memory
  • Agents / Agent Executors
  • Tools / Toolkits
  • Chains
  • Callbacks/Tracing
  • Async

Reproduction

Code snippet:
from langchain import PromptTemplate, LLMChain
from langchain.vectorstores import Chroma
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory
from langchain.embeddings import HuggingFaceInstructEmbeddings
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain.llms import GPT4All

embedding_model_name = "hkunlp/instructor-large"
persist_directory = 'db'
callbacks = [StreamingStdOutCallbackHandler()]
model_path = "/home/imrohankar/gpttest/DocGPT/models/ggml-gpt4all-j-v1.3-groovy.bin"

llm = GPT4All(
model = model_path,
callbacks = callbacks,
verbose = False
)

Error:
Traceback (most recent call last):
File "/home/imrohankar/gpttest/DocGPT/conversation.py", line 15, in
llm = GPT4All(
File "/home/imrohankar/gpttest/DocGPT/venv/lib/python3.8/site-packages/langchain/load/serializable.py", line 74, in init
super().init(**kwargs)
File "pydantic/main.py", line 341, in pydantic.main.BaseModel.init
pydantic.error_wrappers.ValidationError: 1 validation error for GPT4All
root
'type' object is not subscriptable (type=type_error)

Note: Tried downgrading pydantic, langchain versions but still the error. I am unable to understand why it gives type error while initializing GPT4All

Expected behavior

I would expect it to find the model file at the model path

@dosubot dosubot bot added Ɑ: embeddings Related to text embedding models module 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature labels Jul 16, 2023
@cserpell
Copy link

I am having a similar issue. This is a minimum code example to reproduce the error:

from langchain.llms.gpt4all import GPT4All

llm = GPT4All(model="./models/gpt4all-lora-quantized-ggml.bin")

I get the following error:

Traceback (most recent call last):
  File "gpt4all_me.py", line 3, in <module>
    llm = GPT4All(model="./models/gpt4all-lora-quantized-ggml.bin")
  File "/home/cserpell/git/activelooplangchain/a/lib/python3.8/site-packages/langchain/load/serializable.py", line 74, in __init__
    super().__init__(**kwargs)
  File "pydantic/main.py", line 341, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 1 validation error for GPT4All
__root__ -> __root__
  __init__() takes 1 positional argument but 2 were given (type=type_error)

I tried giving the directory without the ./, without the ./model/, putting the file in the current directory, and some other options, with no success.

Versions:
Python 3.8.15

Package Version
aiohttp 3.8.4
aiosignal 1.3.1
async-timeout 4.0.2
attrs 23.1.0
certifi 2023.5.7
charset-normalizer 3.2.0
dataclasses-json 0.5.9
frozenlist 1.4.0
gpt4all 1.0.5
greenlet 2.0.2
idna 3.4
langchain 0.0.234
langsmith 0.0.5
marshmallow 3.19.0
marshmallow-enum 1.5.1
multidict 6.0.4
mypy-extensions 1.0.0
numexpr 2.8.4
numpy 1.24.4
openapi-schema-pydantic 1.2.4
packaging 23.1
pip 23.2
pydantic 1.10.11
PyYAML 6.0
requests 2.31.0
setuptools 56.0.0
SQLAlchemy 2.0.19
tenacity 8.2.2
tqdm 4.65.0
typing_extensions 4.7.1
typing-inspect 0.9.0
urllib3 2.0.3
yarl 1.9.2

@dosubot
Copy link

dosubot bot commented Jul 16, 2023

Answer generated by a 🤖

Answer

From your description, it appears that you're encountering a 'type' object is not subscriptable (type=type_error) error when trying to initialize the GPT4All class in the LangChain framework. This error typically arises when you attempt to use subscript notation on a type object, which isn't allowed in Python. The issue doesn't seem to be directly related to the GPT4All class, but rather how it's being initialized.

In your code, you're initializing the GPT4All class as follows:

llm = GPT4All(
    model = model_path,
    callbacks = callbacks,
    verbose = False
)

The error might be stemming from the model_path variable. Please ensure that model_path is a string that accurately points to the location of your model file. If the model file doesn't exist or the path is incorrect, this could be the source of the error.

Additionally, it's important to verify that your model file is compatible with the GPT4All class. The model file should be in the ggml format, as indicated in the context:

To run locally, download a compatible ggml-formatted model.

If the model file isn't in the correct format, this could also be causing the error.

If the problem persists, it could be due to a compatibility issue between the versions of the libraries you're using. You might need to use specific versions of the libraries that are compatible with each other.

In the context shared, it's shown that the GPT4All class can also be initialized with a max_tokens parameter. If your model requires a specific number of tokens, you might need to include this parameter when initializing the GPT4All class:

llm = GPT4All(model=model_path, max_tokens=2048, callbacks=callbacks, verbose=False)

Please ensure that the number of tokens specified in the max_tokens parameter matches the requirements of your model.

This is one potential solution to your problem. If you believe this answer is correct and it's a bug that impacts other users, you're encouraged to make a pull request. If this answer contradicts your experience, you might want to consider upgrading to the latest version of the relevant libraries.

Sources


This response is meant to be useful, save you time, and share context. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

@CreepyGutties
Copy link
Author

Answer generated by a 🤖

Answer

From your description, it appears that you're encountering a 'type' object is not subscriptable (type=type_error) error when trying to initialize the GPT4All class in the LangChain framework. This error typically arises when you attempt to use subscript notation on a type object, which isn't allowed in Python. The issue doesn't seem to be directly related to the GPT4All class, but rather how it's being initialized.

In your code, you're initializing the GPT4All class as follows:

llm = GPT4All(
    model = model_path,
    callbacks = callbacks,
    verbose = False
)

The error might be stemming from the model_path variable. Please ensure that model_path is a string that accurately points to the location of your model file. If the model file doesn't exist or the path is incorrect, this could be the source of the error.

Additionally, it's important to verify that your model file is compatible with the GPT4All class. The model file should be in the ggml format, as indicated in the context:

To run locally, download a compatible ggml-formatted model.

If the model file isn't in the correct format, this could also be causing the error.

If the problem persists, it could be due to a compatibility issue between the versions of the libraries you're using. You might need to use specific versions of the libraries that are compatible with each other.

In the context shared, it's shown that the GPT4All class can also be initialized with a max_tokens parameter. If your model requires a specific number of tokens, you might need to include this parameter when initializing the GPT4All class:

llm = GPT4All(model=model_path, max_tokens=2048, callbacks=callbacks, verbose=False)

Please ensure that the number of tokens specified in the max_tokens parameter matches the requirements of your model.

This is one potential solution to your problem. If you believe this answer is correct and it's a bug that impacts other users, you're encouraged to make a pull request. If this answer contradicts your experience, you might want to consider upgrading to the latest version of the relevant libraries.

Sources

This response is meant to be useful, save you time, and share context. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

I have tried these approaches but unable to figure out. The only thing I have not tried yet is upgrading my Python version. I know few other folks for whom the same piece of code works on python version > 3.9

@CreepyGutties
Copy link
Author

I was able to fix this issue by downgrading to the below package versions.
pydantic==1.10.10
gpt4all==0.3.4
langchain==0.0.197

@cserpell
Copy link

How would downgrading to a previous langchain version solve the issue??

@CreepyGutties
Copy link
Author

How would downgrading to a previous langchain version solve the issue??

I am not certain on how it works but it appeared to me the higher versions of pydantic did not combine well with the combination of higher versions of langchain and gpt4all.

After running some tests for few days, I realized that running the latest versions of langchain and gpt4all works perfectly fine on python > 3.10 without hitting the validationErrors on pydantic
So better to upgrade the python version if anyone is on a lower version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature Ɑ: embeddings Related to text embedding models module
Projects
None yet
Development

No branches or pull requests

2 participants