Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Getting validation error for SQL Database, type=value_error.missing, when its trying to list tables in the db #20405

Open
5 tasks done
RadientBrain opened this issue Apr 13, 2024 · 6 comments
Labels
Ɑ: agent Related to agents module 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature 🔌: postgres Related to postgres integrations

Comments

@RadientBrain
Copy link

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

I have the following code:

from langchain_community.agent_toolkits.sql.base import create_sql_agent
from langchain_community.utilities.sql_database import SQLDatabase
db = SQLDatabase.from_uri(database_uri="%POSTGRES_URI%")
prompt = "what is the total corpus of money in banks"
conversation = create_sql_agent(llm=llm, db=db, agent_type="openai-tools", verbose=True, top_k=15)
return conversation.invoke(input={"input": prompt})

Error Message and Stack Trace (if applicable)

Entering new SQL Agent Executor chain...
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"

Invoking: sql_db_list_tables with {}

ERROR:
1 validation error for _ListSQLDataBaseToolInput
tool_input
field required (type=value_error.missing)

Description

I recently updated langchain and pydantic to their latest version and my code broke and when using the sql agent I get input missing issue. I think the invoking of sql_db_list_tables is not putting any value into when trying to parse the description of table schema. And this is a bug from langchain library itself.

Kindly help into this issue, as this is becoming a blocker. Thanks.

System Info

System Information

OS: Darwin
OS Version: Darwin Kernel Version 23.1.0: Mon Oct 9 21:27:24 PDT 2023; root:xnu-10002.41.9~6/RELEASE_ARM64_T6000
Python Version: 3.11.7 (main, Dec 4 2023, 18:10:11) [Clang 15.0.0 (clang-1500.1.0.2.5)]

Package Information

langchain_core: 0.1.42
langchain: 0.1.16
langchain_community: 0.0.32
langsmith: 0.1.45
langchain_experimental: 0.0.50
langchain_openai: 0.1.3
langchain_pinecone: 0.0.3
langchain_text_splitters: 0.0.1
langchainhub: 0.1.14
pydantic-settings: 2.1.0
pydantic: 2.7.0
pydantic-core: 2.18.1

Packages not installed (Not Necessarily a Problem)

The following packages were not found:

langgraph
langserve

@dosubot dosubot bot added Ɑ: agent Related to agents module 🔌: postgres Related to postgres integrations 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature labels Apr 13, 2024
@os1ma
Copy link
Contributor

os1ma commented Apr 13, 2024

The patch I created mentioned, but it seems unrelated to this issue.
Will it work if langchain_openai is version 0.1.2 and could it cause the error if langchain_openai is version 0.1.3?

@RadientBrain
Copy link
Author

RadientBrain commented Apr 13, 2024

Hey @os1ma , It can be possible, if there is some dependency issue, so when I upgrade the langchain_openai to latest version to use gpt-4-turbo, I am not able to run the sql_agent:

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 400 Bad Request"

Error in sql_agent: Error code: 400 - {'error': {'message': "Unknown parameter: 'messages[3].tool_calls[0].index'.", 'type': 'invalid_request_error', 'param': 'messages[3].tool_calls[0].index', 'code': 'unknown_parameter'}}

Error in post bot answer: Error code: 400 - {'error': {'message': "Unknown parameter: 'messages[3].tool_calls[0].index'.", 'type': 'invalid_request_error', 'param': 'messages[3].tool_calls[0].index', 'code': 'unknown_parameter'}}

and when I downgrade it to langchain_openai==0.0.8, then I am not able to use gpt-4-turbo.

Its a strange issue but mostly something related to the dependency issue or something else maybe.

@os1ma
Copy link
Contributor

os1ma commented Apr 13, 2024

I was curious about this error, so I looked into it and found the cause.

If anything, it is more of a gpt-4-turbo response issue than a LangChain issue.
At least, it is not a problem caused by the patch I created.

Based on the SQL Agent Document, I tried the following code and encountered the same error as you.

from langchain_community.agent_toolkits.sql.base import create_sql_agent
from langchain_community.utilities.sql_database import SQLDatabase
from langchain_openai import ChatOpenAI

db = SQLDatabase.from_uri("sqlite:///Chinook.db")
llm = ChatOpenAI(model="gpt-4-turbo", temperature=0)
agent_executor = create_sql_agent(llm, db=db, agent_type="openai-tools", verbose=True)
result = agent_executor.invoke("List the total sales per country. Which country's customers spent the most?")
print(result["output"])
python -m langchain_core.sys_info

System Information
------------------
> OS:  Linux
> OS Version:  #111~20.04.1-Ubuntu SMP Mon Mar 11 15:44:43 UTC 2024
> Python Version:  3.11.9 (main, Apr 10 2024, 18:31:06) [GCC 9.4.0]

Package Information
-------------------
> langchain_core: 0.1.42
> langchain: 0.1.16
> langchain_community: 0.0.32
> langsmith: 0.1.47
> langchain_openai: 0.1.3
> langchain_text_splitters: 0.0.1

Packages not installed (Not Necessarily a Problem)
--------------------------------------------------
The following packages were not found:

> langgraph
> langserve

The follosing error occurs using gpt-4-turbo, but this error does not occur using gpt-3.5-turbo.

pydantic.v1.error_wrappers.ValidationError: 1 validation error for _ListSQLDataBaseToolInput
tool_input
  field required (type=value_error.missing)

The reason for this is the difference in the OpenAI API response.

gpt-3.5-turbo returns the following response

{"tool_input": ""}

gpt-4-turbo returns the following response

{}

LangSmith or the following code will allow you to see the difference of those responses.

This code emulates LangChain SQL Agent behavior.

from openai import OpenAI

MODEL = "gpt-3.5-turbo-0125"

message_dicts = [
    {
        "content": 'You are an agent designed to interact with a SQL database.\nGiven an input question, create a syntactically correct sqlite query to run, then look at the results of the query and return the answer.\nUnless the user specifies a specific number of examples they wish to obtain, always limit your query to at most 10 results.\nYou can order the results by a relevant column to return the most interesting examples in the database.\nNever query for all the columns from a specific table, only ask for the relevant columns given the question.\nYou have access to tools for interacting with the database.\nOnly use the below tools. Only use the information returned by the below tools to construct your final answer.\nYou MUST double check your query before executing it. If you get an error while executing a query, rewrite the query and try again.\n\nDO NOT make any DML statements (INSERT, UPDATE, DELETE, DROP etc.) to the database.\n\nIf the question does not seem related to the database, just return "I don\'t know" as the answer.\n',
        "role": "system",
    },
    {
        "content": "List the total sales per country. Which country's customers spent the most?",
        "role": "user",
    },
    {
        "content": "I should look at the tables in the database to see what I can query.  Then I should query the schema of the most relevant tables.",
        "role": "assistant",
    },
]

params = {
    "model": MODEL,
    "stream": True,
    "n": 1,
    "temperature": 0.0,
    "tools": [
        {
            "type": "function",
            "function": {
                "name": "sql_db_query",
                "description": "Input to this tool is a detailed and correct SQL query, output is a result from the database. If the query is not correct, an error message will be returned. If an error is returned, rewrite the query, check the query, and try again. If you encounter an issue with Unknown column 'xxxx' in 'field list', use sql_db_schema to query the correct table fields.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "query": {
                            "description": "A detailed and correct SQL query.",
                            "type": "string",
                        }
                    },
                    "required": ["query"],
                },
            },
        },
        {
            "type": "function",
            "function": {
                "name": "sql_db_schema",
                "description": "Input to this tool is a comma-separated list of tables, output is the schema and sample rows for those tables. Be sure that the tables actually exist by calling sql_db_list_tables first! Example Input: table1, table2, table3",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "table_names": {
                            "description": "A comma-separated list of the table names for which to return the schema. Example input: 'table1, table2, table3'",
                            "type": "string",
                        }
                    },
                    "required": ["table_names"],
                },
            },
        },
        {
            "type": "function",
            "function": {
                "name": "sql_db_list_tables",
                "description": "Input is an empty string, output is a comma-separated list of tables in the database.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "tool_input": {
                            "description": "An empty string",
                            "type": "string",
                        }
                    },
                    "required": ["tool_input"],
                },
            },
        },
        {
            "type": "function",
            "function": {
                "name": "sql_db_query_checker",
                "description": "Use this tool to double check if your query is correct before executing it. Always use this tool before executing a query with sql_db_query!",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "query": {
                            "description": "A detailed and SQL query to be checked.",
                            "type": "string",
                        }
                    },
                    "required": ["query"],
                },
            },
        },
    ],
}

client = OpenAI()

with client.chat.completions.create(messages=message_dicts, **params) as response:
    for chunk in response:
        print(chunk)

As following code, _ListSQLDatabaseTool requires tool_input field, so the error does not cause with gpt-3.5-turbo response, but cause with gpt-4-turbo response.

class _ListSQLDataBaseToolInput(BaseModel):
tool_input: str = Field(..., description="An empty string")

I have tried it and it works fine on gpt-4-turbo with the following patch in the tool.

class _ListSQLDataBaseToolInput(BaseModel):
-    tool_input: str = Field(..., description="An empty string")
+    tool_input: str = Field("", description="An empty string")

I will try to create a pull request of this patch, but I am not sure if the LangChain team will accept it, since the root cause is a response issue of gpt-4-turbo.

@RadientBrain
Copy link
Author

I think langchain team should accept the PR raised by you because, though, the function calling is being done, but for some reason gpt-4-turbo ignores it for responding in particular way, and the ellipsis is not required in this case because its just an empty string, so as you provided a default value, its a quick fix, because that solves the issue.

Though maybe we need to take this issue to openai as well? Because gpt-4-turbo is not respecting function calling, I guess.

baskaryan pushed a commit that referenced this issue Apr 13, 2024
…l_input (#20409)

**Description:**

`_ListSQLDatabaseToolInput` raise error if model returns `{}`.
For example, gpt-4-turbo returns `{}` with SQL Agent initialized by
`create_sql_agent`.

So, I set default value `""` for `_ListSQLDatabaseToolInput` tool_input.

This is actually a gpt-4-turbo issue, not a LangChain issue, but I
thought it would be helpful to set a default value `""`.

This problem is discussed in detail in the following Issue.

**Issue:** #20405

**Dependencies:** none

Sorry, I did not add or change the test code, as tests for this
components was not exist .

However, I have tested the following code based on the [SQL Agent
Document](https://python.langchain.com/docs/use_cases/sql/agents/), to
make sure it works.

```
from langchain_community.agent_toolkits.sql.base import create_sql_agent
from langchain_community.utilities.sql_database import SQLDatabase
from langchain_openai import ChatOpenAI

db = SQLDatabase.from_uri("sqlite:///Chinook.db")
llm = ChatOpenAI(model="gpt-4-turbo", temperature=0)
agent_executor = create_sql_agent(llm, db=db, agent_type="openai-tools", verbose=True)
result = agent_executor.invoke("List the total sales per country. Which country's customers spent the most?")
print(result["output"])
```
@RadientBrain
Copy link
Author

This is merged! 😄 But I think it will take a little while to publish the package.

junkeon pushed a commit to UpstageAI/langchain that referenced this issue Apr 16, 2024
…l_input (langchain-ai#20409)

**Description:**

`_ListSQLDatabaseToolInput` raise error if model returns `{}`.
For example, gpt-4-turbo returns `{}` with SQL Agent initialized by
`create_sql_agent`.

So, I set default value `""` for `_ListSQLDatabaseToolInput` tool_input.

This is actually a gpt-4-turbo issue, not a LangChain issue, but I
thought it would be helpful to set a default value `""`.

This problem is discussed in detail in the following Issue.

**Issue:** langchain-ai#20405

**Dependencies:** none

Sorry, I did not add or change the test code, as tests for this
components was not exist .

However, I have tested the following code based on the [SQL Agent
Document](https://python.langchain.com/docs/use_cases/sql/agents/), to
make sure it works.

```
from langchain_community.agent_toolkits.sql.base import create_sql_agent
from langchain_community.utilities.sql_database import SQLDatabase
from langchain_openai import ChatOpenAI

db = SQLDatabase.from_uri("sqlite:///Chinook.db")
llm = ChatOpenAI(model="gpt-4-turbo", temperature=0)
agent_executor = create_sql_agent(llm, db=db, agent_type="openai-tools", verbose=True)
result = agent_executor.invoke("List the total sales per country. Which country's customers spent the most?")
print(result["output"])
```
naveentatikonda pushed a commit to naveentatikonda/langchain that referenced this issue Apr 19, 2024
…l_input (langchain-ai#20409)

**Description:**

`_ListSQLDatabaseToolInput` raise error if model returns `{}`.
For example, gpt-4-turbo returns `{}` with SQL Agent initialized by
`create_sql_agent`.

So, I set default value `""` for `_ListSQLDatabaseToolInput` tool_input.

This is actually a gpt-4-turbo issue, not a LangChain issue, but I
thought it would be helpful to set a default value `""`.

This problem is discussed in detail in the following Issue.

**Issue:** langchain-ai#20405

**Dependencies:** none

Sorry, I did not add or change the test code, as tests for this
components was not exist .

However, I have tested the following code based on the [SQL Agent
Document](https://python.langchain.com/docs/use_cases/sql/agents/), to
make sure it works.

```
from langchain_community.agent_toolkits.sql.base import create_sql_agent
from langchain_community.utilities.sql_database import SQLDatabase
from langchain_openai import ChatOpenAI

db = SQLDatabase.from_uri("sqlite:///Chinook.db")
llm = ChatOpenAI(model="gpt-4-turbo", temperature=0)
agent_executor = create_sql_agent(llm, db=db, agent_type="openai-tools", verbose=True)
result = agent_executor.invoke("List the total sales per country. Which country's customers spent the most?")
print(result["output"])
```
hinthornw pushed a commit that referenced this issue Apr 26, 2024
…l_input (#20409)

**Description:**

`_ListSQLDatabaseToolInput` raise error if model returns `{}`.
For example, gpt-4-turbo returns `{}` with SQL Agent initialized by
`create_sql_agent`.

So, I set default value `""` for `_ListSQLDatabaseToolInput` tool_input.

This is actually a gpt-4-turbo issue, not a LangChain issue, but I
thought it would be helpful to set a default value `""`.

This problem is discussed in detail in the following Issue.

**Issue:** #20405

**Dependencies:** none

Sorry, I did not add or change the test code, as tests for this
components was not exist .

However, I have tested the following code based on the [SQL Agent
Document](https://python.langchain.com/docs/use_cases/sql/agents/), to
make sure it works.

```
from langchain_community.agent_toolkits.sql.base import create_sql_agent
from langchain_community.utilities.sql_database import SQLDatabase
from langchain_openai import ChatOpenAI

db = SQLDatabase.from_uri("sqlite:///Chinook.db")
llm = ChatOpenAI(model="gpt-4-turbo", temperature=0)
agent_executor = create_sql_agent(llm, db=db, agent_type="openai-tools", verbose=True)
result = agent_executor.invoke("List the total sales per country. Which country's customers spent the most?")
print(result["output"])
```
@mohammedayub44
Copy link

Ran into this error while testing turbo today. Looking out for the new release. Thanks !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ɑ: agent Related to agents module 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature 🔌: postgres Related to postgres integrations
Projects
None yet
Development

No branches or pull requests

3 participants