[BUG] Getting validation error for SQL Database, type=value_error.missing, when its trying to list tables in the db #20405

RadientBrain · 2024-04-13T04:14:38Z

Checked other resources

I added a very descriptive title to this issue.
I searched the LangChain documentation with the integrated search.
I used the GitHub search to find a similar question and didn't find it.
I am sure that this is a bug in LangChain rather than my code.
The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

I have the following code:

from langchain_community.agent_toolkits.sql.base import create_sql_agent
from langchain_community.utilities.sql_database import SQLDatabase
db = SQLDatabase.from_uri(database_uri="%POSTGRES_URI%")
prompt = "what is the total corpus of money in banks"
conversation = create_sql_agent(llm=llm, db=db, agent_type="openai-tools", verbose=True, top_k=15)
return conversation.invoke(input={"input": prompt})

Error Message and Stack Trace (if applicable)

Entering new SQL Agent Executor chain...
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"

Invoking: sql_db_list_tables with {}

ERROR:
1 validation error for _ListSQLDataBaseToolInput
tool_input
field required (type=value_error.missing)

Description

I recently updated langchain and pydantic to their latest version and my code broke and when using the sql agent I get input missing issue. I think the invoking of sql_db_list_tables is not putting any value into when trying to parse the description of table schema. And this is a bug from langchain library itself.

Kindly help into this issue, as this is becoming a blocker. Thanks.

System Info

System Information

OS: Darwin
OS Version: Darwin Kernel Version 23.1.0: Mon Oct 9 21:27:24 PDT 2023; root:xnu-10002.41.9~6/RELEASE_ARM64_T6000
Python Version: 3.11.7 (main, Dec 4 2023, 18:10:11) [Clang 15.0.0 (clang-1500.1.0.2.5)]

Package Information

langchain_core: 0.1.42
langchain: 0.1.16
langchain_community: 0.0.32
langsmith: 0.1.45
langchain_experimental: 0.0.50
langchain_openai: 0.1.3
langchain_pinecone: 0.0.3
langchain_text_splitters: 0.0.1
langchainhub: 0.1.14
pydantic-settings: 2.1.0
pydantic: 2.7.0
pydantic-core: 2.18.1

Packages not installed (Not Necessarily a Problem)

The following packages were not found:

langgraph
langserve

The text was updated successfully, but these errors were encountered:

os1ma · 2024-04-13T06:50:04Z

The patch I created mentioned, but it seems unrelated to this issue.
Will it work if langchain_openai is version 0.1.2 and could it cause the error if langchain_openai is version 0.1.3?

RadientBrain · 2024-04-13T07:07:54Z

Hey @os1ma , It can be possible, if there is some dependency issue, so when I upgrade the langchain_openai to latest version to use gpt-4-turbo, I am not able to run the sql_agent:

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 400 Bad Request"

Error in sql_agent: Error code: 400 - {'error': {'message': "Unknown parameter: 'messages[3].tool_calls[0].index'.", 'type': 'invalid_request_error', 'param': 'messages[3].tool_calls[0].index', 'code': 'unknown_parameter'}}

Error in post bot answer: Error code: 400 - {'error': {'message': "Unknown parameter: 'messages[3].tool_calls[0].index'.", 'type': 'invalid_request_error', 'param': 'messages[3].tool_calls[0].index', 'code': 'unknown_parameter'}}

and when I downgrade it to langchain_openai==0.0.8, then I am not able to use gpt-4-turbo.

Its a strange issue but mostly something related to the dependency issue or something else maybe.

os1ma · 2024-04-13T08:31:09Z

I was curious about this error, so I looked into it and found the cause.

If anything, it is more of a gpt-4-turbo response issue than a LangChain issue.
At least, it is not a problem caused by the patch I created.

Based on the SQL Agent Document, I tried the following code and encountered the same error as you.

from langchain_community.agent_toolkits.sql.base import create_sql_agent
from langchain_community.utilities.sql_database import SQLDatabase
from langchain_openai import ChatOpenAI

db = SQLDatabase.from_uri("sqlite:///Chinook.db")
llm = ChatOpenAI(model="gpt-4-turbo", temperature=0)
agent_executor = create_sql_agent(llm, db=db, agent_type="openai-tools", verbose=True)
result = agent_executor.invoke("List the total sales per country. Which country's customers spent the most?")
print(result["output"])

python -m langchain_core.sys_info

System Information
------------------
> OS:  Linux
> OS Version:  #111~20.04.1-Ubuntu SMP Mon Mar 11 15:44:43 UTC 2024
> Python Version:  3.11.9 (main, Apr 10 2024, 18:31:06) [GCC 9.4.0]

Package Information
-------------------
> langchain_core: 0.1.42
> langchain: 0.1.16
> langchain_community: 0.0.32
> langsmith: 0.1.47
> langchain_openai: 0.1.3
> langchain_text_splitters: 0.0.1

Packages not installed (Not Necessarily a Problem)
--------------------------------------------------
The following packages were not found:

> langgraph
> langserve

The follosing error occurs using gpt-4-turbo, but this error does not occur using gpt-3.5-turbo.

pydantic.v1.error_wrappers.ValidationError: 1 validation error for _ListSQLDataBaseToolInput
tool_input
  field required (type=value_error.missing)

The reason for this is the difference in the OpenAI API response.

gpt-3.5-turbo returns the following response

{"tool_input": ""}

gpt-4-turbo returns the following response

{}

LangSmith or the following code will allow you to see the difference of those responses.

This code emulates LangChain SQL Agent behavior.

from openai import OpenAI

MODEL = "gpt-3.5-turbo-0125"

message_dicts = [
    {
        "content": 'You are an agent designed to interact with a SQL database.\nGiven an input question, create a syntactically correct sqlite query to run, then look at the results of the query and return the answer.\nUnless the user specifies a specific number of examples they wish to obtain, always limit your query to at most 10 results.\nYou can order the results by a relevant column to return the most interesting examples in the database.\nNever query for all the columns from a specific table, only ask for the relevant columns given the question.\nYou have access to tools for interacting with the database.\nOnly use the below tools. Only use the information returned by the below tools to construct your final answer.\nYou MUST double check your query before executing it. If you get an error while executing a query, rewrite the query and try again.\n\nDO NOT make any DML statements (INSERT, UPDATE, DELETE, DROP etc.) to the database.\n\nIf the question does not seem related to the database, just return "I don\'t know" as the answer.\n',
        "role": "system",
    },
    {
        "content": "List the total sales per country. Which country's customers spent the most?",
        "role": "user",
    },
    {
        "content": "I should look at the tables in the database to see what I can query.  Then I should query the schema of the most relevant tables.",
        "role": "assistant",
    },
]

params = {
    "model": MODEL,
    "stream": True,
    "n": 1,
    "temperature": 0.0,
    "tools": [
        {
            "type": "function",
            "function": {
                "name": "sql_db_query",
                "description": "Input to this tool is a detailed and correct SQL query, output is a result from the database. If the query is not correct, an error message will be returned. If an error is returned, rewrite the query, check the query, and try again. If you encounter an issue with Unknown column 'xxxx' in 'field list', use sql_db_schema to query the correct table fields.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "query": {
                            "description": "A detailed and correct SQL query.",
                            "type": "string",
                        }
                    },
                    "required": ["query"],
                },
            },
        },
        {
            "type": "function",
            "function": {
                "name": "sql_db_schema",
                "description": "Input to this tool is a comma-separated list of tables, output is the schema and sample rows for those tables. Be sure that the tables actually exist by calling sql_db_list_tables first! Example Input: table1, table2, table3",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "table_names": {
                            "description": "A comma-separated list of the table names for which to return the schema. Example input: 'table1, table2, table3'",
                            "type": "string",
                        }
                    },
                    "required": ["table_names"],
                },
            },
        },
        {
            "type": "function",
            "function": {
                "name": "sql_db_list_tables",
                "description": "Input is an empty string, output is a comma-separated list of tables in the database.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "tool_input": {
                            "description": "An empty string",
                            "type": "string",
                        }
                    },
                    "required": ["tool_input"],
                },
            },
        },
        {
            "type": "function",
            "function": {
                "name": "sql_db_query_checker",
                "description": "Use this tool to double check if your query is correct before executing it. Always use this tool before executing a query with sql_db_query!",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "query": {
                            "description": "A detailed and SQL query to be checked.",
                            "type": "string",
                        }
                    },
                    "required": ["query"],
                },
            },
        },
    ],
}

client = OpenAI()

with client.chat.completions.create(messages=message_dicts, **params) as response:
    for chunk in response:
        print(chunk)

As following code, _ListSQLDatabaseTool requires tool_input field, so the error does not cause with gpt-3.5-turbo response, but cause with gpt-4-turbo response.

langchain/libs/community/langchain_community/tools/sql_database/tool.py

Lines 81 to 82 in 6dc4f59

    
           class _ListSQLDataBaseToolInput(BaseModel): 
        
               tool_input: str = Field(..., description="An empty string")

I have tried it and it works fine on gpt-4-turbo with the following patch in the tool.

class _ListSQLDataBaseToolInput(BaseModel):
-    tool_input: str = Field(..., description="An empty string")
+    tool_input: str = Field("", description="An empty string")

I will try to create a pull request of this patch, but I am not sure if the LangChain team will accept it, since the root cause is a response issue of gpt-4-turbo.

RadientBrain · 2024-04-13T10:07:33Z

I think langchain team should accept the PR raised by you because, though, the function calling is being done, but for some reason gpt-4-turbo ignores it for responding in particular way, and the ellipsis is not required in this case because its just an empty string, so as you provided a default value, its a quick fix, because that solves the issue.

Though maybe we need to take this issue to openai as well? Because gpt-4-turbo is not respecting function calling, I guess.

…l_input (#20409) **Description:** `_ListSQLDatabaseToolInput` raise error if model returns `{}`. For example, gpt-4-turbo returns `{}` with SQL Agent initialized by `create_sql_agent`. So, I set default value `""` for `_ListSQLDatabaseToolInput` tool_input. This is actually a gpt-4-turbo issue, not a LangChain issue, but I thought it would be helpful to set a default value `""`. This problem is discussed in detail in the following Issue. **Issue:** #20405 **Dependencies:** none Sorry, I did not add or change the test code, as tests for this components was not exist . However, I have tested the following code based on the [SQL Agent Document](https://python.langchain.com/docs/use_cases/sql/agents/), to make sure it works. ``` from langchain_community.agent_toolkits.sql.base import create_sql_agent from langchain_community.utilities.sql_database import SQLDatabase from langchain_openai import ChatOpenAI db = SQLDatabase.from_uri("sqlite:///Chinook.db") llm = ChatOpenAI(model="gpt-4-turbo", temperature=0) agent_executor = create_sql_agent(llm, db=db, agent_type="openai-tools", verbose=True) result = agent_executor.invoke("List the total sales per country. Which country's customers spent the most?") print(result["output"]) ```

RadientBrain · 2024-04-14T03:52:17Z

This is merged! 😄 But I think it will take a little while to publish the package.

…l_input (langchain-ai#20409) **Description:** `_ListSQLDatabaseToolInput` raise error if model returns `{}`. For example, gpt-4-turbo returns `{}` with SQL Agent initialized by `create_sql_agent`. So, I set default value `""` for `_ListSQLDatabaseToolInput` tool_input. This is actually a gpt-4-turbo issue, not a LangChain issue, but I thought it would be helpful to set a default value `""`. This problem is discussed in detail in the following Issue. **Issue:** langchain-ai#20405 **Dependencies:** none Sorry, I did not add or change the test code, as tests for this components was not exist . However, I have tested the following code based on the [SQL Agent Document](https://python.langchain.com/docs/use_cases/sql/agents/), to make sure it works. ``` from langchain_community.agent_toolkits.sql.base import create_sql_agent from langchain_community.utilities.sql_database import SQLDatabase from langchain_openai import ChatOpenAI db = SQLDatabase.from_uri("sqlite:///Chinook.db") llm = ChatOpenAI(model="gpt-4-turbo", temperature=0) agent_executor = create_sql_agent(llm, db=db, agent_type="openai-tools", verbose=True) result = agent_executor.invoke("List the total sales per country. Which country's customers spent the most?") print(result["output"]) ```

…l_input (#20409) **Description:** `_ListSQLDatabaseToolInput` raise error if model returns `{}`. For example, gpt-4-turbo returns `{}` with SQL Agent initialized by `create_sql_agent`. So, I set default value `""` for `_ListSQLDatabaseToolInput` tool_input. This is actually a gpt-4-turbo issue, not a LangChain issue, but I thought it would be helpful to set a default value `""`. This problem is discussed in detail in the following Issue. **Issue:** #20405 **Dependencies:** none Sorry, I did not add or change the test code, as tests for this components was not exist . However, I have tested the following code based on the [SQL Agent Document](https://python.langchain.com/docs/use_cases/sql/agents/), to make sure it works. ``` from langchain_community.agent_toolkits.sql.base import create_sql_agent from langchain_community.utilities.sql_database import SQLDatabase from langchain_openai import ChatOpenAI db = SQLDatabase.from_uri("sqlite:///Chinook.db") llm = ChatOpenAI(model="gpt-4-turbo", temperature=0) agent_executor = create_sql_agent(llm, db=db, agent_type="openai-tools", verbose=True) result = agent_executor.invoke("List the total sales per country. Which country's customers spent the most?") print(result["output"]) ```

mohammedayub44 · 2024-05-08T20:14:42Z

Ran into this error while testing turbo today. Looking out for the new release. Thanks !

dosubot bot added Ɑ: agent Related to agents module 🔌: postgres Related to postgres integrations 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature labels Apr 13, 2024

This was referenced Apr 13, 2024

OpenAI tools agent error if use with gpt-4-turbo-2024-04-09 #20264

Closed

openai[patch]: Fix langchain-openai unknown parameter error with gpt-4-turbo #20271

Merged

os1ma mentioned this issue Apr 13, 2024

community: Set default value for _ListSQLDatabaseToolInput tool_input #20409

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Getting validation error for SQL Database, type=value_error.missing, when its trying to list tables in the db #20405

[BUG] Getting validation error for SQL Database, type=value_error.missing, when its trying to list tables in the db #20405

RadientBrain commented Apr 13, 2024

os1ma commented Apr 13, 2024

RadientBrain commented Apr 13, 2024 •

edited

os1ma commented Apr 13, 2024

RadientBrain commented Apr 13, 2024

RadientBrain commented Apr 14, 2024

mohammedayub44 commented May 8, 2024

[BUG] Getting validation error for SQL Database, type=value_error.missing, when its trying to list tables in the db #20405

[BUG] Getting validation error for SQL Database, type=value_error.missing, when its trying to list tables in the db #20405

Comments

RadientBrain commented Apr 13, 2024

Checked other resources

Example Code

Error Message and Stack Trace (if applicable)

Description

System Info

System Information

Package Information

Packages not installed (Not Necessarily a Problem)

os1ma commented Apr 13, 2024

RadientBrain commented Apr 13, 2024 • edited

os1ma commented Apr 13, 2024

RadientBrain commented Apr 13, 2024

RadientBrain commented Apr 14, 2024

mohammedayub44 commented May 8, 2024

RadientBrain commented Apr 13, 2024 •

edited