From 8a3581155655f296dd117c21212c996e29ece083 Mon Sep 17 00:00:00 2001 From: Jon Luo <20971593+jzluo@users.noreply.github.com> Date: Thu, 23 Feb 2023 01:21:26 -0500 Subject: [PATCH] Don't instruct LLM to use the LIMIT clause, which is incompatible with SQL Server (#1242) The current prompt specifically instructs the LLM to use the `LIMIT` clause. This will cause issues with MS SQL Server, which uses `SELECT TOP` instead of `LIMIT`. The generated SQL will use `LIMIT`; the instruction to "always limit... using the LIMIT clause" seems to override the "create a syntactically correct mssql query to run" portion. Reported here: https://github.com/hwchase17/langchain/issues/1103#issuecomment-1441144224 I don't have access to a SQL Server instance to test, but removing that part of the prompt in OpenAI Playground results in the correct `SELECT TOP` syntax, whereas keeping it in results in the `LIMIT` clause, even when instructing it to generate syntactically correct mssql. It's also still correctly using `LIMIT` in my MariaDB database. I think in this case we can assume that the model will select the appropriate method based on the dialect specified. In general, it would be nice to be able to test a suite of SQL dialects for things like dialect-specific syntax and other issues we've run into in the past, but I'm not quite sure how to best approach that yet. --- langchain/chains/sql_database/prompt.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/langchain/chains/sql_database/prompt.py b/langchain/chains/sql_database/prompt.py index 127579e2c0b28f..8b0fd1529e5f36 100644 --- a/langchain/chains/sql_database/prompt.py +++ b/langchain/chains/sql_database/prompt.py @@ -2,7 +2,7 @@ from langchain.prompts.base import CommaSeparatedListOutputParser from langchain.prompts.prompt import PromptTemplate -_DEFAULT_TEMPLATE = """Given an input question, first create a syntactically correct {dialect} query to run, then look at the results of the query and return the answer. Unless the user specifies in his question a specific number of examples he wishes to obtain, always limit your query to at most {top_k} results using the LIMIT clause. You can order the results by a relevant column to return the most interesting examples in the database. +_DEFAULT_TEMPLATE = """Given an input question, first create a syntactically correct {dialect} query to run, then look at the results of the query and return the answer. Unless the user specifies in his question a specific number of examples he wishes to obtain, always limit your query to at most {top_k} results. You can order the results by a relevant column to return the most interesting examples in the database. Never query for all the columns from a specific table, only ask for a the few relevant columns given the question.