Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand SQL Agent Prompt #455

Closed
bcg opened this issue Jan 30, 2024 · 2 comments
Closed

Expand SQL Agent Prompt #455

bcg opened this issue Jan 30, 2024 · 2 comments

Comments

@bcg
Copy link

bcg commented Jan 30, 2024

I have an Sqlite database that the current SQLQueryAgent struggles with. I've been able to help it get around some issues by injecting instructions when the query is passed, but I'd like a more sustainable way of doing this.

I was thinking we could add a section to the current prompt that could be filled out at initialization:

_type: prompt
template: |
  Given an input question, create a syntactically correct {dialect} query to run, then return the query in valid SQL.
  Never query for all the columns from a specific table, only ask for a the few relevant columns given the question.
  Pay attention to use only the column names that you can see in the schema description.
  Be careful to not query for columns that do not exist.
  Pay attention to which column is in which table.
  Also, qualify column names with the table name when needed.

  Only use the tables listed below.
  {schema}

  You should take into consideration the following:
  {hints}

 ...

And this would be populated using something like:

hints = "This is an Sqlite3 database. You do not need to cast the total_value column to decimal or float"
agent = Langchain::Agent::SQLQueryAgent.new(llm: Langchain::LLM::OpenAI.new(api_key: ENV["OPENAI_API_KEY"]), db: database, hints: hints)

@andreibondarev I am happy to work on a PR that accomplishes this or something like it but wanted to get your thoughts on it first.

I think it also might be useful to add a data_dictionary prompt section for times when the column names don't make a lot of sense to the LLM.

Something similar could also be really useful for non-sql tools that have expansive data inputs and outputs.

@andreibondarev
Copy link
Collaborator

@bcg Thank you for your insights here! Have you tried Langchain::Assistant yet? My plan is to replace the Agents (including SQLQueryAgent) with it.

You could put the prompt into system instructions, something like the following (untested):

assistant = Langchain::Assistant.new(
  llm: Langchain::LLM::OpenAI.new(api_key: ENV["OPENAI_API_KEY"]),
  thread: Langchain::Thread.new,
  instructions: prompt, # long prompt you wrote out above
  tools: [
    Langchain::Tool::Database.new(connection_string: "...")
  ]
)

assistant.add_message content: "This is an Sqlite3 database. You do not need to cast the total_value column to decimal or float..."
assistant.run

I will say that after this PR Tools can have many different methods and it would be nice to start expanding Tools from single-method (Tool#execute()) to multi-method tools.

One caveat is that Langchain::Assistant currently only works with OpenAI, and only LLMs that support Function Calling in the future.

@mattlindsey
Copy link
Contributor

@bcg I also like your insights. I think another option to Andrei's suggestion is to subclass Langchain::Assistant like I did here as a simple test: https://gist.github.com/mattlindsey/07d99c97ca7070f89c0bd7fad6c89bd7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants