# Database querying system with the use of db-ally

In this guide, we will build a database querying agentic system based on the [db-ally package](https://db-ally.deepsense.ai/). Along the way, we will learn about a [Supervisor Agent](https://github.com/langchain-ai/langgraph/blob/main/examples/multi_agent/agent_supervisor.ipynb), dynamic State modification, and the [human-in-the-loop component](https://github.com/langchain-ai/langgraph/blob/main/examples/human-in-the-loop.ipynb)

Our goal is to build a system that looks like this:
<img src="https://drive.google.com/uc?export=view&id=1McIouHQR9ITYQOmExCtzSeL8Iz2Nyhw1" alt="Agents diagram" width="700" height="auto">

Agents have the following duties:

**Configuration Agent**: based on the user query it changes configurations that are used by other agents

**db-ally Agent**: given the question in the natural language it queries the given database and responds accordingly

**human-in-the-loop**: Sometimes we may miss some kind of information to formulate a query, in such a case we simply ask the user to provide more.

## Setup

First, we need to install all dependencies

In [None]:
!pip install -U dbally[litellm,langsmith] langgraph langchain langchain_openai langchain_experimental

We will use the OpenAI API. We have to set the API key

In [1]:
import os

os.environ["OPENAI_API_KEY"] = ""

Finally, we want to query a database so we should have one. So, let's download a [dummy HR Recruiting database](https://drive.google.com/file/d/1A5yPt-pIyXGV94c6cMIkMf8AhiP6Nnq6/view?usp=drive_link).

In [2]:
!wget -O recruitment.db 'https://drive.google.com/uc?export=download&id=1zo3j8x7qH8opTKyQ9qFgRpS3yqU6uTRs'

--2024-04-18 16:02:02--  https://drive.google.com/uc?export=download&id=1zo3j8x7qH8opTKyQ9qFgRpS3yqU6uTRs
Resolving drive.google.com (drive.google.com)... 142.250.203.206, 2a00:1450:401b:810::200e
Connecting to drive.google.com (drive.google.com)|142.250.203.206|:443... connected.
HTTP request sent, awaiting response... 303 See Other
Location: https://drive.usercontent.google.com/download?id=1zo3j8x7qH8opTKyQ9qFgRpS3yqU6uTRs&export=download [following]
--2024-04-18 16:02:02--  https://drive.usercontent.google.com/download?id=1zo3j8x7qH8opTKyQ9qFgRpS3yqU6uTRs&export=download
Resolving drive.usercontent.google.com (drive.usercontent.google.com)... 142.250.75.1, 2a00:1450:401b:800::2001
Connecting to drive.usercontent.google.com (drive.usercontent.google.com)|142.250.75.1|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 24576 (24K) [application/octet-stream]
Saving to: ‘recruitment.db’


2024-04-18 16:02:03 (1,69 MB/s) - ‘recruitment.db’ saved [24576/24576]



and build it.

In [3]:
from sqlalchemy import create_engine
from sqlalchemy.ext.automap import automap_base
from sqlalchemy import select

ENGINE = create_engine("sqlite:///recruitment.db")

RECRUITMENT_MODEL = automap_base()
RECRUITMENT_MODEL.prepare(autoload_with=ENGINE, reflect=True)

  RECRUITMENT_MODEL.prepare(autoload_with=ENGINE, reflect=True)


The next step is to provide db-ally access to the database via [views](https://db-ally.deepsense.ai/concepts/views/) and [filters](https://db-ally.deepsense.ai/concepts/iql/). Here, we allowed db-ally to answer questions about offered job positions and the experience of candidates.

In [4]:
from dbally import SqlAlchemyBaseView, decorators
import sqlalchemy


class JobOfferView(SqlAlchemyBaseView):
    """
    View meant for answering questions about job offers.
    """

    def get_select(self) -> sqlalchemy.Select:
        return select(RECRUITMENT_MODEL.classes.offer)

    @decorators.view_filter()
    def filter_job_offers_by_position(self, position: str) -> sqlalchemy.ColumnElement:
        return RECRUITMENT_MODEL.classes.offer.position == position


class CandidateView(SqlAlchemyBaseView):
    """
    View meant for answering questions about candiates.
    """

    def get_select(self) -> sqlalchemy.Select:
        return select(RECRUITMENT_MODEL.classes.candidate)

    @decorators.view_filter()
    def filter_candidates_by_experience(self, years: int) -> sqlalchemy.ColumnElement:
        return RECRUITMENT_MODEL.classes.candidate.years_of_experience >= years

Consequently, we need to create a [collection](https://db-ally.deepsense.ai/concepts/collections/) and register previously created views.

In [5]:
import dbally
from dbally.llms.litellm import LiteLLM

recruitment_db = dbally.create_collection("recruitment", llm=LiteLLM())
recruitment_db.add(JobOfferView, lambda: JobOfferView(ENGINE))
recruitment_db.add(CandidateView, lambda: CandidateView(ENGINE))

## Agents State

After creating the collection, we move and define the State of the entire system. In this case, it contains three components:
1. List of messages exchanged between agents
2. Configuration of db-ally where we can set [the format of the answer](https://db-ally.deepsense.ai/concepts/nl_responder/), the used collection, and whether to [log runs to langsmith](https://db-ally.deepsense.ai/how-to/log_runs_to_langsmith/).
3. Information about which agent should be called now.

In [6]:
import enum
import operator
from typing import TypedDict, Annotated, Sequence

from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_core.messages import BaseMessage, HumanMessage, AIMessage


COLLECTIONS = {"Recruitment": recruitment_db, "Benefits": recruitment_db}


AvailableCollections = enum.Enum("AvailableCollections", list(COLLECTIONS.keys()))


class DballyConfig(BaseModel):
    """Modifies the configuration used by the db-ally engine to generate a response"""

    use_nl_responses: bool = Field(default=None, description="Whether or not to use natural language response")
    used_collection: AvailableCollections = Field(
        default=AvailableCollections.Recruitment, description="Which collection should be used"
    )
    log_to_langsmith: bool = Field(default=None, description="Whether to log conversations to the langsmith system")


class AgentState(TypedDict):
    messages: Annotated[Sequence[BaseMessage], operator.add]
    dbally_config: DballyConfig
    next: str

To start the interaction, we will need to provide an initial configuration and state.

Below you can see, how the state and configuration may look at some point in time.

In [7]:
example_dbally_config = DballyConfig(
    use_nl_responses=False, used_collection=AvailableCollections.Recruitment, log_to_langsmith=False
)

example_state = AgentState(
    messages=[
        HumanMessage(content="Do we have an offer for Software Engineer?", name="User"),
        AIMessage(
            content="Yes, we have 4 offers for Software Engineers, do you want to learn more about them?",
            name="db-ally",
        ),
    ],
    dbally_config=example_dbally_config,
    next="human",
)

## db-ally agent

Now, we create the first agent. This one performs an asynchronous call to db-ally based on the last message and the current configuration.


In [8]:
from dbally.iql_generator.iql_prompt_template import UnsupportedQueryError


async def call_dbally(state: AgentState):
    message = state["messages"][-1].content
    dbally_config = state["dbally_config"]

    try:
        result = await recruitment_db.ask(message, return_natural_response=dbally_config.use_nl_responses)

        if result.textual_response is not None:
            return {"messages": [AIMessage(content=result.textual_response, name="db-ally")]}
        else:
            return {"messages": [AIMessage(content=str(result.results), name="db-ally")]}
    except UnsupportedQueryError:
        return {"messages": [AIMessage(contest="database master can't answer this question")]}

Here you can see how the agent will respond, given a particular state. Because we don't want natural language responses at this particular moment, you should see a list of database records, each formatted as a dictionary.

In [9]:
example_state = AgentState(
    messages=[HumanMessage(content="Do we have an offer for Software Engineer?", name="User")],
    dbally_config=example_dbally_config,
    next="",
)
await call_dbally(example_state)

{'messages': [AIMessage(content="[{'id': 5, 'company': 'HuggingFace', 'position': 'Software Engineer', 'excpected_years_of_experience': 3, 'salary': '$80000'}, {'id': 10, 'company': 'OpenAI', 'position': 'Software Engineer', 'excpected_years_of_experience': 3, 'salary': '$85000'}, {'id': 15, 'company': 'Google', 'position': 'Software Engineer', 'excpected_years_of_experience': 3, 'salary': '$90000'}, {'id': 20, 'company': 'Apple', 'position': 'Software Engineer', 'excpected_years_of_experience': 3, 'salary': '$100000'}, {'id': 25, 'company': 'Microsoft', 'position': 'Software Engineer', 'excpected_years_of_experience': 3, 'salary': '$110000'}, {'id': 30, 'company': 'Google', 'position': 'Software Engineer', 'excpected_years_of_experience': 3, 'salary': '$115000'}, {'id': 35, 'company': 'OpenAI', 'position': 'Software Engineer', 'excpected_years_of_experience': 3, 'salary': '$120000'}, {'id': 40, 'company': 'HuggingFace', 'position': 'Software Engineer', 'excpected_years_of_experience':

## Configuration Agent

Now we move to the configuration agent that given the user query sets up the appropriate configuration.

For this purpose, we use the [LangChain tool calling](https://python.langchain.com/docs/modules/model_io/chat/function_calling/) which supports pasing schema using Pydantic. So we can our `DballyConfig` directly as the `tool`.

In [10]:
from copy import copy

from langchain_core.output_parsers.openai_tools import PydanticToolsParser
from langchain_openai import ChatOpenAI


config_llm = ChatOpenAI(model="gpt-3.5-turbo-0125")
config_llm = config_llm.bind_tools([DballyConfig]) | PydanticToolsParser(tools=[DballyConfig])


def change_dbally_config(state: AgentState):
    message = state["messages"][-1].content
    config_modification = config_llm.invoke(message)[0]

    new_config = copy(state["dbally_config"])

    if config_modification.use_nl_responses is not None:
        new_config.use_nl_responses = config_modification.use_nl_responses

    if config_modification.used_collection is not None:
        new_config.used_collection = config_modification.used_collection

    if config_modification.log_to_langsmith is not None:
        new_config.log_to_langsmith = config_modification.log_to_langsmith

    return {
        "messages": [
            HumanMessage(content="Configuration adjusted. Ask human what to do now.", name="change_dbally_config")
        ],
        "dbally_config": new_config,
    }

Under the hood, DballyConfig is parsed into [OpenAI function calling format](https://platform.openai.com/docs/guides/function-calling).

In [11]:
from langchain_core.utils.function_calling import convert_to_openai_function

convert_to_openai_function(DballyConfig)

{'name': 'DballyConfig',
 'description': 'Modifies the configuration used by the db-ally engine to generate a response',
 'parameters': {'type': 'object',
  'properties': {'use_nl_responses': {'description': 'Whether or not to use natural language response',
    'type': 'boolean'},
   'used_collection': {'description': 'Which collection should be used',
    'default': 1,
    'allOf': [{'title': 'AvailableCollections',
      'description': 'An enumeration.',
      'enum': [1, 2]}]},
   'log_to_langsmith': {'description': 'Whether to log conversations to the langsmith system',
    'type': 'boolean'}}}}

Here you can see how this agent will behave when it is executed with the state below. Take a look at the dbally_config, `use_nl_responses` should be set to `True`

In [12]:
example_state = AgentState(
    messages=[HumanMessage(content="From now I want to use natural language responses.", name="User")],
    dbally_config=example_dbally_config,
    next="",
)
change_dbally_config(example_state)

{'messages': [HumanMessage(content='Configuration adjusted. Ask human what to do now.', name='change_dbally_config')],
 'dbally_config': DballyConfig(use_nl_responses=True, used_collection=<AvailableCollections.Recruitment: 1>, log_to_langsmith=False)}

## Human-in-the-loop Agent

Now, we create an agent that will ask our users, if they need something more, or if they can provide additional information


In [13]:
def human_in_the_loop(state: AgentState):
    response = input(f"What's next?")
    return {"messages": [HumanMessage(content=response, name="User")]}

## Supervisor Agent

The last building block of our system is a supervisor who decides which agent is the most appropriate given the current situation.

To create it, we:
1. Craft a system prompt that explains the supervisor's task
2. Utilize function calling. Look carefully at the docstring of the `SelectedRole` it should explain well which agent to use when
3. Build a prompt template. Observe that we provide the entire history of the conversation.
4. Create a chain
5. Write a function that executes the chain and updates the State with the next agent to operate.

In [14]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

members = ["Configuration Setter", "Database master", "human", "finish"]
system_prompt = (
    "You are a supervisor tasked with managing a conversation between the"
    " following workers:  {members}. Given the following user request,"
    " respond with the worker to act next. Each worker will perform a"
    " task and respond with their results and status."
)

AvailableAgents = enum.Enum("AvailableAgents", members)


class SelectedRole(BaseModel):
    """Select next agent to be used in the system. Use:
    1. Configuration Setter if human asks to change something in the answer, or you deduced it is necessary
    2. Database master: to gather information necessary to answer the question
    3. human: When you need an additional input to proceed
    4. finish: If human is satisfied with the answer"""

    next: AvailableAgents


supervisor_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        MessagesPlaceholder(variable_name="messages"),
    ]
).partial(members=", ".join(members))

supervisor_llm = ChatOpenAI(model="gpt-4-1106-preview")

supervisor_chain = (
    supervisor_prompt | supervisor_llm.bind_tools([SelectedRole]) | PydanticToolsParser(tools=[SelectedRole])
)


def call_supervisor(state: AgentState):
    response = supervisor_chain.invoke(state)[0]
    return {"next": response.next.name}

Let's verify that the agent behaves correctly.

In [15]:
example_state = AgentState(
    messages=[HumanMessage(content="From now I want to use natural language responses.", name="User")],
    dbally_config=example_dbally_config,
    next="",
)
call_supervisor(example_state)

{'next': 'Configuration Setter'}

At this moment, we have all the building blocks. The last thing to do is to connect them into a graph. So, we:

1. Create a graph and pass a `Schema of the State`
2. Define all nodes - agents that will cooperate.
3. Connect them with edges. Look carefully at the conditional edge.
4. Set supervisor to be the entry point

In [16]:
from langgraph.graph import END, StateGraph

graph = StateGraph(AgentState)

graph.add_node("supervisor", call_supervisor)
graph.add_node("dbally", call_dbally)
graph.add_node("change_config", change_dbally_config)
graph.add_node("human", human_in_the_loop)

conditional_map = {
    "Configuration Setter": "change_config",
    "Database master": "dbally",
    "human": "human",
    "finish": END,
}

graph.add_edge("change_config", "supervisor")
graph.add_edge("dbally", "supervisor")
graph.add_conditional_edges("supervisor", lambda x: x["next"], conditional_map)
graph.add_edge("human", "supervisor")

graph.set_entry_point("supervisor")

app = graph.compile()

Nice! The last thing we should do is to test it.

We start our conversation by providing configuration information. Next, the `configuration setter` should be selected. After that the user will be asked to provide the next instructions please. You try with a question **Do we have offers for Software Engineer?** This triggers the Database master, which answers using natural language which brings us to the final state.

In [17]:
example_state = AgentState(
    messages=[HumanMessage(content="From now I want to use natural language responses.", name="User")],
    dbally_config=example_dbally_config,
    next="",
)

async for event in app.astream(example_state):
    for v in event.values():
        print(v)

{'next': 'Configuration Setter'}
{'messages': [HumanMessage(content='Configuration adjusted. Ask human what to do now.', name='change_dbally_config')], 'dbally_config': DballyConfig(use_nl_responses=True, used_collection=<AvailableCollections.Recruitment: 1>, log_to_langsmith=False)}
{'next': 'human'}
{'messages': [HumanMessage(content='Do we have any data scientists?', name='User')]}
{'next': 'Database master'}
{'messages': [AIMessage(content='Yes, we have two data scientists in the database. One is Emily Chen from Canada with 3 years of experience, and the other is Anushka Sharma from India with 5 years of experience.', name='db-ally')]}
{'next': 'human'}
{'messages': [HumanMessage(content="That's all I wanted", name='User')]}
{'next': 'finish'}


Congratulations! Together, we built an agentic system capable of querying the database, changing the configuration dynamically, and incorporating human-in-the-loop. Good job!