# Multi-tool Agent with Text2Cypher

You will modify the agent to add a _Text to Cypher_ retriever tool.

The Text to Cypher tool will allow the agent to create queries to retrieve more specific information such as facts and figures.

***

Load the environment variables, import the required Python modules, and set up the base tools.

In [None]:
import sys
sys.path.insert(0, '../new-workshops/solutions')

from typing import Annotated

from neo4j_graphrag.llm import OpenAILLM
from neo4j_graphrag.retrievers import VectorCypherRetriever, Text2CypherRetriever
from neo4j_graphrag.schema import get_schema
from pydantic import Field
from azure.identity import DefaultAzureCredential

from agent_framework.azure import AzureAIClient
from azure.identity.aio import AzureCliCredential

from config import get_neo4j_driver, get_agent_config, get_embedder

In [13]:
# Get configuration and connect to Neo4j
config = get_agent_config()
driver = get_neo4j_driver().__enter__()
embedder = get_embedder()

Define the retrieval query for vector search with graph context.

In [14]:
# Retrieval query for vector search with graph context
retrieval_query = """
MATCH (node)-[:FROM_DOCUMENT]-(doc:Document)-[:FILED]-(company:Company)
OPTIONAL MATCH (company)-[:FACES_RISK]->(risk:RiskFactor)
WITH node, score, company, collect(risk.name) as risks
RETURN 
    node.text as text,
    score,
    {
        company: company.name,
        risks: risks
    } AS metadata
ORDER BY score DESC
"""

# Create vector retriever
vector_retriever = VectorCypherRetriever(
    driver=driver,
    index_name="chunkEmbeddings",
    embedder=embedder,
    retrieval_query=retrieval_query,
)

The Text to Cypher tool uses a separate LLM to generate the Cypher. This is useful as different models and settings are more effective at generating Cypher.

***

Create a `cypher_llm` using Microsoft Foundry.

In [None]:
# Create a separate LLM for Cypher generation using Microsoft Foundry
sync_credential = DefaultAzureCredential()
token = sync_credential.get_token("https://cognitiveservices.azure.com/.default")

cypher_llm = OpenAILLM(
    model_name=config.model_name,
    base_url=config.inference_endpoint,
    api_key=token.token,
)

The Text to Cypher tool requires a prompt which instructs the LLM on how to generate the Cypher.

***

Create a `cypher_prompt` which accepts the graph `schema` and the user's `question`.

In [16]:
# Create a cypher generation prompt
cypher_prompt = """Task: Generate a Cypher statement to query a graph database.
Instructions:
Use only the provided relationship types and properties in the schema.
Do not use any other relationship types or properties that are not provided.

Use `WHERE toLower(node.name) CONTAINS toLower('name')` to filter nodes by name.

Schema:
{schema}

Note: Do not include any explanations or apologies in your responses.
Do not respond to any questions that might ask anything else than for you to construct a Cypher statement.
Do not include any text except the generated Cypher statement.

The question is:
{query_text}"""

The prompt can include specific instructions on how to generate Cypher, for example, this instruction:

> Use `WHERE toLower(node.name) CONTAINS toLower('name')` to filter nodes by name.

... tells the LLM to use case insensitive and wild card matching when searching by company name.

***

Create the Text2Cypher retriever using the schema and custom prompt.

In [17]:
# Create the Text2Cypher retriever
text2cypher_retriever = Text2CypherRetriever(
    driver=driver,
    llm=cypher_llm,
    neo4j_schema=get_schema(driver),
    custom_prompt=cypher_prompt,
)

**Important Note on Text2Cypher:**

You are trusting the generation of Cypher to the LLM. It may generate invalid Cypher queries that could corrupt data in the graph or provide access to sensitive information.

In a production environment, you should ensure that access to data is limited, and sufficient security is in place to prevent malicious queries.

***

Create the tools for the agent. The multi-tool agent will have three tools:

1. `get_graph_schema` - Get the database schema
2. `retrieve_financial_documents` - Search documents semantically
3. `query_database` - Query specific facts from the database

In [18]:
# Define the schema tool
def get_graph_schema() -> str:
    """Get the schema of the graph database including node labels, relationships, and properties."""
    return get_schema(driver)

# Define a tool to retrieve financial documents
def retrieve_financial_documents(
    query: Annotated[str, Field(description="The search query to find relevant documents")]
) -> str:
    """Find details about companies in their financial documents using semantic search."""
    results = vector_retriever.search(query_text=query, top_k=3)
    if not results.items:
        return "No documents found matching the query."
    return "\n\n".join(item.content for item in results.items)

# Define a tool to query the database with natural language
def query_database(
    query: Annotated[str, Field(description="A natural language question about companies, risks, or financial metrics")]
) -> str:
    """Get answers to specific questions about companies, risks, and financial metrics by querying the database directly."""
    results = text2cypher_retriever.search(query_text=query)
    if not results.items:
        return "No results found for the query."
    return "\n\n".join(item.content for item in results.items)

Create the agent `tools` list and set up the `AzureAIClient`.

In [19]:
# Add the tools to a list
tools = [get_graph_schema, retrieve_financial_documents, query_database]

# Create credential and client
credential = AzureCliCredential()

client = AzureAIClient(
    project_endpoint=config.project_endpoint,
    model_deployment_name=config.model_name,
    async_credential=credential,
)

Create a query, run the agent, and stream the results.

In [26]:
# Run the agent
query = "Which company faces the most risk factors? What are 10 of those risk factors?"

async def run_agent():
    async with client.create_agent(
        name="workshop-multi-tool-agent",
        instructions=(
            "You are a helpful assistant that can answer questions about "
            "a graph database containing financial documents. You have three tools:\n"
            "1. get_graph_schema - Get the database schema\n"
            "2. retrieve_financial_documents - Search documents semantically\n"
            "3. query_database - Query specific facts from the database\n\n"
            "Choose the appropriate tool based on the question type. "
            "When a tool returns data, use that data to answer the question directly."
        ),
        tools=tools,
    ) as agent:
        print(f"User: {query}\n")
        print("Assistant: ", end="", flush=True)
        
        async for update in agent.run_stream(query):
            if update.text:
                print(update.text, end="", flush=True)
        
        print("\n")

await run_agent()

User: Which company faces the most risk factors? What are 10 of those risk factors?

Assistant: PG&E Corp is the company that faces the most risk factors, with a total of 202 identified risks. Ten of these risk factors include:

1. Climate Change
2. Interest Rates
3. Wildfires
4. Forward-Looking Statements
5. Triple Bottom Line
6. Climate-Driven Changes in Precipitation and Sea-Level Rise
7. Wildfire Mitigation
8. Transportation Electrification
9. Energy Storage
10. Environmental Regulations



Depending what question you ask, the agent will use different tools to respond to the question.

***

Modify the question and observe how the agent changes tools, or even runs multiple tools, to gather the context it requires to answer the question.

Try these examples that work well with the database:

**Text2Cypher queries (specific facts):**
* Which company faces the most risk factors?
* What companies are in the database?
* How many risk factors does APPLE INC face?
* What products does NVIDIA mention?
* What stock has MICROSOFT CORP issued?

**Semantic search queries (document content):**
* What are the main risk factors mentioned in the documents?
* What products does Microsoft mention in its financial documents?
* Summarize Apple's business strategy

**Schema queries:**
* How does the graph model relate to financial documents and risk factors?

***

[View the complete code](../new-workshops/solutions/03_03_text2cypher_agent.py)

In [21]:
# Try a different query
query = "What products does NVIDIA mention in its documents?"

async def run_experiment():
    # Create a fresh client for this experiment
    credential = AzureCliCredential()
    client = AzureAIClient(
        project_endpoint=config.project_endpoint,
        model_deployment_name=config.model_name,
        async_credential=credential,
    )
    
    async with client.create_agent(
        name="workshop-multi-tool-agent",
        instructions=(
            "You are a helpful assistant that can answer questions about "
            "a graph database containing financial documents. You have three tools:\n"
            "1. get_graph_schema - Get the database schema\n"
            "2. retrieve_financial_documents - Search documents semantically\n"
            "3. query_database - Query specific facts from the database\n\n"
            "Choose the appropriate tool based on the question type. "
            "When a tool returns data, use that data to answer the question directly."
        ),
        tools=tools,
    ) as agent:
        print(f"User: {query}\n")
        print("Assistant: ", end="", flush=True)
        
        async for update in agent.run_stream(query):
            if update.text:
                print(update.text, end="", flush=True)
        
        print("\n")
    
    await credential.close()

await run_experiment()

User: What products does NVIDIA mention in its documents?

Assistant: NVIDIA mentions several products across various markets in its documents:

1. **Gaming GPUs**: 
   - GeForce RTX series, including the RTX 40 series based on the Ada Lovelace architecture.
   - GeForce gaming and PC GPUs.
   - GeForce NOW game streaming service.

2. **Professional Visualization**:
   - Quadro/NVIDIA RTX GPUs for enterprise workstation graphics.
   - NVIDIA Omniverse for building and operating metaverse and 3D internet applications.

3. **Data Center Solutions**:
   - Hopper-based GPUs, including the H100.
   - DGX AI supercomputer.
   - HGX for hyperscale and supercomputing data centers.
   - EGX for enterprise and edge computing.
   - IGX for high-precision edge AI.
   - AGX for autonomous machines.

4. **Automotive Platforms**:
   - DRIVE Hyperion for autonomous driving solutions.
   - DRIVE Sim for data center-based simulation solutions.
   - NVIDIA DRIVE AGX computing hardware.

5. **Other Platfo

In [22]:
# Cleanup
driver.close()
await credential.close()