# Vector + Graph Retriever Agent

You will modify the agent to include an additional tool that:

1. Searches the documents using the vector index
2. Traverses the graph around the document to find other facts

***

Load the environment variables, create the `model`, connect to the Neo4j `graph` database, and create the `get_schema` tool.

In [None]:
import os
from dotenv import load_dotenv
load_dotenv()

from langchain.chat_models import init_chat_model
from langgraph.prebuilt import create_react_agent
from langchain_core.tools import tool
from langchain_neo4j import Neo4jGraph, Neo4jVector
from langchain_openai import OpenAIEmbeddings

# Initialize the LLM
model = init_chat_model("gpt-4o", model_provider="openai")

# Connect to Neo4j
graph = Neo4jGraph(
    url=os.getenv("NEO4J_URI"),
    username=os.getenv("NEO4J_USERNAME"), 
    password=os.getenv("NEO4J_PASSWORD"),
)

# Define functions for each tool in the agent

@tool("Get-graph-database-schema")
def get_schema():
    """Get the schema of the graph database."""
    context = graph.schema
    return context

To use the vector index, you will need to create an embedding model to convert user's queries into embeddings.

In [None]:
# Create the embedding model
embedding_model = OpenAIEmbeddings(model="text-embedding-ada-002")

To retrieve data from the graph after documents have been found, you can define a `retrieval_query`.

In [None]:
retrieval_query = """
MATCH (node)-[:FROM_DOCUMENT]-(doc:Document)-[:FILED]-(company:Company)
RETURN 
    node.text as text,
    score,
    {
        company: company.name,
        risks: [ (company:Company)-[:FACES_RISK]->(risk:RiskFactor) | risk.name ]
    } AS metadata
ORDER BY score DESC
"""

> This query retrieves the `Company` the `Document` relates to and any associated `RiskFactor` nodes.

***

Create the vector index to search the `chunkEmbeddings` and include the `retrieval_query`.

In [None]:
# Create vector index
chunk_vector = Neo4jVector.from_existing_index(
    embedding_model,
    graph=graph,
    index_name="chunkEmbeddings",
    embedding_node_property="embedding",
    text_node_property="text",
    retrieval_query=retrieval_query,
)

Create a new tool to `Retrieve-financial-documents` that searches the `chunk_vector`.

In [None]:
# Define a tool to retrieve financial documents
@tool("Retrieve-financial-documents")
def retrieve_docs(query: str):
    """Find details about companies in their financial documents."""
    # Use the vector to find relevant documents
    context = chunk_vector.similarity_search(
        query, 
        k=3,
    )
    return context

> The agent will use the tool's name and docstring to determine if it is needed.

***

Create the agent `tools` and the `agent`.

In [None]:
# Add the tools to the agent
tools = [get_schema, retrieve_docs]

agent = create_react_agent(
    model, 
    tools
)

> The agent has access to the `get-schema` and `retrieve_docs` tools. The agent will pick between them when processing the user's query.

***

Create a query, run the agent, and stream the results.

In [None]:
query = "Summarise what risk factors are mentioned in Apple's financial documents?"

for step in agent.stream(
    {
        "messages": [{"role": "user", "content": query}]
    },
    stream_mode="values",
):
    step["messages"][-1].pretty_print()

Experiment with the agent, ask different questions about the documents and the graph schema, for example:

* Summarize the schema of the graph database.
* What are the main risk factors mentioned in the documents?
* Tell me about cybersecurity threats in financial services
* What products does Microsoft mention in its financial documents?
* How are companies connected through their mentioned products?
* What type of questions can I ask about Apple using the graph database?

> The agent will pick different tools depending on the task.

***

Try modifying the `retrieval_query` to pull back additional data about the `Company` such as:

* Asset managers - `(company:Company)<-[:OWNS]-(manager:AssetManager)`
* Financial metrics - `(company:Company)-[:HAS_METRIC]->(metric:FinancialMetric)`
* Products - `(company:Company)-[:MENTIONS]->(product:Product)`

Including additional context will help the agent to create more specific responses.

***

[View the complete code](solutions/02_02_vector_graph_agent.py)