# Multiple Agents over RAG + Knowledge Graph (Neo4j)


This notebook demonstrates a transparent multi-agent workflow:
- Agents are built with `langchain.agents.create_agent`
- Tools are defined with `@tool`
- Retrievers are called with `.invoke()`

### Architecture
Planner → RAG Researcher → KG Analyst → Validator/Editor

Each role is implemented as a separate **agent** with a role-specific `system_prompt`.


## 1) Install dependencies

In [16]:
!pip -q install -U langchain langchain-openai langchain-neo4j neo4j langchain-community faiss-cpu tiktoken

## 2) Configure credentials

In [17]:
import os, getpass

if "OPENAI_API_KEY" not in os.environ:
    os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI API Key: ")

NEO4J_URI = "neo4j+s://ead18442.databases.neo4j.io"
NEO4J_USER = "neo4j"
NEO4J_PASSWORD = "fz1Hzz1Z-Gzh8zYDwlfujRKHhTM9zq4eC5QAYp1YGWY"


## 3) Build shared tools
We create two tools that all agents can use:
1) `rag_retrieve` — retrieves evidence from a local vector store (FAISS)
2) `kg_query` — queries Neo4j via NL→Cypher

Students should notice:
- RAG is best for *fuzzy descriptive* questions
- KG is best for *relationships and paths*


In [18]:
from langchain.tools import tool
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_core.documents import Document
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import FAISS

# Model used by all agents
model = ChatOpenAI(model="gpt-4o-mini", temperature=0)

# --- RAG corpus (small and inspectable) ---
rag_docs = [
    Document(page_content="Alice is a PhD student researching neuro-symbolic AI, knowledge graphs, and reasoning systems.",
             metadata={"entity":"Alice","type":"Person"}),
    Document(page_content="Bob is a Professor working on ontology engineering and graph-based AI. He supervises students.",
             metadata={"entity":"Bob","type":"Person"}),
    Document(page_content="EnergyAIGroup focuses on AI for sustainability, energy forecasting, optimisation, and climate modelling.",
             metadata={"entity":"EnergyAIGroup","type":"Group"}),
]

# Embed + chunk + index
emb = OpenAIEmbeddings(model="text-embedding-3-small")
chunks = RecursiveCharacterTextSplitter(chunk_size=200, chunk_overlap=30).split_documents(rag_docs)
vs = FAISS.from_documents(chunks, emb)

# Retriever: v1-style call is retriever.invoke(query)
retriever = vs.as_retriever(search_kwargs={"k": 3})

@tool
def rag_retrieve(query: str) -> str:
    """Retrieve relevant University text snippets (RAG evidence)."""
    hits = retriever.invoke(query)
    return "\n\n".join([f"[{d.metadata}] {d.page_content}" for d in hits])


In [26]:
from langchain_neo4j import Neo4jGraph, GraphCypherQAChain

graph = Neo4jGraph(url=NEO4J_URI, username=NEO4J_USER, password=NEO4J_PASSWORD)
graph.refresh_schema()

kg_chain = GraphCypherQAChain.from_llm(
    model,
    graph=graph,
    verbose=False,
    validate_cypher=True,
    allow_dangerous_requests=True,
    use_function_response=True
)

@tool
def kg_query(question: str) -> str:
    """Answer relationship/type/path questions using Neo4j (NL→Cypher)."""
    return kg_chain.invoke({"query": question})["result"]

TOOLS = [rag_retrieve, kg_query]
TOOLS


[StructuredTool(name='rag_retrieve', description='Retrieve relevant University text snippets (RAG evidence).', args_schema=<class 'langchain_core.utils.pydantic.rag_retrieve'>, func=<function rag_retrieve at 0x7c68ce6d4040>),
 StructuredTool(name='kg_query', description='Answer relationship/type/path questions using Neo4j (NL→Cypher).', args_schema=<class 'langchain_core.utils.pydantic.kg_query'>, func=<function kg_query at 0x7c683418ef20>)]

## 4) Helper: extract tool calls from the agent run (tool-selection tracing)
LangChain v1 agents return a **state** containing a sequence of messages.
We inspect messages to find tool calls and tool outputs so students can see *why* the agent chose RAG vs KG.


In [27]:
def _msg_to_dict(m):
    # Handle either dict messages or LangChain message objects
    if isinstance(m, dict):
        return m
    d = {"type": type(m).__name__}
    for attr in ["content", "tool_calls", "name"]:
        if hasattr(m, attr):
            d[attr] = getattr(m, attr)
    return d

def trace_tool_calls(state):
    messages = state.get("messages", state)  # sometimes state is a dict with messages
    if isinstance(messages, dict) and "messages" in messages:
        messages = messages["messages"]
    if not isinstance(messages, list):
        print("No messages found to trace.")
        return

    print("\nTOOL TRACE (in order):")
    for i, m in enumerate(messages, 1):
        md = _msg_to_dict(m)
        tool_calls = md.get("tool_calls") or md.get("tool_calls", [])
        if tool_calls:
            print(f"Message {i}: tool call(s) →")
            for tc in tool_calls:
                # tc can be dict-like
                name = tc.get("name") if isinstance(tc, dict) else getattr(tc, "name", None)
                args = tc.get("args") if isinstance(tc, dict) else getattr(tc, "args", None)
                print("  -", name, "args:", args)

        # Tool results often show up as ToolMessage objects with content
        if md.get("type") == "ToolMessage" or md.get("name") in ("rag_retrieve", "kg_query"):
            c = md.get("content", "")
            if c:
                print(f"Message {i}: tool output snippet → {str(c)[:220].replace('\n',' ')}...")

def final_text(state):
    messages = state.get("messages", [])
    if not messages:
        return ""
    last = messages[-1]
    if isinstance(last, dict):
        return last.get("content", "")
    return getattr(last, "content", str(last))


## 5) Create role-specific agents using `create_agent`
Each agent gets the same tools but a different system prompt.
LangChain v1 agents are invoked with:
`agent.invoke({"messages": [{"role": "user", "content": "..."}]})`


In [28]:
from langchain.agents import create_agent

planner_agent = create_agent(
    model,
    tools=TOOLS,
    system_prompt=(
        """ROLE: Planner. Break the user goal into numbered sub-questions.
           For each sub-question, state which tool to use:
            - Use kg_query for relationships/paths/types (supervises, member of, affiliated with, types, paths).
            - Use rag_retrieve for fuzzy descriptions (research interests, focus areas).
           Do NOT answer the questions. Output only the plan."""
    )
)

rag_agent = create_agent(
    model,
    tools=TOOLS,
    system_prompt=(
        """ROLE: RAG Researcher. Use rag_retrieve to gather evidence snippets for descriptive questions.
        Return bullet points of evidence with minimal paraphrase. Do not invent facts."""
    )
)

kg_agent = create_agent(
    model,
    tools=TOOLS,
    system_prompt=(
        """ROLE: KG Analyst. Use kg_query for relationship/type/path questions.
        Return concise answers. If unclear, retry the tool with a clearer question."""
    )
)

editor_agent = create_agent(
    model,
    tools=TOOLS,
    system_prompt=(
        """ROLE: Validator/Editor. Combine evidence from the other agents into a final short report.
        Every claim must be supported by evidence from tools. Note missing/conflicting evidence explicitly."""
    )
)


## 6) Orchestrate the multi-agent workflow
We run each agent in sequence and pass the outputs forward.


In [29]:
user_goal = (
    """Write a short university profile report answering:
    1) Who supervises Alice?
    2) Which research group is Alice affiliated with?
    3) What does EnergyAIGroup focus on?
    4) Who works on neuro-symbolic reasoning?
    Use evidence from tools."""
)

# 1) Planner
plan_state = planner_agent.invoke({"messages": [{"role": "user", "content": user_goal}]})
print("\n--- PLAN ---\n", final_text(plan_state))
trace_tool_calls(plan_state)



--- PLAN ---
 1) Who supervises Alice?  
   - Tool: kg_query

2) Which research group is Alice affiliated with?  
   - Tool: kg_query

3) What does EnergyAIGroup focus on?  
   - Tool: rag_retrieve

4) Who works on neuro-symbolic reasoning?  
   - Tool: kg_query

TOOL TRACE (in order):


In [30]:
# 2) RAG Researcher (descriptive questions)
rag_request = (
    """Gather evidence for:
      - What does EnergyAIGroup focus on?
      - Who works on neuro-symbolic reasoning?
    Use rag_retrieve and return evidence bullet points."""
)
rag_state = rag_agent.invoke({"messages": [{"role": "user", "content": rag_request}]})
print("\n--- RAG EVIDENCE ---\n", final_text(rag_state))
trace_tool_calls(rag_state)



--- RAG EVIDENCE ---
 ### Evidence for EnergyAIGroup Focus:
- EnergyAIGroup focuses on AI for sustainability, energy forecasting, optimisation, and climate modelling.

### Evidence for Neuro-Symbolic Reasoning:
- Alice is a PhD student researching neuro-symbolic AI, knowledge graphs, and reasoning systems.
- Bob is a Professor working on ontology engineering and graph-based AI, and he supervises students.

TOOL TRACE (in order):
Message 2: tool call(s) →
  - rag_retrieve args: {'query': 'What does EnergyAIGroup focus on?'}
  - rag_retrieve args: {'query': 'Who works on neuro-symbolic reasoning?'}
Message 3: tool output snippet → [{'entity': 'EnergyAIGroup', 'type': 'Group'}] EnergyAIGroup focuses on AI for sustainability, energy forecasting, optimisation, and climate modelling.  [{'entity': 'Alice', 'type': 'Person'}] Alice is a PhD student rese...
Message 4: tool output snippet → [{'entity': 'Alice', 'type': 'Person'}] Alice is a PhD student researching neuro-symbolic AI, knowledge g

In [31]:
# 3) KG Analyst (relationship questions)
kg_request = (
    """Answer using KG evidence:
      - Who supervises Alice?
      - Which research group is Alice affiliated with?
      - Which classes is Bob an instance of?
      Use kg_query and return concise answers."""
)
kg_state = kg_agent.invoke({"messages": [{"role": "user", "content": kg_request}]})
print("\n--- KG EVIDENCE ---\n", final_text(kg_state))
trace_tool_calls(kg_state)



--- KG EVIDENCE ---
 - Alice is supervised by no one (no information available).
- Alice is affiliated with the EnergyAIGroup, focusing on AI applications for sustainability, energy systems, and climate modeling.
- Bob is an instance of the classes: Academic, Professor, and Person.

TOOL TRACE (in order):
Message 2: tool call(s) →
  - kg_query args: {'question': 'Who supervises Alice?'}
  - kg_query args: {'question': 'Which research group is Alice affiliated with?'}
  - kg_query args: {'question': 'Which classes is Bob an instance of?'}
Message 3: tool output snippet → There is no information available regarding who supervises Alice....
Message 4: tool output snippet → Alice is affiliated with the EnergyAIGroup, which focuses on AI applications for sustainability, energy systems, and climate modeling....
Message 5: tool output snippet → Bob is an instance of the following classes: Academic, Professor, and Person....


In [32]:
# 4) Validator/Editor combines everything
final_request = (
    """Combine into final report.

"
    f"PLAN:\n{final_text(plan_state)}\n\n"
    f"RAG EVIDENCE:\n{final_text(rag_state)}\n\n"
    f"KG EVIDENCE:\n{final_text(kg_state)}\n\n"
    "Output format:
    - Q1 answer + evidence
    - Q2 answer + evidence
    - Q3 answer + evidence
    - Q4 answer + evidence
    - Notes/uncertainty (if any)
"""
)
final_state = editor_agent.invoke({"messages": [{"role": "user", "content": final_request}]})
print("\n--- FINAL REPORT ---\n", final_text(final_state))
trace_tool_calls(final_state)



--- FINAL REPORT ---
 ### Final Report

**Q1 Answer:** The EnergyAIGroup is focused on AI applications for sustainability, energy forecasting, optimization, and climate modeling.  
**Evidence:** RAG evidence indicates that "EnergyAIGroup focuses on AI for sustainability, energy forecasting, optimisation, and climate modelling."

---

**Q2 Answer:** Bob is a Professor specializing in ontology engineering and graph-based AI, and he supervises students.  
**Evidence:** Both RAG and KG evidence confirm that "Bob is a Professor working on ontology engineering and graph-based AI. He supervises students."

---

**Q3 Answer:** Alice is a PhD student researching neuro-symbolic AI, knowledge graphs, and reasoning systems.  
**Evidence:** RAG and KG evidence state that "Alice is a PhD student researching neuro-symbolic AI, knowledge graphs, and reasoning systems."

---

**Q4 Answer:** The knowledge graph reflects the current state of individuals involved, specifically Alice and Bob, with their r

## 7) Reflection
1) Which tool was used where (RAG vs KG), and why?
2) What could still fail (missing KG data, weak retrieval, tool misuse)?
3) What extra validation would you add in production (e.g., rerun kg_query, add read-only DB user, add a judge agent)?
