# 📘 Agentic Architectures 12: Graph / World-Model Memory

Welcome to this detailed exploration of one of the most powerful memory structures for AI agents: the **Graph-based World Model**. This architecture moves beyond simple document retrieval or chat history to build a structured, interconnected understanding of the world, much like a human's semantic memory.

Instead of storing information as isolated chunks of text, a graph-based agent parses incoming data into **entities (nodes)** and **relationships (edges)**. This creates a rich, queryable knowledge graph. The agent can then answer complex questions by traversing this graph, uncovering insights that would be hidden in unstructured text.

To showcase this in detail, we will build a **Corporate Intelligence Agent**. This agent will:
1.  **Ingest Unstructured Reports:** Read text documents about companies, people, and products.
2.  **Construct a Knowledge Graph:** Use an LLM to extract entities (e.g., `Company`, `Person`) and relationships (e.g., `ACQUIRED`, `WORKS_FOR`, `COMPETES_WITH`) and populate a Neo4j graph database.
3.  **Answer Complex, Multi-Hop Questions:** Use the graph to answer questions that require connecting multiple pieces of information, such as "*Who works for the company that acquired BetaSolutions?*"—a query that is extremely difficult for standard vector search.

### Definition
A **Graph / World-Model Memory** is an agentic architecture where knowledge is stored in a structured graph database. Information is represented as nodes (entities like people, places, concepts) and edges (the relationships between them). This creates a dynamic "world model" that the agent can reason over.

### High-level Workflow

1.  **Information Ingestion:** The agent receives unstructured or semi-structured data (text, documents, API responses).
2.  **Knowledge Extraction:** An LLM-powered process parses the information, identifying key entities and the relationships that connect them.
3.  **Graph Update:** The extracted nodes and edges are added to or updated in a persistent graph database (like Neo4j).
4.  **Question Answering / Reasoning:** When asked a question, the agent:
    a. Converts the natural language question into a formal graph query language (e.g., Cypher for Neo4j).
    b. Executes the query against the graph to retrieve relevant subgraphs or facts.
    c. Synthesizes the query results into a natural language answer.

### When to Use / Applications
*   **Enterprise Knowledge Assistants:** Building a queryable model of a company's projects, employees, and customers from internal documents.
*   **Advanced Research Assistants:** Creating a dynamic knowledge base of a scientific field by ingesting research papers.
*   **Complex System Diagnostics:** Modeling a system's components and their dependencies to diagnose failures.

### Strengths & Weaknesses
*   **Strengths:**
    *   **Structured & Explainable:** The knowledge is highly organized. An answer can be explained by showing the exact path in the graph that led to it.
    *   **Enables Complex Reasoning:** Excels at answering "multi-hop" questions that require connecting disparate pieces of information through relationships.
*   **Weaknesses:**
    *   **Upfront Complexity:** Requires a well-defined schema and a robust extraction process.
    *   **Keeping the Graph Updated:** Can be challenging to manage updates, resolve conflicting information, and prune outdated facts over time (knowledge lifecycle management).

## Phase 0: Foundation & Setup

We'll install libraries, including the Neo4j driver, and configure our environment. **Crucially, you must have a running Neo4j instance and a `.env` file with its credentials.**

In [1]:
# !pip install -q -U langchain-nebius langchain langgraph rich python-dotenv langchain_community neo4j

In [2]:
import os
from typing import List, Dict, Any, Optional
from dotenv import load_dotenv

# Pydantic for data modeling
from pydantic import BaseModel, Field

# LangChain components
from langchain_nebius import ChatNebius
from langchain_community.graphs import Neo4jGraph
from langchain.chains import GraphCypherQAChain
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.pydantic_v1 import BaseModel as V1BaseModel

# For pretty printing
from rich.console import Console
from rich.markdown import Markdown

# --- API Key and Tracing Setup ---
load_dotenv()

os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_PROJECT"] = "Agentic Architecture - Graph Memory (Nebius)"

required_vars = ["NEBIUS_API_KEY", "LANGCHAIN_API_KEY", "NEO4J_URI", "NEO4J_USERNAME", "NEO4J_PASSWORD"]
for var in required_vars:
    if var not in os.environ:
        print(f"Warning: Environment variable {var} not set.")

print("Environment variables loaded and tracing is set up.")

Environment variables loaded and tracing is set up.


## Phase 1: Building the Graph Construction Agent

This is the heart of the ingestion pipeline. We need an agent that can read unstructured text and extract entities and relationships in a structured format. We will use an LLM with structured output capabilities (Pydantic) to act as our knowledge extractor.

In [3]:
console = Console()
llm = ChatNebius(model="mistralai/Mixtral-8x22B-Instruct-v0.1", temperature=0)

# Connect to our Neo4j database
try:
    graph = Neo4jGraph()
    # Clear the graph for a clean run
    graph.query("MATCH (n) DETACH DELETE n")
except Exception as e:
    console.print(f"[bold red]Failed to connect to Neo4j: {e}. Please check your credentials and connection.[/bold red]")
    graph = None

# Pydantic models for structured extraction (using LangChain's v1 BaseModel for compatibility with older structured output methods)
class Node(V1BaseModel):
    id: str = Field(description="Unique name or identifier for the entity.")
    type: str = Field(description="The type or label of the entity (e.g., Person, Company, Product).")

class Relationship(V1BaseModel):
    source: Node
    target: Node
    type: str = Field(description="The type of relationship (e.g., WORKS_FOR, ACQUIRED).")

class KnowledgeGraph(V1BaseModel):
    """A graph of nodes and relationships."""
    relationships: List[Relationship]

# The Graph Maker Agent
def get_graph_maker_chain():
    extractor_llm = llm.with_structured_output(KnowledgeGraph)
    prompt = ChatPromptTemplate.from_messages([
        ("system", "You are an expert at extracting information from text and building a knowledge graph. Extract all entities (nodes) and relationships from the provided text. The relationship type should be a verb in all caps, like 'WORKS_FOR' or 'ACQUIRED'."),
        ("human", "Extract a knowledge graph from the following text:\n\n{text}")
    ])
    return prompt | extractor_llm

graph_maker_agent = get_graph_maker_chain()
print("Successfully connected to Neo4j and defined the Graph Maker Agent.")

Successfully connected to Neo4j and defined the Graph Maker Agent.


## Phase 2: Ingesting Knowledge and Building the World Model

Now, we'll feed our agent a series of related but separate documents. The agent will process each one and progressively build up our corporate knowledge graph. This simulates how a real system would learn over time as new information becomes available.

In [4]:
unstructured_documents = [
    "On May 15, 2023, global tech giant AlphaCorp announced its acquisition of startup BetaSolutions, a leader in cloud-native database technology.",
    "Dr. Evelyn Reed, a renowned AI researcher, has been the Chief Science Officer at AlphaCorp since 2021. She leads the team responsible for the QuantumLeap AI platform.",
    "Innovate Inc.'s flagship product, NeuraGen, is seen as a direct competitor to AlphaCorp's QuantumLeap AI. Meanwhile, Innovate Inc. recently hired Johnathan Miles as their new CTO."
]
for i, doc in enumerate(unstructured_documents):
    console.print(f"--- Ingesting Document {i+1} ---")
    try:
        kg_data = graph_maker_agent.invoke({"text": doc})
        if kg_data.relationships:
            graph.add_graph_documents(graph_documents=kg_data.relationships, include_source=False)
            console.print(f"[green]Successfully added {len(kg_data.relationships)} relationships to the graph.[/green]")
        else:
             console.print("[yellow]No relationships extracted.[/yellow]")
    except Exception as e:
        console.print(f"[red]Failed to process document: {e}[/red]")

console.print("--- ✅ Knowledge Graph Ingestion Complete ---")
console.print("\n--- Graph Schema ---")
console.print(graph.schema)

--- Ingesting Document 1 ---
Successfully added 1 relationships to the graph.
--- Ingesting Document 2 ---
Successfully added 2 relationships to the graph.
--- Ingesting Document 3 ---
Successfully added 2 relationships to the graph.
--- ✅ Knowledge Graph Ingestion Complete ---



--- Graph Schema ---


Node properties: [{'properties': [('id', 'STRING')], 'labels': ['Product']}, {'properties': [('id', 'STRING')], 'labels': ['Person']}, {'properties': [('id', 'STRING')], 'labels': ['Company']}]
Relationship properties: []
Relationships: [(:Company)-[:PRODUCES]->(:Product), (:Person)-[:WORKS_FOR]->(:Company), (:Product)-[:COMPETES_WITH]->(:Product), (:Company)-[:ACQUIRED]->(:Company)]


## Phase 3: Building the Graph-Querying Agent

With our knowledge graph populated, we need an agent that can use it. This involves a **Text-to-Cypher** process. The agent will receive a user's natural language question, convert it into a Cypher query using the graph schema as context, execute the query, and then synthesize the results into a human-readable answer.

In [5]:
# LangChain has a built-in chain for this, but we'll inspect the components
# to understand how it works.
cypher_generation_prompt = ChatPromptTemplate.from_template(
    """You are an expert Neo4j Cypher query developer. Your task is to convert a user's natural language question into a valid Cypher query.
You must use the provided graph schema to construct the query. Do not use any node labels or relationship types that are not in the schema.
Return ONLY the Cypher query, with no additional text or explanations.

Graph Schema:
{schema}

User Question:
{question}
"""
)

cypher_response_prompt = ChatPromptTemplate.from_template(
    """You are an assistant that provides clear, natural language answers based on query results from a knowledge graph.
Use the context from the graph query result to answer the user's original question.

User Question: {question}
Query Result: {context}
"""
)

def query_graph(question: str) -> Dict[str, Any]:
    """The full Text-to-Cypher and synthesis pipeline."""
    console.print(f"\n[bold]Question:[/bold] {question}")
    
    # 1. Generate Cypher Query
    console.print("--- ➡️ Generating Cypher Query ---")
    cypher_chain = cypher_generation_prompt | llm
    generated_cypher = cypher_chain.invoke({"schema": graph.schema, "question": question}).content
    console.print(f"[cyan]Generated Cypher:\n{generated_cypher}[/cyan]")
    
    # 2. Execute Cypher Query
    console.print("--- ⚡ Executing Query ---")
    try:
        context = graph.query(generated_cypher)
        console.print(f"[yellow]Query Result:\n{context}[/yellow]")
    except Exception as e:
        console.print(f"[red]Cypher Query Failed: {e}[/red]")
        return {"answer": "I was unable to execute a query to find the answer to your question."}
    
    # 3. Synthesize Final Answer
    console.print("--- 🗣️ Synthesizing Final Answer ---")
    synthesis_chain = cypher_response_prompt | llm
    answer = synthesis_chain.invoke({"question": question, "context": context}).content
    
    return {"answer": answer}

print("Graph-Querying Agent defined successfully.")

Graph-Querying Agent defined successfully.


## Phase 4: Demonstration & Analysis

Now for the ultimate test. We will ask our agent questions that range from simple fact retrieval to complex, multi-hop reasoning that requires connecting information from all three of our source documents.

In [6]:
# Test 1: Simple fact retrieval (requires info from doc 2)
result1 = query_graph("Who works for AlphaCorp?")
console.print("\n--- Final Answer ---")
console.print(Markdown(result1['answer']))

# Test 2: Another simple fact retrieval (requires info from doc 1)
result2 = query_graph("What company did AlphaCorp acquire?")
console.print("\n--- Final Answer ---")
console.print(Markdown(result2['answer']))

# Test 3: The multi-hop reasoning question (requires info from all 3 docs)
result3 = query_graph("What companies compete with the products made by the company that acquired BetaSolutions?")
console.print("\n--- Final Answer ---")
console.print(Markdown(result3['answer']))


Question: Who works for AlphaCorp?
--- ➡️ Generating Cypher Query ---
Generated Cypher:
MATCH (p:Person)-[:WORKS_FOR]->(c:Company {id: 'AlphaCorp'}) RETURN p.id
--- ⚡ Executing Query ---
Query Result:
[{'p.id': 'Dr. Evelyn Reed'}]
--- 🗣️ Synthesizing Final Answer ---



--- Final Answer ---


Dr. Evelyn Reed works for AlphaCorp.


Question: What company did AlphaCorp acquire?
--- ➡️ Generating Cypher Query ---
Generated Cypher:
MATCH (:Company {id: 'AlphaCorp'})-[:ACQUIRED]->(acquired_company:Company)
RETURN acquired_company.id
--- ⚡ Executing Query ---
Query Result:
[{'acquired_company.id': 'BetaSolutions'}]
--- 🗣️ Synthesizing Final Answer ---



--- Final Answer ---


AlphaCorp acquired BetaSolutions.


Question: What companies compete with the products made by the company that acquired BetaSolutions?
--- ➡️ Generating Cypher Query ---
Generated Cypher:
MATCH (acquirer:Company)-[:ACQUIRED]->(:Company {id: 'BetaSolutions'})
MATCH (acquirer)-[:PRODUCES]->(product:Product)
MATCH (product)-[:COMPETES_WITH]->(competitor_product:Product)
MATCH (competitor_company:Company)-[:PRODUCES]->(competitor_product)
RETURN DISTINCT competitor_company.id
--- ⚡ Executing Query ---
Query Result:
[{'competitor_company.id': 'Innovate Inc.'}]
--- 🗣️ Synthesizing Final Answer ---



--- Final Answer ---


Innovate Inc. competes with the products made by the company that acquired BetaSolutions.

### Analysis of the Results

The demonstration highlights the profound advantage of a graph-based world model:

- The first two questions were simple lookups. The agent successfully converted the questions into Cypher, queried the graph, and found the direct relationships.

- The third question is the crucial one. A standard RAG agent would fail here. It might find the document about the acquisition and the document about the competitor, but it would struggle to connect them. It lacks the explicit relational structure to understand that the "AlphaCorp" in document 1 is the same entity as the "AlphaCorp" in documents 2 and 3.

- Our graph-based agent, however, solved it with ease. We can trace its logic directly from the generated Cypher query:
    1.  `MATCH (acquirer:Company)-[:ACQUIRED]->(:Company {id: 'BetaSolutions'})`: First, find the company that acquired BetaSolutions (Result: AlphaCorp).
    2.  `MATCH (acquirer)-[:PRODUCES]->(product:Product)`: Next, find the products produced by that company (Result: QuantumLeap AI).
    3.  `MATCH (product)-[:COMPETES_WITH]->(competitor_product:Product)`: Then, find the products that compete with that product (Result: NeuraGen).
    4.  `MATCH (competitor_company:Company)-[:PRODUCES]->(competitor_product)`: Finally, find the company that produces the competing product (Result: Innovate Inc.).

This ability to traverse relationships and synthesize information from different sources is the superpower of the Graph / World-Model architecture. The answer is not just retrieved; it is reasoned.

## Conclusion

In this notebook, we have constructed a complete agentic system built around a **Graph / World-Model Memory**. We demonstrated the full lifecycle: ingesting unstructured data, using an LLM to build a structured knowledge graph, and then using that graph to answer complex, multi-hop questions that require genuine reasoning.

This architecture represents a significant leap in capability over simpler memory systems. By creating an explicit, queryable model of the world, we give our agents the ability to connect disparate facts and uncover hidden insights. While the challenges of maintaining this graph over time are real, the potential for building deeply knowledgeable and explainable AI assistants makes this one of the most exciting and powerful patterns in modern agentic design.