# Graph RAG using LangChain and Kuzu

This notebook demonstrates **Graph-based Retrieval Augmented Generation** using:
- KuzuDB as the graph database
- LangChain for graph construction and querying
- LLMs for knowledge graph extraction

Key features:
- Converts text into structured knowledge graphs
- Supports complex relationship queries
- Visualizes the resulting graph

Based on: [langchain-kuzu package](https://pypi.org/project/langchain-kuzu/)

In [1]:
# A default setup cell.
# It imports environment variables, define 'devtools.debug" as a buildins, set PYTHONPATH, and code auto-reload
# Copy it in other Notebooks


from dotenv import load_dotenv
from rich import print

load_dotenv(verbose=True)
%load_ext autoreload
%autoreload 2
%reset -f

# cSpell: disable

## Import Required Libraries

Key components we'll use:
- `kuzu`: Graph database engine
- `LLMGraphTransformer`: Converts text to graph structure
- `KuzuGraph`: LangChain wrapper for KuzuDB
- `KuzuQAChain`: Handles graph-based question answering
- `CytoscapeWidget`: For graph visualization

In [2]:
import kuzu
from genai_tk.core.llm_factory import get_llm
from ipycytoscape import CytoscapeWidget
from langchain_core.documents import Document
from langchain_experimental.graph_transformers import LLMGraphTransformer
from langchain_kuzu.chains.graph_qa.kuzu import KuzuQAChain
from langchain_kuzu.graphs.kuzu_graph import KuzuGraph
from langchain_openai import ChatOpenAI

## Build the Knowledge Graph

We'll create a graph from text by:
1. Defining allowed node types (Person, Company, Location)
2. Specifying allowed relationships between them
3. Using an LLM to extract entities and relationships
4. Storing the structured graph in KuzuDB

The example text contains facts about Apple's CEO and headquarters location.

In [3]:
DB = "/tmp/test_db4"
MODEL_ID = None
llm = get_llm(llm_id=MODEL_ID)

# llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

text = "Tim Cook is the CEO of Apple. Apple has its headquarters in California."

if True:  # ne schema
    allowed_nodes = ["Person", "Company", "Location"]
    allowed_relationships = [
        ("Person", "IS_CEO_OF", "Company"),
        ("Company", "HAS_HEADQUARTERS_IN", "Location"),
    ]
    # Define the LLMGraphTransformer
    llm_transformer = LLMGraphTransformer(
        llm=llm,
        allowed_nodes=allowed_nodes,
        allowed_relationships=allowed_relationships,
    )

    # Convert the given text into graph documents
    documents = [Document(page_content=text)]
    graph_documents = llm_transformer.convert_to_graph_documents(documents)
    db = kuzu.Database(DB)
    graph = KuzuGraph(db, allow_dangerous_requests=True)

    # Add the graph document to the graph
    graph.add_graph_documents(
        graph_documents,
        include_source=True,
    )
else:
    print("load {DB}")

[32m2025-09-25 18:43:46.096[0m | [34m[1mDEBUG   [0m | [36mgenai_tk.core.llm_factory[0m:[36mget_llm[0m:[36m726[0m - [34m[1mget LLM:'kimi_k2_openrouter'[0m


## Query the Knowledge Graph

Now we'll use the `KuzuQAChain` to:
1. Accept natural language questions
2. Generate and execute Cypher queries
3. Return formatted answers

The chain shows the generated Cypher queries when verbose=True.

In [4]:
llm = get_llm(llm_id=None)
db = kuzu.Database(DB)
graph = KuzuGraph(db, allow_dangerous_requests=True)

# Create the KuzuQAChain with verbosity enabled to see the generated Cypher queries
chain = KuzuQAChain.from_llm(
    llm=llm,
    graph=graph,
    verbose=True,
    allow_dangerous_requests=True,
)

# Query the graph
queries = [
    "Who is the CEO of Apple?",
    "Where is Apple headquartered?",
]

for query in queries:
    result = chain.invoke(query)
    print(f"Query: {query}\nResult: {result}\n")

[32m2025-09-25 18:43:50.262[0m | [34m[1mDEBUG   [0m | [36mgenai_tk.core.llm_factory[0m:[36mget_llm[0m:[36m726[0m - [34m[1mget LLM:'kimi_k2_openrouter'[0m




[1m> Entering new KuzuQAChain chain...[0m
Generated Cypher:
[32;1m[1;3m
MATCH (p:Person)-[:IS_CEO_OF]->(c:Company)
WHERE c.id = 'Apple'
RETURN p.id AS CEO
[0m
Full Context:
[32;1m[1;3m[{'CEO': 'Tim Cook'}][0m

[1m> Finished chain.[0m
Query: Who is the CEO of Apple?
Result: {'query': 'Who is the CEO of Apple?', 'result': 'The CEO of Apple is Tim Cook.'}



[1m> Entering new KuzuQAChain chain...[0m
Generated Cypher:
[32;1m[1;3m
MATCH (c:Company {id: "Apple"})-[:HAS_HEADQUARTERS_IN]->(l:Location)
RETURN l.id AS headquarters
[0m
Full Context:
[32;1m[1;3m[{'headquarters': 'California'}][0m

[1m> Finished chain.[0m
Query: Where is Apple headquartered?
Result: {'query': 'Where is Apple headquartered?', 'result': 'Apple is headquartered in California.'}



## Visualize the Graph

We'll use Cytoscape to render an interactive visualization of:
- Nodes (entities)
- Edges (relationships)
- Their properties

In [5]:
# Create Cytoscape widget
from genai_blueprint.webapp.ui_components.cypher_graph_display import get_cytoscape_json, get_cytoscape_style

cyto = CytoscapeWidget()
cyto.graph.add_graph_from_json(get_cytoscape_json(graph))
cyto.set_style(get_cytoscape_style())
# Set layout and style
cyto.set_layout(animate=True)

In [6]:
# Display the graph
cyto

CytoscapeWidget(cytoscape_layout={'name': 'cola', 'animate': True}, cytoscape_style=[{'selector': 'node', 'css…