# GraphRAG ChatBot For Stakeholder Model Whitepaper

This notebook will show the process of taking the Godot Stakeholder Model PDF and converting it into nodes and relationships in a NEO4J graqph database as well as using ChatGPT to communicate with the data in the database.

In [22]:
import os
from dotenv import load_dotenv

from langchain.document_loaders import WikipediaLoader
from langchain.evaluation.qa.eval_chain import QAEvalChain
from langchain.prompts import PromptTemplate
from langchain.text_splitter import RecursiveCharacterTextSplitter

from langchain_community.document_loaders import PyPDFLoader
from langchain_experimental.graph_transformers import LLMGraphTransformer
from langchain_neo4j import GraphCypherQAChain, Neo4jGraph
from langchain_openai import ChatOpenAI

In [4]:
load_dotenv(dotenv_path=".env")

URI = os.getenv("NEO4J_URI")
USER = os.getenv("NEO4J_USER")
PWD = os.getenv("NEO4J_PASSWORD")
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

In [6]:
pdf_path = "The GODOT Stakeholder Value Model_ Whitepaper + Game Theory.pdf"
loader = PyPDFLoader(pdf_path)
pages = loader.load()

In [10]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
chunks = text_splitter.split_documents(pages)

In [11]:
len(chunks)

168

In [14]:
graph = Neo4jGraph(url=URI, username=USER, password=PWD)
llm = ChatOpenAI(temperature=0, model_name="gpt-4o", openai_api_key=OPENAI_API_KEY)
llm_transformer = LLMGraphTransformer(llm=llm)

In [15]:
graph_documents = llm_transformer.convert_to_graph_documents(chunks)

In [16]:
graph.add_graph_documents(graph_documents)

In [59]:
enhanced_graph = Neo4jGraph(url=URI, username=USER, password=PWD, enhanced_schema=True)

In [60]:
CYPHER_GENERATION_TEMPLATE = """
You are an expert Neo4j Cypher developer.
Translate the user's natural language question INTO Cypher queries
that retrieve information ONLY from the GODOT Stakeholder Value Model graph.

Schema:
{schema}

Rules:
- Only use labels and relationship types that exist in the schema.
- Scope all queries to the GODOT Stakeholder Value Model context. Do NOT return generic definitions.
- Use explicit property filters with WHERE, e.g. MATCH (c:Company) WHERE c.`id` = "Godot".
- For broad questions ("What is X?"), treat X as a graph concept and expand context with OPTIONAL MATCH.
- Return informative properties (`id`, `name`, `title`, `summary`, `description`) and relationship targets.
- Use clear aliases, e.g. RETURN c.`id` AS company, collect(DISTINCT p.`id`) AS policies.
- Do NOT invent labels or properties that aren’t in the schema.
- If nothing is found, still produce a minimal MATCH/WHERE that returns zero rows.

Few-shot guidance:

Q: "What is Godot as a company?"
Cypher:
MATCH (c:Company) WHERE c.`id` = "Godot"
OPTIONAL MATCH (c)-[:HAS_POLICY]->(p:Policy)
OPTIONAL MATCH (c)-[:HAS_STRUCTURE]->(s:Structure)
OPTIONAL MATCH (c)-[:HAS_EQUITY_TRUST]->(t:EquityTrust)
OPTIONAL MATCH (c)-[:RELATES_TO]->(k:Concept)
RETURN c.`id` AS company,
       collect(DISTINCT p.`id`) AS policies,
       collect(DISTINCT s.`id`) AS structures,
       collect(DISTINCT t.`id`) AS equity_trust_elements,
       collect(DISTINCT k.`id`) AS related_concepts

Q: "What is game theory?"
Cypher:
MATCH (g:Concept)
WHERE toLower(g.`id`) = "game theory" OR toLower(g.`name`) CONTAINS "game theory"
OPTIONAL MATCH (g)<-[:APPLIES_FRAMEWORK]-(:Analysis)-[:USES]->(f:GameTheoryFramework)
OPTIONAL MATCH (g)<-[:RELATES_TO]-(co:Company)-[:RELATES_TO]->(c:Concept)
WHERE co.`id` = "Godot"
RETURN g.`id` AS concept,
       coalesce(g.`summary`, g.`description`, g.`id`) AS concept_summary,
       collect(DISTINCT f.`id`) AS frameworks,
       collect(DISTINCT c.`id`) AS nearby_company_concepts

Q: "How are bonuses calculated?"
Cypher:
MATCH (co:Company) WHERE co.`id` = "Godot"
OPTIONAL MATCH (co)-[:HAS_EQUITY_TRUST]->(t:EquityTrust)
OPTIONAL MATCH (t)-[:ALLOCATES]->(b:BonusPolicy)
OPTIONAL MATCH (b)-[:DEFINED_BY]->(kpi:KPI)
RETURN b.`id` AS bonus_policy,
       coalesce(b.`formula`, b.`summary`, b.`description`) AS bonus_formula,
       kpi.`id` AS kpi_id, kpi.`definition` AS kpi_definition

Q: "Summarize the compensation model."
Cypher:
MATCH (co:Company) WHERE co.`id` = "Godot"
OPTIONAL MATCH (co)-[:HAS_POLICY]->(p:Policy)
OPTIONAL MATCH (co)-[:HAS_EQUITY_TRUST]->(t:EquityTrust)-[:ALLOCATES]->(alloc)
WITH co, p, t, collect(DISTINCT labels(alloc)[0] + ":" + coalesce(alloc.`id`, alloc.`name`)) AS trust_allocations
RETURN co.`id` AS company,
       collect(DISTINCT p.`id`) AS policies,
       trust_allocations

User Question:
{question}
"""







cypher_generation_prompt = PromptTemplate(
    template=CYPHER_GENERATION_TEMPLATE,
    input_variables=["schema", "query"]
)


In [61]:
cypher_chain = GraphCypherQAChain.from_llm(
    llm,
    graph=enhanced_graph,
    cypher_prompt=cypher_generation_prompt,
    verbose=True,
    allow_dangerous_requests=True
)


In [62]:
cypher_chain.invoke({"query":"What is game theory?"})



[1m> Entering new GraphCypherQAChain chain...[0m
Generated Cypher:
[32;1m[1;3mcypher
MATCH (g:Concept)
WHERE toLower(g.`id`) = "game theory" OR toLower(g.`name`) CONTAINS "game theory"
OPTIONAL MATCH (g)<-[:APPLIES_FRAMEWORK]-(:Analysis)-[:USES]->(f:GameTheoryFramework)
OPTIONAL MATCH (g)<-[:RELATES_TO]-(co:Company)-[:RELATES_TO]->(c:Concept)
WHERE co.`id` = "Godot"
RETURN g.`id` AS concept,
       coalesce(g.`summary`, g.`description`, g.`id`) AS concept_summary,
       collect(DISTINCT f.`id`) AS frameworks,
       collect(DISTINCT c.`id`) AS nearby_company_concepts
[0m




Full Context:
[32;1m[1;3m[{'concept': 'Game Theory', 'concept_summary': 'Analytical lens in the model to evaluate incentive compatibility and cooperation.', 'frameworks': [], 'nearby_company_concepts': []}][0m

[1m> Finished chain.[0m


{'query': 'What is game theory?',
 'result': 'Game theory is an analytical lens in the model to evaluate incentive compatibility and cooperation.'}

In [63]:
cypher_chain.invoke({"query":"What are the incentives for employees?"})



[1m> Entering new GraphCypherQAChain chain...[0m
Generated Cypher:
[32;1m[1;3mcypher
MATCH (g:Group) WHERE g.`id` = "Employees"
OPTIONAL MATCH (g)-[:BENEFIT_FROM]->(c:Concept)
OPTIONAL MATCH (g)-[:RECEIVE]->(p:Policy)
RETURN g.`id` AS group,
       collect(DISTINCT c.`id`) AS benefit_concepts,
       collect(DISTINCT p.`id`) AS policies
[0m
Full Context:
[32;1m[1;3m[{'group': 'Employees', 'benefit_concepts': [], 'policies': []}][0m

[1m> Finished chain.[0m


{'query': 'What are the incentives for employees?',
 'result': "I don't know the answer."}