# Managing grapsh with Neo4j

TODO: 
- Explain the purpose of this tutorial. More information about what is Neo4j and how it differs from the plotting libraries we saw in the previous tutorial can be found here: [supplementary/Neo4j.md](./supplementary/Neo4j.md).
- Explain what is Neo4j and why it is necessary.
- Explain the approaches followed in this tutorial.

In [1]:
from langchain_experimental.graph_transformers import LLMGraphTransformer
from langchain_core.documents import Document
from langchain_ollama import ChatOllama
from langchain_core.prompts import ChatPromptTemplate

In [2]:
text = '''
The solar system consists of the Sun and the objects that orbit it, including planets, moons, asteroids, comets, and meteoroids.
The Sun is a star at the center of the Solar System.
Mercury is a planet in the Solar System. Mercury orbits the Sun. Mercury has no atmosphere and no magnetic field.
Venus is a planet in the Solar System. Venus orbits the Sun. Venus has a thick atmosphere. The atmosphere of Venus is composed mainly of carbon dioxide. Venus has no magnetic field.
Earth is a planet in the Solar System. Earth orbits the Sun. Earth has one moon called the Moon. Earth has a thick atmosphere composed mainly of nitrogen and oxygen. Earth has a strong magnetic field.
Mars is a planet in the Solar System. Mars orbits the Sun. Mars has two moons called Phobos and Deimos. Mars has a thin atmosphere composed mainly of carbon dioxide. Mars has a weak magnetic field.
Jupiter is a planet in the Solar System. Jupiter orbits the Sun. Jupiter has moons called Io, Europa, Ganymede, and Callisto. Jupiter has a thick atmosphere composed mainly of hydrogen and helium. Jupiter has a strong magnetic field.
'''
print(text)


The solar system consists of the Sun and the objects that orbit it, including planets, moons, asteroids, comets, and meteoroids.
The Sun is a star at the center of the Solar System.
Mercury is a planet in the Solar System. Mercury orbits the Sun. Mercury has no atmosphere and no magnetic field.
Venus is a planet in the Solar System. Venus orbits the Sun. Venus has a thick atmosphere. The atmosphere of Venus is composed mainly of carbon dioxide. Venus has no magnetic field.
Earth is a planet in the Solar System. Earth orbits the Sun. Earth has one moon called the Moon. Earth has a thick atmosphere composed mainly of nitrogen and oxygen. Earth has a strong magnetic field.
Mars is a planet in the Solar System. Mars orbits the Sun. Mars has two moons called Phobos and Deimos. Mars has a thin atmosphere composed mainly of carbon dioxide. Mars has a weak magnetic field.
Jupiter is a planet in the Solar System. Jupiter orbits the Sun. Jupiter has moons called Io, Europa, Ganymede, and Callis

In [3]:
documents = [Document(page_content=text)]

In [4]:
# Initialize the ChatOllama model with the specified model name
# model_name = 'qwen3-vl:4b'
# model_name = 'llama3.2:3b'  # Or another text-focused model
model_name = 'tomasonjo/llama3-text2cypher-demo:8b_4bit'
# and initialize the ChatOllama instance
chat_model = ChatOllama(
    model=model_name,
    validate_model_on_init=True,
    temperature=0
)

In [5]:
query_text = "How many moons does Mars have?"

### Connecting to a Neo4j graph server

More information about good practices for not accidentally sharing user names and passwords in repositories can be found here: [supplementary/Secrets.md](./supplementary/Secrets.md).

In [6]:
import os
from dotenv import load_dotenv
from langchain_neo4j import Neo4jGraph

# Load environment variables from .env file
load_dotenv()

# Get credentials from environment variables
neo4j_url = os.getenv("NEO4J_URL", "bolt://localhost:7687")
neo4j_user = os.getenv("NEO4J_USER", "neo4j")
neo4j_password = os.getenv("NEO4J_PASSWORD")

if not neo4j_password:
    raise ValueError("NEO4J_PASSWORD environment variable is not set. Please create a .env file with your credentials.")

graph = Neo4jGraph(
    url=neo4j_url,
    username=neo4j_user,
    password=neo4j_password
)

In [7]:
# Create a ChatPromptTemplate for graph extraction
graph_prompt = ChatPromptTemplate.from_messages([
    ("system", """You are an expert Neo4j Cypher query generator.

TASK:
- Translate the user's natural language question into a Cypher query.

CONSTRAINTS:
- Use ONLY the schema provided below.
- Do NOT invent labels, relationship types, or properties.
- Do NOT explain the query.
- Output ONLY valid Cypher.
- If the question cannot be answered unambiguously using the schema, output:
  // CANNOT_ANSWER

GRAPH SCHEMA:
Node labels:
- Star {{id}}
- Planet {{id}}
- Moon {{id}}
- Atmosphere {{id}}
- Substance {{id}}
- MagneticFieldStrength {{id}}

Relationships:
- (Planet)-[:ORBITS]->(Star)
- (Moon)-[:ORBITS]->(Planet)
- (Planet)-[:HAS_ATMOSPHERE]->(Atmosphere)
- (Atmosphere)-[:COMPOSED_OF]->(Substance)
- (Planet)-[:HAS_MAGNETIC_FIELD]->(MagneticFieldStrength)
     
ALLOWED VALUES:
- MagneticField.id \\in {{"none", "weak", "strong"}}

QUERY RULES:
1. Always specify node labels.
2. Always specify relationship directions.
3. MagneticField nodes MUST be matched or merged by description
4. Use meaningful variable names.
5. Return only properties, not full nodes.
6. Use DISTINCT unless duplicates are required.
7. Use OPTIONAL MATCH if information may be missing.
8. Do not use APOC or procedures.

FAILURE CONDITIONS:
- If required entities, labels, or relationships are missing from the schema,
  output:
  // CANNOT_ANSWER

EXAMPLES:
Question:
Which planet orbits the Sun?

Cypher:
MATCH (planet:Planet)-[:ORBITS]->(star:Star {{id: "Sun"}})
RETURN DISTINCT planet.id

Question:
Which moon orbits planet Mars?

Cypher:
MATCH (moon:Moon)-[:ORBITS]->(planet:Planet {{id: "Mars"}})
RETURN DISTINCT moon.id

Question:
What substances compose the atmosphere of Mars?

Cypher:
MATCH (planet:Planet {{id: "Mars"}})
      -[:HAS_ATMOSPHERE]->(atm:Atmosphere)
      -[:COMPOSED_OF]->(substance:Substance)
RETURN DISTINCT substance.id

Question:
Does Jupiter have a magnetic field?

Cypher:
MATCH (planet:Planet {{id: "Jupiter"}})
      -[:HAS_MAGNETIC_FIELD]->(prop:MagneticFieldStrength)
RETURN DISTINCT prop.id
"""),
    ("human", "{input}")
])


In [8]:
prompt_schema = LLMGraphTransformer(
    llm=chat_model,
    prompt=graph_prompt,
)

In [9]:
graph_prompt_schema = prompt_schema.convert_to_graph_documents(documents)
print(graph_prompt_schema)

[GraphDocument(nodes=[Node(id='Sun', type='Star', properties={}), Node(id='Mercury', type='Planet', properties={}), Node(id='Venus', type='Planet', properties={}), Node(id='Earth', type='Planet', properties={}), Node(id='Moon', type='Moon', properties={}), Node(id='Mars', type='Planet', properties={}), Node(id='Phobos', type='Moon', properties={}), Node(id='Deimos', type='Moon', properties={}), Node(id='Jupiter', type='Planet', properties={}), Node(id='Io', type='Moon', properties={}), Node(id='Europa', type='Moon', properties={}), Node(id='Ganymede', type='Moon', properties={}), Node(id='Callisto', type='Moon', properties={})], relationships=[Relationship(source=Node(id='Mercury', type='Planet', properties={}), target=Node(id='Sun', type='Star', properties={}), type='ORBITS', properties={}), Relationship(source=Node(id='Venus', type='Planet', properties={}), target=Node(id='Sun', type='Star', properties={}), type='ORBITS', properties={}), Relationship(source=Node(id='Earth', type='Pla

In [10]:
graph.add_graph_documents(graph_prompt_schema)

<img src="figs/graph_prompt.png" width=400px height=400px />

In [11]:
# graph.query("MATCH (n) DETACH DELETE n;")
# graph.query("MATCH (n) RETURN n;")

In [12]:
cypher_prompt = ChatPromptTemplate.from_messages([
    ("system", """
You are an expert Neo4j Cypher query generator.

TASK:
- Translate the user's natural language question into a Cypher query.

CONSTRAINTS:
- Use ONLY the schema provided below.
- Do NOT invent labels, relationship types, or properties.
- Do NOT explain the query.
- Output ONLY valid Cypher.
- If the question cannot be answered unambiguously using the schema, output:
  // CANNOT_ANSWER

GRAPH SCHEMA:
Node labels:
- Star {{id}}
- Planet {{id}}
- Moon {{id}}
- Atmosphere {{id}}
- Substance {{id}}
- MagneticFieldStrength {{id}}

Relationships:
- (Planet)-[:ORBITS]->(Star)
- (Moon)-[:ORBITS]->(Planet)
- (Planet)-[:HAS_ATMOSPHERE]->(Atmosphere)
- (Atmosphere)-[:COMPOSED_OF]->(Substance)
- (Planet)-[:HAS_MAGNETIC_FIELD]->(MagneticFieldStrength)
     
ALLOWED VALUES:
- MagneticField.id \\in {{"none", "weak", "strong"}}

QUERY RULES:
1. Always specify node labels.
2. Always specify relationship directions.
3. MagneticField nodes MUST be matched or merged by description
4. Use meaningful variable names.
5. Return only properties, not full nodes.
6. Use DISTINCT unless duplicates are required.
7. Use OPTIONAL MATCH if information may be missing.
8. Do not use APOC or procedures.

FAILURE CONDITIONS:
- If required entities, labels, or relationships are missing from the schema,
  output:
  // CANNOT_ANSWER

"""),
    ("human", "{question}")
])

In [13]:
qa_prompt = ChatPromptTemplate.from_messages([
    ("system", """
You are an expert assistant answering questions using results retrieved from a Neo4j graph.

RULES:
- The provided context comes from a trusted database query.
- If the context contains a numeric value that answers the question, use it directly.
- Do NOT say "I don't know" if the answer is present in the context.
- Answer concisely and directly in natural language.
- If the context is empty or missing required information, say:
  "I don't know the answer."
"""),
    ("human", """
Question:
{question}

Context:
{context}
""")
])

## What does `GraphCypherQAChain.from_llm` do (short version)

More information on how `GraphCypherQAChain.from_llm` works, is given in this file: [supplementary/GraphCypherQAChain.md](./supplementary/GraphCypherQAChain.md).

In [14]:
from langchain_neo4j import GraphCypherQAChain

# The process of occurs in two steps:
# 1) The LLM generates a Cypher query based on the user's question and the graph schema.
# 2) The returned Cypher query is turned to a text answer.
# cypher_prompt=cypher_prompt, concerns the first step
# qa_prompt=qa_prompt, concerns the second step

graphchain = GraphCypherQAChain.from_llm(
    chat_model,
    graph=graph,
    cypher_prompt=cypher_prompt,
    qa_prompt=qa_prompt,
    verbose=True,
    return_intermediate_steps=True,
    allow_dangerous_requests=True
)

In [15]:
results = graphchain.invoke({"query": query_text})
print(results)



[1m> Entering new GraphCypherQAChain chain...[0m


Generated Cypher:
[32;1m[1;3mMATCH (p:Planet {id: "Mars"})<-[:ORBITS]-(m:Moon)
RETURN count(m) AS moon_count[0m
Full Context:
[32;1m[1;3m[{'moon_count': 2}][0m

[1m> Finished chain.[0m
{'query': 'How many moons does Mars have?', 'result': 'Mars has 2 moons.', 'intermediate_steps': [{'query': 'MATCH (p:Planet {id: "Mars"})<-[:ORBITS]-(m:Moon)\nRETURN count(m) AS moon_count'}, {'context': [{'moon_count': 2}]}]}


In [16]:
print('QUERY \n', results['query'])
print('INTERMEDIATE STEPS: \n', results['intermediate_steps'])
print('RESULT: \n', results['result'])

QUERY 
 How many moons does Mars have?
INTERMEDIATE STEPS: 
 [{'query': 'MATCH (p:Planet {id: "Mars"})<-[:ORBITS]-(m:Moon)\nRETURN count(m) AS moon_count'}, {'context': [{'moon_count': 2}]}]
RESULT: 
 Mars has 2 moons.
