## Generate Cypher

* Local implementation following this tutorial: https://graphacademy.neo4j.com/courses/llm-fundamentals/4-cypher-generation/1-cypher-qa-chain/

### Requirements

In [1]:
!pip install langchain openai langchain-openai neo4j python-dotenv langchainhub langchain-community --quiet

In [2]:
%load_ext watermark
%watermark -p langchain,langchainhub,langchain_community

langchain          : 0.1.5
langchainhub       : 0.1.14
langchain_community: 0.0.17



### Imports

In [3]:
import os
from graphdatascience import GraphDataScience
from dotenv import load_dotenv, find_dotenv
from pathlib import Path
import neo4j

### Settings

In [4]:
project_path = Path(os.getcwd()).parent
data_path = project_path / "data"
model_path = project_path / "models"
output_path = project_path / "output"

# load env settings
_ = load_dotenv(find_dotenv())

llm_model = "gpt-4"
database = "recommendations-50"

openai_api_key = os.getenv('OPENAI_API_KEY')

### Connect to Neo4j

In [5]:
from langchain_community.graphs import Neo4jGraph

graph = Neo4jGraph(
    url=os.getenv('NEO4J_URL'),
    username=os.getenv('NEO4J_USER'),
    password=os.getenv('NEO4J_PASS'),
    database=database
)
result = graph.query("""
MATCH (m:Movie{title: 'Toy Story'}) 
RETURN m.title, m.plot, m.poster
""")
print(result)

[{'m.title': 'Toy Story', 'm.plot': "A cowboy doll is profoundly threatened and jealous when a new spaceman figure supplants him as top toy in a boy's room.", 'm.poster': 'https://image.tmdb.org/t/p/w440_and_h660_face/uXDfjJbdP4ijW5hWSBrPrlKpxab.jpg'}]


### Generate cypher

In [52]:
from langchain_openai import ChatOpenAI
from langchain_community.graphs import Neo4jGraph
from langchain.chains import GraphCypherQAChain
from langchain.prompts import PromptTemplate

llm = ChatOpenAI(openai_api_key=openai_api_key)

graph = Neo4jGraph(
    url=os.getenv('NEO4J_URL'),
    username=os.getenv('NEO4J_USER'),
    password=os.getenv('NEO4J_PASS'),
    database=database
)

CYPHER_GENERATION_TEMPLATE = """
You are an expert Neo4j Developer translating user questions into Cypher to answer questions about movies and provide recommendations.
Convert the user's question based on the schema. 

For shortestPath queries, return the name attribute for Person or Genre labels.

Schema: {schema}
Question: {question}
"""

cypher_generation_prompt = PromptTemplate(
    template=CYPHER_GENERATION_TEMPLATE,
    input_variables=["schema", "question"],
)

cypher_chain = GraphCypherQAChain.from_llm(
    llm,
    graph=graph,
    cypher_prompt=cypher_generation_prompt,
    verbose=True
)

cypher_chain.invoke({"query": "What role did Tom Hanks play in Toy Story?"})



[1m> Entering new GraphCypherQAChain chain...[0m
Generated Cypher:
[32;1m[1;3mMATCH (p:Person {name: "Tom Hanks"})-[:ACTED_IN]->(m:Movie {title: "Toy Story"})
RETURN p.name, m.title, m.year, m.poster, m.plot, m.runtime, m.revenue, m.imdbRating, m.imdbVotes, m.budget, m.countries, m.languages, m.released, m.imdbId, m.tmdbId[0m
Full Context:
[32;1m[1;3m[{'p.name': 'Tom Hanks', 'm.title': 'Toy Story', 'm.year': 1995, 'm.poster': 'https://image.tmdb.org/t/p/w440_and_h660_face/uXDfjJbdP4ijW5hWSBrPrlKpxab.jpg', 'm.plot': "A cowboy doll is profoundly threatened and jealous when a new spaceman figure supplants him as top toy in a boy's room.", 'm.runtime': 81, 'm.revenue': 373554033, 'm.imdbRating': 8.3, 'm.imdbVotes': 591836, 'm.budget': 30000000, 'm.countries': ['USA'], 'm.languages': ['English'], 'm.released': '1995-11-22', 'm.imdbId': '0114709', 'm.tmdbId': '862'}][0m

[1m> Finished chain.[0m


{'query': 'What role did Tom Hanks play in Toy Story?',
 'result': 'Tom Hanks played the role of Woody, a cowboy doll, in Toy Story.'}

In [15]:
cypher_chain.invoke({"query": "What movies did Meg Ryan act in?"})



[1m> Entering new GraphCypherQAChain chain...[0m
Generated Cypher:
[32;1m[1;3mMATCH (m:Movie)-[:ACTED_IN]->(a:Actor {name: "Meg Ryan"})
RETURN m.title[0m
Full Context:
[32;1m[1;3m[][0m

[1m> Finished chain.[0m


{'query': 'What movies did Meg Ryan act in?',
 'result': "I'm sorry, I don't have access to that information."}

In [35]:
cypher_chain.invoke({"query": "How many movies has Tom Hanks directed?"})



[1m> Entering new GraphCypherQAChain chain...[0m
Generated Cypher:
[32;1m[1;3mMATCH (p:Person {name: "Tom Hanks"})-[:DIRECTED]->(m:Movie)
RETURN COUNT(m) AS numberOfMoviesDirectedByTomHanks[0m
Full Context:
[32;1m[1;3m[{'numberOfMoviesDirectedByTomHanks': 2}][0m

[1m> Finished chain.[0m


{'query': 'How many movies has Tom Hanks directed?',
 'result': 'Tom Hanks has directed 2 movies.'}

In [53]:
cypher_chain.invoke({"query": "What is the shortest path between Tom Hanks and Mel Gibson"})



[1m> Entering new GraphCypherQAChain chain...[0m
Generated Cypher:
[32;1m[1;3mMATCH (p1:Person {name: "Tom Hanks"}), (p2:Person {name: "Mel Gibson"})
MATCH path = shortestPath((p1)-[*]-(p2))
RETURN [n IN nodes(path) | n.name] AS shortestPath[0m
Full Context:
[32;1m[1;3m[{'shortestPath': ['Tom Hanks', None, 'Adventure', None, 'Mel Gibson']}][0m

[1m> Finished chain.[0m


{'query': 'What is the shortest path between Tom Hanks and Mel Gibson',
 'result': 'The shortest path between Tom Hanks and Mel Gibson is through the genre of Adventure.'}

__observation__
* It seems a bit difficult for the LLM to return name for Person and Genre labels and titles for Movies.. Tried with examples, but this does not work properly still. With chatgpt in the browser it works btw. It solves it like this:

<pre>
MATCH path = shortestPath((tom:Person {name: "Tom Hanks"})-[*]-(mel:Person {name: "Mel Gibson"}))
RETURN [node in nodes(path) | 
CASE 
    WHEN "Person" IN labels(node) OR "Genre" IN labels(node) THEN node.name 
    WHEN "Movie" IN labels(node) THEN node.title 
END
] AS shortestPath;
</pre>