Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GraphCypherQAChain cannot generate correct Cypher commands #22385

Closed
5 tasks done
mj-1023 opened this issue Jun 1, 2024 · 2 comments
Closed
5 tasks done

GraphCypherQAChain cannot generate correct Cypher commands #22385

mj-1023 opened this issue Jun 1, 2024 · 2 comments
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature 🔌: neo4j Primarily related to Neo4j integrations

Comments

@mj-1023
Copy link

mj-1023 commented Jun 1, 2024

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

from langchain_community.graphs import Neo4jGraph
from langchain.chains import GraphCypherQAChain
from langchain_openai import ChatOpenAI

uri = "bolt://localhost:7687"
username = "xxxx"
password = "xxxxx"

graph = Neo4jGraph(url=uri, username=username, password=password)

llm = ChatOpenAI(model="gpt-4-0125-preview",temperature=0)

chain = GraphCypherQAChain.from_llm(graph=graph, llm=llm, verbose=True, validate_cypher=True)

Error Message and Stack Trace (if applicable)

Entering new GraphCypherQAChain chain...
Generated Cypher:
MATCH (t:Tools {name: "test.py"})-[:has MD5 hash]->(h:Hash) RETURN h.name
Traceback (most recent call last):
File "\lib\site-packages\langchain_community\graphs\neo4j_graph.py", line 391, in query
data = session.run(Query(text=query, timeout=self.timeout), params)
File "\lib\site-packages\neo4j_sync\work\session.py", line 313, in run
self._auto_result._run(
File "\lib\site-packages\neo4j_sync\work\result.py", line 181, in _run
self._attach()
File "\lib\site-packages\neo4j_sync\work\result.py", line 301, in _attach
self._connection.fetch_message()
File "\lib\site-packages\neo4j_sync\io_common.py", line 178, in inner
func(args, **kwargs)
File "\lib\site-packages\neo4j_sync\io_bolt.py", line 850, in fetch_message
res = self._process_message(tag, fields)
File "\lib\site-packages\neo4j_sync\io_bolt5.py", line 369, in _process_message
response.on_failure(summary_metadata or {})
File "\lib\site-packages\neo4j_sync\io_common.py", line 245, in on_failure
raise Neo4jError.hydrate(**metadata)
neo4j.exceptions.CypherSyntaxError: {code: Neo.ClientError.Statement.SyntaxError} {message: Invalid input 'MD5': expected
"
"
"WHERE"
"]"
"{"
a parameter (line 1, column 50 (offset: 49))
"MATCH (t:Tools {name: "test.py"})-[:has MD5 hash]->(h:Hash) RETURN h.name"
^}

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "\graph_RAG.py", line 29, in
response = chain.invoke({"query": "What is the MD5 of test.py?"})
File "lib\site-packages\langchain\chains\base.py", line 166, in invoke
raise e
File "\lib\site-packages\langchain\chains\base.py", line 156, in invoke
self._call(inputs, run_manager=run_manager)
File "\lib\site-packages\langchain_community\chains\graph_qa\cypher.py", line 274, in _call
context = self.graph.query(generated_cypher)[: self.top_k]
File "\lib\site-packages\langchain_community\graphs\neo4j_graph.py", line 397, in query
raise ValueError(f"Generated Cypher Statement is not valid\n{e}")
ValueError: Generated Cypher Statement is not valid
{code: Neo.ClientError.Statement.SyntaxError} {message: Invalid input 'MD5': expected
"*"
"WHERE"
"]"
"{"
a parameter (line 1, column 50 (offset: 49))
"MATCH (t:Tools {name: "test.py"})-[:has MD5 hash]->(h:Hash) RETURN h.name"
^}

Description

Following the tutorial https://python.langchain.com/v0.2/docs/tutorials/graph/, the relationships between entities in my neo4j database contain spaces, and the agent cannot handle this situation correctly.

System Info

System Information

OS: Windows
OS Version: 10.0.19045
Python Version: 3.10.10 | packaged by Anaconda, Inc. | (main, Mar 21 2023, 18:39:17) [MSC v.1916 64 bit (AMD64)]

Package Information

langchain_core: 0.2.3
langchain: 0.2.1
langchain_community: 0.2.1
langsmith: 0.1.67
langchain_openai: 0.1.8
langchain_text_splitters: 0.2.0

Packages not installed (Not Necessarily a Problem)

The following packages were not found:

langgraph
langserve

@dosubot dosubot bot added 🔌: neo4j Primarily related to Neo4j integrations 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature labels Jun 1, 2024
@RafaelXokito
Copy link
Contributor

The case you're describing seems to be an edge case that occurs when your schema contains spaces in node labels or relationship types. This can be solved using the cypher_llm_kwargs argument in the from_llm class method.

The default prompt is a generic one that works in most scenarios. However, it should be concise while maintaining optimal results. Try the following prompt to handle this specific edge case:

cypher_prompt_template = """Task:Generate Cypher statement to query a graph database.
Instructions:
Use only the provided relationship types and properties in the schema.
Do not use any other relationship types or properties that are not provided.
Schema:
{schema}
Note: Do not include any explanations or apologies in your responses.
Do not respond to any questions that might ask anything else than for you to construct a Cypher statement.
Do not include any text except the generated Cypher statement.

Examples:

Example 1:
Schema: {Person: {name, age}, City: {name}, LIVES IN: {since}}
Question: Find all persons living in New York.
Output: MATCH (p:`Person`)-[:`LIVES IN`]->(c:`City` {name: 'New York'}) RETURN p

Example 2:
Schema: {Employee: {name, role}, Department: {name}, WORKS_IN: {since}}
Question: Retrieve the names of all employees working in the Sales department.
Output: MATCH (e:`Employee`)-[:`WORKS_IN`]->(d:`Department` {name: 'Sales'}) RETURN e.name

Example 3:
Schema: {Movie: {title, releaseYear}, Director: {name}, DIRECTED: {since}}
Question: List all movies directed by Christopher Nolan.
Output: MATCH (d:`Director` {name: 'Christopher Nolan'})-[:`DIRECTED`]->(m:`Movie`) RETURN m.title

The question is:
{question}"""

cypher_prompt = PromptTemplate(template=cypher_prompt_template, input_variables=[])
chain = GraphCypherQAChain.from_llm(graph=graph, llm=llm, verbose=True,
                    validate_cypher=True, cypher_llm_kwargs={"prompt": cypher_prompt})

Try this approach and see if it resolves your issue.

@dosubot dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Oct 2, 2024
@dosubot dosubot bot closed this as not planned Won't fix, can't repro, duplicate, stale Oct 9, 2024
@dosubot dosubot bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Oct 9, 2024
@LIUKAI0815
Copy link

@RafaelXokito ValueError: Missing some input keys: {'name', 'Employee', 'Movie', 'Person'}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature 🔌: neo4j Primarily related to Neo4j integrations
Projects
None yet
Development

No branches or pull requests

3 participants