A Python parser for the Cypher graph query language. Parses Cypher query strings into Abstract Syntax Trees (ASTs).
Built on ANTLR 4 with a patched openCypher grammar that supports modern Cypher features including subqueries, FOREACH, shortestPath, and more.
pip install pycypherfrom pycypher import parse
result = parse('MATCH (n:Person)-[:KNOWS]->(m) WHERE m.age > 30 RETURN n.name, m.name;')
# Check for parse errors
print(result['errors']) # []
# Access the AST
for node in result['result']:
print(node['node']['parent'], '->', node['node']['text'][:50])Parses a Cypher query string and returns an AST dictionary.
Parameters:
query_string(str): A Cypher query (e.g.,"MATCH (n) RETURN n;")
Returns: A dict with two keys:
result- List of AST nodes. Each node has:node- Dict withparent(rule name),text(source text),sourceInterval(token range)children- Nested dict withresult(child nodes) anderrors(child parse errors)
errors- List of top-level parse error nodes
Package version string (e.g., "1.0.0").
MATCH, OPTIONAL MATCH, WHERE, RETURN, WITH, UNWIND, UNION, ORDER BY, SKIP, LIMIT, DISTINCT
CREATE, MERGE (ON CREATE / ON MATCH), SET, DELETE, DETACH DELETE, REMOVE, FOREACH
- Arithmetic:
+,-,*,/,%,^ - Comparison:
=,<>,<,>,<=,>= - Boolean:
AND,OR,XOR,NOT - String:
STARTS WITH,ENDS WITH,CONTAINS - Null:
IS NULL,IS NOT NULL - Lists:
IN, list literals, list comprehensions - Maps: map literals, map projections
- CASE / WHEN / THEN / ELSE / END
- Parameters:
$param
- Node patterns:
(n),(n:Label),(n:Label {prop: value}) - Relationship patterns:
-[r:TYPE]->,<-[:TYPE]-,-[:TYPE]- - Variable-length paths:
[*],[*2],[*1..3],[*..5] shortestPath()andallShortestPaths()- Multiple labels:
(n:Person:Employee) - Multiple relationship types:
-[:KNOWS|LIKES]->
- Aggregation:
count(),collect(),sum(),avg(),min(),max() count(*),count(DISTINCT x)- Namespaced functions:
apoc.text.join() - Predicate functions:
ALL(),ANY(),NONE(),SINGLE()
CALL { ... }inline subqueriesEXISTS { ... }existence checks (pattern and full-query forms)CALL procedure() YIELD ...procedure calls
- Backtick-escaped identifiers:
`My Label` - Block comments:
/* ... */ - Line comments:
// ... - Case-insensitive keywords
from pycypher import parse
result = parse('MATCH (n:Person) RETURN n.name;')
assert result['errors'] == []result = parse('''
MATCH (a:Person)-[:KNOWS]->(b:Person)-[:LIVES_IN]->(c:City)
WHERE b.age > 25
RETURN a.name, b.name, c.name
ORDER BY b.age DESC
LIMIT 10
''')result = parse('''
MERGE (n:User {id: $userId})
ON CREATE SET n.created = timestamp()
ON MATCH SET n.lastSeen = timestamp()
RETURN n
''')result = parse('''
MATCH (n:Person)
WHERE EXISTS {
MATCH (n)-[:KNOWS]->(m)
WHERE m.age > 30
}
CALL { MATCH (x) RETURN x LIMIT 1 }
RETURN n.name
''')result = parse('THIS IS NOT VALID CYPHER')
if result['errors']:
print(f"Parse errors found: {len(result['errors'])}")The parser is built from ANTLR 4.13 grammar files located in grammar/:
CypherLexer.g4- Token definitions (keywords, operators, literals)CypherParser.g4- Parser rules (statements, clauses, expressions, patterns)
Based on the antlr/grammars-v4 Cypher grammar (BSD license) by Boris Zhguchev, with patches for:
- FOREACH clause
- CALL {} subqueries (Cypher 5)
- EXISTS { full-query } subqueries
- shortestPath / allShortestPaths
- Numeric token precedence fix
If you modify the grammar:
# Download ANTLR
curl -O https://www.antlr.org/download/antlr-4.13.2-complete.jar
# Generate Python files
java -jar antlr-4.13.2-complete.jar \
-Dlanguage=Python3 -visitor \
grammar/CypherLexer.g4 grammar/CypherParser.g4
# Copy generated files into the package
cp CypherLexer.py pycypher/lexer.py
cp CypherParser.py pycypher/parser.py
cp CypherParserVisitor.py pycypher/visitor.py
cp CypherParserListener.py pycypher/listener.pygit clone https://github.com/Mizzlr/pycypher
cd pycypher
python -m venv .venv && source .venv/bin/activate
pip install antlr4-python3-runtime pytest
python -m pytest tests/ -vMIT License. See LICENSE for details.