A standalone openCypher front-end in pure Rust: lexer, parser, AST,
semantic analyzer, and logical plan generator. No storage. No executor.
No opinion about how nodes and edges are laid out on disk. Drop it into
any Rust graph-DB-shaped project that needs Cypher input without taking
on libcypher-parser as a C dependency.
Every embedded graph DB ends up reimplementing the same Cypher front-end. The parser and the planner are separable from storage and execution, and there's no good reason each graph store should ship its own. The existing implementations sit behind the parser: Neo4j is closed-source Java, libcypher-parser is C and pulls in bindgen / pkg-config plumbing on every build, Memgraph and RedisGraph keep their parsers inside the database, and the rest are hand-rolled per project.
What's missing is a pure-Rust, library-grade, MIT-licensed Cypher front-end with a clean separation between parsing, semantic analysis, logical planning, and execution. cypher-rs is that piece. It stops where storage starts.
| Stage | What | Status |
|---|---|---|
| Lexer | tokens for openCypher 9 grammar | partial (v0.2) |
| Parser | concrete syntax tree | partial (v0.2: MATCH/OPTIONAL MATCH/WHERE/RETURN/ORDER BY/LIMIT/SKIP, list literals, IN) |
| AST lowering | symbol table, variable binding | partial (v0.3) |
| Semantic analysis | scope / label / rel-type checks | partial (v0.3 - type checks deferred) |
| Logical plan | algebra: scan · expand · filter · project · agg | planned |
| Plan rewriter | predicate pushdown, projection pruning | planned |
| Cost model | pluggable trait; default = cardinality-only | planned |
| Diagnostics | miette-powered span errors |
planned |
Not in scope: physical plan, storage adapter, executor, network protocol, server.
use cypher_rs::{parse, lower, plan, Schema};
let q = "
MATCH (u:User)-[:FOLLOWS]->(f:User)
WHERE u.id = $uid
RETURN f.name AS name, f.created_at AS joined
ORDER BY joined DESC
LIMIT 10
";
let cst = parse(q)?; // concrete syntax tree
let ast = lower(&cst)?; // typed AST with bindings
let lp = plan(&ast, &Schema::infer())?; // logical plan
println!("{lp:#}");Output (simplified):
Limit { count: 10 }
└── Sort { keys: [joined DESC] }
└── Project { exprs: [f.name AS name, f.created_at AS joined] }
└── Filter { pred: u.id = $uid }
└── Expand { src: u, rel: :FOLLOWS, dir: out, dst: f, label: User }
└── Scan { var: u, label: User }
Plug your executor under that and you have a Cypher engine.
cypher-rs ships a Model Context Protocol server so any MCP-compatible agent (Claude Code, Cursor, Windsurf, Continue, …) can call the front-end as tools. Useful for "is this query valid? what's its plan? what does the optimizer change? what does it cost?" without hand-rolling a CLI.
The server is gated behind the mcp Cargo feature so the library's dep tree stays lean by default. Build the binary once:
cargo build --release --features mcp --bin cypher-mcpRegister it with Claude Code:
claude mcp add --scope user cypher-rs "$(pwd)/target/release/cypher-mcp"
claude mcp list # cypher-rs: ... - ✓ ConnectedFor Cursor / Windsurf / any other MCP client, drop this into their mcpServers config:
{
"mcpServers": {
"cypher-rs": {
"command": "/abs/path/to/cypher-rs/target/release/cypher-mcp"
}
}
}The server is stateless: every call takes a query string. No session state, no warm-up, sub-millisecond startup.
| Tool | What it does |
|---|---|
cypher_parse |
Parse a query, return a debug-print of the AST + clause count. |
cypher_validate |
Quick yes/no: does the query parse and pass semantic analysis? |
cypher_analyze |
Full semantic-analysis report: bindings introduced by MATCH / OPTIONAL MATCH plus every issue (severity / code / message). |
cypher_plan |
Build the logical plan and return its tree-pretty-printed form. Pass optimize: true to apply the rewriter first. |
cypher_optimize |
Plan before / after the optimizer runs to fixpoint, plus a changed flag. Useful for inspecting which rewrites fire. |
cypher_explain |
Headline tool. Run the full pipeline (parse → analyze → plan → optimize → cost → columns) in one call. |
cypher_cost |
Cost estimate using CardinalityCostModel. Unitless — compare plans, do not compare across models. |
cypher_columns |
Output columns + required input columns for the optimized plan. |
In Claude Code, after registering, talk to Claude in English:
You: Use cypher-rs to validate this query and explain the plan it produces:
MATCH (u:User)-[:FOLLOWS]->(f:User) WHERE u.id = $uid RETURN f.name AS name, f.created_at AS joined ORDER BY joined DESC LIMIT 10(Claude calls
cypher_explainand reads back the plan tree, the optimizer's rewrite, the cardinality estimate, and the column set.)You: Why does the optimizer not push the predicate below the Sort?
(Claude calls
cypher_optimizeto show the before/after, then reasons over it.)
You can also force a tool ("use cypher_validate…") but normally describing the question in English is faster.
- No
libcypher-parserdependency. Pure Rust; builds anywhere Rust builds. Nobindgen, nopkg-config, no system C library. - No executor coupling. Plug into Sled, RocksDB, FFS, in-memory, remote; the crate doesn't care.
- Reusable across deployments. Embedded graph DB, server-side graph DB, OLAP graph engine; same front-end.
- Inspectable plans. The logical plan is data, not code. Print it, serialize it, optimize it, send it across a wire.
- Parser:
pestfor v0 (PEG, easy to read, easy to evolve). May switch tolalrpopif benchmarks demand. - AST: enum-heavy. No
Box<dyn Trait>everywhere. If allocation patterns get hot, an arena layer is straightforward. - Errors:
miettefor diagnostics with source spans. Cypher errors feel like Rust errors, with carets and context. - No
async: front-end is pure CPU work. Async belongs at the storage layer, not here. - Generic over schemas: a
Schematrait lets you provide whatever metadata you have (or nothing). Validation is opt-in.
Logical plans are built from a small algebra:
| Op | Inputs | Output |
|---|---|---|
Scan { var, label? } |
- | rows binding var |
Expand { src, rel, dir, dst, label? } |
rows | rows extended with dst |
Filter { pred } |
rows | rows where pred holds |
Project { exprs } |
rows | rows with new columns |
Aggregate { group_by, aggs } |
rows | grouped rows |
Sort { keys } |
rows | sorted rows |
Limit { offset, count } |
rows | bounded rows |
Union { lhs, rhs } |
rows × rows | concatenated rows |
Join { lhs, rhs, kind } |
rows × rows | joined rows |
OptionalExpand |
rows | rows with optional dst |
Create / Merge / SetProperty / Delete |
rows | mutations |
Every plan is a tree of these. Optimization rules are tree rewrites.
The bar is the openCypher Technology Compatibility Kit. v0 targets the parser-only subset. v1 targets ≥95% on the full TCK.
cypher-rs |
libcypher-parser |
hand-rolled | Neo4j embedded | |
|---|---|---|---|---|
| Language | Rust | C | varies | Java |
| Pure | ✓ | ✗ (system dep) | ✓ | ✗ (JVM) |
| Logical plan | ✓ | ✗ (parser only) | ad-hoc | ✓ |
| Cost model | ✓ (pluggable) | ✗ | ad-hoc | ✓ (fixed) |
| License | MIT | Apache 2.0 | varies | GPLv3/commercial |
| TCK conformance | targeted | partial | varies | full |
cypher-rs-sled- adapter for the Sled embedded KV storecypher-rs-rocksdb- adapter for RocksDBcypher-rs-ffs- adapter for the FFS embedded graph DB (closed)cypher-rs-arrow- vectorized executor over Apache Arrow
Errors carry source spans. A typo:
error: unknown property `naem` on label `User`
╭─[query.cypher:3:14]
3 │ RETURN u.naem
· ─┬─
· ╰── did you mean `name`?
╰────
- Physical execution. That's the storage adapter's job.
- Storage layer. Not here.
- Neo4j-specific extensions. APOC, GDS, Bolt, internal catalogs - out of scope. The goal is portable openCypher, not Cypher-as-Neo4j.
- Distributed planning. A future crate, not this one.
Q: Is cypher-rs faster than libcypher-parser?
A: TBD. The goal is parity for v1, then optimization. Pure Rust gives
us LTO and inlining the C version doesn't have.
Q: What about GQL (the new ISO standard)? A: openCypher is a strict subset of GQL. The plan is openCypher → GQL delta tracking, with a feature flag.
Q: Will this work in WebAssembly?
A: Yes - pure Rust, no unsafe, no system deps. A cypher-rs-wasm
companion crate is on the roadmap.
Q: Why pest and not lalrpop / chumsky / nom? A: Pest's PEG grammar tracks the openCypher EBNF closely; refactoring the grammar is a textual operation. We may revisit if profiling shows pest is the bottleneck.
Q: Does this support CALL procedures?
A: Built-in CALL is in scope; user-defined procedures (which are
backend-specific) are not.
- v0.0 - scaffold, design, scope
- v0.1 - lexer + parser; MATCH / WHERE / RETURN / LIMIT / SKIP, expressions
- v0.2 - OPTIONAL MATCH · ORDER BY (multi-key, ASC/DESC) · list literals ·
IN. Atomickw_*rules with word-boundary checks for every keyword. - v0.3 - semantic analyzer (variable binding · scope check · optional schema-aware label / rel-type validation via
Schematrait) - v0.4 - logical plan + algebra (Empty · Scan · Expand · Filter · Project · Sort · Skip · Limit) with indented tree pretty-print
- v0.5 - multi-pattern Cartesian · multiple MATCH clauses · OPTIONAL MATCH (
Optionalouter-apply) - v0.6 - predicate pushdown optimizer (
optimize(plan)to fixpoint; pushes through Project / Sort / Cartesian; respects Limit / Skip / Optional) - v0.7 - cost-model trait (
CostModel) +CardinalityCostModeldefault +estimate_cost(plan, model) -> f64 - v0.8 - map literals (
{key: value}) as expressions and as node/rel pattern properties; planner desugars pattern-property maps into Filter operators - v0.9 - anonymous rel-binding synthesis (patterns like
(:User {id: 1})and[:KNOWS {since: 2020}]now lower to a Filter against an internal__node_N/__rel_Nbinding instead of dropping the predicate). Projection pruning (column-set tracking) deferred to v0.10. - v0.10 - projection pruning analysis (
output_columns(plan)+required_input_columns(plan, outer_demand)for column-set tracking; pure analysis, no plan-tree changes; executors use it to materialize only referenced bindings).cypher-rs-sledintegration crate deferred to v0.11. - v1.0 - openCypher TCK ≥ 95%; used in FFS
cypher · opencypher · graph-database · query-language ·
parser · rust · compiler · query-planner · embedded-database ·
pest · graph-algorithms
MIT - see LICENSE.