
# Velr Cypher MATCH Cookbook (Movies demo)

This notebook is a **cookbook of MATCH patterns that Velr supports today**, based directly on the
endâ€‘toâ€‘end tests in the `velr-e2e` crate.

The idea is:

- Give you a **copyâ€‘pasteable set of Cypher patterns** to experiment with
- Show what **currently works in Velr** in terms of:
  - Node patterns & labels
  - Relationship patterns (typed, untyped, undirected, multiâ€‘type)
  - Property filters, string predicates, `IN`, `NULL`
  - Cartesian products & multiâ€‘`MATCH`
  - Paths & variableâ€‘length patterns
  - Aggregations, pagination, and functions like `id()`, `type()`, `length()`


In [None]:
%pip install velr --force-reinstall 
%pip install pandas polars --quiet

In [None]:
from velr.driver import Velr
import pandas as pd
import polars as pl

db = Velr.open(None)
print("Velr DB opened:", db)


### 0.1 Load Movies CSVs from disk

The Movies demo data is stored as four CSV files in `../data/`:

- `../data/movies_people.csv`
- `../data/movies_movies.csv`
- `../data/movies_directed.csv`
- `../data/movies_acted_in.csv`

We'll load them with **Polars** and bind them into Velr.

In [None]:
# Load Movies demo CSVs from disk.
# Adjust the paths below if your layout differs.
import polars as pl

people_csv    = pl.read_csv("../data/movies_people.csv")
movies_csv    = pl.read_csv("../data/movies_movies.csv")
directed_csv  = pl.read_csv("../data/movies_directed.csv")
acted_in_csv  = pl.read_csv("../data/movies_acted_in.csv")

people_csv.head(), movies_csv.head()


### 0.2 Bind CSV tables into Velr

Now we bind the Polars DataFrames as in-memory tables so we can `UNWIND BIND(...)` them into nodes and relationships.

In [None]:
# Bind the loaded CSVs into Velr as in-memory tables.
db.bind_polars("_movies_people",   people_csv)
db.bind_polars("_movies_movies",   movies_csv)
db.bind_polars("_movies_directed", directed_csv)
db.bind_polars("_movies_acted_in", acted_in_csv)

print("Bound CSV tables into Velr")


### 0.3 Create nodes and labels via `UNWIND`

First we create bare `:Person` and `:Movie` nodes, then in a second step
we `SET` extra labels (Actor / Director / Writer, and genre labels).

In [None]:
# Create Person nodes
db.run("""
UNWIND BIND('_movies_people') AS r
CREATE (p:Person {
  key:        r.key,
  name:       r.name,
  born:       r.born,
  birthplace: r.birthplace
});
""")

# Add Actor / Director / Writer labels
db.run("""
UNWIND BIND('_movies_people') AS r
MATCH (p:Person {key:r.key})
WHERE r.is_actor
SET p:Actor;
""")

db.run("""
UNWIND BIND('_movies_people') AS r
MATCH (p:Person {key:r.key})
WHERE r.is_director
SET p:Director;
""")

db.run("""
UNWIND BIND('_movies_people') AS r
MATCH (p:Person {key:r.key})
WHERE r.is_writer
SET p:Writer;
""")

# Create Movie nodes
db.run("""
UNWIND BIND('_movies_movies') AS r
CREATE (m:Movie {
  key:      r.key,
  title:    r.title,
  released: r.released,
  imdb:     r.imdb_id,
  runtime:  r.runtime,
  genres:  [r.genre1, r.genre2]
});
""")

# Add genre labels as in the original script
db.run("""
UNWIND BIND('_movies_movies') AS r
MATCH (m:Movie {key:r.key})
WHERE r.is_scifi
SET m:ScienceFiction;
""")

db.run("""
UNWIND BIND('_movies_movies') AS r
MATCH (m:Movie {key:r.key})
WHERE r.is_action
SET m:Action;
""")

db.run("""
UNWIND BIND('_movies_movies') AS r
MATCH (m:Movie {key:r.key})
WHERE r.is_thriller
SET m:Thriller;
""")

db.run("""
UNWIND BIND('_movies_movies') AS r
MATCH (m:Movie {key:r.key})
WHERE r.is_heist
SET m:Heist;
""")

db.run("""
UNWIND BIND('_movies_movies') AS r
MATCH (m:Movie {key:r.key})
WHERE r.is_superhero
SET m:Superhero;
""")

print("Nodes & labels created")


### 0.4 Create DIRECTED and ACTED_IN relationships via `UNWIND`

In [None]:
# DIRECTED
db.run("""
UNWIND BIND('_movies_directed') AS r
MATCH (d:Person {key:r.director_key}), (m:Movie {key:r.movie_key})
CREATE (d)-[:DIRECTED {since:r.since}]->(m);
""")

# ACTED_IN
db.run("""
UNWIND BIND('_movies_acted_in') AS r
MATCH (p:Person {key:r.person_key}), (m:Movie {key:r.movie_key})
CREATE (p)-[:ACTED_IN {
  role:    r.role,
  roles:  [r.role],   // single-element list, like the original seed
  minutes: r.minutes
}]->(m);
""")

print("Relationships created")


## 1. Basic node patterns & labels

In [None]:

# 1.1 All movie titles (single label)
q = """
MATCH (m:Movie)
RETURN m.title AS title
ORDER BY title ASC;
"""
db.to_pandas(q)


In [None]:

# 1.2 Filter on a node property (numeric comparison)
q = """
MATCH (p:Person)
WHERE p.born > 1965
RETURN p.name AS name, p.born AS born
ORDER BY born ASC, name ASC;
"""
db.to_pandas(q)


In [None]:

# 1.3 Nodes with multiple labels
q = """
MATCH (m:Movie:ScienceFiction)
RETURN m.title AS title, m.released AS released
ORDER BY released ASC, title ASC;
"""
db.to_pandas(q)


## 2. Relationship patterns (typed, untyped, undirected, multiâ€‘type)

In [None]:

# 2.1 Typed, directed relationship with edge properties in WHERE
q = """
MATCH (a:Actor)-[r:ACTED_IN]->(m:Movie {title:'The Matrix'})
WHERE r.minutes >= 0
RETURN a.name AS actor, r.role AS role, r.minutes AS minutes
ORDER BY minutes DESC, actor ASC;
"""
db.to_pandas(q)


In [None]:

# 2.2 Typed, directed relationship with edge props in WHERE (another example)
q = """
MATCH (d:Director)-[r:DIRECTED]->(m:Movie)
WHERE r.since >= 2010
RETURN d.name AS director, m.title AS title, r.since AS since
ORDER BY since ASC, title ASC;
"""
db.to_pandas(q)


In [None]:

# 2.3 Untyped, directed relationship (any type)
q = """
MATCH (p:Person)-[]->(m:Movie)
RETURN p.name AS person, m.title AS movie
ORDER BY movie ASC, person ASC;
"""
db.to_pandas(q)


In [None]:

# 2.4 Untyped, undirected relationship (any type, any direction)
q = """
MATCH (p:Person)--(m:Movie)
RETURN p.name AS person, m.title AS movie
ORDER BY movie ASC, person ASC;
"""
db.to_pandas(q)


In [None]:

# 2.5 Multi-type relationship: ACTED_IN or DIRECTED
q = """
MATCH (p:Person)-[r:ACTED_IN|DIRECTED]->(m:Movie)
RETURN p.name AS person, type(r) AS rel_type, m.title AS movie
ORDER BY person ASC, rel_type ASC, movie ASC;
"""
db.to_pandas(q)


In [None]:

# 2.6 Relationship pattern map (edge properties in the pattern)
q = """
MATCH (p:Person)-[:ACTED_IN {role:'Neo'}]->(m:Movie)
RETURN p.name AS name, m.title AS title
ORDER BY title ASC, name ASC;
"""
db.to_pandas(q)


In [None]:

# 2.7 Node pattern map with multiple fields
q = """
MATCH (m:Movie {title:'The Matrix', released:1999})
RETURN m.title AS title, m.released AS released
ORDER BY title ASC;
"""
db.to_pandas(q)


## 3. WHERE filters: numeric, boolean logic, ranges

In [None]:

# 3.1 AND + OR on node properties
q = """
MATCH (p:Person)
WHERE p.born > 1965 AND (p.birthplace = 'London, UK' OR p.birthplace = 'Los Angeles, USA')
RETURN p.name AS name, p.born AS born, p.birthplace AS birthplace
ORDER BY birthplace ASC, born ASC, name ASC;
"""
db.to_pandas(q)


In [None]:

# 3.2 Numeric range with >= and <=
q = """
MATCH (m:Movie)
WHERE m.runtime >= 120 AND m.runtime <= 160
RETURN m.title AS title, m.runtime AS runtime
ORDER BY runtime ASC, title ASC;
"""
db.to_pandas(q)


In [None]:

# 3.3 Cartesian + filter: cross-node comparisons
q = """
MATCH (m1:Movie), (m2:Movie)
WHERE m1.released > m2.released
RETURN m1.title AS newer, m1.released AS newer_year,
       m2.title AS older, m2.released AS older_year
ORDER BY newer_year ASC, newer ASC, older ASC;
"""
db.to_pandas(q)


In [None]:

# 3.4 Cross-alias comparison on properties (Director vs Actor)
q = """
MATCH (d:Director), (a:Actor)
WHERE d.born < a.born
RETURN d.name AS director, d.born AS director_born,
       a.name AS actor, a.born AS actor_born
ORDER BY director_born ASC, director ASC, actor ASC;
"""
db.to_pandas(q)



## 4. String predicates, `IN` lists, and `NULL`

These all come from the e2e tests and show what Velr currently supports.


In [None]:

# 4.1 STARTS WITH
q = """
MATCH (m:Movie)
WHERE m.title STARTS WITH 'The'
RETURN m.title AS title
ORDER BY title ASC;
"""
db.to_pandas(q)


In [None]:

# 4.2 CONTAINS on node property
q = """
MATCH (m:Movie)
WHERE m.title CONTAINS 'Knight'
RETURN m.title AS title
ORDER BY title ASC;
"""
db.to_pandas(q)


In [None]:

# 4.3 ENDS WITH
q = """
MATCH (p:Person)
WHERE p.birthplace ENDS WITH ', UK'
RETURN p.name AS name
ORDER BY name ASC;
"""
db.to_pandas(q)


In [None]:

# 4.4 CONTAINS on birthplace
q = """
MATCH (p:Person)
WHERE p.birthplace CONTAINS 'Los'
RETURN p.name AS name, p.birthplace AS birthplace
ORDER BY name ASC;
"""
db.to_pandas(q)


In [None]:

# 4.5 IN list on node property
q = """
MATCH (m:Movie)
WHERE m.released IN [1999, 2010]
RETURN m.title AS title, m.released AS released
ORDER BY released ASC, title ASC;
"""
db.to_pandas(q)


In [None]:

# 4.6 IN list on relationship property
q = """
MATCH (p:Person)-[a:ACTED_IN]->(m:Movie)
WHERE a.role IN ['Neo', 'Bane']
RETURN p.name AS name, a.role AS role, m.title AS title
ORDER BY role ASC, name ASC;
"""
db.to_pandas(q)


In [None]:

# 4.7 CONTAINS on relationship property
q = """
MATCH (p:Person)-[a:ACTED_IN]->(m:Movie)
WHERE a.role CONTAINS 'ri'
RETURN p.name AS name, a.role AS role, m.title AS title
ORDER BY role ASC, name ASC;
"""
db.to_pandas(q)


In [None]:

# 4.8 Global OR: predicates over different variables in the same WHERE
q = """
MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE p.birthplace CONTAINS 'Los' OR m.title STARTS WITH 'The'
RETURN p.name AS name, m.title AS title
ORDER BY title ASC, name ASC;
"""
db.to_pandas(q)


In [None]:

# 4.9 IS NULL on a property
q = """
MATCH (p:Person)
WHERE p.birthplace IS NULL
RETURN p.name AS name
ORDER BY name ASC;
"""
db.to_pandas(q)


## 5. Paths and variableâ€‘length patterns

In [None]:

# 5.1 Named path + path functions: nodes(), relationships(), length()
q = """
MATCH pth = (p:Person)-[a:ACTED_IN]->(m:Movie)
RETURN
  p.name             AS name,
  m.title            AS title,
  nodes(pth)         AS nodes,
  relationships(pth) AS rels,
  length(pth)        AS hops
ORDER BY title ASC, name ASC;
"""
db.to_pandas(q)


In [None]:

# 5.2 Path of length 2 with a named variable
q = """
MATCH pth = (d:Person)-[:DIRECTED]->(m:Movie)<-[:ACTED_IN]-(p:Person)
RETURN
  m.title     AS title,
  d.name      AS director,
  p.name      AS actor,
  length(pth) AS hops
ORDER BY title ASC, director ASC, actor ASC;
"""
db.to_pandas(q)


In [None]:

# 5.3 WHERE on length(path)
q = """
MATCH pth = (d:Person)-[:DIRECTED]->(m:Movie)<-[:ACTED_IN]-(p:Person)
WHERE length(pth) = 2
RETURN m.title AS title, d.name AS director, p.name AS actor
ORDER BY title ASC, director ASC, actor ASC;
"""
db.to_pandas(q)


In [None]:

# 5.4 Variable-length: ACTED_IN with length 1..2 (inclusive)
q = """
MATCH (p:Person)-[:ACTED_IN*1..2]->(m:Movie)
RETURN p.name AS person, m.title AS title
ORDER BY title ASC, person ASC;
"""
db.to_pandas(q)


In [None]:

# 5.5 Variable-length: exactly 2 hops, any type/direction between Person and Director
q = """
MATCH (p:Person)-[*2]-(d:Director)
RETURN p.name AS person, d.name AS director
ORDER BY director ASC, person ASC;
"""
db.to_pandas(q)


In [None]:

# 5.6 Variable-length: zero-or-more hops, ACTED_IN from Movie to Movie
# In the e2e seed, this collapses to identity pairs because movies have no outgoing ACTED_IN.
q = """
MATCH (m:Movie)-[:ACTED_IN*0..]->(m2:Movie)
RETURN m.title AS left_title, m2.title AS right_title
ORDER BY left_title ASC, right_title ASC;
"""
db.to_pandas(q)


In [None]:

# 5.7 Variable-length co-actors: exactly 2 ACTED_IN hops between Persons
q = """
MATCH (p:Person)-[:ACTED_IN*2]-(q:Person)
WHERE p.name < q.name
RETURN p.name AS a, q.name AS b
ORDER BY a ASC, b ASC;
"""
db.to_pandas(q)


In [None]:

# 5.8 Variable-length with path alias and length
q = """
MATCH pt = (p:Person)-[*2]-(d:Director)
RETURN p.name AS person, d.name AS director, length(pt) AS L
ORDER BY director ASC, person ASC, L ASC;
"""
db.to_pandas(q)


## 6. Aggregates: COUNT, SUM, AVG, MIN, MAX

In [None]:

# 6.1 COUNT(*) over all Movie nodes
q = """
MATCH (m:Movie)
RETURN COUNT(*) AS movies_total;
"""
db.to_pandas(q)


In [None]:

# 6.2 COUNT(a) per movie (number of ACTED_IN edges)
q = """
MATCH (p:Person)-[a:ACTED_IN]->(m:Movie)
RETURN m.title AS title, COUNT(a) AS actors
ORDER BY title ASC;
"""
db.to_pandas(q)


In [None]:

# 6.3 SUM and AVG over relationship property
q = """
MATCH (p:Person)-[a:ACTED_IN]->(m:Movie)
RETURN
    m.title AS title,
    SUM(a.minutes) AS minutes_sum,
    AVG(a.minutes) AS minutes_avg
ORDER BY title ASC;
"""
db.to_pandas(q)


In [None]:

# 6.4 MIN / MAX over a node property
q = """
MATCH (m:Movie)
RETURN MIN(m.runtime) AS runtime_min, MAX(m.runtime) AS runtime_max;
"""
db.to_pandas(q)


In [None]:

# 6.5 COUNT over NULLs (property missing -> not counted)
q = """
MATCH (p:Person)
RETURN COUNT(p.nick) AS nick_count;
"""
db.to_pandas(q)


In [None]:

# 6.6 COUNT over non-NULL runtimes (all movies have runtime in the seed)
q = """
MATCH (m:Movie)
RETURN COUNT(m.runtime) AS runtime_count;
"""
db.to_pandas(q)


In [None]:

# 6.7 WHERE before aggregation
q = """
MATCH (p:Person)-[a:ACTED_IN]->(m:Movie)
WHERE p.born > 1965
RETURN m.title AS title, COUNT(a) AS young_actors
ORDER BY title ASC;
"""
db.to_pandas(q)


In [None]:

# 6.8 Aggregate by director
q = """
MATCH (d:Director)-[:DIRECTED]->(m:Movie)
RETURN d.name AS director, COUNT(m) AS films
ORDER BY director ASC;
"""
db.to_pandas(q)


## 7. Pagination: `LIMIT` and `SKIP`

In [None]:

# 7.1 LIMIT
q = """
MATCH (a:Actor)-[:ACTED_IN]->(m:Movie)
RETURN a.name AS actor, m.title AS movie
ORDER BY actor ASC, movie ASC
LIMIT 3;
"""
db.to_pandas(q)


In [None]:

# 7.2 SKIP + LIMIT
q = """
MATCH (a:Actor)-[:ACTED_IN]->(m:Movie)
RETURN a.name AS actor, m.title AS movie
ORDER BY actor ASC, movie ASC
SKIP 3 LIMIT 3;
"""
db.to_pandas(q)


## 8. Multiâ€‘`MATCH` and cartesian products

In [None]:

# 8.1 Two MATCH clauses that form a cartesian product
q = """
MATCH (m1:Movie)
MATCH (m2:Movie)
WHERE m1.released > m2.released
RETURN m1.title AS newer, m2.title AS older
ORDER BY newer ASC, older ASC;
"""
db.to_pandas(q)


In [None]:

# 8.2 Extending a bound variable across MATCH clauses
q = """
MATCH (m:Movie {title:'The Matrix'})
MATCH (p:Person)-[:ACTED_IN]->(m)
RETURN p.name AS actor
ORDER BY actor ASC;
"""
db.to_pandas(q)


## 9. Functions: `id()`, `type()` and generic patterns

In [None]:

# 9.1 id() and type() on nodes and relationships
q = """
MATCH (p:Person)-[a:ACTED_IN]->(m:Movie)
RETURN
  id(p)   AS pid,
  type(a) AS reltype,
  id(a)   AS aid,
  id(m)   AS mid
ORDER BY mid ASC, pid ASC
LIMIT 10;
"""
db.to_pandas(q)


In [None]:

# 9.2 Match all nodes, return only id(n)
q = """
MATCH (n)
RETURN id(n) AS id
"""
db.to_pandas(q)


In [None]:

# 9.3 Anonymous nodes with a typed relationship, aggregated
q = """
MATCH ()-[a:ACTED_IN]->()
RETURN COUNT(a) AS acted_in_rels;
"""
db.to_pandas(q)



## 10. Where to go from here

This notebook is **directly grounded in the Velr e2e `MATCH` tests**, so everything here
should reflect what the engine can execute today.

Things you might try next:

- Fork this notebook and **add your own patterns**
- Use it as a **living compatibility matrix** vs. Neo4j-style Cypher
- Swap out the Movies graph for your own domain data and see how the
  patterns behave there

Happy querying! ðŸ§ ðŸ“ˆ
