Skip to content

Fix WITH clause variable passing in aggregation queries #60

@DecisionNerd

Description

@DecisionNerd

Problem

Queries using WITH clause for aggregation fail with KeyError when trying to access variables in subsequent clauses. This prevents common aggregation patterns from working.

Example Query

MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WITH p, count(m) as movie_count
RETURN p.name as actor, movie_count
ORDER BY movie_count DESC

Error

KeyError: 'p'

The variable p from the WITH clause is not accessible in the RETURN clause.

Impact

  • Cannot perform common aggregation patterns
  • Prevents queries like "find top N by count"
  • Blocks basic graph analytics workflows
  • Discovered during dataset integration testing

Expected Behavior

Variables listed in WITH clause should be passed through to subsequent clauses. In the example above:

  • p should be available as a node reference
  • movie_count should be available as the aggregation result

Test Case

from graphforge import GraphForge

gf = GraphForge()
gf.execute("CREATE (p1:Person {name: 'Alice'})")
gf.execute("CREATE (p2:Person {name: 'Bob'})")
gf.execute("CREATE (m1:Movie {title: 'Movie1'})")
gf.execute("CREATE (m2:Movie {title: 'Movie2'})")
gf.execute("MATCH (p:Person {name: 'Alice'}), (m) CREATE (p)-[:ACTED_IN]->(m)")
gf.execute("MATCH (p:Person {name: 'Bob'}), (m:Movie {title: 'Movie1'}) CREATE (p)-[:ACTED_IN]->(m)")

# This should work but fails
results = gf.execute("""
    MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
    WITH p, count(m) as movie_count
    RETURN p.name as actor, movie_count
    ORDER BY movie_count DESC
""")
# KeyError: 'p'

Root Cause

The executor's handling of WITH clause doesn't properly preserve variable bindings when transitioning between aggregation and projection phases.

Priority

HIGH - This is a core query feature that should work. Blocks common use cases.

Related

  • Discovered during dataset integration testing (v0.2.1)
  • See tests/integration/datasets/test_working_datasets.py
  • Affects real-world queries on dataset graphs

Labels

  • bug
  • executor
  • high-priority

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingenhancementNew feature or requestexecutorChanges to query executor

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions