feat: implement CALL procedure invocation#351
Conversation
Implements the CALL clause for procedure invocation across all 4 layers: **Grammar (cypher.lark)** - Added `call_procedure` rule: `CALL proc.name(args) [YIELD items]` - Added `call_procedure_implicit` rule: `CALL proc.name [YIELD items]` (standalone only) - Added `proc_name` rule for dot-qualified names (e.g. `test.my.proc`) - Added `yield_clause` rule supporting `YIELD *` and `YIELD item1, item2 AS alias` - Added backtick-quoted identifier support (required by TCK scenarios) **AST / Parser** - Extended `CallClause` with `procedure_name`, `arguments`, `yield_items`, `yield_star` - Added transformer methods: `call_procedure`, `call_procedure_implicit`, `proc_name`, `yield_star`, `yield_items` - `_get_token_value` now strips backtick quoting **Procedures module (new)** - `ProcedureColumn`: name, cypher_type, nullable - `ProcedureDefinition`: name, parameters, output_columns, implementation callable - `ProcedureRegistry`: register, register_from_tck (static-row filtering), get, exists **Planner** - Added `ProcedureCall` operator (distinct from existing `Call` subquery operator) - Compile-time validations: `YIELD *` in-query, implicit args in-query, aggregation in args, `VariableAlreadyBound` in YIELD, `UndefinedVariable` for RETURN referencing unYIELDed procedure outputs - Registry passed to planner for semantic validation **Executor** - `_execute_procedure_call`: evaluates args, validates types, calls implementation - Standalone vs in-query seeding, void procedure pass-through - YIELD * / explicit YIELD / no-YIELD column exposure - `$param` bindings excluded from standalone result rows - Parameter type coercion (INTEGER accepted for FLOAT/NUMBER) **TCK / Tests** - `step_there_exists_a_procedure` BDD step now registers real procedures - 37/37 procedure TCK scenarios passing (100%) - Unit tests: parser, planner, executor, registry - Integration tests: 16 standalone and in-query scenarios TCK compliance: 3,235 → 3,291 (+56) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
WalkthroughAdds CALL procedure support: grammar, AST, planner operator, procedure registry, executor invocation with parameter binding/type checks, YIELD semantics, GraphForge API hooks, and tests covering parser, planner, executor, registry, TCK integration, and integration scenarios. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant GraphForge
participant Parser
participant Planner
participant ProcedureRegistry
participant Executor
participant ProcedureImpl
Client->>GraphForge: execute(query, parameters?)
GraphForge->>Parser: parse(query)
Parser-->>GraphForge: AST (CallClause / ProcedureCall)
GraphForge->>Planner: plan(AST) with procedure_registry
Planner-->>GraphForge: operators (ProcedureCall ...)
GraphForge->>Executor: execute(operators, parameters)
Executor->>ProcedureRegistry: get(procedure_name)
ProcedureRegistry-->>Executor: ProcedureDefinition
Executor->>Executor: evaluate argument exprs per row (using parameters)
Executor->>ProcedureImpl: call(*py_args)
ProcedureImpl-->>Executor: rows (or no rows)
Executor->>Client: result rows (YIELD-expanded)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #351 +/- ##
==========================================
- Coverage 89.54% 89.23% -0.32%
==========================================
Files 38 39 +1
Lines 11237 11571 +334
Branches 2391 2478 +87
==========================================
+ Hits 10062 10325 +263
- Misses 822 868 +46
- Partials 353 378 +25
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report in Codecov by Sentry.
|
There was a problem hiding this comment.
Actionable comments posted: 13
🧹 Nitpick comments (7)
tests/unit/parser/test_call_procedure_parser.py (1)
7-11: Use a fixture instead of a module-level parser instance.This shares one
CypherParseracross the whole module, which goes against the repo’s isolation rule for unit tests. A fixture keeps each case independent and makes parser-state regressions easier to localize.As per coding guidelines, "Always use fresh test fixtures for test isolation; never use shared mutable state across tests".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/parser/test_call_procedure_parser.py` around lines 7 - 11, The module-level mutable parser variable should be replaced with a pytest fixture that returns a fresh CypherParser for each test; remove the global parser and change the helper or tests to accept a fixture (e.g., a pytest fixture named "parser" that returns CypherParser()) so each test uses its own instance when calling parse, ensuring no shared state from the previous module-level parser or the existing parse() wrapper.src/graphforge/ast/clause.py (1)
305-311: Validate the twoCallClauseshapes explicitly.
CallClausenow models two mutually exclusive forms, but nothing preventsquery=None, procedure_name=None, both being set, oryield_starbeing attached to a subquery form. Please add a@model_validator(mode="after")that requires exactly one mode and rejects procedure-only fields onCALL { ... }.As per coding guidelines, "Use `@field_validator` for single-field validation and `@model_validator`(mode='after') for cross-field validation in Pydantic models".Possible model-level validation
-from pydantic import BaseModel, Field, field_validator +from pydantic import BaseModel, Field, field_validator, model_validator @@ - query: Any = Field(None, description="Nested query (subquery mode)") + query: Any | None = Field(None, description="Nested query (subquery mode)") procedure_name: str | None = Field(None, description="Qualified procedure name") arguments: list[Any] | None = Field(None, description="Argument expressions") yield_items: list[Any] | None = Field(None, description="YIELD return items") yield_star: bool = Field(False, description="YIELD * was specified") + `@model_validator`(mode="after") + def validate_call_shape(self) -> "CallClause": + has_query = self.query is not None + has_proc = self.procedure_name is not None + if has_query == has_proc: + raise ValueError("CALL must contain exactly one of query or procedure_name") + if has_query and ( + self.arguments is not None or self.yield_items is not None or self.yield_star + ): + raise ValueError("Subquery CALL cannot include procedure arguments or YIELD") + if self.yield_star and self.yield_items is not None: + raise ValueError("YIELD * cannot be combined with explicit YIELD items") + return self + model_config = {"frozen": True, "arbitrary_types_allowed": True}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/graphforge/ast/clause.py` around lines 305 - 311, The CallClause Pydantic model currently allows invalid combinations of fields (query, procedure_name, arguments, yield_items, yield_star); add a `@model_validator`(mode="after") on the CallClause class that enforces exactly one mode: either subquery form (query is not None, procedure_name/arguments must be None, yield_star must be False and yield_items optional) or procedure form (procedure_name is not None, arguments may be present, query must be None, and yield_star/yield_items must be None); raise a ValueError with a clear message when neither or both modes are present or when procedure-only fields are set on a subquery form.src/graphforge/planner/operators.py (1)
598-604: Validate impossibleProcedureCallcombinations in the model.This operator can currently be constructed with invalid combinations such as both
yield_starandyield_items, or implicit arguments on a non-standalone call. A small@model_validator(mode="after")here would stop impossible plans before they reach the executor.As per coding guidelines, "Use `@field_validator` for single-field validation and `@model_validator`(mode='after') for cross-field validation in Pydantic models".Possible shape validation
class ProcedureCall(BaseModel): @@ yield_items: list[Any] | None = Field(None, description="Explicit YIELD items") yield_star: bool = Field(False, description="YIELD * was written") standalone: bool = Field(False, description="Standalone call (no prior rows)") + `@model_validator`(mode="after") + def validate_call_shape(self) -> "ProcedureCall": + if self.yield_star and self.yield_items is not None: + raise ValueError("YIELD * cannot be combined with explicit YIELD items") + if self.arguments is None and not self.standalone: + raise ValueError("Implicit argument passing is only valid for standalone CALL") + if self.yield_star and not self.standalone: + raise ValueError("YIELD * is only valid for standalone CALL") + return self + model_config = {"frozen": True, "arbitrary_types_allowed": True}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/graphforge/planner/operators.py` around lines 598 - 604, The ProcedureCall model allows impossible combos (e.g., both yield_star and yield_items, or implicit arguments (arguments is None) on a non-standalone call); add a Pydantic cross-field validator on ProcedureCall using `@model_validator`(mode="after") that checks: if yield_star is True and yield_items is not None (or non-empty) raise a ValueError; if arguments is None and standalone is False raise a ValueError; return the model instance at the end. Reference the ProcedureCall class and its fields procedure_name, arguments, yield_items, yield_star, and standalone when adding the validator.tests/unit/executor/test_procedure_call_executor.py (1)
46-143: These executor "unit" tests currently exercise the full stack.Every case goes through
GraphForge.execute(), so parser/planner failures are indistinguishable from executor regressions and this file overlaps the integration suite. BuildProcedureCalloperators and execute them against a prepared executor directly here.As per coding guidelines, "tests/unit/**/*.py: Unit tests must test a single component in isolation using fixtures like empty_graph and assertions on AST/operator types".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/executor/test_procedure_call_executor.py` around lines 46 - 143, The tests in TestStandaloneCallExecution/TestInQueryCallExecution/TestCallWithMatchContext use GraphForge.execute() which exercises the full stack; replace those usages by building ProcedureCall operator nodes directly (use the ProcedureCall AST/operator) and run them against the prepared executor harness (invoke the ProcedureCall execution path or ProcedureCallExecutor) so the tests isolate the procedure executor; keep using the helper factories labels_proc(), lookup_proc(), void_proc() to register procedures and construct the operator inputs (empty_graph or prepared match rows) and assert on returned rows/columns, and remove assertions that depend on parsing/planning (i.e., stop calling GraphForge.execute()).tests/integration/test_call_procedures.py (1)
68-70: Assert actual result contents in these end-to-end cases.Several of these tests only check
len(...)or key presence, so they would still pass if CALL returned duplicated or wrong values. Add concrete row/value assertions so the integration suite pins the new semantics, not just the row count.As per coding guidelines, "Integration tests must test end-to-end Cypher query execution with GraphForge.execute() and verify result values".
Also applies to: 91-105, 113-123
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/integration/test_call_procedures.py` around lines 68 - 70, The test test_standalone_with_yield_star currently only asserts len(result) == 3 which allows wrong or duplicated rows to pass; update the test to assert specific result rows and field values returned by gf.execute("CALL test.labels() YIELD *") (e.g., exact set/order of labels and any other yielded columns) so the end-to-end behavior is pinned; apply the same change pattern to the other call-procedure tests referenced (lines covering the other cases at 91-105 and 113-123) to replace length/key-only checks with concrete assertions of row contents and expected values.src/graphforge/executor/executor.py (2)
3252-3255: Addstrict=Truetozip()for defensive validation.While argument count is validated earlier for explicit calls (lines 3213-3218), adding
strict=Trueprovides defense-in-depth, especially for implicit argument calls where the count relationship is less obvious.♻️ Proposed fix
# Validate argument types - for i, (param, py_val) in enumerate(zip(defn.parameters, evaluated_args)): + for i, (param, py_val) in enumerate(zip(defn.parameters, evaluated_args, strict=True)): if py_val is None: continue # null is always allowed for nullable params _check_proc_arg_type(param, py_val, op.procedure_name, i)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/graphforge/executor/executor.py` around lines 3252 - 3255, The zip over defn.parameters and evaluated_args should use strict=True to ensure the two iterables have identical lengths defensively; update the loop in the executor where you iterate with for i, (param, py_val) in enumerate(zip(defn.parameters, evaluated_args)): to call zip(defn.parameters, evaluated_args, strict=True) so mismatched argument counts raise immediately (this affects the validation path that calls _check_proc_arg_type with op.procedure_name and index i).
388-393: Remove redundant self-import.
_python_to_cypheris defined in this same file (line 79), so the import on line 390 is unnecessary and should be removed.🧹 Proposed fix
seed_ctx = ExecutionContext() if parameters: - from graphforge.executor.executor import _python_to_cypher - for pname, pval in parameters.items(): seed_ctx.bind(f"${pname}", _python_to_cypher(pval))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/graphforge/executor/executor.py` around lines 388 - 393, Remove the redundant local import of _python_to_cypher inside the parameter handling block; since _python_to_cypher is defined in the same file (near line 79), delete the line "from graphforge.executor.executor import _python_to_cypher" and leave the loop that calls _python_to_cypher when binding parameters to seed_ctx (ExecutionContext and seed_ctx.bind remain unchanged).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/graphforge/api.py`:
- Around line 189-221: The clone() method currently creates a fresh GraphForge
without copying user-registered procedures, causing CALLs to fail on clones;
update GraphForge.clone to preserve the procedure registry by either passing the
existing self.procedure_registry into the new GraphForge constructor or
assigning/cloning it onto the cloned instance (ensure the planner/executor on
the clone use that same procedure_registry), referencing GraphForge.clone,
self.procedure_registry, ProcedureRegistry and the constructor usages that
initialize planner/executor so registered procedures registered via
register_procedure remain available on the clone.
In `@src/graphforge/executor/executor.py`:
- Around line 3244-3249: The code that builds procedure call arguments uses
_cypher_to_python which tries to access .value and will raise for
NodeRef/EdgeRef; replace calls to _cypher_to_python with _unwrap_cypher_val when
appending procedure arguments (both in the branch that iterates op.arguments and
in the branch that handled pre-evaluated cypher_val) so node/relationship refs
are passed through unchanged; update the usages around evaluated_args,
evaluate_expression, and op.arguments to call _unwrap_cypher_val instead of
_cypher_to_python.
In `@src/graphforge/parser/cypher.lark`:
- Line 314: The IDENTIFIER token currently allows empty backtick-quoted names;
update the token rule in cypher.lark (IDENTIFIER) so backtick-quoted identifiers
require at least one character (e.g. use `+` instead of `*` inside the backtick
alternative) so the lexer fails on `` rather than producing an empty name, and
re-run tests; also search parser.py transformer methods that consume IDENTIFIER
tokens (transformer methods handling identifier rules) to confirm they no longer
need to guard against empty v[0] values.
- Around line 97-107: The CALL grammar currently reuses function_args and
return_item which permits unsupported forms like CALL p(*) or CALL p(DISTINCT n)
and CALL p() YIELD a + 1 AS x; replace those usages with dedicated
procedure-specific rules (e.g. proc_call_args and proc_yield_item or similarly
named rules) and update the CALL alternatives (call_procedure and
call_procedure_implicit) to reference the new rules instead of function_args and
return_item, ensuring proc_name, yield_clause, yield_star and yield_items remain
valid; then add matching transformer methods in parser.py for the new rule names
(e.g. transform_proc_call_args, transform_proc_yield_item) and remove/stop using
the old function_args/return_item transformer handlers so the parser rejects the
unsupported constructs at grammar level.
In `@src/graphforge/parser/parser.py`:
- Around line 67-75: The _get_token_value helper currently only strips outer
backtick delimiters but doesn't unescape doubled backticks inside quoted
identifiers; update _get_token_value (the method in parser.py that inspects
Token.value or the raw item) so that after removing the leading and trailing
backticks it also replaces any doubled backtick occurrences with a single
backtick (e.g., use the inner string v[1:-1].replace("``", "`")) before
returning, ensuring quoted names like `a``b` normalize to a`b`.
- Around line 379-386: The call_procedure transformer loop is currently dropping
the distinct flag from function_args (handling of distinct_arg) so CALL
proc(DISTINCT x) is silently accepted; modify the loop in parser.py (the items
processing inside call_procedure) to explicitly detect a tuple (args_list,
distinct_flag) where distinct_flag is True and raise a parse/SyntaxError
indicating DISTINCT is not allowed in CALL (include "DISTINCT not allowed in
procedure call" or similar), otherwise continue handling non-distinct tuples as
before; reference the call_procedure transformer, the function_args/distinct_arg
tuple shape, and the CallClause context when adding the error.
In `@src/graphforge/planner/planner.py`:
- Around line 179-193: The standalone detection is wrong because
_plan_with_query splits query into chunks so _plan_simple_query only sees the
pre-WITH chunk; update the logic that sets is_standalone (used around
clause.procedure_name, clause.yield_star, clause.arguments) to consider the
original full query pipeline rather than just the local chunk: pass a boolean
(e.g., has_following_pipeline or is_terminal_call) from _plan_with_query into
_plan_simple_query (or accept the full clauses list/context) and compute
is_standalone as "no ReturnClause and no following pipeline continuation (e.g.,
no subsequent WithClause/RETURN in the original clauses)" so YIELD * and
implicit args are only allowed when truly terminal. Ensure references to
clause.procedure_name, clause.yield_star, clause.arguments, _plan_with_query and
_plan_simple_query are updated accordingly.
- Around line 203-240: The YIELD handling never records aliases into
seen_aliases nor binds introduced names into the planner scope, so duplicate
aliases pass and yielded names are not added to _type_context; update the loop
over clause.yield_items (where seen_aliases is declared) to add each resolved
alias to seen_aliases after validation, and after appending the ProcedureCall
(for non-standalone calls with clause.yield_items or yield_star handling) add
code to bind each yielded alias into self._type_context._types (or call the
existing type-binding utility) so the planner treats those names as in-scope for
subsequent WITH/RETURN/CALL; keep using ProcedureCall, clause.procedure_name,
clause.yield_items, seen_aliases, and _type_context to locate where to change.
- Around line 292-304: The current check only inspects the top-level expression
in return_clause.items and misses nested Variable uses; update the logic that
uses _proc_unscoped_outputs, Variable and _type_context.has_variable to
recursively walk each return ClauseItem.expression (e.g., via an AST visitor or
a helper like _walk_expression) and apply the same undefined-output check for
every Variable node found (not just the top-level node), raising the same
SyntaxError when a Variable.name is in _proc_unscoped_outputs and not present in
_type_context.has_variable.
In `@src/graphforge/procedures/registry.py`:
- Around line 85-114: The _impl helper currently treats missing args as
"match-all" and uses zip(param_names, args) which allows wrong-arity calls to
succeed; change _impl so that when param_names is non-empty it rejects calls
whose args length doesn't equal len(param_names) (e.g., return an empty list or
otherwise fail the call) instead of returning all static_rows or comparing only
a prefix; update logic around param_names, args, and the zip-based loop in _impl
to enforce exact arity before doing row comparisons against static_rows.
In `@tests/tck/conftest.py`:
- Around line 234-242: The datatable parsing silently truncates rows via
zip(headers, data_row) causing malformed procedure tables to be accepted; update
the parsing in the datatable handling (use the headers variable, datatable loop
and the per-row logic that calls _parse_param_value) to validate that each
data_row has exactly len(headers) cells and reject or raise an explicit error
when lengths differ (rather than zipping), so malformed rows are refused and
only well-formed rows are appended to rows.
- Around line 210-229: The _parse_columns helper is too permissive and lossy:
the top-level proc name regex and the inner column regex only accept bare-word
identifiers, silently drop unsupported column specs, and strip the '?'
nullability marker; update the regexes used in _re.match (the outer match for
proc_def and the inner cm match inside _parse_columns) to accept either quoted
identifiers (e.g., "..." or '...') or unquoted identifiers and to capture an
optional trailing '?' on types instead of stripping it; change _parse_columns to
preserve nullability (either by returning the type string with the '?' preserved
or by returning an explicit nullable flag alongside name and type) and make it
raise/pytest.fail when an input column spec does not match the supported pattern
rather than silently skipping it so misdeclared signatures are surfaced (refer
to function _parse_columns and the surrounding proc_def matching code).
In `@tests/unit/planner/test_call_procedure_planner.py`:
- Around line 17-27: The helper make_registry_with_proc incorrectly treats empty
lists as None by using "params or []" and "outputs or [...]", so zero-parameter
or void procedures get the wrong defaults; update make_registry_with_proc to
check for None explicitly (e.g., use params if params is not None else [] and
outputs if outputs is not None else [ProcedureColumn(...)] ) when constructing
the ProcedureDefinition (refer to make_registry_with_proc, ProcedureDefinition,
parameters, output_columns) so empty lists are preserved by tests.
---
Nitpick comments:
In `@src/graphforge/ast/clause.py`:
- Around line 305-311: The CallClause Pydantic model currently allows invalid
combinations of fields (query, procedure_name, arguments, yield_items,
yield_star); add a `@model_validator`(mode="after") on the CallClause class that
enforces exactly one mode: either subquery form (query is not None,
procedure_name/arguments must be None, yield_star must be False and yield_items
optional) or procedure form (procedure_name is not None, arguments may be
present, query must be None, and yield_star/yield_items must be None); raise a
ValueError with a clear message when neither or both modes are present or when
procedure-only fields are set on a subquery form.
In `@src/graphforge/executor/executor.py`:
- Around line 3252-3255: The zip over defn.parameters and evaluated_args should
use strict=True to ensure the two iterables have identical lengths defensively;
update the loop in the executor where you iterate with for i, (param, py_val) in
enumerate(zip(defn.parameters, evaluated_args)): to call zip(defn.parameters,
evaluated_args, strict=True) so mismatched argument counts raise immediately
(this affects the validation path that calls _check_proc_arg_type with
op.procedure_name and index i).
- Around line 388-393: Remove the redundant local import of _python_to_cypher
inside the parameter handling block; since _python_to_cypher is defined in the
same file (near line 79), delete the line "from graphforge.executor.executor
import _python_to_cypher" and leave the loop that calls _python_to_cypher when
binding parameters to seed_ctx (ExecutionContext and seed_ctx.bind remain
unchanged).
In `@src/graphforge/planner/operators.py`:
- Around line 598-604: The ProcedureCall model allows impossible combos (e.g.,
both yield_star and yield_items, or implicit arguments (arguments is None) on a
non-standalone call); add a Pydantic cross-field validator on ProcedureCall
using `@model_validator`(mode="after") that checks: if yield_star is True and
yield_items is not None (or non-empty) raise a ValueError; if arguments is None
and standalone is False raise a ValueError; return the model instance at the
end. Reference the ProcedureCall class and its fields procedure_name, arguments,
yield_items, yield_star, and standalone when adding the validator.
In `@tests/integration/test_call_procedures.py`:
- Around line 68-70: The test test_standalone_with_yield_star currently only
asserts len(result) == 3 which allows wrong or duplicated rows to pass; update
the test to assert specific result rows and field values returned by
gf.execute("CALL test.labels() YIELD *") (e.g., exact set/order of labels and
any other yielded columns) so the end-to-end behavior is pinned; apply the same
change pattern to the other call-procedure tests referenced (lines covering the
other cases at 91-105 and 113-123) to replace length/key-only checks with
concrete assertions of row contents and expected values.
In `@tests/unit/executor/test_procedure_call_executor.py`:
- Around line 46-143: The tests in
TestStandaloneCallExecution/TestInQueryCallExecution/TestCallWithMatchContext
use GraphForge.execute() which exercises the full stack; replace those usages by
building ProcedureCall operator nodes directly (use the ProcedureCall
AST/operator) and run them against the prepared executor harness (invoke the
ProcedureCall execution path or ProcedureCallExecutor) so the tests isolate the
procedure executor; keep using the helper factories labels_proc(),
lookup_proc(), void_proc() to register procedures and construct the operator
inputs (empty_graph or prepared match rows) and assert on returned rows/columns,
and remove assertions that depend on parsing/planning (i.e., stop calling
GraphForge.execute()).
In `@tests/unit/parser/test_call_procedure_parser.py`:
- Around line 7-11: The module-level mutable parser variable should be replaced
with a pytest fixture that returns a fresh CypherParser for each test; remove
the global parser and change the helper or tests to accept a fixture (e.g., a
pytest fixture named "parser" that returns CypherParser()) so each test uses its
own instance when calling parse, ensuring no shared state from the previous
module-level parser or the existing parse() wrapper.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 1933610f-3287-4291-9492-bc84dd99fe70
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock,!**/*.lock
📒 Files selected for processing (16)
src/graphforge/api.pysrc/graphforge/ast/clause.pysrc/graphforge/executor/executor.pysrc/graphforge/parser/cypher.larksrc/graphforge/parser/parser.pysrc/graphforge/planner/operators.pysrc/graphforge/planner/planner.pysrc/graphforge/procedures/__init__.pysrc/graphforge/procedures/registry.pytests/integration/test_call_procedures.pytests/tck/conftest.pytests/unit/executor/test_procedure_call_executor.pytests/unit/parser/test_call_procedure_parser.pytests/unit/planner/test_call_procedure_planner.pytests/unit/procedures/__init__.pytests/unit/procedures/test_registry.py
| self.procedure_registry = ProcedureRegistry() | ||
| self.planner = QueryPlanner(procedure_registry=self.procedure_registry) | ||
| self.optimizer = QueryOptimizer() if enable_optimizer else None | ||
| self.executor = QueryExecutor(self.graph, graphforge=self, planner=self.planner) | ||
| self.executor = QueryExecutor( | ||
| self.graph, | ||
| graphforge=self, | ||
| planner=self.planner, | ||
| procedure_registry=self.procedure_registry, | ||
| ) | ||
|
|
||
| def register_procedure(self, defn: Any) -> None: | ||
| """Register a user-defined procedure for use in CALL statements. | ||
|
|
||
| Args: | ||
| defn: A ProcedureDefinition instance describing the procedure's | ||
| name, parameters, output columns, and implementation callable. | ||
|
|
||
| Example:: | ||
|
|
||
| from graphforge.procedures import ProcedureDefinition, ProcedureColumn | ||
|
|
||
| def my_impl(x): | ||
| return [{"out": x * 2}] | ||
|
|
||
| gf.register_procedure(ProcedureDefinition( | ||
| name="my.double", | ||
| parameters=[ProcedureColumn("x", "INTEGER")], | ||
| output_columns=[ProcedureColumn("out", "INTEGER")], | ||
| implementation=my_impl, | ||
| )) | ||
| gf.execute("CALL my.double(21) YIELD out RETURN out") | ||
| """ | ||
| self.procedure_registry.register(defn) |
There was a problem hiding this comment.
clone() will currently drop user-registered procedures.
This PR makes procedures instance state via self.procedure_registry, but clone() still creates a fresh GraphForge() and only copies graph data. After register_procedure(), the clone can start failing CALL queries that succeed on the source unless the registry is copied over or the non-cloning behavior is documented explicitly.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/graphforge/api.py` around lines 189 - 221, The clone() method currently
creates a fresh GraphForge without copying user-registered procedures, causing
CALLs to fail on clones; update GraphForge.clone to preserve the procedure
registry by either passing the existing self.procedure_registry into the new
GraphForge constructor or assigning/cloning it onto the cloned instance (ensure
the planner/executor on the clone use that same procedure_registry), referencing
GraphForge.clone, self.procedure_registry, ProcedureRegistry and the constructor
usages that initialize planner/executor so registered procedures registered via
register_procedure remain available on the clone.
| | "CALL"i proc_name "(" function_args? ")" yield_clause? -> call_procedure | ||
| | "CALL"i proc_name yield_clause? -> call_procedure_implicit | ||
|
|
||
| // Qualified procedure name: test.doNothing / test.my.proc / ns.sub.name | ||
| proc_name: IDENTIFIER ("." IDENTIFIER)* | ||
|
|
||
| // YIELD clause (procedure results projection) | ||
| // YIELD * — all outputs (standalone only) | ||
| // YIELD col1, col2 AS alias — explicit projection | ||
| yield_clause: "YIELD"i "*" -> yield_star | ||
| | "YIELD"i return_item ("," return_item)* -> yield_items |
There was a problem hiding this comment.
Don't reuse function_args and return_item for procedure CALL syntax.
function_args makes CALL p(*) and CALL p(DISTINCT n) parse, while return_item makes CALL p() YIELD a + 1 AS x parse. Those are outside the supported CALL surface and should be rejected in the grammar instead of later layers. This will also need matching transformer methods in parser.py for the new rule names.
Suggested grammar split
+procedure_args: expression ("," expression)*
+yield_item: IDENTIFIER ("AS"i IDENTIFIER)?
+
call_clause: "CALL"i "{" query "}" -> call_subquery
- | "CALL"i proc_name "(" function_args? ")" yield_clause? -> call_procedure
+ | "CALL"i proc_name "(" procedure_args? ")" yield_clause? -> call_procedure
| "CALL"i proc_name yield_clause? -> call_procedure_implicit
@@
yield_clause: "YIELD"i "*" -> yield_star
- | "YIELD"i return_item ("," return_item)* -> yield_items
+ | "YIELD"i yield_item ("," yield_item)* -> yield_items🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/graphforge/parser/cypher.lark` around lines 97 - 107, The CALL grammar
currently reuses function_args and return_item which permits unsupported forms
like CALL p(*) or CALL p(DISTINCT n) and CALL p() YIELD a + 1 AS x; replace
those usages with dedicated procedure-specific rules (e.g. proc_call_args and
proc_yield_item or similarly named rules) and update the CALL alternatives
(call_procedure and call_procedure_implicit) to reference the new rules instead
of function_args and return_item, ensuring proc_name, yield_clause, yield_star
and yield_items remain valid; then add matching transformer methods in parser.py
for the new rule names (e.g. transform_proc_call_args,
transform_proc_yield_item) and remove/stop using the old
function_args/return_item transformer handlers so the parser rejects the
unsupported constructs at grammar level.
| m = _re.match( | ||
| r"(?P<name>[\w.]+)\s*\((?P<params>[^)]*)\)\s*::\s*\((?P<outputs>[^)]*)\)", | ||
| proc_def.strip(), | ||
| ) | ||
| if not m: | ||
| pytest.fail(f"Cannot parse procedure definition: {proc_def!r}") | ||
|
|
||
| proc_name = m.group("name") | ||
|
|
||
| def _parse_columns(col_str: str) -> list[tuple[str, str]]: | ||
| """Parse 'name :: TYPE?, ...' into [(name, type), ...].""" | ||
| cols = [] | ||
| for raw in col_str.split(","): | ||
| stripped = raw.strip() | ||
| if not stripped: | ||
| continue | ||
| cm = _re.match(r"(?P<n>\w+)\s*::\s*(?P<t>\w+\??)", stripped) | ||
| if cm: | ||
| cols.append((cm.group("n"), cm.group("t").rstrip("?"))) | ||
| return cols |
There was a problem hiding this comment.
Quoted or non-null TCK signatures are misregistered here.
These regexes only accept bare word identifiers, silently drop unsupported column specs, and strip ? without preserving nullability. A scenario that declares quoted names or non-null columns will register a different procedure contract than the one the parser/executor is supposed to validate.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/tck/conftest.py` around lines 210 - 229, The _parse_columns helper is
too permissive and lossy: the top-level proc name regex and the inner column
regex only accept bare-word identifiers, silently drop unsupported column specs,
and strip the '?' nullability marker; update the regexes used in _re.match (the
outer match for proc_def and the inner cm match inside _parse_columns) to accept
either quoted identifiers (e.g., "..." or '...') or unquoted identifiers and to
capture an optional trailing '?' on types instead of stripping it; change
_parse_columns to preserve nullability (either by returning the type string with
the '?' preserved or by returning an explicit nullable flag alongside name and
type) and make it raise/pytest.fail when an input column spec does not match the
supported pattern rather than silently skipping it so misdeclared signatures are
surfaced (refer to function _parse_columns and the surrounding proc_def matching
code).
…190) - Reject empty backtick identifiers in grammar - Unescape doubled backticks in _get_token_value - Use _unwrap_cypher_val (not _cypher_to_python) for procedure args - Fix standalone detection in WITH-segmented queries - Bind YIELD aliases into type_context scope - Walk full expression tree for UndefinedVariable check in RETURN - Reject wrong-arity calls in register_from_tck _impl - Assert row/header length match in TCK procedure table parsing - Use explicit None-check instead of falsy-check for params/outputs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
tests/tck/conftest.py (1)
210-229:⚠️ Potential issue | 🟠 MajorProcedure signature parsing still mutates/loosens declared contracts.
_parse_columnssilently drops unsupported specs and strips?, and the top-level proc-name regex still excludes quoted/backtick identifiers. This can register a different procedure contract than the scenario declares.Proposed fix
- m = _re.match( - r"(?P<name>[\w.]+)\s*\((?P<params>[^)]*)\)\s*::\s*\((?P<outputs>[^)]*)\)", + ident = r"(?:`(?:[^`]|``)+`|[A-Za-z_]\w*)" + m = _re.fullmatch( + rf"(?P<name>{ident}(?:\.{ident})*)\s*\((?P<params>[^)]*)\)\s*::\s*\((?P<outputs>[^)]*)\)", proc_def.strip(), ) @@ - for raw in col_str.split(","): + for raw in col_str.split(","): stripped = raw.strip() if not stripped: continue - cm = _re.match(r"(?P<n>\w+)\s*::\s*(?P<t>\w+\??)", stripped) - if cm: - cols.append((cm.group("n"), cm.group("t").rstrip("?"))) + cm = _re.fullmatch(rf"(?P<n>{ident})\s*::\s*(?P<t>[A-Za-z_]\w*\??)", stripped) + if not cm: + pytest.fail(f"Unsupported procedure column spec: {stripped!r}") + raw_name = cm.group("n") + name = ( + raw_name[1:-1].replace("``", "`") + if raw_name.startswith("`") and raw_name.endswith("`") + else raw_name + ) + cols.append((name, cm.group("t"))) return cols🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/tck/conftest.py` around lines 210 - 229, The current proc signature parser weakens/contracts declarations: the proc_name regex (used when matching into m and extracting proc_name) must be extended to accept quoted/backtick identifiers (not only [\w.]) and the _parse_columns function should not silently drop unsupported specs or strip the optional marker; instead, when _parse_columns sees an unrecognized column pattern it should call pytest.fail (or raise) to surface the parsing error, and it should preserve the optional marker (e.g., return a tuple like (name, type, optional) or keep '?' in the type) rather than rstrip("?"), so update the regex/matching logic and the return shape of _parse_columns accordingly and update callers to handle the preserved optional info.src/graphforge/parser/cypher.lark (1)
97-107:⚠️ Potential issue | 🟠 MajorSplit CALL args and YIELD items into procedure-specific rules.
Reusing
function_argsandreturn_itemstill lets unsupported forms parse here:CALL p(*),CALL p(DISTINCT n), andCALL p() YIELD a + 1 AS x.src/graphforge/parser/parser.py:358-417also still special-casescount_star/distinct_arg, so these invalid shapes are carried past the grammar instead of being rejected at parse time.♻️ Proposed grammar split
call_clause: "CALL"i "{" query "}" -> call_subquery - | "CALL"i proc_name "(" function_args? ")" yield_clause? -> call_procedure + | "CALL"i proc_name "(" procedure_args? ")" yield_clause? -> call_procedure | "CALL"i proc_name yield_clause? -> call_procedure_implicit @@ +procedure_args: expression ("," expression)* + +yield_item: IDENTIFIER ("AS"i IDENTIFIER)? + yield_clause: "YIELD"i "*" -> yield_star - | "YIELD"i return_item ("," return_item)* -> yield_items + | "YIELD"i yield_item ("," yield_item)* -> yield_itemsYou'll also need matching transformer methods in
src/graphforge/parser/parser.pyfor the new rule names.
Based on learnings: Debug parser issues by checkingcypher.larkgrammar syntax and transformer methods inparser.pythat match rule names.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/graphforge/parser/cypher.lark` around lines 97 - 107, The grammar currently reuses function_args and return_item for CALL which allows invalid shapes like CALL p(*) and CALL p(DISTINCT n) and lets expressions in YIELD (e.g., a + 1 AS x); introduce procedure-specific rules (e.g., procedure_args, procedure_arg (no DISTINCT/count_star), procedure_yield_item) and replace usages in the CALL alternatives (call_procedure, call_procedure_implicit, yield_clause) so only allowed forms parse, and then add/rename corresponding transformer methods in parser.py to handle the new rule names instead of relying on function_args/return_item and the existing special-casing (count_star/distinct_arg).
🧹 Nitpick comments (2)
src/graphforge/planner/planner.py (2)
490-495: Preferenumerate()over.index()for iteration tracking.Using
segments.index(segment)inside the loop is O(n) per iteration and could return the wrong index if duplicate objects exist. While segments are built fresh and unlikely to have duplicates, usingenumerate()is cleaner and more robust.♻️ Suggested refactor
# Plan each segment - for segment in segments: + for seg_idx, segment in enumerate(segments): if isinstance(segment, WithClause): # ... existing WITH handling ... elif isinstance(segment, list): # Determine if more segments follow this one (makes CALL in-query) - seg_idx = segments.index(segment) has_following = seg_idx < len(segments) - 1 operators.extend( self._plan_simple_query(segment, has_following_clauses=has_following) )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/graphforge/planner/planner.py` around lines 490 - 495, Replace the per-iteration O(n) lookup using segments.index(segment) by iterating with enumerate so the loop has the current index directly; in the loop that calls self._plan_simple_query(segment, has_following_clauses=...), compute has_following as idx < len(segments) - 1 using the enumerate index (e.g., for idx, segment in enumerate(segments)) and pass that to _plan_simple_query to avoid incorrect indices for duplicate segments and improve performance.
1181-1199: Expression walker is functional but relies on generic fallback.The
_collect_variablesmethod handlesBinaryOpandFunctionCallexplicitly but relies on the generichasattr(expr, "__dict__")fallback for other expression types likeUnaryOp,CaseExpression,ListComprehension, andSubscript. This works because Pydantic models expose fields viavars(), but explicitly handling common expression types would make the intent clearer.The current implementation is acceptable given the generic fallback provides coverage.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/graphforge/planner/planner.py` around lines 1181 - 1199, _collect_variables currently relies on a generic hasattr(expr, "__dict__") fallback for several common expression types; explicitly handle UnaryOp, CaseExpression, ListComprehension, and Subscript to make intent clearer and avoid depending on generic introspection. Update the function (method _collect_variables) to add explicit branches similar to the BinaryOp and FunctionCall handling: for UnaryOp recurse into its operand, for CaseExpression recurse into its test/when/then/else parts, for ListComprehension recurse into the element and generators/ifs, and for Subscript recurse into the value and the slice; keep the generic __dict__ fallback as a last resort. Ensure you reference the types UnaryOp, CaseExpression, ListComprehension, and Subscript in the if/elif chain so those nodes are processed explicitly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/graphforge/executor/executor.py`:
- Around line 389-393: The code redundantly imports _python_to_cypher from
graphforge.executor.executor inside the parameters handling block; remove the
local import and call the module-level function directly. Update the block that
iterates parameters (parameters, seed_ctx.bind, _python_to_cypher) to drop the
from graphforge.executor.executor import _python_to_cypher line so it uses the
existing _python_to_cypher defined in this file.
In `@src/graphforge/parser/cypher.lark`:
- Line 314: The IDENTIFIER token currently rejects empty quoted names and fails
to preserve doubled-backtick escapes; update the IDENTIFIER token in cypher.lark
so its quoted-identifier branch accepts zero or more occurrences of either a
doubled backtick or any non-backtick character between the outer backticks
(instead of the current `[^`]+`), while keeping the unquoted identifier branch
([a-zA-Z_][a-zA-Z0-9_]*); this lets empty quoted names and preserves ````
escapes (so downstream unquoting logic can collapse doubled backticks).
---
Duplicate comments:
In `@src/graphforge/parser/cypher.lark`:
- Around line 97-107: The grammar currently reuses function_args and return_item
for CALL which allows invalid shapes like CALL p(*) and CALL p(DISTINCT n) and
lets expressions in YIELD (e.g., a + 1 AS x); introduce procedure-specific rules
(e.g., procedure_args, procedure_arg (no DISTINCT/count_star),
procedure_yield_item) and replace usages in the CALL alternatives
(call_procedure, call_procedure_implicit, yield_clause) so only allowed forms
parse, and then add/rename corresponding transformer methods in parser.py to
handle the new rule names instead of relying on function_args/return_item and
the existing special-casing (count_star/distinct_arg).
In `@tests/tck/conftest.py`:
- Around line 210-229: The current proc signature parser weakens/contracts
declarations: the proc_name regex (used when matching into m and extracting
proc_name) must be extended to accept quoted/backtick identifiers (not only
[\w.]) and the _parse_columns function should not silently drop unsupported
specs or strip the optional marker; instead, when _parse_columns sees an
unrecognized column pattern it should call pytest.fail (or raise) to surface the
parsing error, and it should preserve the optional marker (e.g., return a tuple
like (name, type, optional) or keep '?' in the type) rather than rstrip("?"), so
update the regex/matching logic and the return shape of _parse_columns
accordingly and update callers to handle the preserved optional info.
---
Nitpick comments:
In `@src/graphforge/planner/planner.py`:
- Around line 490-495: Replace the per-iteration O(n) lookup using
segments.index(segment) by iterating with enumerate so the loop has the current
index directly; in the loop that calls self._plan_simple_query(segment,
has_following_clauses=...), compute has_following as idx < len(segments) - 1
using the enumerate index (e.g., for idx, segment in enumerate(segments)) and
pass that to _plan_simple_query to avoid incorrect indices for duplicate
segments and improve performance.
- Around line 1181-1199: _collect_variables currently relies on a generic
hasattr(expr, "__dict__") fallback for several common expression types;
explicitly handle UnaryOp, CaseExpression, ListComprehension, and Subscript to
make intent clearer and avoid depending on generic introspection. Update the
function (method _collect_variables) to add explicit branches similar to the
BinaryOp and FunctionCall handling: for UnaryOp recurse into its operand, for
CaseExpression recurse into its test/when/then/else parts, for ListComprehension
recurse into the element and generators/ifs, and for Subscript recurse into the
value and the slice; keep the generic __dict__ fallback as a last resort. Ensure
you reference the types UnaryOp, CaseExpression, ListComprehension, and
Subscript in the if/elif chain so those nodes are processed explicitly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 3620c2db-a0fa-4c7e-83b7-eef23ee1e5a5
📒 Files selected for processing (7)
src/graphforge/executor/executor.pysrc/graphforge/parser/cypher.larksrc/graphforge/parser/parser.pysrc/graphforge/planner/planner.pysrc/graphforge/procedures/registry.pytests/tck/conftest.pytests/unit/planner/test_call_procedure_planner.py
✅ Files skipped from review due to trivial changes (1)
- src/graphforge/parser/parser.py
🚧 Files skipped from review as they are similar to previous changes (2)
- tests/unit/planner/test_call_procedure_planner.py
- src/graphforge/procedures/registry.py
| if parameters: | ||
| from graphforge.executor.executor import _python_to_cypher | ||
|
|
||
| for pname, pval in parameters.items(): | ||
| seed_ctx.bind(f"${pname}", _python_to_cypher(pval)) |
There was a problem hiding this comment.
Redundant self-import.
_python_to_cypher is already defined at module level in this same file, so importing it from graphforge.executor.executor is unnecessary.
🐛 Proposed fix
seed_ctx = ExecutionContext()
if parameters:
- from graphforge.executor.executor import _python_to_cypher
-
for pname, pval in parameters.items():
seed_ctx.bind(f"${pname}", _python_to_cypher(pval))
rows: list[Any] = [seed_ctx]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/graphforge/executor/executor.py` around lines 389 - 393, The code
redundantly imports _python_to_cypher from graphforge.executor.executor inside
the parameters handling block; remove the local import and call the module-level
function directly. Update the block that iterates parameters (parameters,
seed_ctx.bind, _python_to_cypher) to drop the from graphforge.executor.executor
import _python_to_cypher line so it uses the existing _python_to_cypher defined
in this file.
| FALSE: /false/i | ||
| NULL: /null/i | ||
| IDENTIFIER: /[a-zA-Z_][a-zA-Z0-9_]*/ | ||
| IDENTIFIER: /`[^`]+`|[a-zA-Z_][a-zA-Z0-9_]*/ |
There was a problem hiding this comment.
Preserve doubled-backtick escapes in quoted identifiers.
This token now rejects empty quoted names, but it also rejects valid Cypher identifiers like `ab` ``. The lexer will fail before the parser's unquoting logic gets a chance to collapse doubled backticks.
🐛 Minimal token fix
-IDENTIFIER: /`[^`]+`|[a-zA-Z_][a-zA-Z0-9_]*/
+IDENTIFIER: /`(?:[^`]|``)+`|[a-zA-Z_][a-zA-Z0-9_]*/Based on learnings: Grammar rules must be defined in src/graphforge/parser/cypher.lark using EBNF syntax.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| IDENTIFIER: /`[^`]+`|[a-zA-Z_][a-zA-Z0-9_]*/ | |
| IDENTIFIER: /`(?:[^`]|``)+`|[a-zA-Z_][a-zA-Z0-9_]*/ |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/graphforge/parser/cypher.lark` at line 314, The IDENTIFIER token
currently rejects empty quoted names and fails to preserve doubled-backtick
escapes; update the IDENTIFIER token in cypher.lark so its quoted-identifier
branch accepts zero or more occurrences of either a doubled backtick or any
non-backtick character between the outer backticks (instead of the current
`[^`]+`), while keeping the unquoted identifier branch ([a-zA-Z_][a-zA-Z0-9_]*);
this lets empty quoted names and preserves ```` escapes (so downstream unquoting
logic can collapse doubled backticks).
* ci: re-apply Python 3.14 promotion to required (reverted by linter) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: fix implementation status inaccuracies for functions (#249) - functions.md: mark extract(), filter(), reduce() as COMPLETE (all three are implemented as dedicated grammar rules, not missing) - functions.md: mark all(), any(), none(), single(), exists(), isEmpty() as COMPLETE (were incorrectly listed as NOT_IMPLEMENTED) - functions.md: add missing numeric functions: e(), pi(), exp(), log(), log10(), trig (sin/cos/tan/cot/asin/acos/atan/atan2), degrees(), radians() — total numeric count corrected from 10 to 19 - functions.md: add startNode(), endNode() to Scalar section - functions.md: remove stale line-number references (changed in Phases 5-9); update summary stats and version history to v0.3.8 - functions.md: update Limitations and Priority sections to remove now-fixed items; add Known Gaps table with open issues - opencypher-compatibility-matrix.md: correct all the same function statuses; flip XOR, ^, list slicing, negative indexing to ✅; update CALL procedures to ✅ (v0.3.7 #351); update TCK metrics to 3,801/3,885 (97.8%); replace stale v0.3.1-v0.3.7 roadmap with current v0.3.9/v0.4.0 roadmap; update version history Closes #249 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* ci: re-apply Python 3.14 promotion to required (reverted by linter) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: fix implementation status inaccuracies for functions (#249) - functions.md: mark extract(), filter(), reduce() as COMPLETE (all three are implemented as dedicated grammar rules, not missing) - functions.md: mark all(), any(), none(), single(), exists(), isEmpty() as COMPLETE (were incorrectly listed as NOT_IMPLEMENTED) - functions.md: add missing numeric functions: e(), pi(), exp(), log(), log10(), trig (sin/cos/tan/cot/asin/acos/atan/atan2), degrees(), radians() — total numeric count corrected from 10 to 19 - functions.md: add startNode(), endNode() to Scalar section - functions.md: remove stale line-number references (changed in Phases 5-9); update summary stats and version history to v0.3.8 - functions.md: update Limitations and Priority sections to remove now-fixed items; add Known Gaps table with open issues - opencypher-compatibility-matrix.md: correct all the same function statuses; flip XOR, ^, list slicing, negative indexing to ✅; update CALL procedures to ✅ (v0.3.7 #351); update TCK metrics to 3,801/3,885 (97.8%); replace stale v0.3.1-v0.3.7 roadmap with current v0.3.9/v0.4.0 roadmap; update version history Closes #249 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: EXISTS subquery support — simple, full, nested, and validation (#385) - Grammar: add `exists_simple` alternative to `exists_expr` for bare pattern form `EXISTS { pattern [WHERE expr] }` alongside existing full form `EXISTS { query }` - AST: extend `SubqueryExpression` with optional `pattern_parts` and `where_predicate` fields for the simple form (query=None when unused) - Parser: add `exists_full` and `exists_simple` transformer methods; update `count_expr` and legacy `exists_expr` to pass all optional fields explicitly - Planner: add `outer_scope_vars` parameter to `plan()` so correlated subquery variables from the outer MATCH are pre-declared, preventing false UndefinedVariable errors; add `_validate_exists_subqueries` / `_validate_no_write_in_exists` to raise SyntaxError(InvalidClauseComposition) when write clauses (SET, REMOVE, CREATE, DELETE, MERGE) appear in EXISTS bodies - Evaluator: route simple form to `_evaluate_simple_exists` which walks pattern parts against the graph, binds new variables per match, and evaluates the WHERE predicate; full form passes outer scope bindings when planning the subquery Fixes all 10 TCK existential-subquery scenarios (ExistentialSubquery1-3), including nested EXISTS and the must-fail update-in-EXISTS validation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: address CodeRabbit review issues on EXISTS subquery (#385) evaluator.py — _evaluate_simple_exists: - Carry _cur sentinel key in each binding dict so anonymous intermediate node patterns (no variable name) advance traversal correctly; previously an anonymous src caused a silent skip via the `elif src_v: continue` branch - Add src label validation for the bound-variable seed path (mirrors the label check already present for unbound scans) - Implement var-length hop expansion inside _extend_hop: when rel_pattern has min_hops/max_hops set, delegate to _var_length_reachable and apply dst label/equality checks to each reachable node planner.py — _validate_exists_subqueries / _validate_no_write_in_exists: - Pass clause.where.predicate (not clause.where) to avoid a WhereClause object reaching _validate_no_write_in_exists with no matching isinstance - Iterate WithClause.items and recurse into each item.expression so EXISTS subqueries embedded in WITH projections are validated - Recurse into SubqueryExpression.where_predicate in _validate_no_write_in_exists so write clauses nested inside a simple-form WHERE are caught Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: update language to Full OpenCypher* and fix flaky network test in CI - Update comparison tables in README and docs from "openCypher subset" to "Full OpenCypher*" with footnote: 3,865/3,885 TCK passing, 20 permanent xfails due to CPython integer/float precision limitations - Add @pytest.mark.snap to TestSmallDatasetLoading and TestComplexQueries so their live network downloads are skipped in CI - Exclude snap-marked tests in CI integration and coverage shard runs via -m "not snap" (snap marker was already documented as CI-skipped in pyproject.toml but not enforced in the workflow) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Closes #190
Summary
Implements the CALL clause for procedure invocation across all 4 layers of the GraphForge architecture.
CALL proc.name(args) [YIELD items],CALL proc.name(implicit, standalone only),YIELD *andYIELD item AS alias; backtick-quoted identifiersCallClausewith procedure fields; transformer methods for all new grammar rulesProcedureColumn,ProcedureDefinition,ProcedureRegistrywithregister_from_tckfor static-row TCK proceduresProcedureCalloperator; compile-time validation ofYIELD *in-query, implicit args in-query, aggregation in args,VariableAlreadyBound,UndefinedVariablefor unYIELDed outputsTest plan
test_call_procedures.py)🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Tests