fix(cypher): grammar-resolve where_predicates vs expr ambiguity (#1194)#1203
Closed
fix(cypher): grammar-resolve where_predicates vs expr ambiguity (#1194)#1203
Conversation
Contributor
Author
|
Review rerun (wave 3/4 equivalent) found no blocking issues, but one suggestion-level hardening item:
Current implementation is functionally sound for #1194 scope; this is mainly a robustness/observability improvement for future maintenance. |
8 tasks
Contributor
Author
|
Closing in favor of #1209, which addresses #1194 via the cleaner approach now that #1200 slice 1 (#1202) has landed. What changed since this PR was opened:
Why this PR is superseded:
Replacement (#1209):
The branch ( |
lmeyerov
added a commit
that referenced
this pull request
Apr 25, 2026
… (#1209) Replace the text-level `split_top_level_and` + regex loop in `generic_where_clause` with a structural walker over Lark's parsed `BooleanExpr` (and `_ExpressionSlice` for the single-atom path). Closes #1194 in the spirit of the issue: the grammar already declares the structured rule we want, and slice 1 (#1202) already exposes the parsed tree on `WhereClause.expr_tree` — `generic_where_clause` should trust that source of truth instead of re-splitting the WHERE body on top-level AND. Adds two helpers: - `_match_bare_label_atom(text)` — fullmatch atom text against `_BARE_LABEL_PREDICATE_RE` (preserves the #1125 false-positive guard). - `_lift_label_only_and_spine(node)` — DFS over a `BooleanExpr` AND-spine; returns lifted `(alias, labels)` tuples iff every leaf is a bare-label atom, else `None` (all-or-nothing — mixed/OR/XOR/NOT fall through). Behaviorally identical to master across the WHERE-shape matrix: single label, multi-AND chains, mixed label+property, OR/XOR/NOT, parenthesized boolean trees, and string-literal false-positive guards all produce the same `WhereClause` shape. Verified by parser, binder, slice-1 producer, slice-2 conformance + binder_expr_tree, and lowering test suites (1528 passed, no regressions). Targeted mypy clean. Adds 4 focused unit tests for the new helpers in `test_parser.py` and updates the stale comment in `test_parse_where_triple_and_label_conjunction_through_generic_where_clause` to reference the walker (was: `split_top_level_and`). `expr_split.py` and the regex are unchanged — both still have other callers (where-pattern canonicalization, lowering); their retirement is deferred to later slices of #1200. Supersedes the Earley-sub-parser approach in PR #1203, which would have introduced a second Lark parser path; the structural walker is cleaner now that slice 1 has landed on master. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
generic_where_clausewith grammar-driven reparsing of ambiguousWHEREclauseslalr(to avoid query-shape regressions), and adds a targetedearleywhere_clausesubparser for ambiguous label-predicate casesWhy this shape
earleyfixed ambiguity but regressed clause attachment (for exampleWITH ... WHERE ...)where_clausereparsing preserves existing global parse behavior while removing the regex fallback and still using grammar-level disambiguationValidation
PYTHONPATH=. uv run --no-project --with lark --with pytest python -m pytest graphistry/tests/compute/gfql/cypher/test_parser.py -qPYTHONPATH=. uv run --no-project --with lark --with pytest python -m pytest graphistry/tests/compute/gfql/cypher/test_binder.py -qPYTHONPATH=. uv run --no-project --with lark --with pytest python -m pytest graphistry/tests/compute/gfql/cypher/test_lowering.py -qPYTHONPATH=. uv run --no-project --with ruff ruff check graphistry/compute/gfql/cypher/parser.py graphistry/compute/gfql/frontends/cypher/binder.py graphistry/tests/compute/gfql/cypher/test_parser.py graphistry/tests/compute/gfql/cypher/test_binder.py