Skip to content

Conversation

@beinan
Copy link
Collaborator

@beinan beinan commented Jan 28, 2026

Fixes #107.

  • Fail fast on unknown/unimplemented Cypher scalar functions (no more silent 0 fallback)
  • Enforce semantic errors during planning/execution (DataFusion + Simple)
  • Add alias tracking + post-WITH MATCH variable discovery in semantic analysis
  • Implement tolower/lower + toupper/upper in simple executor

Tests: cargo test

@beinan beinan requested a review from ChunxuTang January 28, 2026 22:06
@codecov-commenter
Copy link

codecov-commenter commented Jan 28, 2026

Codecov Report

❌ Patch coverage is 88.39779% with 21 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
crates/lance-graph/src/semantic.rs 82.05% 14 Missing ⚠️
crates/lance-graph/src/simple_executor/expr.rs 87.50% 7 Missing ⚠️

📢 Thoughts on this report? Let us know!

}

#[tokio::test]
async fn test_tolower_works_in_simple_executor() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@beinan This file is mainly for tests of datafusion-planner. For tests of the simple executor, maybe move them to another file?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — moved the Simple executor test out of tests/test_datafusion_pipeline.rs into tests/test_simple_executor_pipeline.rs (and added a couple of small unit tests in simple_executor/expr.rs to cover the new branches). Thanks for the suggestion.

VE::Variable(v) => col(v),
VE::Literal(v) => to_df_literal(v),
VE::Function { .. } | VE::Arithmetic { .. } => lit(0),
VE::Function { name, args } => match name.to_lowercase().as_str() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I attempted this with Codex and I got a similar solution, but I was wondering if this was a sufficiently good solution, because tolower() and lower() are technically a subset of string scalar functions in Cypher, and there are potentially many more useful string functions that could be implemented in the future by other contributors.

Is there any particular reason why these would get special treatment in the simple executor? I also didn't fully understand why the planner would pick the simple executor in the case where this scalar Cypher function is called, bypassing the DataFusion planner entirely (where to_lowercase() is implemented. Curious if you had an explanation for that. Thanks!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also didn't fully understand why the planner would pick the simple executor in the case where this scalar Cypher function is called, bypassing the DataFusion planner entirely (where to_lowercase() is implemented

@prrao87 Are you referring to the issue #106?
I couldn't reproduce your error. The lower() example you provided works on my side. I guess the issue had already been fixed in the current codebase (but may not be in v0.4.0 if you directly used that)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, you're right. I am using the version 0.4.0 on PyPI, so that explains why it didn't work. Is there a plan to release 0.5.0 sometime soon with that and the WITH clause support? Thanks!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's a good time to release another version, maybe later this week or this weekend. Cc @beinan (in case you want a later date)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re: special-casing in the Simple executor — it’s a legacy/limited path that still gets used when callers explicitly choose (and historically it had the fallback). The goal here was to eliminate silent wrong results; adding was the smallest change to make common string-casing queries behave sensibly under Simple as well. Longer-term it probably makes sense to either (a) expand Simple’s function mapping in a more systematic way or (b) make Simple strictly reject anything beyond the minimal subset (so users are pushed to the DataFusion planner).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call, let's release 0.4.1 once this PR merged

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@beinan Thanks for the clarification. I would be inclined for the option (b) make Simple strictly reject anything beyond the minimal subset (so users are pushed to the DataFusion planner). In my opinion, the simple executor has performance benefits for some basic simple queries, such as property filtering. So a minimal subset of feature support would be good enough (complex queries should go to the datafusion planner)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree — I think option (b) is the right long-term shape for ExecutionStrategy::Simple: keep it fast for the minimal subset, and proactively error on anything outside that subset to steer users to the DataFusion planner.

For this PR I kept the change scoped to eliminating silent wrong results (and added a tiny bit of string-casing support), but happy to follow up with tightening the Simple feature surface if you want.

@beinan beinan force-pushed the fix/cypher-unsupported-functions-error branch from 705d693 to dd2fc88 Compare January 29, 2026 07:06
Copy link
Collaborator

@ChunxuTang ChunxuTang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@beinan LGTM. Just one minor comment for the UNWIND tests.

// ============================================================================

#[tokio::test]
async fn test_unwind_simple_list() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you double-check whether you unintentionally removed the unwind tests? I guess it may be caused by merge conflicts.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — you’re right, those tests got dropped during the rebase/conflict resolution. I restored the entire section from upstream into and pushed (commit ).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — you’re right, those UNWIND tests got dropped during the rebase/conflict resolution.

I restored the entire // UNWIND Tests section from upstream into crates/lance-graph/tests/test_datafusion_pipeline.rs and pushed (commit 136dde8).

@beinan beinan merged commit 7586012 into lance-format:main Jan 29, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cypher scalar functions: unimplemented functions silently return 0 instead of erroring

5 participants