Skip to content

[CALCITE-6242] Enhance lambda closure#4926

Open
cjj2010 wants to merge 1 commit into
apache:mainfrom
cjj2010:CALCITE-6242
Open

[CALCITE-6242] Enhance lambda closure#4926
cjj2010 wants to merge 1 commit into
apache:mainfrom
cjj2010:CALCITE-6242

Conversation

@cjj2010
Copy link
Copy Markdown
Contributor

@cjj2010 cjj2010 commented May 8, 2026

.type("RecordType(INTEGER NOT NULL EXPR$0) NOT NULL");
s.withSql("select HIGHER_ORDER_FUNCTION2(1, () -> 0.1)")
.type("RecordType(INTEGER NOT NULL EXPR$0) NOT NULL");
s.withSql("select HIGHER_ORDER_FUNCTION(1, (x, y) -> x + 1 + ^emp.deptno^) from emp")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add jira message in test.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


!ok

select *
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also here.


@Override public R visitLambda(RexLambda lambda, P arg) {
return null;
if (!deep) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could add some comments here to explain the reason.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, thanks

@cjj2010 cjj2010 changed the title [CALCITE-6242] The "exists" library function throws a "param not found" error when a column is used in lambda evaluation logic. [CALCITE-6242] Enhance lambda closure parsing May 9, 2026
Copy link
Copy Markdown
Member

@xuzifu666 xuzifu666 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@xuzifu666
Copy link
Copy Markdown
Member

There are some new discussions in Jira that you should keep up with.

@xuzifu666 xuzifu666 added the discussion-in-jira There's open discussion in JIRA to be resolved before proceeding with the PR label May 11, 2026
@cjj2010 cjj2010 changed the title [CALCITE-6242] Enhance lambda closure parsing [CALCITE-6242] Enhance lambda closure May 12, 2026
Copy link
Copy Markdown
Contributor

@mihaibudiu mihaibudiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens to nested lambdas? Are they legal? Will they validate?
x -> (y -> x + y)
For the inner closure "x" should not be treated like a table column name.

@cjj2010
Copy link
Copy Markdown
Contributor Author

cjj2010 commented May 18, 2026

What happens to nested lambdas? Are they legal? Will they validate? x -> (y -> x + y) For the inner closure "x" should not be treated like a table column name.

Thank you for your suggestion. Currently, nested lambdas like x -> (y -> x + y) can be parsed normally. I have added more cases to verify @mihaibudiu

Copy link
Copy Markdown
Contributor

@mihaibudiu mihaibudiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very nice, left a couple more requests for checking some tricky corner cases.

checkArgument(parameterTypes.containsKey(columnName),
"column %s not found", columnName);
return parameterTypes.get(columnName);
if (parameterTypes.containsKey(columnName)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know what happens to case sensitivity?
I suspect parameter names are subject to the same rules as other identifiers, including quoting.
Can you please write some tests using uppercase/lowercase parameter names and also using quoted names?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know what happens to case sensitivity? I suspect parameter names are subject to the same rules as other identifiers, including quoting. Can you please write some tests using uppercase/lowercase parameter names and also using quoted names?

Thank you for reminding me that the support for case sensitivity here is indeed not rigorous enough. I have made optimizations and added test cases. I am very grateful for your professional and meticulous review of this pull request (PR)

// the inner lambda can resolve references to outer lambda parameters.
// e.g., in x -> EXISTS(arr, y -> x + y = 4), the inner lambda's blackboard
// needs access to "X" from the outer lambda's nameToNodeMap.
if (bb.nameToNodeMap != null) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

speaking of this, I suspect we should reject lambdas with the same parameter used twice x -> x -> x + 1. Please write a test about that too.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

speaking of this, I suspect we should reject lambdas with the same parameter used twice x -> x -> x + 1. Please write a test about that too.

I completely agree. I have added new test cases for this

@cjj2010 cjj2010 force-pushed the CALCITE-6242 branch 2 times, most recently from 4ef11b6 to 2056190 Compare May 19, 2026 02:54
// nested lambdas can resolve outer lambda parameters via the
// scope chain, rather than treating them as table column names.
registerOperandSubQueries(
i == 1 ? lambdaScope : parentScope, call, i);
Copy link
Copy Markdown
Contributor

@dssysolyatin dssysolyatin May 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of the magic i == 1 inside the for loop, can we just write: (We anyway expect shape of SqlLambda)

registerOperandSubQueries(lambdaScope, call.getExpression(), i)

Also, I don't quite get why we need registerOperandSubQueries for the parameters (operand = 0)? Do we need this loop at all ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry. We can not write like that, because registerOperandSubQueries accept ordinal as parameter. But question about loop is still valid

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry. We can not write like that, because registerOperandSubQueries accept ordinal as parameter. But question about loop is still valid

I agree, but compared to lambda, in the current scenario, flattening it into two lines would be more intuitive. Operand 0 is the parameter list, and operand 1 is the lambda body

Copy link
Copy Markdown
Contributor

@dssysolyatin dssysolyatin May 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need two lines? Do we actually need registerOperandSubQueries for the parameters, is it even possible for a parameter to be a SubQuery?

I think this could be simplified from:

for (int i = 0; i < operands.size(); i++) { 
  registerOperandSubQueries(
            i == 1 ? lambdaScope : parentScope, call, i);

to just:

// To make it easier to understand code
int expressionOperand = 1
registerOperandSubQueries(lambdaScope, call, expressionOperand)

?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need two lines? Do we actually need registerOperandSubQueries for the parameters, is it even possible for a parameter to be a SubQuery?

I think this could be simplified from:

for (int i = 0; i < operands.size(); i++) { 
  registerOperandSubQueries(
            i == 1 ? lambdaScope : parentScope, call, i);

to just:

// To make it easier to understand code
int expressionOperand = 1
registerOperandSubQueries(lambdaScope, call, expressionOperand)

?

like this:
// Register the parameter list under the parent scope
// (parameters are visible to the lambda body, not the other way around).
registerOperandSubQueries(parentScope, call, 0);
// Register the expression body under the lambda scope so that
// nested lambdas can resolve outer lambda parameters via the
// scope chain, rather than treating them as table column names.
registerOperandSubQueries(lambdaScope, call, 1);

Copy link
Copy Markdown
Contributor

@dssysolyatin dssysolyatin May 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need to call registerOperandSubQueries for parameters.
registerOperandSubQueries(parentScope, call, 0);

registerOperandSubQueries does:

Registers any sub-queries inside a given call operand, and converts the
operand to a scalar sub-query if the operator requires it.

SqlLambda.parameters is list of SqlIdentitifier. Can it ever be/contain a SubQuery? I hope not.
So, registerOperandSubQueries(parentScope, call, 0) should do nothing. It should be safe to drop it

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, you're right.
SqlLambda.parameters is a SqlNodeList whose elements are always SqlIdentifier (the parameter names). Tracing through registerOperandSubQueries → registerSubQueries:
parameters.getKind() is not QUERY, so the SCALAR_QUERY wrapping branch is skipped.
In the SqlNodeList branch, each SqlIdentifier is neither a QUERY nor a SqlCall/SqlNodeList, so it falls into the atomic-node branch and is ignored.
So registerOperandSubQueries(parentScope, call, 0) is effectively a no-op on a lambda's parameter list, and dropping it is safe.I have updated the code.

throw newValidationError(params.get(j),
RESOURCE.duplicateLambdaParameter(otherName));
}
}
Copy link
Copy Markdown
Contributor

@dssysolyatin dssysolyatin May 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we write something like:

final Set<String> seen = catalogReader.nameMatcher().createSet();
  for (SqlNode param : lambdaExpr.getParameters()) {
    final String name = ((SqlIdentifier) param).getSimple();
    if (!seen.add(name)) {
      throw newValidationError(param, RESOURCE.duplicateLambdaParameter(name));
    }
  }

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we write something like:

final Set<String> seen = catalogReader.nameMatcher().createSet();
  for (SqlNode param : lambdaExpr.getParameters()) {
    final String name = ((SqlIdentifier) param).getSimple();
    if (!seen.add(name)) {
      throw newValidationError(param, RESOURCE.duplicateLambdaParameter(name));
    }
  }

done,thanks

Copy link
Copy Markdown
Contributor

@mihaibudiu mihaibudiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If @dssysolyatin does not have other questions, let's merge this.

@mihaibudiu
Copy link
Copy Markdown
Contributor

I don't think calcite has a "call" expression, which would allow you to call a closure: (x -> x + 1)(3). That would make such closures even more useful. Maybe we should file an issue about it?

@mihaibudiu mihaibudiu added the LGTM-will-merge-soon Overall PR looks OK. Only minor things left. label May 19, 2026
@dssysolyatin
Copy link
Copy Markdown
Contributor

Just last question - #4926 (comment) and feel free to merge after that

@cjj2010 cjj2010 force-pushed the CALCITE-6242 branch 2 times, most recently from 4b33f31 to 5a993bf Compare May 20, 2026 04:56
@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

discussion-in-jira There's open discussion in JIRA to be resolved before proceeding with the PR LGTM-will-merge-soon Overall PR looks OK. Only minor things left.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants