Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: implement correct logic for nested lambdas and more complex lambda expressions #7056

Merged
merged 3 commits into from
Feb 26, 2021

Conversation

stevenpyzhang
Copy link
Member

@stevenpyzhang stevenpyzhang commented Feb 20, 2021

Description

When processing a function call, we only want to use the context with the input type list populated for the LambdaFunctionCalls since those are where we need the input type list for mapping arguments. For other FunctionCall argument expressions, we need to pass in the original context that we arrived at the node with.

We also need to be passing copies of the contexts in order to prevent corruption when mapping the input type to the lambda arguments from other subtrees.

Testing done

Unit tests so far, QTT test coming soon.

Reviewer checklist

  • Ensure docs are updated if necessary. (eg. if a user visible feature is being added or changed).
  • Ensure relevant issues are linked (description should include text like "Fixes #")

@stevenpyzhang stevenpyzhang force-pushed the nested-lambdas-patch branch 2 times, most recently from 7b2170e to 56eaa8f Compare February 24, 2021 03:53
@stevenpyzhang stevenpyzhang changed the title Nested lambdas patch feat: implement correct logic for nested lambdas and more complex lambda expressions Feb 24, 2021
@stevenpyzhang stevenpyzhang marked this pull request as ready for review February 24, 2021 05:29
@stevenpyzhang stevenpyzhang requested a review from a team as a code owner February 24, 2021 05:29
final boolean hasLambda = node.hasLambdaFunctionCallArguments();
for (final Expression argExpr : node.getArguments()) {
final TypeContext childContext = context.getCopy();
final TypeContext childContext;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better add some comments for other readers why we are doing this: since the context need to be shared across function args that involves lambdas, blah blah.

final FunctionName functionName = node.getName();

final TypeContext contextCopy = context.getCopy();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe rename to lambdaSharedContext?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

took me a while as well to understand what was going on. I'd name them like this:

// this is the one that's passed in, it represents the context at the time the parent called this method
TypeContext parentTypeContext; 

// this is a copy of the parent type context, additionally populated with the result of visiting non-lambda
// expressions - it starts as a copy of the parent typeContext
TypeContext currentTypeContext; 

// a copy of either parent or current type context to be passed to the child - in the case of lambdas
// we pass in the parent type context without the result of resolving the current context because 
// there may be valid overlapping lambda parameter names
TypeContext childContext;

final SqlType resolvedArgType =
expressionTypeManager.getExpressionSqlType(argExpr, childContext);
process(argExpr, context.getCopy());
expressionTypeManager.getExpressionSqlType(argExpr, childContext.getCopy());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are a bit duplicate copying context here. Maybe we can wrap the logic of

        final TypeContext childContext;
        if (argExpr instanceof LambdaFunctionCall) {
          childContext = contextCopy.getCopy();
        } else {
          childContext = context.getCopy();
        }

as a argTypeContext(argExpr, parentContext, lambdaSharedContext), which would be called for both expressionTypeManager#getExpressionSqlType and process? Ditto on other classes.

@stevenpyzhang stevenpyzhang force-pushed the nested-lambdas-patch branch 2 times, most recently from 8225a82 to d93cd2e Compare February 24, 2021 22:12
@stevenpyzhang
Copy link
Member Author

PR is blocked on #7093 right now

@stevenpyzhang stevenpyzhang force-pushed the nested-lambdas-patch branch 3 times, most recently from 3aed980 to e881860 Compare February 25, 2021 08:14
@stevenpyzhang stevenpyzhang requested a review from a team February 25, 2021 20:24
Copy link
Contributor

@agavra agavra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach makes sense to me (if I'm understanding the problem correctly, see comment below with suggested naming)

"name": "apply transform lambda function to array",
"statements": [
"CREATE STREAM TEST (ID BIGINT KEY, VALUE MAP<STRING, ARRAY<INT>>) WITH (kafka_topic='test_topic', value_format='AVRO');",
"CREATE STREAM OUTPUT as SELECT ID, TRANSFORM(TRANSFORM(VALUE, (x,y) => x, (x,y) => FIlTER(y, z => z IS NOT NULL)), (x,y) => UCASE(x) , (k,v) => ARRAY_MAX(v)) as FILTERED_TRANSFORMED from TEST emit changes;"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might be more interesting to filter something like z => z < 5 - otherwise ARRAY_MAX([2,null,5]) might just be returning 5 anyway, right?


// Then
assertThat(
javaExpression, equalTo("((String) function_0.evaluate(COL4, COL1, new BiFunction() {\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

personally, I don't think these are valuable to check in (though obviously great for development) - the important part is that the results are the same and the QTT does that. We might want to change the code that we generate and it would have no impact other than cause our tests to fail 😢

@stevenpyzhang stevenpyzhang merged commit 1a042cd into confluentinc:master Feb 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants