Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using ExpressionTypeManager to detect the types for function look up. #1368

Conversation

hjafarpour
Copy link
Contributor

Description

This PR will use ExpressionTypeManager to detect the argument types for functions. Argument types are used to look up for the correct function among variants of a function with different arguments.
This change will help in fixing the rebase issues in struct PRs.

Testing done

Existing tests cover this.

Reviewer checklist

  • Ensure docs are updated if necessary. (eg. if a user visible feature is being added or changed).
  • Ensure relevant issues are linked (description should include text like "Fixes #")

@hjafarpour hjafarpour self-assigned this Jun 1, 2018
@hjafarpour hjafarpour requested review from dguy and a team June 1, 2018 19:01
@@ -132,26 +128,24 @@ protected Object visitFunctionCall(FunctionCall node, Object context) {
final int functionNumber = functionCounter++;
functionArguments.beginFunction();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make functionArguments local to this method call now.

@@ -198,10 +196,11 @@ protected Expression visitFunctionCall(final FunctionCall node,
functionArguments.beginFunction();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're not using functionArguments to track the call stack then can we just clean it up? Otherwise all its doing is being a list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed it from this part.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my understanding, the stack is being replaced by multiple instances of ExpressionTypeManager now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you going to remove the FunctionArguments? Seems it is not needed with this change, i.e., as @rodesai suggested

Copy link
Contributor

@big-andy-coates big-andy-coates left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @hjafarpour

This PR is a much nicer size! Thanks.

Few nits / comments below.

@@ -132,26 +128,24 @@ protected Object visitFunctionCall(FunctionCall node, Object context) {
final int functionNumber = functionCounter++;
functionArguments.beginFunction();
final String functionName = node.getName().getSuffix();
ExpressionTypeManager expressionTypeManager =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: final.

return new Pair<>("Integer.parseInt(\"" + node.getValue() + "\")", Schema.INT32_SCHEMA);
}

@Override
protected Pair<String, Schema> visitFunctionCall(FunctionCall node, Boolean unmangleNames) {
String functionName = node.getName().getSuffix();
UdfFactory udfFactory = functionRegistry.getUdfFactory(functionName);
FunctionArguments functionArguments = new FunctionArguments();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: final.

functionArguments.beginFunction();

String instanceName = functionName + "_" + functionCounter++;

final String arguments = node.getArguments().stream()
.map(arg -> process(arg, unmangleNames).getLeft())
.collect(Collectors.joining(", "));
node.getArguments().stream().forEach(arg -> functionArguments.addArgumentType(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit|: stream().forEach() can be replaced with forEach

@@ -188,7 +182,9 @@ protected Object visitDereferenceExpression(DereferenceExpression node, Object c
throw new RuntimeException(
"Cannot find the select field in the available fields: " + node.toString());
}
updateFunctionArgTypesAndParams(schemaField.get());
parameters.add(new ParameterType(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are multiple copies of this same code - maybe it should be a method?

e.g.

private void addParameter(final Field field) {
      parameters.add(new ParameterType(
          SchemaUtil.getJavaType(field.schema()),
          field.name().replace(".", "_")));
    }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Moved them to a method.

return new Pair<>("Integer.parseInt(\"" + node.getValue() + "\")", Schema.INT32_SCHEMA);
}

@Override
protected Pair<String, Schema> visitFunctionCall(FunctionCall node, Boolean unmangleNames) {
String functionName = node.getName().getSuffix();
UdfFactory udfFactory = functionRegistry.getUdfFactory(functionName);
FunctionArguments functionArguments = new FunctionArguments();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a use of functionArguments at the bottom of this method that has no corresponding being/endFunction - so I think it can be removed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plus I think it could just be replace with a local List<Schema.Type>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed it into a list.

for (Expression argExpr : node.getArguments()) {
process(argExpr, null);
functionArguments.addArgumentType(expressionTypeManager.getExpressionType(argExpr).type());
Copy link
Contributor

@big-andy-coates big-andy-coates Jun 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider making functionArguments just be local List<Schema.Type>...

hjafarpour added a commit to hjafarpour/ksql that referenced this pull request Jun 5, 2018
….com/hjafarpour/ksql into Struct-Type-P2-Use-Connect-Data-Type

Also temporarily removed the ExtractJsonField test from the query translation tests. Will bring it back when sycnhed with the master and merged confluentinc#1368.
Applied Andy's feedback.
@hjafarpour
Copy link
Contributor Author

@big-andy-coates, @rodesai applied your feedback.
@dguy this PR makes changes in UDF PR. Wanna take a look?

Copy link
Contributor

@apurvam apurvam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the manageably sized patch @hjafarpour . Apart from making it easier to rebase future struct PRs, does this also fix a bug with the current code when it comes to dealing with function arguments? I think there was an allusion to a bug fix in a separate context.

private void addParameter(Optional<Field> schemaField) {
parameters.add(new ParameterType(
SchemaUtil.getJavaType(schemaField.get().schema()),
schemaField.get().name().replace(".", "_")));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider encapsulating this . -> _ translation into a semantically named --and thus self explanaing-- method inside SchemaField. This would help code readability. As it stands, why do you need to do this replace operation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

schemaField is an instance of Field which is a Connect class so I won't be able to change it from KSQL side.

final List<Schema.Type> types = functionArguments.endFunction();
final Schema returnType = udfFactory.getFunction(types).getReturnType();
node.getArguments().forEach(arg -> argumentTypes.add(
expressionTypeManager.getExpressionType(arg).type()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like every use of getExpressionType is followed by a type() call. May be worth thinking about adding another appropriately named method to ExpresssionTypeManager which makes this type call directly and returns the value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a method for this.

@@ -198,10 +196,11 @@ protected Expression visitFunctionCall(final FunctionCall node,
functionArguments.beginFunction();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my understanding, the stack is being replaced by multiple instances of ExpressionTypeManager now?

Copy link
Contributor

@dguy dguy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @hjafarpour, just a couple of nits.

@@ -123,35 +121,39 @@ public ExpressionMetadata buildCodeGenFromParseTree(
this.functionRegistry = functionRegistry;
}

private void addParameter(Optional<Field> schemaField) {
parameters.add(new ParameterType(
SchemaUtil.getJavaType(schemaField.get().schema()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably move the if (schemaField.isPresent(...) into this method

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add ifPresent().

final List<Schema.Type> types = functionArguments.endFunction();
final Schema returnType = udfFactory.getFunction(types).getReturnType();
node.getArguments().forEach(arg -> argumentTypes.add(
expressionTypeManager.getExpressionType(arg).type()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@@ -198,10 +196,11 @@ protected Expression visitFunctionCall(final FunctionCall node,
functionArguments.beginFunction();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you going to remove the FunctionArguments? Seems it is not needed with this change, i.e., as @rodesai suggested

@dguy
Copy link
Contributor

dguy commented Jun 5, 2018

@hjafarpour can you add this test: https://github.com/confluentinc/ksql/pull/1365/files#diff-d2219c48b2a9226d6594344414507ae2R447

Then i can close that PR. Thanks

@hjafarpour
Copy link
Contributor Author

@dguy removed FunctionArgument class and also added the tests you mentioned.
@apurvam yes the stack was replaced with calls to the ExpressionTypeManager. I consolidated the use of ExpressionTypeManager and now there is only one instance that is being used through out the class.

@apurvam
Copy link
Contributor

apurvam commented Jun 5, 2018

It's curious that there are 6 failed security integration tests twice in a row. I kicked off the PR builder again right now, but we probably will have to understand what's going on.

Copy link
Contributor

@apurvam apurvam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @hjafarpour . The latest version looks good to me. But we need to get to the bottom of the build failures before we can merge. They definitely seem to be triggered by this patch.

@apurvam
Copy link
Contributor

apurvam commented Jun 6, 2018

Hm. It looks like the master build is also broken. Those 6 tests are failing there as well. So it isn't anything to do with your PR.

Copy link
Contributor

@dguy dguy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @hjafarpour, LGTM

Copy link
Contributor

@apurvam apurvam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Before merging, we need to:

  1. Update the branch to be in sync with master.
  2. Get a green Jenkins build.

@hjafarpour hjafarpour changed the base branch from master to 5.0.x June 6, 2018 18:08
@hjafarpour hjafarpour merged commit b71c5fa into confluentinc:5.0.x Jun 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants