Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CALCITE-4171] Support named parameters for table window functions #2103

Merged
merged 2 commits into from Aug 18, 2020

Conversation

danny0405
Copy link
Contributor

@danny0405 danny0405 commented Aug 10, 2020

  • Changes SqlArgumentAssignmentOperator to allow non-scala query as operand
  • In SqlCallBinding, matches the permuted operand by name with name matcher
  • Refactor SqlWindowTableFunction and its sub-class to reuse same logic
  • Add SqlOperator#paramFallbackToDefault to decide if to replace the
    non-specified param with a DEFAULT call, for SqlWindowTableFunction,
    it always return false

The root cause is that the COALESCE operand type was wrongly replaced by
`SqlToRelConverter#adjustInputRef`, actually, for an agg as bb root, there
is no need to do such adjust. Because the nullability does not change and
the agg type is not same with the bb's scope.

Tweak the `#adjustInputRef` to only fix type nullability, if there
are cases that the type name also changes, just return the original node
and let the subsequent conversion work flow throw.
Copy link
Contributor

@amaliujia amaliujia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -47,4 +47,8 @@
writer.keyword(getName());
call.operand(0).unparse(writer, getRightPrec(), rightPrec);
}

@Override public boolean argumentMustBeScalar(int ordinal) {
return false;
Copy link
Contributor

@amaliujia amaliujia Aug 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really nice. I actually tried to add named parameter support but failed on making non-scalar case (i.e. table function) work.

I wasn't aware that this can be as simple as override this parameter in SqlArgumentAssignmentOperator.java

return false;
}
for (int i = 1; i < descriptors + 1; i++) {
final SqlNode operand = callBinding.operand(i);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use i < descriptors + 1 rather than i <= descriptors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the descriptors represent the number of the descriptors.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i <= descriptors is better since it has fewer operations.

Copy link
Contributor

@amaliujia amaliujia Aug 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @danny0405

any suggestions on https://issues.apache.org/jira/browse/CALCITE-3780?focusedCommentId=17169620&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17169620?

Your change will assume that the structure of window table functions be row, (ROW, DESCRIPTOR, DESCRIPTOR ..., other params), in the comment we are thinking to make key descriptor of SESSION as optional, which might mean it will be moved to the last of the arguments so it becomes SESSION(data TABLE, ts DESCRIPTOR, gap interval, key DESCRIPTOR optional)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is still straight-forward when it is the form (ROW, DESCRIPTOR ..., other params), in this form, we always assume that the DESCRIPTOR has the column as the time column.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I will take a look what is the best way when authoring that PR.

+ "^\"data\"^ => table orders,\n"
+ "TIMECOL => descriptor(rowtime),\n"
+ "SIZE => interval '2' hour))")
.fails("Param 'data' not found in function 'TUMBLE'; did you mean 'DATA'\\?");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does the order of parameters matter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, but only the first data param, because for a table function, only the first param can be a query.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add some tests about it? (Ignore it if it already has)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already added in the SqlToRelConverterTest.

@chunweilei
Copy link
Contributor

FYI, master branch is still locked~~

.add("window_start", timestampType)
.add("window_end", timestampType)
.add("wstart", timestampType)
.add("wend", timestampType)
Copy link
Contributor

@amaliujia amaliujia Aug 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that I asked one of the authors of the "one sql to rule them all" paper. The only reason that they use wstart and wend was because of the layout for a paper. I got suggestion to use window_start and window_end.

I also believe in a JIRA or an email (sorry I cannot recall which one it is) we agreed to switch to window_start and window_end.

So I will suggest we keep window_start and window_end, which can be understood easier than wstart/wend.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I didn’t notice that, can you share the discussion?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will spend some time to search for that JIRA.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. I might or might not discussed this in Calcite (because I cannot find a JIRA that about it), but

  1. This is the acknowledge from Kenneth Knowles, one of the paper authors: https://docs.google.com/document/d/138uA7VTpbF84CFrd--cz3YVe0-AQ9ALnsavaSE2JeE4/edit?disco=AAAAEAjWvxg

  2. I started to use window_start and window_end in https://issues.apache.org/jira/browse/CALCITE-3272. Julian, as another paper author, didn't against it I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the share, i have reverted the rename back.

Copy link
Contributor

@amaliujia amaliujia Aug 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@danny0405 danny0405 force-pushed the CALCITE-4171 branch 5 times, most recently from 42f6d0d to 60af4aa Compare August 17, 2020 05:46
[CALCITE-4171] Support named parameters for table window functions

* Changes SqlArgumentAssignmentOperator to allow non-scala query as
  operand
* In SqlCallBinding, matches the permuted operand by name with name
  matcher
* Refactor SqlWindowTableFunction and its sub-class to reuse same
  logic
* Do not patch up the SqlWindowTableFunction with DEFAULTs when sql
  validation
@danny0405 danny0405 merged commit ebbb7bb into apache:master Aug 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
LGTM-will-merge-soon Overall PR looks OK. Only minor things left.
Projects
None yet
3 participants