[CALCITE-3789] Support validation of UNNEST multiple array columns like Presto#1811
[CALCITE-3789] Support validation of UNNEST multiple array columns like Presto#1811my7ym wants to merge 1 commit intoapache:masterfrom
Conversation
| } | ||
|
|
||
| private boolean allowFlattenStruct(SqlOperatorBinding operatorBinding) { | ||
| if (!(operatorBinding instanceof SqlCallBinding)) { |
There was a problem hiding this comment.
allowFlattenStruct -> allowAliasUnnestArrayColumns ?
There was a problem hiding this comment.
Another class type for operator binding is RexCallBinding, but why we return true for that ?
There was a problem hiding this comment.
Great name. Much more concise. Will do.
There was a problem hiding this comment.
Another class type for operator binding is
RexCallBinding, but why we return true for that ?
This is the original behavior. Actually there are tons of SqlOperatorBinding implementations. As the original behavior, UNNEST will ALWAYS flatten the struct. The logic here is that we do NOT allow flatten struct ONLY for SqlCallBinding && conformance.allowAliasUnnestArrayColumns()
There was a problem hiding this comment.
@danny0405 Took a second thought on the function name, the result of the function is actually whether disallow flatten struct in UNNEST. So maybe disallowFlattenStructInUnnest? Let me know. Thanks.
There was a problem hiding this comment.
I know you only to support SqlCallBinding for unnest alias columns, but the logic seems weird, can you change
allowFlattenStruct => allowAliasUnnestColumns ?
There was a problem hiding this comment.
I think it looks even more confusing. If you read the code carefully enough, it is NOT allowAliasUnnestColumns. If you prefer the name change, I think disallowAliasUnnestColumn is more appropriate.
There was a problem hiding this comment.
Couldn’t you use !allowAliasUnnestColumns ?
| } | ||
|
|
||
| private boolean allowFlattenStruct(SqlOperatorBinding operatorBinding) { | ||
| if (!(operatorBinding instanceof SqlCallBinding)) { |
There was a problem hiding this comment.
Another class type for operator binding is RexCallBinding, but why we return true for that ?
| prefix.names, | ||
| nameMatcher, | ||
| validator.getConformance().allowAliasUnnestArrayColumns(), | ||
| resolved); |
There was a problem hiding this comment.
The code may works for your case, but you have changed a common sql identifier resolving logic with a special conformance, which seems not right way to go ~
The conformance switch logic should always happen in SqlValidator, we can pass along some switch flags into the scope if necessary.
There was a problem hiding this comment.
Sorry a little confused here. Here I am directly using conformance as the flag. I feel like it's more concise than an explicit flag like
SqlValidatorImpl.java::
boolean allowAliasUnnestArrays = validator.getConformance().allowAliasUnnestArrayColumns;
DelegatingScope.setAllowAliasUnnestArrays(allowAliasUnnestArrays);
DeletatingScope.java::
Just replace validator.getConformance().allowAliasUnnestArrayColumns with the explicit flag.
And it seems using conformance in scope happens before. e.g. OrderByScope.
There was a problem hiding this comment.
The current fix seems not right, the nested field can be seen if we set up the record type to StructKind.PEEK_FIELDS, for the test cases you gave:
sql("select e.ENAME\n"
+ "from dept_nested as d CROSS JOIN\n"
+ " UNNEST(d.employees) as t(e)")
.withConformance(SqlConformanceEnum.PRESTO).columnType("VARCHAR(10) NOT NULL");There t(e) table has record type with e as field, the e can be seen and if you change the nested employee type with code snippet:
final RelDataType empRecordType = typeFactory.builder()
.add("EMPNO", intType)
.add("ENAME", varchar10Type)
.add("DETAIL", typeFactory.builder()
.add("SKILLS", array(skillRecordType)).build())
.kind(StructKind.PEEK_FIELDS) // here is the line you can tweak
.build();So, in total, this line change is not necessary, the field access should not be controlled by the SqlConformance.
| SQL_SERVER_2008, | ||
|
|
||
| PRESTO; | ||
|
|
There was a problem hiding this comment.
PRESTO is not really a db, it behaves more like a computation engine, i'm not sure if we should add a new conformance for it.
There was a problem hiding this comment.
That's right. I am also debating on this. I could delete it or rename it. But I think the conformance is only for SQL string validation, so whether or not DB is kinda fine. But it's your call.
There was a problem hiding this comment.
Presto has its own SQL parser & validator & type system, so it counts as a 'dialect' (conformance) for these purposes.
There was a problem hiding this comment.
I'll add further comments to https://issues.apache.org/jira/browse/CALCITE-3789.
my7ym
left a comment
There was a problem hiding this comment.
Thanks @danny0405 for the timely and detailed review!
| } | ||
|
|
||
| private boolean allowFlattenStruct(SqlOperatorBinding operatorBinding) { | ||
| if (!(operatorBinding instanceof SqlCallBinding)) { |
There was a problem hiding this comment.
Great name. Much more concise. Will do.
| } | ||
|
|
||
| private boolean allowFlattenStruct(SqlOperatorBinding operatorBinding) { | ||
| if (!(operatorBinding instanceof SqlCallBinding)) { |
There was a problem hiding this comment.
Another class type for operator binding is
RexCallBinding, but why we return true for that ?
This is the original behavior. Actually there are tons of SqlOperatorBinding implementations. As the original behavior, UNNEST will ALWAYS flatten the struct. The logic here is that we do NOT allow flatten struct ONLY for SqlCallBinding && conformance.allowAliasUnnestArrayColumns()
| prefix.names, | ||
| nameMatcher, | ||
| validator.getConformance().allowAliasUnnestArrayColumns(), | ||
| resolved); |
There was a problem hiding this comment.
Sorry a little confused here. Here I am directly using conformance as the flag. I feel like it's more concise than an explicit flag like
SqlValidatorImpl.java::
boolean allowAliasUnnestArrays = validator.getConformance().allowAliasUnnestArrayColumns;
DelegatingScope.setAllowAliasUnnestArrays(allowAliasUnnestArrays);
DeletatingScope.java::
Just replace validator.getConformance().allowAliasUnnestArrayColumns with the explicit flag.
And it seems using conformance in scope happens before. e.g. OrderByScope.
| SQL_SERVER_2008, | ||
|
|
||
| PRESTO; | ||
|
|
There was a problem hiding this comment.
That's right. I am also debating on this. I could delete it or rename it. But I think the conformance is only for SQL string validation, so whether or not DB is kinda fine. But it's your call.
|
@danny0405 Just a kindly reminder, could you please take another look when available? Thanks a lot! |
|
@danny0405 Could you help take another look at this PR? Thanks! |
|
I'm planning to review again this weekend, very busy with the release work of Calcite 1.22 and also my own company work, sorry for that, Just like Julian said, add a new SqlConformance is really a big change, for "big" i mean, we should keep all the changes correct. |
Make sense. Just send a reminder and get a rough timeline. Thanks @danny0405 |
f25f91d to
5aba5c4
Compare
|
@danny0405 @julianhyde Just a soft reminder on this PR. No rush. Thanks! |
|
Reviewing now, may take some time ~ |
core/src/main/java/org/apache/calcite/plan/volcano/VolcanoPlanner.java
Outdated
Show resolved
Hide resolved
|
Thanks @my7ym , could you rebase you code to remove the unnecessary change ? |
999fe1b to
d48b2f1
Compare
core/src/main/java/org/apache/calcite/sql/validate/SqlAbstractConformance.java
Show resolved
Hide resolved
core/src/test/java/org/apache/calcite/test/SqlValidatorTest.java
Outdated
Show resolved
Hide resolved
|
We should also add a test case for at least one SqlToRelConverterTest. |
Will do it in the future if needed. I am still unclear that how this should be represented as RelNode. (discussed in https://issues.apache.org/jira/browse/CALCITE-3787) Is it OK for you? EDIT: Some other reasons:
|
9e8d652 to
12c5af4
Compare
core/src/test/java/org/apache/calcite/test/catalog/MockCatalogReaderSimple.java
Outdated
Show resolved
Hide resolved
Usually we have 2 kinds of use cases for Calcite:
Either way we should generates a working plan, that is why I suggest to add SqlToRel test. For more general, we should add operator test and SQL ITCase if we change an operator semantics. But for you case, I think a SqlToRelConverter test is okey. |
OK. I could do that for something like unnest(employees) as t(e), but I am not sure what's a valid plan for unnest(employees, admins) as t(e, a). Can I just add one test case for something like unnest(employees) as t(e)? Thanks. |
12d30c5 to
667c43d
Compare
I have added tests for it. But it needs changes on SqlToRelConverter. The change is because usually for STRUCT, Uncollect will flatten the type but for this feature, it should not. This is also aligned with the registered type during validation. I am not sure whether this should be the best way to implement it. Let me know. Thanks. EDIT: If you think this change is large unnecessarily because we don't need SqlToRel conversion right now), I am happy to revert the SqlToRel conversion changes. |
Done |
core/src/test/java/org/apache/calcite/test/SqlToRelConverterTest.java
Outdated
Show resolved
Hide resolved
| if (field.getType() instanceof MapSqlType) { | ||
| builder.add(SqlUnnestOperator.MAP_KEY_COLUMN_NAME, field.getType().getKeyType()); | ||
| builder.add(SqlUnnestOperator.MAP_VALUE_COLUMN_NAME, field.getType().getValueType()); | ||
| } else { |
There was a problem hiding this comment.
I am also not sure what will be the best way to handle MAP here.
667c43d to
d1c6964
Compare
core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java
Outdated
Show resolved
Hide resolved
| boolean withOrdinality, List<String> fieldAliases) { | ||
| super(cluster, traitSet, input); | ||
| this.withOrdinality = withOrdinality; | ||
| this.fieldAliases = fieldAliases; |
There was a problem hiding this comment.
you should copy fieldAliases into an immutable list.
Do you think the logic would be more uniform if fieldAliases were empty, not null, in the usual case? I'm a big fan of empty lists.
There was a problem hiding this comment.
Done. Usually I think null is a separate state to flag something other than empty. But here they seem to be the same and we have extra bonus to avoid NPE. Good call.
| * | ||
| * <p>E.g. in UNNEST(a_array, b_array) AS T(a, b), | ||
| * a and b will be aliases of a_array and b_array | ||
| * respectively |
There was a problem hiding this comment.
What will be the behavior of that expression if allowAliasUnnestColumns is false?
There was a problem hiding this comment.
What is the most logical place for this function inside SqlConformance? I bet it's not at the very end.
|
|
||
| /** Conformance value that instructs Calcite to use SQL semantics | ||
| * consistent with Presto. */ | ||
| PRESTO; |
| return true; | ||
| } | ||
| } | ||
| public boolean allowAliasUnnestColumns() { |
There was a problem hiding this comment.
need space before this function, no space after.
core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java
Outdated
Show resolved
Hide resolved
|
|
||
| private void convertUnnest(Blackboard bb, SqlCall from, List<String> fieldAliases) { | ||
| SqlCall call; | ||
| call = from; |
There was a problem hiding this comment.
assign on one line and make final? or eliminate a variable?
| replaceSubQueries(bb, node, RelOptUtil.Logic.TRUE_FALSE_UNKNOWN); | ||
| } | ||
| final List<RexNode> exprs = new ArrayList<>(); | ||
| final List<String> fieldNames = new ArrayList<>(); |
There was a problem hiding this comment.
you don't need two separate arrays. RelBuilder.project will apply aliases if you have used RelBuilder.alias to wrap expressions in 'as'. Use project rather than projectNamed.
There was a problem hiding this comment.
Done. btw this logic is there already before I extract it as a separate method. I usually avoid to touch unnecessary part as much as possible.
| (null != bb.root) ? bb.root : LogicalValues.createOneRow(cluster); | ||
| relBuilder.push(child).projectNamed(exprs, fieldNames, false); | ||
|
|
||
| Uncollect uncollect = |
There was a problem hiding this comment.
it's about time we added RelBuilder.uncollect
d1c6964 to
1e212e1
Compare
my7ym
left a comment
There was a problem hiding this comment.
@danny0405 @julianhyde Danny's patch seems to resolve all the pending issues in this PR. Let me know if you folks have other feedbacks. If not, please accept it. Thanks!
core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java
Outdated
Show resolved
Hide resolved
core/src/test/java/org/apache/calcite/test/SqlToRelConverterTest.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java
Outdated
Show resolved
Hide resolved
|
If you think my fix is okey, can you check it in this patch, thanks ~ |
Left another comment. I thought you preferred to keep them separate so I did not merge them. Will do after that comment is resolved! Thanks! |
|
I have fired a new fix at https://github.com/danny0405/calcite/tree/fix-3789, please check if you have time. |
👍 Will do and let you know. EIDT: Just two nit comments. Solid fix. Let me know if you want to update it or I could directly resolve them when I merge your fix. Thanks! |
Directly resolve them when you merge the fix, thanks for the contribution ~ |
1e212e1 to
5ab8dee
Compare
Done. Not sure whether I should make the commit message & JIRA title the same. Feel free to modify either of them. Thanks! |
This patch also add a new PRESTO conformance. Fix-up (by Danny): - Fix SqlTypeUtil#flattenRecordType to not append field index if there are no duplicates - Rename SqlConformance#allowAliasUnnestColumns to SqlConformance#allowAliasUnnestItems - Fix RelStructuredTypeFlattener to not generate flattenned field based on struct field - Promote SqlToRelConverter#convertFrom to allow specify field aliases - Add comment to RelBuilder#uncollect close apache#1811 Change-Id: Ibeaeda6f0007b5409afcb43e5aab3b878e0de89b
apache#343) This patch also add a new PRESTO conformance. Fix-up (by Danny): - Fix SqlTypeUtil#flattenRecordType to not append field index if there are no duplicates - Rename SqlConformance#allowAliasUnnestColumns to SqlConformance#allowAliasUnnestItems - Fix RelStructuredTypeFlattener to not generate flattenned field based on struct field - Promote SqlToRelConverter#convertFrom to allow specify field aliases - Add comment to RelBuilder#uncollect close apache#1811 Co-authored-by: Will Yu <wmy7ymw@gmail.com>
This patch also add a new PRESTO conformance. Fix-up (by Danny): - Fix SqlTypeUtil#flattenRecordType to not append field index if there are no duplicates - Rename SqlConformance#allowAliasUnnestColumns to SqlConformance#allowAliasUnnestItems - Fix RelStructuredTypeFlattener to not generate flattenned field based on struct field - Promote SqlToRelConverter#convertFrom to allow specify field aliases - Add comment to RelBuilder#uncollect close apache#1811 Change-Id: Ibeaeda6f0007b5409afcb43e5aab3b878e0de89b
apache#343) This patch also add a new PRESTO conformance. Fix-up (by Danny): - Fix SqlTypeUtil#flattenRecordType to not append field index if there are no duplicates - Rename SqlConformance#allowAliasUnnestColumns to SqlConformance#allowAliasUnnestItems - Fix RelStructuredTypeFlattener to not generate flattenned field based on struct field - Promote SqlToRelConverter#convertFrom to allow specify field aliases - Add comment to RelBuilder#uncollect close apache#1811 Co-authored-by: Will Yu <wmy7ymw@gmail.com>
No description provided.