New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-24394][test] Refactor BuiltInFunctions IT Tests #17341
Conversation
Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community Automated ChecksLast check on commit 1b6da15 (Thu Sep 23 18:54:41 UTC 2021) Warnings:
Mention the bot in a comment to re-run the automated checks. Review Progress
Please see the Pull Request Review Guide for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commandsThe @flinkbot bot supports the following commands:
|
f2c827b
to
1b6da15
Compare
@flinkbot run azure |
1b6da15
to
8e3d7ed
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @matriv. This is quite a big change for a hotfix and should have a corresponding JIRA issue maybe as a subtask of your current issue. I added some feedback.
} | ||
|
||
TestSpec testTableApiResult( | ||
Expression[] expression, Object[] result, AbstractDataType<?>[] dataType) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: could we use List
instead of arrays? I find Arrays.asList
and soon List.of
nicer than the verbose array syntax.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't have any objection, just chose arrays here to avoid the overhead of List objects especially for the common cases of only one element, but considering it's test, Lists are nicer indeed.
@@ -298,14 +298,52 @@ TestSpec testResult( | |||
return testResult(expression, sqlExpression, result, dataType, dataType); | |||
} | |||
|
|||
TestSpec testResult(TableTestSpecColumn... tableTestSpecColumns) { | |||
int cols = tableTestSpecColumns.length; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we don't enforce using final in Flink but if you check the core, you will see that most committers use final extensively to indicate immutable/mutability in code. I would vote for having a conistent coding style within our team. At least we should adapt to the coding style of the class. When I read this and the following lines, it looks to me as if we would modify the reference in the for loop again. Because all variables above are marked final
.
testItems.add(new TableApiResultTestItem(expression, result, tableApiDataType)); | ||
testItems.add(new SqlResultTestItem(sqlExpression, result, sqlDataType)); | ||
StringJoiner sj = new StringJoiner(","); | ||
for (String sql : sqlExpression) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: use the Java streams API or String.join
once we have lists
String sqlExpression, | ||
Object result, | ||
AbstractDataType<?> tableApiDataType, | ||
AbstractDataType<?> sqlQueryDataType) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if Table API and SQL API share the same result it should in most cases also have the same data type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kept this variant exposed as there are a couple of tests in JsonFunctionsITCase
that make use of different return data types: https://github.com/apache/flink/pull/17341/files#diff-14f6e87fed00e28afef4a5e995126c430b4065661e3dc2f66d14f3741378f874R180
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH this is a very special case and hopefully does not happen elsewhere. we should not start having more of these special cases.
} | ||
|
||
/** Helper POJO to store test parameters. */ | ||
public static class TableTestSpecColumn { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Column
sounds very internal to this class. Tests don't know columns but only test specs. How about ResultSpec
to keep it short?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, thx, couldn't come up with a nice name.
// In this case, the return type is not null because we have a | ||
// constant in the function invocation | ||
BIGINT().notNull())); | ||
of( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of using a static import in every test, we could simply provide static methods in the upper class, called resultItem(...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'd prefer though the name resultSpec()
since resultItem
is another more lower level pojo within the class and can cause confusion.
157029a
to
8d4756c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update. I added one last comment and will merge this once the build is green then.
@@ -150,7 +153,7 @@ private static void testResult( | |||
|
|||
assertEquals( | |||
"Result of column [" + i + "] doesn't match.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
of spec
btw I think it would be nice to also print the test item summary to quickly know which one failed if there are many.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like:
"Logical type for spec [" + i + "] of test [" + testItem + "] doesn't match"
&
"Result for spec [" + i + "] of test [" + testItem + "] doesn't match"
looks good?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pushed this change, along with a fix regarding the relevant toString()
implementation for TableAPI cases.
Add the possibility to tests multiple TableApi expressions and/or SQL expressions as multiple table columns using just one execution of the pipeline to speed up the total test execution time. Refactor existing tests of BuiltInFunctions to use the new approach whereever possible.
8d4756c
to
8f7543e
Compare
@flinkbot run azure |
@twalthr I guess this should go to 1.14 as well and possibly 1.13 to ease future backporting of relevant fixes? |
@matriv I haven't merged this to 1.14 yet. We can do this on demand. |
…nFunctionTestBase Add the possibility to tests multiple TableApi expressions and/or SQL expressions as multiple table columns using just one execution of the pipeline to speed up the total test execution time. Refactor existing tests of BuiltInFunctions to use the new approach whereever possible. This closes apache#17341.
Add the possibility to tests multiple TableApi expressions
and/or SQL expressions as multiple table columns using just one
execution of the pipeline to speed up the total test execution time.
Refactor existing tests of BuiltInFunctions to use the new approach
wherever possible.
What is the purpose of the change
Refactor
BuiltInFunctionTestBase
to allow testing of multiple expressions at once.Brief change log
Verifying this change
This change is a trivial rework / code cleanup without any test coverage.
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: noDocumentation