-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-20398][e2e] Migrate test_batch_sql.sh to Java e2e tests framework #24471
base: master
Are you sure you want to change the base?
Conversation
Hi guys, here is the PR for https://issues.apache.org/jira/browse/FLINK-20398. I decided to go for
Important concern: Should it still run nightly? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @affo .
This test used to be part of run-nightly-tests.sh,
The Java e2e tests are also triggered in the nightly run (see ./flink-end-to-end-tests/run-nightly-tests.sh:259).
...to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/BatchSQLTest.java
Show resolved
Hide resolved
...to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/BatchSQLTest.java
Outdated
Show resolved
Hide resolved
...to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/BatchSQLTest.java
Outdated
Show resolved
Hide resolved
...to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/BatchSQLTest.java
Outdated
Show resolved
Hide resolved
...to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/BatchSQLTest.java
Outdated
Show resolved
Hide resolved
...to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/BatchSQLTest.java
Outdated
Show resolved
Hide resolved
...d-to-end-tests-common/src/main/java/org/apache/flink/tests/util/flink/FlinkDistribution.java
Outdated
Show resolved
Hide resolved
flink-end-to-end-tests/flink-batch-sql-test/src/test/resources/log4j2-test.properties
Show resolved
Hide resolved
@XComp thank you for your review, gonna address your feedback today (as I had a week off) |
Required quite of an effort honestly, but here we are with the JUnit5 version of what I had before 👍 This also allowed not to start a separate jar, but to directly include the code in the text and directly run it agains the MiniCluster obtained 👍 Thank you for your detailed review UPDATE I am still investigating why the test fails in CI as I cannot reproduce that locally... UPDATE rebased and forced pushed, now CI is ok 👍 |
@XComp everything should be ok now 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job. I left a few nitty comments. But it looks good overall already. 👍
...to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/BatchSQLTest.java
Outdated
Show resolved
Hide resolved
...to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/BatchSQLTest.java
Outdated
Show resolved
Hide resolved
flink-end-to-end-tests/flink-batch-sql-test/src/test/resources/log4j2-test.properties
Outdated
Show resolved
Hide resolved
@EnumSource( | ||
value = BatchShuffleMode.class, | ||
names = { | ||
"ALL_EXCHANGES_BLOCKING", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"ALL_EXCHANGES_BLOCKING", | |
"ALL_EXCHANGES_PIPELINED", | |
"ALL_EXCHANGES_BLOCKING", |
Does it make sense to add the pipelined mode as well? (just thinking out loud, I don't have much knowledge of this part of the code).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not an expert either, but I tried and I get an IllegalState
:
At the moment, adaptive batch scheduler requires batch workloads to be executed with types of all edges being BLOCKING or HYBRID_FULL/HYBRID_SELECTIVE. To do that, you need to configure 'execution.batch-shuffle-mode' to 'ALL_EXCHANGES_BLOCKING' or 'ALL_EXCHANGES_HYBRID_FULL/ALL_EXCHANGES_HYBRID_SELECTIVE'. Note that for DataSet jobs which do not recognize the aforementioned shuffle mode, the ExecutionMode needs to be BATCH_FORCED to force BLOCKING shuffle
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense to add comment that only ALL_EXCHANGES_BLOCKING
, ALL_EXCHANGES_HYBRID_FULL
, and ALL_EXCHANGES_HYBRID_SELECTIVE
are supported by the adaptive batch scheduler.
...to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/BatchSQLTest.java
Outdated
Show resolved
Hide resolved
...to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/BatchSQLTest.java
Outdated
Show resolved
Hide resolved
...to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/BatchSQLTest.java
Outdated
Show resolved
Hide resolved
fyi: I will be off for the rest of April and, therefore, wouldn't be able to finalize this PR. You might want to reach out to other committers or expect a delay in my responses. |
@XComp Glad for your vacation! It turned out to be quite tough, as probably it is not that common, or these new APIs are not super-well documented for now. I wanted to use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @affo !
I have added couple questions
...to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/BatchSQLTest.java
Show resolved
Hide resolved
...nd-to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/Generator.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me 👍 The test runs locally and in CI as well.
$ ./mvnw -Prun-end-to-end-tests -pl flink-end-to-end-tests/flink-batch-sql-test verify -Dfast
I guess it's ready to be merged. I have a few minor things/questions, though.
...nd-to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/Generator.java
Outdated
Show resolved
Hide resolved
...to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/GeneratedRow.java
Outdated
Show resolved
Hide resolved
...nd-to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/Generator.java
Outdated
Show resolved
Hide resolved
@XComp Hello! Final touches done and your comments are addressed 👍 The problem for which I had to implement a I made it accept a In my use case, I simply made the quirk that I preserved the previous implementation using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍 Thanks for you contribution. One minor thing on the InternalGenerator
. Feel free to reject my proposal
...nd-to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/Generator.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for keeping up with me here. The PR looks good overall modulo CI. 👍 Let's wait for CI to pass and we should be able to merge the change.
...nd-to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/Generator.java
Outdated
Show resolved
Hide resolved
CI test failure is unrelated: FLINK-34513 |
@flinkbot run azure |
I'm not gonna wait for another CI round. Looks like the CI bot didn't pick up the rerun command. Anyway, I verified that the test ran (see logs). |
One final thing: I wasn't able to do it myself somehow. Can you change the commit message prefix from |
@XComp done! Don't worry in any case, I loved the review process. This is my first contribution and this is part of learning for next ones 🤝 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @affo for taking care of it. I just left some comments. PTAL.
@EnumSource( | ||
value = BatchShuffleMode.class, | ||
names = { | ||
"ALL_EXCHANGES_BLOCKING", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense to add comment that only ALL_EXCHANGES_BLOCKING
, ALL_EXCHANGES_HYBRID_FULL
, and ALL_EXCHANGES_HYBRID_SELECTIVE
are supported by the adaptive batch scheduler.
int keyIndex = 0; | ||
long ms = 0; | ||
while (ms < durationMs) { | ||
elements.add(createRow(keyIndex++, ms, offsetMs)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new implementation will consume more memory than the old one which will generate row
iteratively on the fly. This could be a potential issue for large data volume batch tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense to add comment that only ALL_EXCHANGES_BLOCKING, ALL_EXCHANGES_HYBRID_FULL, and ALL_EXCHANGES_HYBRID_SELECTIVE are supported by the adaptive batch scheduler.
Definitely 👍
The new implementation will consume more memory than the old one which will generate row iteratively on the fly. This could be a potential issue for large data volume batch tests.
Yep, now all records are generated and then used during execution, while before records were generated on the fly.
It is still possible to have such an implementation, however I am going to add context:
The PR started with a port from bash to Java.
The first pass was easy, but included the use of many deprecated method.
With @XComp we opted for improving that in a second commit.
While solving the deprecation warning, I decided to re-use the FromElementsSource
already implemented in the test utils. However, that source is meant to be fault-tolerant, so, it requires to be able to get any record produced by offset, hence the List
of elements created is necessary.
Truth be told, for this test, no fault-tolerance mechanism is required. I could have another implementation of that without that strict requirement that can use records on the fly and forget about those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the reply. Commonly, batch processing does not rely on offset. Would you please help me understand why the source should be fault-tolerant and requires getting record by offset for batch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JingGe yeah, nothing related strictly to this case.
The FromElementsSource
is actually generic and can be used and is used in the streaming case.
Here I am using it in batch table mode.
I am just reusing that as it is possible 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In other words, I could have another implementation of the bounded source without any fault-tolerance guarantee 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, the number of records is not huge, that's why I did not mention that 👍
However, I understand your concerns as well 👍
At this point I would write another generator as part of this PR.
However, I would provide that as part of test-utils rather than only confined to batch as other tests could benefit from that.
What do you guys think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds great! Please feel free to create a follow up ticket and contribute the new generator with a new PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, the number of records is not huge, that's why I did not mention that 👍
True - that's a valid point. I didn't check the number of elements as part of my last comment. I leave the decision up to you whether it's done in a new PR or part of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JingGe any objections? The refactoring should be ok considering that the amount of data involved is quite low. The actual migration from bash to Java is also done in a separate commit which enables us to revert if we feel it's necessary. WDYT?
What is the purpose of the change
Migrate
test_batch_sql.sh
to end-to-end test frameworks.Brief change log
BatchSQLTest
portingtest_batch_sql.sh
FlinkDistribution
test_batch_sql.sh
scripttest_batch_sql.sh
invocations fromrun-nightly-tests.sh
Verifying this change
This change added tests and can be verified as follows:
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: noDocumentation