[FLINK-20398][e2e] Migrate test_batch_sql.sh to Java e2e tests framework #24471

affo · 2024-03-08T15:05:49Z

What is the purpose of the change

Migrate test_batch_sql.sh to end-to-end test frameworks.

Brief change log

implement BatchSQLTest porting test_batch_sql.sh
fix issue in getting job ID in FlinkDistribution
remove test_batch_sql.sh script
remove test_batch_sql.sh invocations from run-nightly-tests.sh

Verifying this change

This change added tests and can be verified as follows:

Added integration tests for end-to-end batch mode SQL query execution

Does this pull request potentially affect one of the following parts:

Dependencies (does it add or upgrade a dependency): no
The public API, i.e., is any changed class annotated with @Public(Evolving): no
The serializers: no
The runtime per-record code paths (performance sensitive): no
Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
The S3 file system connector: no

Documentation

Does this pull request introduce a new feature? no
If yes, how is the feature documented? NA

flinkbot · 2024-03-08T15:12:29Z

CI report:

cfafeb2 Azure: SUCCESS

Bot commands

The @flinkbot bot supports the following commands:

@flinkbot run azure re-run the last Azure build

affo · 2024-03-11T14:56:31Z

@wuchong @XComp

Hi guys, here is the PR for https://issues.apache.org/jira/browse/FLINK-20398.

I decided to go for LocalStandaloneFlinkResourceFactory as it is already used and part of flink-end-to-end-tests-common other options were:

MiniCluster
FlinkContainers (testcontainers-based)
Just tell me if you would rather see one of those implementations for some reason 👍

Important concern:
This test used to be part of run-nightly-tests.sh,
now I think it would run differently 🤔

Should it still run nightly?

XComp

Thanks @affo .

This test used to be part of run-nightly-tests.sh,

The Java e2e tests are also triggered in the nightly run (see ./flink-end-to-end-tests/run-nightly-tests.sh:259).

...to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/BatchSQLTest.java

...d-to-end-tests-common/src/main/java/org/apache/flink/tests/util/flink/FlinkDistribution.java

flink-end-to-end-tests/flink-batch-sql-test/src/test/resources/log4j2-test.properties

affo · 2024-03-18T13:15:32Z

@XComp thank you for your review, gonna address your feedback today (as I had a week off)

affo · 2024-03-19T15:28:53Z

@XComp

Required quite of an effort honestly, but here we are with the JUnit5 version of what I had before 👍

This also allowed not to start a separate jar, but to directly include the code in the text and directly run it agains the MiniCluster obtained 👍

Thank you for your detailed review

UPDATE

I am still investigating why the test fails in CI as I cannot reproduce that locally...
I tried to use java8 for compiling and running, but I hit another error actually 😓

UPDATE

rebased and forced pushed, now CI is ok 👍

affo · 2024-03-25T15:11:40Z

@XComp everything should be ok now 👍

XComp

Good job. I left a few nitty comments. But it looks good overall already. 👍

...to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/BatchSQLTest.java

flink-end-to-end-tests/flink-batch-sql-test/src/test/resources/log4j2-test.properties

XComp · 2024-03-27T14:08:01Z

...to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/BatchSQLTest.java

+    @EnumSource(
+            value = BatchShuffleMode.class,
+            names = {
+                "ALL_EXCHANGES_BLOCKING",


Suggested change

"ALL_EXCHANGES_BLOCKING",

"ALL_EXCHANGES_PIPELINED",

"ALL_EXCHANGES_BLOCKING",

Does it make sense to add the pipelined mode as well? (just thinking out loud, I don't have much knowledge of this part of the code).

Not an expert either, but I tried and I get an IllegalState:

At the moment, adaptive batch scheduler requires batch workloads to be executed with types of all edges being BLOCKING or HYBRID_FULL/HYBRID_SELECTIVE. To do that, you need to configure 'execution.batch-shuffle-mode' to 'ALL_EXCHANGES_BLOCKING' or 'ALL_EXCHANGES_HYBRID_FULL/ALL_EXCHANGES_HYBRID_SELECTIVE'. Note that for DataSet jobs which do not recognize the aforementioned shuffle mode, the ExecutionMode needs to be BATCH_FORCED to force BLOCKING shuffle

I think it makes sense to add comment that only ALL_EXCHANGES_BLOCKING, ALL_EXCHANGES_HYBRID_FULL, and ALL_EXCHANGES_HYBRID_SELECTIVE are supported by the adaptive batch scheduler.

...to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/BatchSQLTest.java

XComp · 2024-04-03T07:17:59Z

fyi: I will be off for the rest of April and, therefore, wouldn't be able to finalize this PR. You might want to reach out to other committers or expect a delay in my responses.

affo · 2024-04-11T14:40:35Z

@XComp Glad for your vacation!
Finally I also addressed the deprecation warnings and went through the implementation of a custom connector through DynamicTableSource.

It turned out to be quite tough, as probably it is not that common, or these new APIs are not super-well documented for now.

I wanted to use TableEnvironment.fromValues however, I could not use it as the test was hanging...
I want to understand why and, in case, file an issue for that.

morazow

Thanks @affo !

I have added couple questions

...to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/BatchSQLTest.java

...nd-to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/Generator.java

XComp

Looks good to me 👍 The test runs locally and in CI as well.

$ ./mvnw -Prun-end-to-end-tests -pl flink-end-to-end-tests/flink-batch-sql-test verify -Dfast

I guess it's ready to be merged. I have a few minor things/questions, though.

...nd-to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/Generator.java

...to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/GeneratedRow.java

...nd-to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/Generator.java

affo · 2024-05-23T13:39:08Z

@XComp Hello!

Final touches done and your comments are addressed 👍
I added the capability for FromElementsSource to accept a ElementsSupplier at construction phase.

The problem for which I had to implement a Serializable extend RowData was due to the fact that the FromElementsSource had a field List<OUT> elements, where OUT can also be non-serializable (which is the case of RowData), so, when the job was starting the operator couldn't be serialized.

I made it accept a ElementsSupplier extends Serializable so that it is clear that the supplier should be serializable.

In my use case, I simply made the quirk that I preserved the previous implementation using Row (which is serializable) and just convert it to RowData on get. No class has now RowData fields that prevent their serializability.

XComp

LGTM 👍 Thanks for you contribution. One minor thing on the InternalGenerator. Feel free to reject my proposal

...nd-to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/Generator.java

XComp

Thanks for keeping up with me here. The PR looks good overall modulo CI. 👍 Let's wait for CI to pass and we should be able to merge the change.

...nd-to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/Generator.java

XComp · 2024-05-29T14:47:09Z

CI test failure is unrelated: FLINK-34513

XComp · 2024-05-29T14:48:30Z

@flinkbot run azure

XComp · 2024-05-30T07:34:29Z

I'm not gonna wait for another CI round. Looks like the CI bot didn't pick up the rerun command. Anyway, I verified that the test ran (see logs).

XComp · 2024-05-30T07:45:07Z

One final thing: I wasn't able to do it myself somehow. Can you change the commit message prefix from [refactor] to [FLINK-20398]? "refactor" isn't a prefix the Flink community usually use.

…eation

affo · 2024-05-30T08:38:27Z

@XComp done!

Don't worry in any case, I loved the review process. This is my first contribution and this is part of learning for next ones 🤝

JingGe

Thanks @affo for taking care of it. I just left some comments. PTAL.

JingGe · 2024-05-31T15:41:26Z

...to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/BatchSQLTest.java

+    @EnumSource(
+            value = BatchShuffleMode.class,
+            names = {
+                "ALL_EXCHANGES_BLOCKING",


I think it makes sense to add comment that only ALL_EXCHANGES_BLOCKING, ALL_EXCHANGES_HYBRID_FULL, and ALL_EXCHANGES_HYBRID_SELECTIVE are supported by the adaptive batch scheduler.

JingGe · 2024-05-31T15:50:12Z

...nd-to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/Generator.java

+        int keyIndex = 0;
+        long ms = 0;
+        while (ms < durationMs) {
+            elements.add(createRow(keyIndex++, ms, offsetMs));


The new implementation will consume more memory than the old one which will generate row iteratively on the fly. This could be a potential issue for large data volume batch tests.

I think it makes sense to add comment that only ALL_EXCHANGES_BLOCKING, ALL_EXCHANGES_HYBRID_FULL, and ALL_EXCHANGES_HYBRID_SELECTIVE are supported by the adaptive batch scheduler.

Definitely 👍

The new implementation will consume more memory than the old one which will generate row iteratively on the fly. This could be a potential issue for large data volume batch tests.

Yep, now all records are generated and then used during execution, while before records were generated on the fly.
It is still possible to have such an implementation, however I am going to add context:

The PR started with a port from bash to Java.
The first pass was easy, but included the use of many deprecated method.
With @XComp we opted for improving that in a second commit.
While solving the deprecation warning, I decided to re-use the FromElementsSource already implemented in the test utils. However, that source is meant to be fault-tolerant, so, it requires to be able to get any record produced by offset, hence the List of elements created is necessary.

Truth be told, for this test, no fault-tolerance mechanism is required. I could have another implementation of that without that strict requirement that can use records on the fly and forget about those.

@JingGe @XComp should I proceed? Thank you!

Thanks for the reply. Commonly, batch processing does not rely on offset. Would you please help me understand why the source should be fault-tolerant and requires getting record by offset for batch?

@JingGe yeah, nothing related strictly to this case.

The FromElementsSource is actually generic and can be used and is used in the streaming case.

Here I am using it in batch table mode.

I am just reusing that as it is possible 👍

In other words, I could have another implementation of the bounded source without any fault-tolerance guarantee 👍

@JingGe @XComp

Got it, the number of records is not huge, that's why I did not mention that 👍
However, I understand your concerns as well 👍

At this point I would write another generator as part of this PR.

However, I would provide that as part of test-utils rather than only confined to batch as other tests could benefit from that.

What do you guys think?

Sounds great! Please feel free to create a follow up ticket and contribute the new generator with a new PR.

Got it, the number of records is not huge, that's why I did not mention that 👍

True - that's a valid point. I didn't check the number of elements as part of my last comment. I leave the decision up to you whether it's done in a new PR or part of this PR.

@JingGe @XComp Thank you for the feedback!

@XComp I would merge this is as it is.

In the brackground I was already working on something similar, I will create another issue for adding a test source for batch tests and for Table API

@JingGe any objections? The refactoring should be ok considering that the amount of data involved is quite low. The actual migration from bash to Java is also done in a separate commit which enables us to revert if we feel it's necessary. WDYT?

affo force-pushed the la-20398 branch from 35acd12 to f8cf66a Compare March 11, 2024 14:50

XComp reviewed Mar 14, 2024

View reviewed changes

affo force-pushed the la-20398 branch from f8cf66a to 4a9f0c7 Compare March 19, 2024 15:25

affo force-pushed the la-20398 branch from 4a9f0c7 to 4601f99 Compare March 21, 2024 11:33

XComp reviewed Mar 27, 2024

View reviewed changes

flinkbot added component=TableSQL/API component=Tests labels Apr 4, 2024

affo force-pushed the la-20398 branch from 4601f99 to 9f706dd Compare April 11, 2024 14:34

morazow suggested changes Apr 12, 2024

View reviewed changes

...to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/BatchSQLTest.java Show resolved Hide resolved

...nd-to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/Generator.java Outdated Show resolved Hide resolved

morazow approved these changes Apr 13, 2024

View reviewed changes

XComp approved these changes May 15, 2024

View reviewed changes

[FLINK-20398][e2e] Migrate test_batch_sql.sh to Java e2e tests framework

0498c5c

affo force-pushed the la-20398 branch from 9f706dd to 7e10cf8 Compare May 23, 2024 13:30

XComp reviewed May 27, 2024

View reviewed changes

...nd-to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/Generator.java Outdated Show resolved Hide resolved

affo force-pushed the la-20398 branch from 7e10cf8 to 01fb93e Compare May 29, 2024 08:34

XComp approved these changes May 29, 2024

View reviewed changes

...nd-to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/Generator.java Outdated Show resolved Hide resolved

[FLINK-20398][test] solve deprecation in BatchSQLTest source table cr…

cfafeb2

…eation

affo force-pushed the la-20398 branch from 01fb93e to cfafeb2 Compare May 30, 2024 08:36

JingGe reviewed May 31, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FLINK-20398][e2e] Migrate test_batch_sql.sh to Java e2e tests framework #24471

[FLINK-20398][e2e] Migrate test_batch_sql.sh to Java e2e tests framework #24471

affo commented Mar 8, 2024 •

edited

flinkbot commented Mar 8, 2024 •

edited

affo commented Mar 11, 2024

XComp left a comment •

edited

affo commented Mar 18, 2024 •

edited

affo commented Mar 19, 2024 •

edited

affo commented Mar 25, 2024

XComp left a comment

XComp Mar 27, 2024

affo Mar 28, 2024

JingGe May 31, 2024

XComp commented Apr 3, 2024

affo commented Apr 11, 2024

morazow left a comment

XComp left a comment •

edited

affo commented May 23, 2024 •

edited

XComp left a comment

XComp left a comment

XComp commented May 29, 2024

XComp commented May 29, 2024

XComp commented May 30, 2024

XComp commented May 30, 2024

affo commented May 30, 2024

JingGe left a comment

JingGe May 31, 2024

JingGe May 31, 2024

affo Jun 3, 2024

JingGe Jun 3, 2024 •

edited

affo Jun 3, 2024

affo Jun 3, 2024

affo Jun 5, 2024

JingGe Jun 5, 2024

XComp Jun 5, 2024

affo Jun 5, 2024

XComp Jun 10, 2024

	"ALL_EXCHANGES_BLOCKING",
	"ALL_EXCHANGES_PIPELINED",
	"ALL_EXCHANGES_BLOCKING",

[FLINK-20398][e2e] Migrate test_batch_sql.sh to Java e2e tests framework #24471

Are you sure you want to change the base?

[FLINK-20398][e2e] Migrate test_batch_sql.sh to Java e2e tests framework #24471

Conversation

affo commented Mar 8, 2024 • edited

What is the purpose of the change

Brief change log

Verifying this change

Does this pull request potentially affect one of the following parts:

Documentation

flinkbot commented Mar 8, 2024 • edited

CI report:

affo commented Mar 11, 2024

XComp left a comment • edited

Choose a reason for hiding this comment

affo commented Mar 18, 2024 • edited

affo commented Mar 19, 2024 • edited

affo commented Mar 25, 2024

XComp left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

XComp commented Apr 3, 2024

affo commented Apr 11, 2024

morazow left a comment

Choose a reason for hiding this comment

XComp left a comment • edited

Choose a reason for hiding this comment

affo commented May 23, 2024 • edited

XComp left a comment

Choose a reason for hiding this comment

XComp left a comment

Choose a reason for hiding this comment

XComp commented May 29, 2024

XComp commented May 29, 2024

XComp commented May 30, 2024

XComp commented May 30, 2024

affo commented May 30, 2024

JingGe left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JingGe Jun 3, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

affo commented Mar 8, 2024 •

edited

flinkbot commented Mar 8, 2024 •

edited

XComp left a comment •

edited

affo commented Mar 18, 2024 •

edited

affo commented Mar 19, 2024 •

edited

XComp left a comment •

edited

affo commented May 23, 2024 •

edited

JingGe Jun 3, 2024 •

edited