Flink: Add support for Flink 1.18 #8930

YesOrNo828 · 2023-10-27T02:45:40Z

Feature Request / Improvement

Feature Request:

Recently Flink 1.18 was released.
https://nightlies.apache.org/flink/flink-docs-release-1.18/release-notes/flink-1.18/

What to do?

The Flink community has been working hard to improve the stability of the connector interface. This makes it easier to abstract and maintain the Flink common connector between different Flink versions. Similar efforts have been made in other open-source communities, such as Paimon (Apache Incubating) and Amoro.

This issue only concerns the flink-common abstraction in the flink1.18 version, 1.17 and other flink connector refactor work would be done in other issues.

What's the benefit?

Reduce the codebase and facilitate code maintenance;
Reduce the CI cost.

WDYT?

Query engine

Flink

pvary · 2023-10-27T05:23:09Z

I am concerned about the dependency tree of the flink/v1.17 module. If the flink-common depends on Flink 1.18, and the flink/v1.17 depends on flink-common, then we will have a transitive dependency on Flink 1.18. We can exclude the dependency, but how can we make sure that everything works as expected without running the tests.
Also this would mean that we need to predict the changes in Flink when deciding what to put into the flink-common package, and what to keep in the version specific packages. As an alternative we can move these classes around every time when it is needed - but this approach has its own maintenance costs.

YesOrNo828 · 2023-10-27T13:04:18Z

then we will have a transitive dependency on Flink 1.18. We can exclude the dependency, but how can we make sure that everything works as expected without running the tests.

You're right. There'll be a transitive dependency. If Flink1.17's APIs are incompatible with Flink1.18, I think we can copy some tests into the Flink1.17 module to ensure it works fine and add end-to-end tests to guarantee compatibility with different Flink versions

As an alternative we can move these classes around every time when it is needed - but this approach has its own maintenance costs.

As of Flink 1.18, released a few days ago, the Flink community has externalized the JDBC, ES, Kafka, pulsar, HBase, and so on connectors. That means the Flink API has become more compatible. So I think this maintenance cost between different Flink versions is acceptable.

YesOrNo828 · 2023-10-28T05:43:19Z

The list supported Flink versions for each connector:
https://flink.apache.org/downloads/#apache-flink-connectors

pvary · 2023-10-28T06:34:22Z

then we will have a transitive dependency on Flink 1.18. We can exclude the dependency, but how can we make sure that everything works as expected without running the tests.

You're right. There'll be a transitive dependency. If Flink1.17's APIs are incompatible with Flink1.18, I think we can copy some tests into the Flink1.17 module to ensure it works fine and add end-to-end tests to guarantee compatibility with different Flink versions

I think the issue here, is that we can not be sure, which dependency needs to be copied over in advance.

I tried to upgrade to Flink 1.18 yesterday, and found that there is a Flink change which causes the TestFlinkMetaDataTable#testAllFilesPartitioned test to fail. See: https://apache-flink.slack.com/archives/C03G7LJTS2G/p1698402761715879?thread_ts=1698402761.715879&cid=C03G7LJTS2G

One can argue that these issues are found at upgrade time, but if we add features between migration, then we will not find them, only if we run all of the tests.

So I think running all of the tests on every supported Flink version is a must.

As an alternative we can move these classes around every time when it is needed - but this approach has its own maintenance costs.

As of Flink 1.18, released a few days ago, the Flink community has externalized the JDBC, ES, Kafka, pulsar, HBase, and so on connectors. That means the Flink API has become more compatible. So I think this maintenance cost between different Flink versions is acceptable.

I am currently working on extending the Flink Sink API to provide features needed for Iceberg. See:

I would like to use them ASAP, and this will cause differences between 1.18/1.19 versions of the Iceberg connector. If we have flink-common, then until 1.18 this should be in common, after 1.19 it should go to the version specific code, and after we drop support for 1.18 then it will go to the common code again. Also we would need to come up with the appropriate abstractions for separate the changing code from the fix code. These issues might be even more pronounced when the Flink 2.0 comes out, of which the discussion is already started.

IMHO these are more complex tasks than simply backporting the changes to the other branches. Maybe that is the cause why the Iceberg-Spark code is handled in the same way.

kevnzhao · 2023-11-08T02:41:58Z

Quick check on any progress or plan/timeline for this request?

pvary · 2023-11-08T11:37:54Z

I have 2 PRs in progress, which I would like to merge first (#8803 and #8553). If nobody starts working on this until they are merged, then I will move forward with this one. I plan to use the original method of upgrading (independent path for every Flink version). I would like to see the 1.18 available before the end of the year - but hopefully sooner

PrabhuJoseph · 2023-11-11T04:08:31Z

The failing test TestFlinkMetaDataTable#testAllFilesPartitioned is due to this a Flink side issue which i reported here - https://issues.apache.org/jira/browse/FLINK-33523.

PrabhuJoseph · 2023-11-11T16:26:53Z

Flink1.17 used to allow conversion of collection (List) data types ARRAY<INT NOT NULL> and ARRAY<INT> into Object[] or Integer[]. But Flink1.18 fixed it in https://issues.apache.org/jira/browse/FLINK-31835 which allows casting of ARRAY<INT> into Object[] or Integer[] and ARRAY<INT NOT NULL> into int[].

Reference: https://github.com/apache/flink/pull/22485/files#diff-3462f451bf210e099fbea27237e809d83b134c4e9859c2b54d5974e1d786c3a6R199

The failing iceberg unit test needs to be fixed to handle this case. I have patch which fixes the same. Let me know when we start working on this, i can share the patch for the unit test failure.

joprabhu@88665a14a66d Aws157Iceberg % ./gradlew -DsparkVersions=$SPARK_VERSION -DhiveVersions=$HIVE_MAJOR_VERSION -DflinkVersions=$FLINK_MAJOR_VERSION :iceberg-flink:iceberg-flink-1.18:test                                                                

Deprecated Gradle features were used in this build, making it incompatible with Gradle 9.0.

You can use '--warning-mode all' to show the individual deprecation warnings and determine if they come from your own scripts or plugins.

See https://docs.gradle.org/8.1.1/userguide/command_line_interface.html#sec:command_line_warnings

BUILD SUCCESSFUL in 23m 28s

pvary · 2023-11-13T14:51:43Z

Thanks @PrabhuJoseph for the investigation!
We will come back to this soon!

pvary · 2023-12-12T14:23:37Z

As #9211 is merged, we can close this.
WDYT: @YesOrNo828

pvary mentioned this issue Nov 23, 2023

Questions related to watermarking of Iceberg source #9138

Closed

pvary mentioned this issue Dec 6, 2023

Flink: Adds support for 1.18 version #9211

Merged

YesOrNo828 closed this as completed Dec 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flink: Add support for Flink 1.18 #8930

Flink: Add support for Flink 1.18 #8930

YesOrNo828 commented Oct 27, 2023

pvary commented Oct 27, 2023

YesOrNo828 commented Oct 27, 2023

YesOrNo828 commented Oct 28, 2023

pvary commented Oct 28, 2023

kevnzhao commented Nov 8, 2023

pvary commented Nov 8, 2023 •

edited

PrabhuJoseph commented Nov 11, 2023

PrabhuJoseph commented Nov 11, 2023

pvary commented Nov 13, 2023

pvary commented Dec 12, 2023

Flink: Add support for Flink 1.18 #8930

Flink: Add support for Flink 1.18 #8930

Comments

YesOrNo828 commented Oct 27, 2023

Feature Request / Improvement

Feature Request:

What to do?

What's the benefit?

Query engine

pvary commented Oct 27, 2023

YesOrNo828 commented Oct 27, 2023

YesOrNo828 commented Oct 28, 2023

pvary commented Oct 28, 2023

kevnzhao commented Nov 8, 2023

pvary commented Nov 8, 2023 • edited

PrabhuJoseph commented Nov 11, 2023

PrabhuJoseph commented Nov 11, 2023

pvary commented Nov 13, 2023

pvary commented Dec 12, 2023

pvary commented Nov 8, 2023 •

edited