Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-28203] Support Maven 3.3+ #21349

Merged
merged 5 commits into from
May 12, 2023
Merged

[FLINK-28203] Support Maven 3.3+ #21349

merged 5 commits into from
May 12, 2023

Conversation

zentol
Copy link
Contributor

@zentol zentol commented Nov 18, 2022

Based on #21346 (for test stability).

This PR sets up our build system to support Maven 3.3+ and our CI system to run with Maven 3.8.6.

In Maven 3.3 the dependency tree was made immutable at runtime, and thus can no longer be changed by the shade plugin. The plugin would usually remove a dependency from the tree when it is being bundled (known as dependency reduction).
While dependency reduction still works for the published poms (== what users consume) since it can still change the content of the final pom, while developing Flink it no longer works. This breaks plenty of things, since suddenly a whole bunch of dependencies are still visible to downstream modules that weren't before.
To workaround this we now mark all dependencies that we bundle as optional; this makes them non-transitive.
Behavior-wise a non-transitive dependency is identical to a removed dependency.

To also make this work in the IDE (which never interacts with jars within a project, in contrast to Maven) the optional flag is modeled as a property that defaults to true, but is set to false by a special profile when the module is imported into IntelliJ.

Naturally, requiring all dependencies that are bundled to be marked as optional is prone too errors. To that end the ShadeOptionalChecker is being introduced, which analyzes the bundled dependencies (based on the shade-plugin output) and the set of dependencies (based on the dependency plugin) to detect cases where a dependency is not marked as optional as it should.
The enforced rule is rather simple: Any dependency that is bundled, or any of it's parents, must show up as optional in the dependency tree.
The parent clause is required to cover cases where a module has 2 paths to a bundled dependency.
If a module depends on A1/A2, each depending on B, with A1 and B being bundled, then even if A1 is marked as optional B is still shown as a non-optional dependency (because the non-optional A2 still needs it!).

This has the caveat that going forward we may, at times, have to add/exclude dependencies purely for things to show up correctly in the dependency tree.

TODO: The ShadeOptionalChecker needs tests.

@flinkbot
Copy link
Collaborator

flinkbot commented Nov 18, 2022

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@zentol
Copy link
Contributor Author

zentol commented Nov 21, 2022

Some discourse on why the PR is the way it is:

Could this be limited to modules that actually need this?

Yes. We could determine which modules are relied upon by other modules and only enforce the optional flags for said module. This would however result in inconsistent module poms and additional work for the poor soul that added a dependency on another module.

Could this be achieved with exclusions?

Yes-ish. We could add exclusions for all bundled dependencies in the consuming modules, and optionally add another module in between to centralize these exclusions in a single place. (Similar to what the -loader modules do, at the cost of even more modules).
This however would require exclusions/bundling rules in 2 distinct files to be kept in sync. This is always problematic and can easily result in excess exclusions that may cause problems down the line.
It would additionally be more difficult to test this because there's no clear answer for a consuming module as to what should actually be excluded.
We'd implicitly limit the setup to the one described in Could this be limited to modules that actually need this?.
Making this work with IntelliJ would be quite noisy, as we'd have to setup the exclusions in a profile.

@zentol zentol force-pushed the optional_shade branch 2 times, most recently from ddc6dd5 to 054b181 Compare November 21, 2022 11:41
@alpinegizmo
Copy link
Contributor

If a module depends on A1/A2, each depending on B, with A1 and B being bundled, then even if A1 is marked as optional B is still shown as a non-optional dependency (because the non-optional A2 still needs it!).

Are there are cases where this currently occurs?

@alpinegizmo
Copy link
Contributor

I buy your reasoning as to why you've chosen to set it up this way, as opposed to the alternatives you described. This seems to create the most straightforward and most maintainable of the possible outcomes.

@alpinegizmo
Copy link
Contributor

alpinegizmo commented Nov 23, 2022

Some version of the explanation in this PR needs to put somewhere where flink developers can find it. Probably here: https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/flinkdev/building/#dependency-shading

@zentol
Copy link
Contributor Author

zentol commented Nov 23, 2022

If a module depends on A1/A2, each depending on B, with A1 and B being bundled, then even if A1 is marked as optional B is still shown as a non-optional dependency (because the non-optional A2 still needs it!).

There are a few cases, yes. :(
Some examples:

flink-sql-parquet (because of a test dependency!):

[INFO] |  \- org.apache.avro:avro:jar:1.11.1:compile (optional) 
[INFO] |     +- com.fasterxml.jackson.core:jackson-core:jar:2.13.4:compile (optional) 
[INFO] |     +- com.fasterxml.jackson.core:jackson-databind:jar:2.13.4.2:compile (optional) 
[INFO] |     |  \- com.fasterxml.jackson.core:jackson-annotations:jar:2.13.4:compile

flink-connector-kinesis:

[INFO] +- com.amazonaws:aws-java-sdk-kinesis:jar:1.12.276:compile (optional) 
[INFO] |  +- com.amazonaws:aws-java-sdk-core:jar:1.12.276:compile
[INFO] |  |  \- software.amazon.ion:ion-java:jar:1.0.2:compile

@zentol
Copy link
Contributor Author

zentol commented Nov 23, 2022

Some version of the explanation in this PR needs to put somewhere where flink developers can find it. Probably here: https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/flinkdev/building/#dependency-shading

That page is all about building Flink from source and the peculiarities about building Flink on Maven 3.3+ (which this PR should actually update!). It's not really developer documentation on how to setup shading (or really changing Flink in any way) etc.

I'm inclined to add a page to the wiki / extend the Dependencies page.

@alpinegizmo
Copy link
Contributor

Some version of the explanation in this PR needs to put somewhere where flink developers can find it. Probably here: https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/flinkdev/building/#dependency-shading

That page is all about building Flink from source and the peculiarities about building Flink on Maven 3.3+ (which this PR should actually update!). It's not really developer documentation on how to setup shading (or really changing Flink in any way) etc.

I'm inclined to add a page to the wiki / extend the Dependencies page.

That page in the docs on building Flink is in a section entitled "Flink Development", and building Flink from source is the first step toward contributing to Flink. Perhaps developer documentation doesn't belong here, but there should at least be a noticeable pointer to the wiki. I say this as someone who was, until quite recently, unaware that there is valuable content for contributors (other than the FLIPs) in the wiki.

@zentol
Copy link
Contributor Author

zentol commented Nov 23, 2022

there should at least be a noticeable pointer to the wiki

Definitely. We haven't done a good job documenting the wiki as an actual source for developer docs.

Copy link
Contributor

@alpinegizmo alpinegizmo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some typos.

@zentol zentol marked this pull request as ready for review November 29, 2022 09:19
@zentol zentol changed the title Support Maven 3.3+ [FLINK-28203] Support Maven 3.3+ Dec 1, 2022
@zentol zentol requested a review from XComp April 25, 2023 09:21
Copy link
Contributor

@XComp XComp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @zentol for this PR. I went over it. The biggest issue is the documentation, I guess. Some of the description that is added to this PR could be included in the code as well.

tools/ci/verify_bundled_optional.sh Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
flink-connectors/flink-sql-connector-hive-3.1.3/pom.xml Outdated Show resolved Hide resolved
flink-dist/pom.xml Outdated Show resolved Hide resolved
flink-filesystems/flink-s3-fs-presto/pom.xml Show resolved Hide resolved
flink-formats/flink-sql-parquet/pom.xml Show resolved Hide resolved
flink-python/pom.xml Outdated Show resolved Hide resolved
Copy link
Contributor

@XComp XComp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went over the changes. One test needs to be revisited. PTAL

@zentol zentol requested a review from XComp May 11, 2023 14:11
Copy link
Contributor

@XComp XComp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

Clarify that this dependency is meant to be provided by the user.
@XComp
Copy link
Contributor

XComp commented May 12, 2023

FYI: FLINK-32066 ...if you're waiting for another CI run to be picked up.

@zentol
Copy link
Contributor Author

zentol commented May 12, 2023

@zentol zentol merged commit f0d0190 into apache:master May 12, 2023
@zentol zentol deleted the optional_shade branch May 22, 2023 09:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants