Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-24019][build][dist] Package Scala APIs separately #17493

Merged
merged 6 commits into from
Oct 22, 2021

Conversation

zentol
Copy link
Contributor

@zentol zentol commented Oct 15, 2021

With this PR the flink-(streaming-)scala APIs and Scala are bundled in a separate flink-scala jar in lib/. Combined with #17402 this will allow users to remove Scala from the classpath while still being able to run Java DataSet/-Stream jobs.

@flinkbot
Copy link
Collaborator

flinkbot commented Oct 15, 2021

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@flinkbot
Copy link
Collaborator

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Automated Checks

Last check on commit 873dd4f (Fri Oct 15 07:15:34 UTC 2021)

Warnings:

  • 3 pom.xml files were touched: Check for build and licensing issues.
  • No documentation files were touched! Remember to keep the Flink docs up to date!

Mention the bot in a comment to re-run the automated checks.

Review Progress

  • ❓ 1. The [description] looks good.
  • ❓ 2. There is [consensus] that the contribution should go into to Flink.
  • ❓ 3. Needs [attention] from.
  • ❓ 4. The change fits into the overall [architecture].
  • ❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.


The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands
The @flinkbot bot supports the following commands:

  • @flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
  • @flinkbot approve all to approve all aspects
  • @flinkbot approve-until architecture to approve everything until architecture
  • @flinkbot attention @username1 [@username2 ..] to require somebody's attention
  • @flinkbot disapprove architecture to remove an approval you gave earlier

@dmvk
Copy link
Member

dmvk commented Oct 18, 2021

@zentol The license check seems to be failing

@zentol
Copy link
Contributor Author

zentol commented Oct 18, 2021

@dmvk yep; there were some issues with the packaging. Should be fixed now.

@zentol
Copy link
Contributor Author

zentol commented Oct 18, 2021

I'm also looking into why this failed in the first place; the original version should have worked.

@zentol zentol force-pushed the 24019 branch 2 times, most recently from ee9ba11 to 92c87b6 Compare October 18, 2021 11:04
@dmvk
Copy link
Member

dmvk commented Oct 20, 2021

Looking at the build output, we should be able to remove the scala version suffix from the flink-dist jar after https://github.com/apache/flink/pull/17518/files is merged right?

@zentol
Copy link
Contributor Author

zentol commented Oct 20, 2021

No, because it still has various non-test dependencies that rely on Scala, like the Table API, various examples, optional libraries like gelly/cep.
For the Scala suffixes we do not care what is being bundled; the important bit is the dependency tree.

@dmvk
Copy link
Member

dmvk commented Oct 20, 2021

For the Scala suffixes we do not care what is being bundled; the important bit is the dependency tree.

I'm not 100% sure about this point. From the user perspective:

  • I don't care about how the Flink distribution is being build.
  • If I look into the flink-dist_2.12-1.15-SNAPSHOT.jar, my immediate feeling would be... "yeah, this jar bundles a scala classes"

I'm not proposing to change name of the build module, but simply renaming the "bundled output artifact"


Just to double check, idea is that user can opt-out by removing the flink-dist-scala jar from the lib folder.

  • For now this would break the Table API (this is packaged in separate jar); this limitation will be addressed in FLINK-24427

Copy link
Member

@dmvk dmvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall, great job 👍 I've added some comments and questions, but nothing major.

I've also built the flink distribution locally and checked the flink-dist a flink-scala jar contents.

<file>
<source>../flink-dist-scala/target/flink-dist-scala_${scala.binary.version}-${project.version}.jar</source>
<outputDirectory>lib/</outputDirectory>
<destName>flink-scala_${scala.binary.version}-${project.version}.jar</destName>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we change the destination name to flink-dist-scala here?

Suggested change
<destName>flink-scala_${scala.binary.version}-${project.version}.jar</destName>
<destName>flink-dist-scala_${scala.binary.version}-${project.version}.jar</destName>

Copy link
Contributor Author

@zentol zentol Oct 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, because that would break all scripts that use globbing patters like flink-dist*.
(tried that initially...)

<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-deploy-plugin</artifactId>
<version>2.4</version><!--$NO-MVN-MAN-VER$-->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we downgrading the deploy-plugin version here (from 2.8.2 used in the Apache parent)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its the same version we use everywhere in the project. I don't think there's a reason for it; we never looked into it because there's isn't really a reason to so.

@zentol
Copy link
Contributor Author

zentol commented Oct 20, 2021

Just to double check, idea is that user can opt-out by removing the flink-dist-scala jar from the lib folder. For now this would break the Table API (this is packaged in separate jar); this limitation will be addressed in FLINK-24427

That is correct. Users can only use the DataSet/-Stream APIs without Scala.

@zentol
Copy link
Contributor Author

zentol commented Oct 20, 2021

[flink-dist does not bundle any scala-reliant classes so let's remove the suffix from the artifact but keep it on the artifactId]

That's something we could consider, yes.

@zentol zentol merged commit 86c9694 into apache:master Oct 22, 2021
@zentol zentol deleted the 24019 branch November 8, 2021 11:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants