Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-8193][quickstart] Cleanup quickstart poms #5118

Closed
wants to merge 9 commits into from

Conversation

zentol
Copy link
Contributor

@zentol zentol commented Dec 4, 2017

What is the purpose of the change

This PR addresses 2 problems in the quick start poms:

  1. The poms declared unused dependencies (flink-clients), and were missing used dependencies (flink-core, scala-library).
  2. The manual dependency exclusion using the maven-shade-plugin was prone to being out-dated.

This PR fixes the missing/unused dependencies, and removes the manual exclusion so there is now only one correct way of building the jar, which is using the build-jar profile.

Verifying this change

This change was verified locally. I installed the archetypes locally and created new projects from them.
For verification I executed the programs in the IDE and packaged the jar with the build-jar profile and checked whether that they don't contain any dependencies.

Documentation

  • Does this pull request introduce a new feature? (no)
  • If yes, how is the feature documented? (docs)

@aljoscha
Copy link
Contributor

aljoscha commented Dec 4, 2017

The comment about "adding to the exclusions" under a) is outdated now. Should we maybe add the shade plugin in the profile, so that users don't accidentally build a super-fat jar?

Overall, I like the simplification.

@StephanEwen
Copy link
Contributor

I think that the change of 1. is actually not necessary. The dependencies were good in my opinion:

  • flink-core is not directly needed, because it is a transitive dependency of flink-java and flink-streaming-java. Omitting this unnecessary dependency makes it easier to keep the -Pbuild-jar profile in sync with the normal profile.
  • We added flink-clients explicitly because that is actually really the dependency needed to execute programs (is needed in the DataSet API, flink-java is not enough). It is only redundant at the moment because flink-streaming-java depends on flink-clients, which is not really nice and probably subject to change in the future.

@StephanEwen
Copy link
Contributor

Is there a way to fail the build if users do not use the -Pbuild-jar profile? I would assume that right now, at least half of the users build a mother-of-all-jars that includes all of Flink.

@zentol
Copy link
Contributor Author

zentol commented Dec 4, 2017

flink-core is a direct dependency due to the usage of the Collector class in the example programs, which is part of flink-core.

@zentol
Copy link
Contributor Author

zentol commented Dec 4, 2017

I'll add flink-clients again with a comment.

@zentol
Copy link
Contributor Author

zentol commented Dec 5, 2017

As for failing the build, I don't like that option, even if I find a way to do it. We do not know how such a kill-switch would interact with various IDEs and tools.

What we could do however is print a warning if mvn package/install is invoked without specifying the build-jar profile.

@aljoscha
Copy link
Contributor

aljoscha commented Dec 5, 2017

I like it this way, but what do you think, @StephanEwen? Is that a strong preference for "fail the build"?

@zentol
Copy link
Contributor Author

zentol commented Dec 5, 2017

Note that the current approach also prints a warning if mvn verify is executed, since that comes after package before which we print the warning.

@aljoscha
Copy link
Contributor

aljoscha commented Dec 5, 2017

Is it possible to print that only in the package phase? (or whatever it's called)

@zentol
Copy link
Contributor Author

zentol commented Dec 5, 2017

that's what we're already doing (well technically in the pre-package phase).

Thing is that the package phase is always executed before verify, and we can't get around that aside from building a custom life-cycle. And we have to print a warning in this case, as the jar is still created and thus could be used.

@StephanEwen
Copy link
Contributor

StephanEwen commented Dec 5, 2017

I am unsure here. The warning is nice, but without some exclusions from the shading phase, we will have many users wich build gigantic jars (happens to me as well sometimes). Duplicating dependencies into the user jar is more dangerous now than it used to be, given that we have the inverted classloading activated as a default. We saw such issues with Avro, and probably have similar issues with Kryo and custom serializers as well, if they end up both in app classpath and in the user jar.

I know its harder to maintain, but in this case I think that immediate user experience trumps maintenance effort, so I would vote to add the exclusions back. The exclusions set should be much smaller now that we don't have Hadoop as a dependency here anymore. Having at least the flink projects (and flink-shaded projects) as exclusions, as well as scala / akka / kryo should really help.

We can use wildcard exclusions to make it more future safe, like replacing scala version with * (<exclude>org.apache.flink:flink-runtime*</exclude>) and excluding all artifacts for a group, like <exclude>org.scala-lang:*</exclude> and <exclude>akka:*</exclude>.

@aljoscha
Copy link
Contributor

aljoscha commented Dec 5, 2017

Can't we do what I suggested earlier: don't build a fat jar if you don't specify the build-jar profile. That way you get a very slim jar that has no dependencies.

For programs that don't have deps besides the basic Flink deps this still works and if people add fancy dependencies they must know that they must somehow include them in their jar and then they discover build-jar.

@zentol
Copy link
Contributor Author

zentol commented Dec 5, 2017

That should work.

@StephanEwen
Copy link
Contributor

+1 for the latest suggestion

@zentol zentol force-pushed the 8193 branch 3 times, most recently from 94fd098 to f915506 Compare December 6, 2017 08:19
@zentol
Copy link
Contributor Author

zentol commented Dec 6, 2017

Updated the PR.

@StephanEwen
Copy link
Contributor

Looks good.

Could you double check what gets packaged into the fat jar? In the previous version, the slf4j-api and jsr305 dependencies ended up in the fat jar, even though they should be provided.

@zentol
Copy link
Contributor Author

zentol commented Dec 6, 2017

After adding kafka09/cassandra, slf4j-api and force-shading where included in the jar.

I've added exclusions for these dependencies to the shade-plugin configuration, as well as for jsr305 just to be safe.

@StephanEwen
Copy link
Contributor

+1

@zentol
Copy link
Contributor Author

zentol commented Dec 6, 2017

merging.

@asfgit asfgit closed this in 19b1b83 Dec 6, 2017
asfgit pushed a commit that referenced this pull request Dec 6, 2017
@zentol zentol deleted the 8193 branch December 6, 2017 10:42
glaksh100 pushed a commit to lyft/flink that referenced this pull request Jun 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants