[BEAM-4615] Flink job server wrapper and shadow jar#5726
Conversation
|
R: @angoenka |
runners/flink/build.gradle
Outdated
| } | ||
|
|
||
| task runJobServer(type: JavaExec) { | ||
| classpath = sourceSets.main.runtimeClasspath |
There was a problem hiding this comment.
Why not use a configuration for this like 'jobServer' and use
classpath = configurations.jobServer
This will allow you to run using the post shaded artifacts.
There was a problem hiding this comment.
I was going to do that originally, but we would effectively have to duplicate every dependency above in the configuration. In any case, I don't understand how doing so would have different shading semantics from this. I thought that the reason to do it was to maintain separate dependencies between the portable and non-portable runners.
There was a problem hiding this comment.
You don't need to duplicate every dependency.
configurations {
jobServer
}
dependencies {
...
jobServer project(path: project.path, configuration: "shadow")
}
task runJobServer {
classpath = configurations.jobServer
...
}
should work.
There was a problem hiding this comment.
Do you still recommend a distinct configuration now that we actually export the artifact in a new project?
runners/flink/build.gradle
Outdated
| jvmArgs = ["-Xdebug", "-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5005"] | ||
| } | ||
|
|
||
| task jobServerShadowJar(type: com.github.jengelman.gradle.plugins.shadow.tasks.ShadowJar) { |
There was a problem hiding this comment.
Why is this needed beyond the javaexec task above?
If its needed, does this need to be a separate artifact from what we currently ship as the Flink jar with Apache Beam?
There was a problem hiding this comment.
It's needed in order to make the job server submittable to a Flink cluster via flink run. If we already had an executable uberjar that we publish that would be fine.
There was a problem hiding this comment.
It seems like we should publish both the old Flink artifact which people use right now and this job server artifact with Apache Beam. The easiest way to do this is to create a separate job-server subproject in a directory below the Flink directory and produce this as an artifact.
There was a problem hiding this comment.
If we have the uber flink jar then we can use it as it is.
What is the command to create the uber jar in gradle and where does it put the jar?
There was a problem hiding this comment.
@lukecwik I think that makes sense. I'm not sure how to reach into a different project's configurations in order to construct the dependency set though.
There was a problem hiding this comment.
I would think that this would add the project and all its dependencies to the current project's compile scope dependency list:
configurations {
compile project(path: "beam-runners-flink", configuration: "shadow")
}
There was a problem hiding this comment.
I ended up doing it in a different way. Let me know what you think.
There was a problem hiding this comment.
The main downside of my approach is that it requires an evaluation dependency.
There was a problem hiding this comment.
@angoenka With the new layout, to create the uber jar, you run ./gradlew -p runners/flink/job-server shadowJar. It will output the uber jar at runners/flink/job-server/build/libs/flink-job-server.jar. Note that if we plan to actually publish this, we may want to go back to a versioned artifact name.
In order to debug the job server through gradle: ./gradlew -p runners/flink/job-server runShadow -PjobHost=<server address> -PartifactsDir=<artifact staging directory>.
|
I'm not sure what's causing the build error in job-server as it builds on my machine just fine. |
| * the following projects are evaluated before we evaluate this project. This is because | ||
| * we reference "sourceSets.shadow.output" directly. | ||
| */ | ||
| evaluationDependsOn(":beam-runners-flink_2.11") |
There was a problem hiding this comment.
We should remove the version 2_11 at the end.
There was a problem hiding this comment.
I think that should be decided and done as a different change, since it already exists that way in master and was explicitly decided.
| archiveName = "flink-job-server.jar" | ||
| mergeServiceFiles() | ||
| append "reference.conf" | ||
| def flinkProject = project(":beam-runners-flink_2.11") |
There was a problem hiding this comment.
Version should be removed.
There was a problem hiding this comment.
Again, this is how the projects are named in master. We can change those separately if desired, but this is generally the convention used for java artifacts that depend on specific Scala binaries.
| // task will not work because the flink runner classes only exist in the shadow | ||
| // jar. | ||
| runShadow { | ||
| def jobHost = project.hasProperty("jobHost") ? project.property("jobHost") : "localhost:8099" |
There was a problem hiding this comment.
Do we need to setup classpath and main parameters.
There was a problem hiding this comment.
No. I've applied the application plugin above in order to get access to the runShadow task. This allows us to configure the main entrypoint globally for this build file via mainClassName.
| // NOTE: runShadow must be used in order to run the job server. The standard run | ||
| // task will not work because the flink runner classes only exist in the shadow | ||
| // jar. | ||
| runShadow { |
There was a problem hiding this comment.
You don't have to specify the type since runShadow is already of the correct type.
| jvmArgs = ["-Xdebug", "-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5005"] | ||
| } | ||
|
|
||
| shadowJar { |
There was a problem hiding this comment.
(type: com.github.jengelman.gradle.plugins.shadow.tasks.ShadowJar)
There was a problem hiding this comment.
You don't need to specify the type.
| append "reference.conf" | ||
| def flinkProject = project(":beam-runners-flink_2.11") | ||
| from flinkProject.configurations.runtime | ||
| from flinkProject.sourceSets.main.output |
There was a problem hiding this comment.
Should we add mainfest
manifest { attributes "Main-Class": "org.apache.beam.runners.flink.FlinkJobServerDriver" }
There was a problem hiding this comment.
+1, removes a thing that people have to specify.
There was a problem hiding this comment.
As I noted above, this is already done globally via mainClassName. This is a feature of the application plugin, which itself is required for runShadow functionality.
| jvmArgs = ["-Xdebug", "-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5005"] | ||
| } | ||
|
|
||
| shadowJar { |
There was a problem hiding this comment.
You don't need to specify the type.
| } | ||
|
|
||
| shadowJar { | ||
| archiveName = "flink-job-server.jar" |
There was a problem hiding this comment.
You should make this the shadowClosure for applyJavaNature like:
applyJavaNature(failOnWarning: true, shadowClosure: { ... contents of shadowJar task ... })
There was a problem hiding this comment.
Ah, I hadn't seen that before! Done.
| mergeServiceFiles() | ||
| append "reference.conf" | ||
| def flinkProject = project(":beam-runners-flink_2.11") | ||
| from flinkProject.configurations.runtime |
There was a problem hiding this comment.
I would prefer if you were able to drop the from clauses and instead used:
dependencies {
compile project(path: ":beam-runners-flink", configuration: "shadow")
}
and then you wouldn't need the evaluationDependsOn and the shading plugin should be already configured to bundle all jars.
There was a problem hiding this comment.
Ah, that's much better! It's also necessary in order to get shadowJar to work via applyJavaNature.
| append "reference.conf" | ||
| def flinkProject = project(":beam-runners-flink_2.11") | ||
| from flinkProject.configurations.runtime | ||
| from flinkProject.sourceSets.main.output |
There was a problem hiding this comment.
+1, removes a thing that people have to specify.
| // NOTE: runShadow must be used in order to run the job server. The standard run | ||
| // task will not work because the flink runner classes only exist in the shadow | ||
| // jar. | ||
| runShadow { |
There was a problem hiding this comment.
You don't have to specify the type since runShadow is already of the correct type.
| * limitations under the License. | ||
| */ | ||
|
|
||
| apply from: project(":").file("build_rules.gradle") |
There was a problem hiding this comment.
note that build_rules.gradle was deleted and migrated to a binary plugin, you'll want:
apply plugin: org.apache.beam.gradle.BeamModulePlugin
There was a problem hiding this comment.
Ah, that explains the precommit error. Thanks for pointing that out. Rebased and fixed.
0ec96ef to
6fd8659
Compare
|
Thanks for the feedback. I think it's much cleaner now. |
| failOnWarning: true, | ||
| shadowClosure: { | ||
| archiveName = "flink-job-server.jar" | ||
| mergeServiceFiles() |
There was a problem hiding this comment.
The defaults are:
classifier = "shaded"
mergeServiceFiles()
into("META-INF/") {
from "${project.rootProject.projectDir}/LICENSE"
from "${project.rootProject.projectDir}/NOTICE"
}
so I think we can drop mergeServiceFiles()
Finally is it important to set the archiveName? (since I'm not sure how this will impact maven publishing and what artifact gets produced)
There was a problem hiding this comment.
I've gone ahead and dropped mergeServiceFiles(). Setting the archive name is not necessary. This is a remnant from when the jar was output in the same project as the core flink jar, so it needed a distinct name. I've removed this as well.
bsidhom
left a comment
There was a problem hiding this comment.
This should now use as many shadowJar defaults as possible. Please let me know if there are no more review comments so I can clean up the history before it's merged.
| failOnWarning: true, | ||
| shadowClosure: { | ||
| archiveName = "flink-job-server.jar" | ||
| mergeServiceFiles() |
There was a problem hiding this comment.
I've gone ahead and dropped mergeServiceFiles(). Setting the archive name is not necessary. This is a remnant from when the jar was output in the same project as the core flink jar, so it needed a distinct name. I've removed this as well.
This adds a new subproject that exports a new shadow jar that includes the job server and all transitive dependencies. It also includes a runShadow task that can be used for local debugging via gradle.
b4d758d to
45d509c
Compare
|
Went ahead and squashed the commits. |
…and shadow jar [BEAM-4615] Flink job server wrapper and shadow jar
Adds a
JavaExectarget that can be remotely debugged and a runnable uber jar that packages the Flink job server.Follow this checklist to help us incorporate your contribution quickly and easily:
[BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replaceBEAM-XXXwith the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.It will help us expedite review of your Pull Request if you tag someone (e.g.
@username) to look at it.