[BEAM-4615] Flink job server wrapper and shadow jar by bsidhom · Pull Request #5726 · apache/beam

bsidhom · 2018-06-21T21:19:03Z

Adds a JavaExec target that can be remotely debugged and a runnable uber jar that packages the Flink job server.

Follow this checklist to help us incorporate your contribution quickly and easily:

Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
If this contribution is large, please file an Apache Individual Contributor License Agreement.

It will help us expedite review of your Pull Request if you tag someone (e.g. @username) to look at it.

bsidhom · 2018-06-21T21:19:11Z

R: @angoenka

lukecwik · 2018-06-21T21:24:12Z

runners/flink/build.gradle

 }

+task runJobServer(type: JavaExec) {
+  classpath = sourceSets.main.runtimeClasspath


Why not use a configuration for this like 'jobServer' and use

classpath = configurations.jobServer

This will allow you to run using the post shaded artifacts.

I was going to do that originally, but we would effectively have to duplicate every dependency above in the configuration. In any case, I don't understand how doing so would have different shading semantics from this. I thought that the reason to do it was to maintain separate dependencies between the portable and non-portable runners.

You don't need to duplicate every dependency.

configurations { jobServer } dependencies { ... jobServer project(path: project.path, configuration: "shadow") } task runJobServer { classpath = configurations.jobServer ... }

should work.

Do you still recommend a distinct configuration now that we actually export the artifact in a new project?

lukecwik · 2018-06-21T21:25:14Z

runners/flink/build.gradle

+  jvmArgs = ["-Xdebug", "-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5005"]
+}
+
+task jobServerShadowJar(type: com.github.jengelman.gradle.plugins.shadow.tasks.ShadowJar) {


Why is this needed beyond the javaexec task above?

If its needed, does this need to be a separate artifact from what we currently ship as the Flink jar with Apache Beam?

It's needed in order to make the job server submittable to a Flink cluster via flink run. If we already had an executable uberjar that we publish that would be fine.

It seems like we should publish both the old Flink artifact which people use right now and this job server artifact with Apache Beam. The easiest way to do this is to create a separate job-server subproject in a directory below the Flink directory and produce this as an artifact.

If we have the uber flink jar then we can use it as it is.
What is the command to create the uber jar in gradle and where does it put the jar?

@lukecwik I think that makes sense. I'm not sure how to reach into a different project's configurations in order to construct the dependency set though.

I would think that this would add the project and all its dependencies to the current project's compile scope dependency list:

configurations { compile project(path: "beam-runners-flink", configuration: "shadow") }

I ended up doing it in a different way. Let me know what you think.

The main downside of my approach is that it requires an evaluation dependency.

@angoenka With the new layout, to create the uber jar, you run ./gradlew -p runners/flink/job-server shadowJar. It will output the uber jar at runners/flink/job-server/build/libs/flink-job-server.jar. Note that if we plan to actually publish this, we may want to go back to a versioned artifact name.

In order to debug the job server through gradle: ./gradlew -p runners/flink/job-server runShadow -PjobHost=<server address> -PartifactsDir=<artifact staging directory>.

bsidhom · 2018-06-22T17:28:53Z

I'm not sure what's causing the build error in job-server as it builds on my machine just fine.

angoenka · 2018-06-22T18:48:58Z

runners/flink/job-server/build.gradle

+ * the following projects are evaluated before we evaluate this project. This is because
+ * we reference "sourceSets.shadow.output" directly.
+ */
+evaluationDependsOn(":beam-runners-flink_2.11")


We should remove the version 2_11 at the end.

I think that should be decided and done as a different change, since it already exists that way in master and was explicitly decided.

angoenka · 2018-06-22T18:50:00Z

runners/flink/job-server/build.gradle

+  archiveName = "flink-job-server.jar"
+  mergeServiceFiles()
+  append "reference.conf"
+  def flinkProject = project(":beam-runners-flink_2.11")


Version should be removed.

Again, this is how the projects are named in master. We can change those separately if desired, but this is generally the convention used for java artifacts that depend on specific Scala binaries.

angoenka · 2018-06-22T18:52:00Z

runners/flink/job-server/build.gradle

+// task will not work because the flink runner classes only exist in the shadow
+// jar.
+runShadow {
+  def jobHost = project.hasProperty("jobHost") ? project.property("jobHost") : "localhost:8099"


Do we need to setup classpath and main parameters.

No. I've applied the application plugin above in order to get access to the runShadow task. This allows us to configure the main entrypoint globally for this build file via mainClassName.

angoenka · 2018-06-22T18:52:22Z

runners/flink/job-server/build.gradle

+// NOTE: runShadow must be used in order to run the job server. The standard run
+// task will not work because the flink runner classes only exist in the shadow
+// jar.
+runShadow {


(type: JavaExec)

You don't have to specify the type since runShadow is already of the correct type.

angoenka · 2018-06-22T18:52:38Z

runners/flink/job-server/build.gradle

+  jvmArgs = ["-Xdebug", "-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5005"]
+}
+
+shadowJar {


(type: com.github.jengelman.gradle.plugins.shadow.tasks.ShadowJar)

You don't need to specify the type.

angoenka · 2018-06-22T18:54:07Z

runners/flink/job-server/build.gradle

+  append "reference.conf"
+  def flinkProject = project(":beam-runners-flink_2.11")
+  from flinkProject.configurations.runtime
+  from flinkProject.sourceSets.main.output


Should we add mainfest
manifest { attributes "Main-Class": "org.apache.beam.runners.flink.FlinkJobServerDriver" }

+1, removes a thing that people have to specify.

As I noted above, this is already done globally via mainClassName. This is a feature of the application plugin, which itself is required for runShadow functionality.

lukecwik · 2018-06-22T20:52:44Z

runners/flink/job-server/build.gradle

+  jvmArgs = ["-Xdebug", "-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5005"]
+}
+
+shadowJar {


You don't need to specify the type.

lukecwik · 2018-06-22T20:54:28Z

runners/flink/job-server/build.gradle

+}
+
+shadowJar {
+  archiveName = "flink-job-server.jar"


You should make this the shadowClosure for applyJavaNature like:

applyJavaNature(failOnWarning: true, shadowClosure: { ... contents of shadowJar task ... })

Ah, I hadn't seen that before! Done.

lukecwik · 2018-06-22T20:55:50Z

runners/flink/job-server/build.gradle

+  mergeServiceFiles()
+  append "reference.conf"
+  def flinkProject = project(":beam-runners-flink_2.11")
+  from flinkProject.configurations.runtime


I would prefer if you were able to drop the from clauses and instead used:

dependencies { compile project(path: ":beam-runners-flink", configuration: "shadow") }

and then you wouldn't need the evaluationDependsOn and the shading plugin should be already configured to bundle all jars.

Ah, that's much better! It's also necessary in order to get shadowJar to work via applyJavaNature.

lukecwik · 2018-06-22T20:56:06Z

runners/flink/job-server/build.gradle

+  append "reference.conf"
+  def flinkProject = project(":beam-runners-flink_2.11")
+  from flinkProject.configurations.runtime
+  from flinkProject.sourceSets.main.output


+1, removes a thing that people have to specify.

lukecwik · 2018-06-22T20:56:58Z

runners/flink/job-server/build.gradle

+// NOTE: runShadow must be used in order to run the job server. The standard run
+// task will not work because the flink runner classes only exist in the shadow
+// jar.
+runShadow {


You don't have to specify the type since runShadow is already of the correct type.

lukecwik · 2018-06-22T20:59:26Z

runners/flink/job-server/build.gradle

+ * limitations under the License.
+ */
+
+apply from: project(":").file("build_rules.gradle")


note that build_rules.gradle was deleted and migrated to a binary plugin, you'll want:

apply plugin: org.apache.beam.gradle.BeamModulePlugin

Ah, that explains the precommit error. Thanks for pointing that out. Rebased and fixed.

bsidhom · 2018-06-22T21:41:26Z

Thanks for the feedback. I think it's much cleaner now.

lukecwik · 2018-06-22T21:48:21Z

runners/flink/job-server/build.gradle

+  failOnWarning: true,
+  shadowClosure: {
+    archiveName = "flink-job-server.jar"
+    mergeServiceFiles()


The defaults are:

classifier = "shaded" mergeServiceFiles() into("META-INF/") { from "${project.rootProject.projectDir}/LICENSE" from "${project.rootProject.projectDir}/NOTICE" }

so I think we can drop mergeServiceFiles()

Finally is it important to set the archiveName? (since I'm not sure how this will impact maven publishing and what artifact gets produced)

I've gone ahead and dropped mergeServiceFiles(). Setting the archive name is not necessary. This is a remnant from when the jar was output in the same project as the core flink jar, so it needed a distinct name. I've removed this as well.

bsidhom

This should now use as many shadowJar defaults as possible. Please let me know if there are no more review comments so I can clean up the history before it's merged.

bsidhom · 2018-06-25T22:20:32Z

runners/flink/job-server/build.gradle

+  failOnWarning: true,
+  shadowClosure: {
+    archiveName = "flink-job-server.jar"
+    mergeServiceFiles()


I've gone ahead and dropped mergeServiceFiles(). Setting the archive name is not necessary. This is a remnant from when the jar was output in the same project as the core flink jar, so it needed a distinct name. I've removed this as well.

This adds a new subproject that exports a new shadow jar that includes the job server and all transitive dependencies. It also includes a runShadow task that can be used for local debugging via gradle.

bsidhom · 2018-06-25T22:47:35Z

Went ahead and squashed the commits.

…and shadow jar [BEAM-4615] Flink job server wrapper and shadow jar

lukecwik reviewed Jun 21, 2018

View reviewed changes

angoenka requested changes Jun 22, 2018

View reviewed changes

lukecwik reviewed Jun 22, 2018

View reviewed changes

bsidhom force-pushed the flink-job-server-gradle branch from 0ec96ef to 6fd8659 Compare June 22, 2018 21:40

lukecwik reviewed Jun 22, 2018

View reviewed changes

bsidhom commented Jun 25, 2018

View reviewed changes

lukecwik approved these changes Jun 25, 2018

View reviewed changes

[BEAM-4615] Flink job server wrapper and shadow jar

45d509c

This adds a new subproject that exports a new shadow jar that includes the job server and all transitive dependencies. It also includes a runShadow task that can be used for local debugging via gradle.

bsidhom force-pushed the flink-job-server-gradle branch from b4d758d to 45d509c Compare June 25, 2018 22:47

jkff merged commit f55e80f into apache:master Jun 26, 2018

bsidhom deleted the flink-job-server-gradle branch June 28, 2018 17:36

charlesccychen pushed a commit to charlesccychen/beam that referenced this pull request Jul 26, 2018

Merge pull request apache#5726: [BEAM-4615] Flink job server wrapper …

1199b80

…and shadow jar [BEAM-4615] Flink job server wrapper and shadow jar

Comments

Conversation

bsidhom commented Jun 21, 2018

Uh oh!

bsidhom commented Jun 21, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lukecwik Jun 21, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lukecwik Jun 21, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

angoenka Jun 21, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lukecwik Jun 21, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bsidhom commented Jun 22, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lukecwik Jun 21, 2018 •

edited

Loading

lukecwik Jun 21, 2018 •

edited

Loading

angoenka Jun 21, 2018 •

edited

Loading

lukecwik Jun 21, 2018 •

edited

Loading

lukecwik Jun 22, 2018 •

edited

Loading