-
Notifications
You must be signed in to change notification settings - Fork 4.5k
[BEAM-6936] Added Jenkins jobs running Java examples on Dataflow with Java 11 #8183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BEAM-6936] Added Jenkins jobs running Java examples on Dataflow with Java 11 #8183
Conversation
|
R: @lgajowy |
|
Run seed job |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added suggestions. Thanks!
Other than that: I don't see the job results in Jenkins. Could you run them and check if they pass?
Could you also set the jira id to: BEAM-6936 in the commit message (it think there is a typo)
| } | ||
| } | ||
|
|
||
| task postCommitFnApiWorkerJava11(type: Test) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that this is a similar task to preCommitFnApiWorker. Could we reuse preCommitFnApiWorker and parametrize it? My main concern now is that we duplicate lots of code that will be hard to maintain.
When refactoring this, please remember to set default argument/parameter values to java 8's in order not to need to change anything else in the code that already uses this. Also please change its name to fnApiWorkerTests or something similar because it's clearly not preCommit only now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After looking into this task implementation, I am unable to see any difference.
So I am also wondering, why we need that additional task at all. Wouldn't it be sufficient to just call the other?
So maybe rename preCommitFnApiWorkerto something like runFnApiWorker and call that task in respective Jenkins job definitions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, that didn't add anything. I renamed preCommitFnApiWorker to verifyFnApiWorker and removed my redundant tasks. Now both precommit (already existent) and postcommit (new) jobs use the same tasks. distinction between Java 11 and Java 8 is done by setting a flag in the Jenkins job
| } | ||
| } | ||
|
|
||
| task postCommitLegacyWorkerJava11(type: Test) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly to above, could we perform the same refactoring?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately it wasn't possible. I couldn't find a way of extracting common configuration to another element. Setting system properties requires Test task type and those properties are restricted in scope to a single task - I experimented with setting a property in a task with dependsOn directive, but it wasn't effective. Do you know of another way of extracting this?
@adude3141 it's also a reply to your general remark, the solution you linked didn't work. I believe that currently the best way is to duplicate those tasks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe something like this will do?
diff --git a/runners/google-cloud-dataflow-java/examples/build.gradle b/runners/google-cloud-dataflow-java/examples/build.gradle
index 72059bb064..f591e5009b 100644
--- a/runners/google-cloud-dataflow-java/examples/build.gradle
+++ b/runners/google-cloud-dataflow-java/examples/build.gradle
@@ -42,25 +42,33 @@ def dockerImageName = project(':beam-runners-google-cloud-dataflow-java').ext.do
// If -PuseExecutableStage is set, the use_executable_stage_bundle_execution wil be enabled.
def fnapiExperiments = project.hasProperty('useExecutableStage') ? 'beam_fn_api,use_executable_stage_bundle_execution' : "beam_fn_api"
+// we require dataflowWorker, optional workerHarnessContainerImage and optional additionalOptions
+def commonConfig = { dataflowWorkerJar, workerHarnessContainerImage = '', additionalOptions = [] ->
+ // return the preevaluated configuration closure
+ return {
+ testClassesDirs = files(project(":beam-examples-java").sourceSets.test.output.classesDirs)
+ include "**/WordCountIT.class"
+ include "**/WindowedWordCountIT.class"
+ forkEvery 1
+ maxParallelForks 4
+
+ def preCommitBeamTestPipelineOptions = [
+ "--project=${gcpProject}",
+ "--tempRoot=${gcsTempRoot}",
+ "--runner=TestDataflowRunner",
+ "--dataflowWorkerJar=${dataflowWorkerJar}",
+ "--workerHarnessContainerImage=${workerHarnessContainerImage}"
+ ] + additionalOptions
+ systemProperty "beamTestPipelineOptions", JsonOutput.toJson(preCommitBeamTestPipelineOptions)
+ }
+}
+
task preCommitLegacyWorker(type: Test) {
dependsOn ":beam-runners-google-cloud-dataflow-java-legacy-worker:shadowJar"
def dataflowWorkerJar = project.findProperty('dataflowWorkerJar') ?: project(":beam-runners-google-cloud-dataflow-java-legacy-worker").shadowJar.archivePath
-
//Set workerHarnessContainerImage to empty to make Dataflow pick up the non-versioned container
//image, which handles a staged worker jar.
- def preCommitBeamTestPipelineOptions = [
- "--project=${gcpProject}",
- "--tempRoot=${gcsTempRoot}",
- "--runner=TestDataflowRunner",
- "--dataflowWorkerJar=${dataflowWorkerJar}",
- "--workerHarnessContainerImage=",
- ]
- testClassesDirs = files(project(":beam-examples-java").sourceSets.test.output.classesDirs)
- include "**/WordCountIT.class"
- include "**/WindowedWordCountIT.class"
- forkEvery 1
- maxParallelForks 4
- systemProperty "beamTestPipelineOptions", JsonOutput.toJson(preCommitBeamTestPipelineOptions)
+ with commonConfig(dataflowWorkerJar)
}
task verifyFnApiWorker(type: Test) {
@@ -68,20 +76,7 @@ task verifyFnApiWorker(type: Test) {
dependsOn ":beam-runners-google-cloud-dataflow-java:buildAndPushDockerContainer"
def dataflowWorkerJar = project.findProperty('dataflowWorkerJar') ?: project(":beam-runners-google-cloud-dataflow-java-fn-api-worker").shadowJar.archivePath
- def preCommitBeamTestPipelineOptions = [
- "--project=${gcpProject}",
- "--tempRoot=${gcsTempRoot}",
- "--runner=TestDataflowRunner",
- "--dataflowWorkerJar=${dataflowWorkerJar}",
- "--workerHarnessContainerImage=${dockerImageName}",
- "--experiments=${fnapiExperiments}",
- ]
- testClassesDirs = files(project(":beam-examples-java").sourceSets.test.output.classesDirs)
- include "**/WordCountIT.class"
- include "**/WindowedWordCountIT.class"
- forkEvery 1
- maxParallelForks 4
- systemProperty "beamTestPipelineOptions", JsonOutput.toJson(preCommitBeamTestPipelineOptions)
+ with commonConfig(dataflowWorkerJar, dockerImageName, "--experiments=${fnapiExperiments}")
useJUnit {
excludeCategories 'org.apache.beam.sdk.testing.StreamingIT'
}
@@ -90,24 +85,10 @@ task verifyFnApiWorker(type: Test) {
task postCommitLegacyWorkerJava11(type: Test) {
dependsOn ":beam-runners-google-cloud-dataflow-java-legacy-worker:shadowJar"
def dataflowWorkerJar = project.findProperty('dataflowWorkerJar') ?: project(":beam-runners-google-cloud-dataflow-java-legacy-worker").shadowJar.archivePath
-
//Set workerHarnessContainerImage to empty to make Dataflow pick up the non-versioned container
//image, which handles a staged worker jar.
- def preCommitBeamTestPipelineOptions = [
- "--project=${gcpProject}",
- "--tempRoot=${gcsTempRoot}",
- "--runner=TestDataflowRunner",
- "--dataflowWorkerJar=${dataflowWorkerJar}",
- "--workerHarnessContainerImage=",
- ]
-
- testClassesDirs = files(project(":beam-examples-java").sourceSets.test.output.classesDirs)
- include "**/WordCountIT.class"
- include "**/WindowedWordCountIT.class"
- forkEvery 1
- maxParallelForks 4
+ with commonConfig(dataflowWorkerJar)
systemProperty "java.specification.version", "11"
- systemProperty "beamTestPipelineOptions", JsonOutput.toJson(preCommitBeamTestPipelineOptions)
}
task java11PostCommit() {But please bear with me. It is a fast hack and I did not run. Also, if you decide to use, doublecheck that I did not accidentally deleted some required configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion, it worked with minimal modifications ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. Glad I could help here!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
|
Adding additional reviewer: R: @adude3141 |
|
Run seed job |
|
Also, as a general remark, I wonder whether it would not be easier to share common configurations. The different task configuration seems to be almost identical. Not sure, what would best be here, but maybe something like [1] will do? Of course, sometimes it makes more sense to duplicate. [1] https://stackoverflow.com/questions/13072034/avoid-duplication-between-similar-gradle-tasks |
|
@lgajowy , @adude3141 I fixed what I could, WDYT? |
|
Run seed job |
| "--runner=TestDataflowRunner", | ||
| "--dataflowWorkerJar=${dataflowWorkerJar}", | ||
| "--workerHarnessContainerImage=${dockerImageName}", | ||
| "--experiments=${fnapiExperiments}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this lost now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I missed it. Added it back
| // we require dataflowWorker, optional workerHarnessContainerImage and optional additionalOptions | ||
| def commonConfig = { dataflowWorkerJar, workerHarnessContainerImage = '', additionalOptions = [] -> | ||
| // return the preevaluated configuration closure | ||
| return { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will this ident pass spot less?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. It did but IMO it shouldn't have. It looks like no build.gradle files are checked only when applyGroovyNature() is invoked in them. Looks like a bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://issues.apache.org/jira/browse/BEAM-7040 issue for this
@mwalenia could you fix it according to Intelij's suggestions?
|
From my point of view, it LGTM. Nice work. Thanks! wdyt, @lgajowy ? |
lgajowy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some nits. Other than that please reorganize the branch (squash, etc) and we're good to go. Thanks!
| // If -PuseExecutableStage is set, the use_executable_stage_bundle_execution wil be enabled. | ||
| def fnapiExperiments = project.hasProperty('useExecutableStage') ? 'beam_fn_api,use_executable_stage_bundle_execution' : "beam_fn_api" | ||
|
|
||
| // we require dataflowWorker, optional workerHarnessContainerImage and optional additionalOptions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this comment redundant? The line below shows exactly what the comment says but (and that's the risk) it can change in the future.
Please delete the comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, done
| def fnapiExperiments = project.hasProperty('useExecutableStage') ? 'beam_fn_api,use_executable_stage_bundle_execution' : "beam_fn_api" | ||
|
|
||
| // we require dataflowWorker, optional workerHarnessContainerImage and optional additionalOptions | ||
| def commonConfig = { dataflowWorkerJar, workerHarnessContainerImage = '', additionalOptions = [] -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep - this improves this code a lot. Great!
| // we require dataflowWorker, optional workerHarnessContainerImage and optional additionalOptions | ||
| def commonConfig = { dataflowWorkerJar, workerHarnessContainerImage = '', additionalOptions = [] -> | ||
| // return the preevaluated configuration closure | ||
| return { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. It did but IMO it shouldn't have. It looks like no build.gradle files are checked only when applyGroovyNature() is invoked in them. Looks like a bug.
| // we require dataflowWorker, optional workerHarnessContainerImage and optional additionalOptions | ||
| def commonConfig = { dataflowWorkerJar, workerHarnessContainerImage = '', additionalOptions = [] -> | ||
| // return the preevaluated configuration closure | ||
| return { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://issues.apache.org/jira/browse/BEAM-7040 issue for this
@mwalenia could you fix it according to Intelij's suggestions?
| } | ||
| } | ||
|
|
||
| task postCommitLegacyWorkerJava11(type: Test) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
fd81d7e to
210d172
Compare
|
@lgajowy I fixed the nits and commit history. Ready to merge |
|
Run seed job |
|
Run Java examples on Dataflow with Java 11 |
|
Run Java Portability examples on Dataflow with Java 11 |
lgajowy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests are failing. Could you investigate/fix?
|
|
|
@alanmyrvold @pabloem There's a problem with tests exceeding CPU quota. I believe that we need to raise the quota, considering that we're adding more and more dataflow based tests. What do you think? |
|
Run Java Portability examples on Dataflow with Java 11 |
|
quota issues should go away immediately. If we see them happening more often, we can investigate, but they should not be frequent. |
|
Run seed job |
1 similar comment
|
Run seed job |
|
Run Java Portability examples on Dataflow with Java 11 |
|
Run Java examples on Dataflow with Java 11 |
|
: ) finally it's passing |
|
should we merge these? : P |
|
Oops. Missed this. Sorry. Merging. |
| "--tempRoot=${gcsTempRoot}", | ||
| "--runner=TestDataflowRunner", | ||
| "--dataflowWorkerJar=${dataflowWorkerJar}", | ||
| workerHarnessContainerImage.isEmpty() ?'':"--workerHarnessContainerImage=${workerHarnessContainerImage}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When dataflowWorkerJar is provided, the workerHarnessContainerImage should be set to empty explicitly, otherwise, the dataflow service will still try to pull harness image rather than using custom-built worker jar,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix here: #10635
Similarly as in my earlier PR
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
R: @username).[BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replaceBEAM-XXXwith the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.Post-Commit Tests Status (on master branch)
Pre-Commit Tests Status (on master branch)
See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.