Skip to content

Conversation

@mwalenia
Copy link
Member

@mwalenia mwalenia commented Apr 1, 2019

Similarly as in my earlier PR


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Choose reviewer(s) and mention them in a comment (R: @username).
  • Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

Post-Commit Tests Status (on master branch)

Lang SDK Apex Dataflow Flink Gearpump Samza Spark
Go Build Status --- --- --- --- --- ---
Java Build Status Build Status Build Status Build Status
Build Status
Build Status
Build Status Build Status Build Status
Python Build Status
Build Status
--- Build Status
Build Status
Build Status --- --- ---

Pre-Commit Tests Status (on master branch)

--- Java Python Go Website
Non-portable Build Status Build Status Build Status Build Status
Portable --- Build Status --- ---

See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.

@mwalenia
Copy link
Member Author

mwalenia commented Apr 1, 2019

R: @lgajowy

@lgajowy
Copy link
Contributor

lgajowy commented Apr 2, 2019

Run seed job

Copy link
Contributor

@lgajowy lgajowy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added suggestions. Thanks!

Other than that: I don't see the job results in Jenkins. Could you run them and check if they pass?

Could you also set the jira id to: BEAM-6936 in the commit message (it think there is a typo)

}
}

task postCommitFnApiWorkerJava11(type: Test) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that this is a similar task to preCommitFnApiWorker. Could we reuse preCommitFnApiWorker and parametrize it? My main concern now is that we duplicate lots of code that will be hard to maintain.

When refactoring this, please remember to set default argument/parameter values to java 8's in order not to need to change anything else in the code that already uses this. Also please change its name to fnApiWorkerTests or something similar because it's clearly not preCommit only now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After looking into this task implementation, I am unable to see any difference.

So I am also wondering, why we need that additional task at all. Wouldn't it be sufficient to just call the other?

So maybe rename preCommitFnApiWorkerto something like runFnApiWorker and call that task in respective Jenkins job definitions?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, that didn't add anything. I renamed preCommitFnApiWorker to verifyFnApiWorker and removed my redundant tasks. Now both precommit (already existent) and postcommit (new) jobs use the same tasks. distinction between Java 11 and Java 8 is done by setting a flag in the Jenkins job

}
}

task postCommitLegacyWorkerJava11(type: Test) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly to above, could we perform the same refactoring?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately it wasn't possible. I couldn't find a way of extracting common configuration to another element. Setting system properties requires Test task type and those properties are restricted in scope to a single task - I experimented with setting a property in a task with dependsOn directive, but it wasn't effective. Do you know of another way of extracting this?
@adude3141 it's also a reply to your general remark, the solution you linked didn't work. I believe that currently the best way is to duplicate those tasks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something like this will do?

diff --git a/runners/google-cloud-dataflow-java/examples/build.gradle b/runners/google-cloud-dataflow-java/examples/build.gradle
index 72059bb064..f591e5009b 100644
--- a/runners/google-cloud-dataflow-java/examples/build.gradle
+++ b/runners/google-cloud-dataflow-java/examples/build.gradle
@@ -42,25 +42,33 @@ def dockerImageName = project(':beam-runners-google-cloud-dataflow-java').ext.do
 // If -PuseExecutableStage is set, the use_executable_stage_bundle_execution wil be enabled.
 def fnapiExperiments = project.hasProperty('useExecutableStage') ? 'beam_fn_api,use_executable_stage_bundle_execution' : "beam_fn_api"
 
+// we require dataflowWorker, optional workerHarnessContainerImage and optional additionalOptions
+def commonConfig = { dataflowWorkerJar, workerHarnessContainerImage = '', additionalOptions = [] ->
+  // return the preevaluated configuration closure
+  return {
+    testClassesDirs = files(project(":beam-examples-java").sourceSets.test.output.classesDirs)
+    include "**/WordCountIT.class"
+    include "**/WindowedWordCountIT.class"
+    forkEvery 1
+    maxParallelForks 4
+
+    def preCommitBeamTestPipelineOptions = [
+            "--project=${gcpProject}",
+            "--tempRoot=${gcsTempRoot}",
+            "--runner=TestDataflowRunner",
+            "--dataflowWorkerJar=${dataflowWorkerJar}",
+            "--workerHarnessContainerImage=${workerHarnessContainerImage}"
+    ] + additionalOptions
+    systemProperty "beamTestPipelineOptions", JsonOutput.toJson(preCommitBeamTestPipelineOptions)
+  }
+}
+
 task preCommitLegacyWorker(type: Test) {
   dependsOn ":beam-runners-google-cloud-dataflow-java-legacy-worker:shadowJar"
   def dataflowWorkerJar = project.findProperty('dataflowWorkerJar') ?: project(":beam-runners-google-cloud-dataflow-java-legacy-worker").shadowJar.archivePath
-
   //Set workerHarnessContainerImage to empty to make Dataflow pick up the non-versioned container
   //image, which handles a staged worker jar.
-  def preCommitBeamTestPipelineOptions = [
-     "--project=${gcpProject}",
-     "--tempRoot=${gcsTempRoot}",
-     "--runner=TestDataflowRunner",
-     "--dataflowWorkerJar=${dataflowWorkerJar}",
-     "--workerHarnessContainerImage=",
-  ]
-  testClassesDirs = files(project(":beam-examples-java").sourceSets.test.output.classesDirs)
-  include "**/WordCountIT.class"
-  include "**/WindowedWordCountIT.class"
-  forkEvery 1
-  maxParallelForks 4
-  systemProperty "beamTestPipelineOptions", JsonOutput.toJson(preCommitBeamTestPipelineOptions)
+  with commonConfig(dataflowWorkerJar)
 }
 
 task verifyFnApiWorker(type: Test) {
@@ -68,20 +76,7 @@ task verifyFnApiWorker(type: Test) {
   dependsOn ":beam-runners-google-cloud-dataflow-java:buildAndPushDockerContainer"
 
   def dataflowWorkerJar = project.findProperty('dataflowWorkerJar') ?: project(":beam-runners-google-cloud-dataflow-java-fn-api-worker").shadowJar.archivePath
-  def preCommitBeamTestPipelineOptions = [
-          "--project=${gcpProject}",
-          "--tempRoot=${gcsTempRoot}",
-          "--runner=TestDataflowRunner",
-          "--dataflowWorkerJar=${dataflowWorkerJar}",
-          "--workerHarnessContainerImage=${dockerImageName}",
-          "--experiments=${fnapiExperiments}",
-  ]
-  testClassesDirs = files(project(":beam-examples-java").sourceSets.test.output.classesDirs)
-  include "**/WordCountIT.class"
-  include "**/WindowedWordCountIT.class"
-  forkEvery 1
-  maxParallelForks 4
-  systemProperty "beamTestPipelineOptions", JsonOutput.toJson(preCommitBeamTestPipelineOptions)
+  with commonConfig(dataflowWorkerJar, dockerImageName, "--experiments=${fnapiExperiments}")
   useJUnit {
     excludeCategories 'org.apache.beam.sdk.testing.StreamingIT'
   }
@@ -90,24 +85,10 @@ task verifyFnApiWorker(type: Test) {
 task postCommitLegacyWorkerJava11(type: Test) {
   dependsOn ":beam-runners-google-cloud-dataflow-java-legacy-worker:shadowJar"
   def dataflowWorkerJar = project.findProperty('dataflowWorkerJar') ?: project(":beam-runners-google-cloud-dataflow-java-legacy-worker").shadowJar.archivePath
-
   //Set workerHarnessContainerImage to empty to make Dataflow pick up the non-versioned container
   //image, which handles a staged worker jar.
-  def preCommitBeamTestPipelineOptions = [
-          "--project=${gcpProject}",
-          "--tempRoot=${gcsTempRoot}",
-          "--runner=TestDataflowRunner",
-          "--dataflowWorkerJar=${dataflowWorkerJar}",
-          "--workerHarnessContainerImage=",
-  ]
-  
-  testClassesDirs = files(project(":beam-examples-java").sourceSets.test.output.classesDirs)
-  include "**/WordCountIT.class"
-  include "**/WindowedWordCountIT.class"
-  forkEvery 1
-  maxParallelForks 4
+  with commonConfig(dataflowWorkerJar)
   systemProperty "java.specification.version", "11"
-  systemProperty "beamTestPipelineOptions", JsonOutput.toJson(preCommitBeamTestPipelineOptions)
 }
 
 task java11PostCommit() {

But please bear with me. It is a fast hack and I did not run. Also, if you decide to use, doublecheck that I did not accidentally deleted some required configuration.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion, it worked with minimal modifications ;)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. Glad I could help here!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@lgajowy lgajowy changed the title [BEAM-6963] Added Jenkins jobs running Java examples on Dataflow with Java 11 [BEAM-6936] Added Jenkins jobs running Java examples on Dataflow with Java 11 Apr 2, 2019
@lgajowy
Copy link
Contributor

lgajowy commented Apr 2, 2019

Adding additional reviewer:

R: @adude3141

@mwalenia
Copy link
Member Author

mwalenia commented Apr 2, 2019

Run seed job

@adude3141
Copy link
Contributor

Also, as a general remark, I wonder whether it would not be easier to share common configurations. The different task configuration seems to be almost identical. Not sure, what would best be here, but maybe something like [1] will do?

Of course, sometimes it makes more sense to duplicate.

[1] https://stackoverflow.com/questions/13072034/avoid-duplication-between-similar-gradle-tasks

@mwalenia
Copy link
Member Author

mwalenia commented Apr 3, 2019

@lgajowy , @adude3141 I fixed what I could, WDYT?

@mwalenia
Copy link
Member Author

mwalenia commented Apr 3, 2019

Run seed job

"--runner=TestDataflowRunner",
"--dataflowWorkerJar=${dataflowWorkerJar}",
"--workerHarnessContainerImage=${dockerImageName}",
"--experiments=${fnapiExperiments}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this lost now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I missed it. Added it back

// we require dataflowWorker, optional workerHarnessContainerImage and optional additionalOptions
def commonConfig = { dataflowWorkerJar, workerHarnessContainerImage = '', additionalOptions = [] ->
// return the preevaluated configuration closure
return {
Copy link
Contributor

@adude3141 adude3141 Apr 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will this ident pass spot less?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. It did but IMO it shouldn't have. It looks like no build.gradle files are checked only when applyGroovyNature() is invoked in them. Looks like a bug.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://issues.apache.org/jira/browse/BEAM-7040 issue for this

@mwalenia could you fix it according to Intelij's suggestions?

@adude3141
Copy link
Contributor

From my point of view, it LGTM. Nice work. Thanks!

wdyt, @lgajowy ?

Copy link
Contributor

@lgajowy lgajowy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nits. Other than that please reorganize the branch (squash, etc) and we're good to go. Thanks!

// If -PuseExecutableStage is set, the use_executable_stage_bundle_execution wil be enabled.
def fnapiExperiments = project.hasProperty('useExecutableStage') ? 'beam_fn_api,use_executable_stage_bundle_execution' : "beam_fn_api"

// we require dataflowWorker, optional workerHarnessContainerImage and optional additionalOptions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this comment redundant? The line below shows exactly what the comment says but (and that's the risk) it can change in the future.

Please delete the comment

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, done

def fnapiExperiments = project.hasProperty('useExecutableStage') ? 'beam_fn_api,use_executable_stage_bundle_execution' : "beam_fn_api"

// we require dataflowWorker, optional workerHarnessContainerImage and optional additionalOptions
def commonConfig = { dataflowWorkerJar, workerHarnessContainerImage = '', additionalOptions = [] ->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep - this improves this code a lot. Great!

// we require dataflowWorker, optional workerHarnessContainerImage and optional additionalOptions
def commonConfig = { dataflowWorkerJar, workerHarnessContainerImage = '', additionalOptions = [] ->
// return the preevaluated configuration closure
return {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. It did but IMO it shouldn't have. It looks like no build.gradle files are checked only when applyGroovyNature() is invoked in them. Looks like a bug.

// we require dataflowWorker, optional workerHarnessContainerImage and optional additionalOptions
def commonConfig = { dataflowWorkerJar, workerHarnessContainerImage = '', additionalOptions = [] ->
// return the preevaluated configuration closure
return {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://issues.apache.org/jira/browse/BEAM-7040 issue for this

@mwalenia could you fix it according to Intelij's suggestions?

}
}

task postCommitLegacyWorkerJava11(type: Test) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@mwalenia mwalenia force-pushed the BEAM-6936-dataflow-java11-examples branch from fd81d7e to 210d172 Compare April 10, 2019 07:24
@mwalenia
Copy link
Member Author

@lgajowy I fixed the nits and commit history. Ready to merge

@lgajowy
Copy link
Contributor

lgajowy commented Apr 10, 2019

Run seed job

@lgajowy
Copy link
Contributor

lgajowy commented Apr 10, 2019

Run Java examples on Dataflow with Java 11

@lgajowy
Copy link
Contributor

lgajowy commented Apr 10, 2019

Run Java Portability examples on Dataflow with Java 11

Copy link
Contributor

@lgajowy lgajowy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests are failing. Could you investigate/fix?

@mwalenia
Copy link
Member Author

java.lang.RuntimeException: Workflow failed. Causes: Project apache-beam-testing has insufficient quota(s) to execute this workflow with 1 instances in region us-central1. Quota summary (required/available): 1/7244 instances, 4/0 CPUs, 430/116819 disk GB, 0/4046 SSD disk GB, 1/263 instance groups, 1/264 managed instance groups, 1/207 instance templates, 1/499 in-use IP addresses.
Something must have saturated CPU quota.

@mwalenia
Copy link
Member Author

@alanmyrvold @pabloem There's a problem with tests exceeding CPU quota. I believe that we need to raise the quota, considering that we're adding more and more dataflow based tests. What do you think?

@pabloem
Copy link
Member

pabloem commented Apr 11, 2019

Run Java Portability examples on Dataflow with Java 11

@pabloem
Copy link
Member

pabloem commented Apr 11, 2019

quota issues should go away immediately. If we see them happening more often, we can investigate, but they should not be frequent.

@pabloem
Copy link
Member

pabloem commented Apr 11, 2019

Run seed job

1 similar comment
@pabloem
Copy link
Member

pabloem commented Apr 12, 2019

Run seed job

@pabloem
Copy link
Member

pabloem commented Apr 12, 2019

Run Java Portability examples on Dataflow with Java 11

@pabloem
Copy link
Member

pabloem commented Apr 12, 2019

Run Java examples on Dataflow with Java 11

@pabloem
Copy link
Member

pabloem commented Apr 12, 2019

: ) finally it's passing

@pabloem
Copy link
Member

pabloem commented Apr 15, 2019

should we merge these? : P

@lgajowy
Copy link
Contributor

lgajowy commented Apr 16, 2019

Oops. Missed this. Sorry. Merging.

@lgajowy lgajowy merged commit 4e099df into apache:master Apr 16, 2019
@lgajowy
Copy link
Contributor

lgajowy commented Apr 16, 2019

@mwalenia thanks for the PR.
@pabloem thanks for pinging me. :)

"--tempRoot=${gcsTempRoot}",
"--runner=TestDataflowRunner",
"--dataflowWorkerJar=${dataflowWorkerJar}",
workerHarnessContainerImage.isEmpty() ?'':"--workerHarnessContainerImage=${workerHarnessContainerImage}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When dataflowWorkerJar is provided, the workerHarnessContainerImage should be set to empty explicitly, otherwise, the dataflow service will still try to pull harness image rather than using custom-built worker jar,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix here: #10635

@mwalenia mwalenia deleted the BEAM-6936-dataflow-java11-examples branch January 24, 2020 10:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants