Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BEAM-8106] Add version to java container image name #12505

Merged
merged 9 commits into from Sep 29, 2020

Conversation

emilymye
Copy link
Contributor

@emilymye emilymye commented Aug 7, 2020

Adding version number to Java 8 image names (in anticipation of releasing java11 container)

This PR:

  • Changes the output tag/name for the docker container task to apache/beam_java8_sdk instead of apache/beam_java_sdk
  • Adjusts hardcoded/java8-only references to java_sdk to java8_sdk (leaving tackling the changes to allow for versioning to a different PR)
  • Removes separate Dockerfile for Java11 and add java_version argument to existing Dockerfile. Changes build command:

Old:

./gradlew :sdks:java:container:docker -Pdockerfile=Dockerfile-java11 ...

New:

./gradlew :sdks:java:container:docker -PimageJavaVersion=11 ...

The actual difference betweeen Dockerfile-java11 and Dockerfile are minimal - literally these COPY lines:

COPY target/LICENSE /opt/apache/beam/
COPY target/NOTICE /opt/apache/beam/

are the only difference, and I'm pretty sure they should be copied appropriately for the Java11 image as well.

Things that should be done at submission/after merge:

  • Add new repository to hub.docker.com/apache before next run of release script.
  • Send a notice out to dev@ and user@
  • Notify releaser that name of released image will have changed after version 3.XX.00

Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Choose reviewer(s) and mention them in a comment (R: @username).
  • Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

Post-Commit Tests Status (on master branch)

Lang SDK Dataflow Flink Samza Spark Twister2
Go Build Status --- Build Status --- Build Status ---
Java Build Status Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status Build Status
Build Status
Build Status
Build Status
Python Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
--- Build Status ---
XLang Build Status --- Build Status --- Build Status ---

Pre-Commit Tests Status (on master branch)

--- Java Python Go Website
Non-portable Build Status Build Status
Build Status
Build Status
Build Status Build Status
Portable --- Build Status --- ---

See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels

See CI.md for more information about GitHub Actions CI.

@emilymye
Copy link
Contributor Author

Run Python2_PVR_Flink PreCommit

@emilymye
Copy link
Contributor Author

retest this please

@codecov
Copy link

codecov bot commented Sep 1, 2020

Codecov Report

Merging #12505 into master will increase coverage by 0.02%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #12505      +/-   ##
==========================================
+ Coverage   82.31%   82.33%   +0.02%     
==========================================
  Files         455      452       -3     
  Lines       54596    53985     -611     
==========================================
- Hits        44941    44449     -492     
+ Misses       9655     9536     -119     
Impacted Files Coverage Δ
...eam/runners/interactive/options/capture_control.py 92.00% <0.00%> (-8.00%) ⬇️
sdks/python/apache_beam/__init__.py 80.00% <0.00%> (-5.72%) ⬇️
conftest.py 77.77% <0.00%> (-5.56%) ⬇️
.../python/apache_beam/io/gcp/bigquery_io_metadata.py 86.95% <0.00%> (-3.67%) ⬇️
...pache_beam/runners/interactive/interactive_beam.py 76.02% <0.00%> (-2.83%) ⬇️
sdks/python/apache_beam/dataframe/io.py 89.62% <0.00%> (-2.38%) ⬇️
...e_beam/io/gcp/big_query_query_to_table_pipeline.py 29.03% <0.00%> (-2.01%) ⬇️
sdks/python/apache_beam/io/fileio.py 94.07% <0.00%> (-1.78%) ⬇️
...beam/runners/interactive/background_caching_job.py 94.78% <0.00%> (-1.74%) ⬇️
...eam/runners/interactive/interactive_environment.py 88.28% <0.00%> (-1.17%) ⬇️
... and 39 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2bb58fa...d0ec608. Read the comment docs.

@kennknowles
Copy link
Member

Just checking - is this blocked on anything that we can help with?

@emilymye
Copy link
Contributor Author

@kennknowles sorry for the delay - I got distracted doing some other things after there was a Python precommit error that I think just ended up being a flake. Will ping when ready!

@emilymye
Copy link
Contributor Author

run java postcommit

@emilymye
Copy link
Contributor Author

R: @chamikaramj @TheNeuralBit
cc: @kennknowles
I think Kenneth mentioned y'all might be good reviewers for this PR. I previously ran the Java postcommit and it passed, but I'm uncertain if I correctly ran all the tests I need to actually verify this change.

@emilymye emilymye changed the title [WIP][BEAM-8106] Add version to java container image name [BEAM-8106] Add version to java container image name Sep 14, 2020
@TheNeuralBit
Copy link
Member

It looks like the spotless (Java code format) check is failing. FYI you can auto-format locally with ./gradlew spotlessApply. It's not critical until we're ready to merge though.

@TheNeuralBit
Copy link
Member

Note the release manager that will likely be impacted is @robinyqiu for 2.25.0, since the 2.24.0 branch has already been cut.

@TheNeuralBit
Copy link
Member

Run Dataflow PortabilityApi ValidatesRunner with Java 11

Copy link
Member

@TheNeuralBit TheNeuralBit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a couple of minor comments and suggestions. Based on https://lists.apache.org/thread.html/rdf813078ee2728b6f26cb89c2857a844139f2c8ca78a68c69512f1ff%40%3Cdev.beam.apache.org%3E I agree that your set of post-merge tasks is sufficient.


private static String getDefaultJavaSdkHarnessContainerUrl() {
String javaVersionId =
Float.parseFloat(System.getProperty("java.specification.version")) >= 9 ? "java11" : "java8";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could you make this an exact check for 8 and 11 and throw an exception for other (unsupported) versions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went ahead and added an Enum to Environments, and removed similar checks in DataflowRunner

REPOSITORY TAG IMAGE ID CREATED SIZE
apache/beam_java_sdk latest 16ca619d489e 2 weeks ago 550MB
REPOSITORY TAG IMAGE ID CREATED SIZE
apache/beam_java_8sdk latest 16ca619d489e 2 weeks ago 550MB
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo here:

Suggested change
apache/beam_java_8sdk latest 16ca619d489e 2 weeks ago 550MB
apache/beam_java8_sdk latest 16ca619d489e 2 weeks ago 550MB

I'm assuming this change and the others like it are the result of a search to find and update all the references to beam_java_sdk, so we don't need to worry about there being other references?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make sure that the correct container name is set in the environment send to the remote SDKs that use Java transforms as cross-language transforms. (We don't have integration tests setup for this by we can just try existing Kafka/SQL x-lang examples to confirm)

https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples/kafkataxi
https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/sql_taxi.py

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make sure the references to beam_java_sdk are updated in those examples too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird! I don't know why they didn't show up in my initial ctrl-f

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I bet it was a race condition, those examples were added in just the last couple of months

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went ahead and ran the Kafka example "from local Beam repo" without specifying the --sdk_harness_container_image_overrides to see if we should pass the correct URL, which it seems to do (based on the worker container manifest):

{
      "args": [ ... ],
      "image": "apache/beam_java8_sdk:2.25.0.dev",
      "imagePullPolicy": "IfNotPresent",
      "name": "sdk-1-0",
      "volumeMounts": [ {
        "mountPath": "/var/opt/google",
        "name": "persist"
},     

Dataflow can't actually grab that container because it doesn't exist (on hub.docker.com), so the job itself doesn't work, but it seems to be grabbing the right URL.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks correct. Thanks.

sdks/java/container/Dockerfile-java11 Outdated Show resolved Hide resolved
@emilymye
Copy link
Contributor Author

Will update once I've run the x-lang example

@emilymye
Copy link
Contributor Author

retest this please

@emilymye
Copy link
Contributor Author

Run Java PreCommit

@emilymye
Copy link
Contributor Author

Run Python2_PVR_Flink PreCommit

Copy link
Member

@TheNeuralBit TheNeuralBit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

I think this is safe to merge but maybe we should hold off until INFRA-20866 is closed. If that takes a while I'd hate to have it hold up the release.

@emilymye
Copy link
Contributor Author

@TheNeuralBit it looks like one of the python unit tests flake-failed? but i think this PR is ready to go

@TheNeuralBit
Copy link
Member

Filed a jira for the flake: BEAM-10987

@TheNeuralBit TheNeuralBit merged commit 97a270f into apache:master Sep 29, 2020
@TheNeuralBit
Copy link
Member

FYI: @lostluck 2.26.0 will be the first release with a Java container published at beam_java8_sdk rather than beam_java_sdk

@emilymye you're planning on sending a message to user@ and dev@ about this right?

@emilymye
Copy link
Contributor Author

@TheNeuralBit composing an email now

@TheNeuralBit
Copy link
Member

Thank you!

ibzib pushed a commit to ibzib/beam that referenced this pull request Sep 30, 2020
* add java_version arg for building Java harness container

* change default java container name to java8

* docs changes

* delete Java11 Dockerfile

* fix typo

* add JavaVersion to Environments

* fix more references in documentation to beam_java_sdk

* fix names of functions

* remove underscore
@lostluck
Copy link
Contributor

lostluck commented Sep 30, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants