Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BEAM-8398] Upgrade Dataflow Java Client API #9792

Merged
merged 2 commits into from
Oct 30, 2019
Merged

Conversation

ibzib
Copy link
Contributor

@ibzib ibzib commented Oct 14, 2019

We need to pick up a newer version of the Dataflow API to add support for the new worker_region and worker_zone flags (BEAM-8251).


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Choose reviewer(s) and mention them in a comment (R: @username).
  • Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

Post-Commit Tests Status (on master branch)

Lang SDK Apex Dataflow Flink Gearpump Samza Spark
Go Build Status --- --- Build Status --- --- Build Status
Java Build Status Build Status Build Status Build Status
Build Status
Build Status
Build Status Build Status Build Status
Build Status
Python Build Status
Build Status
Build Status
Build Status
--- Build Status
Build Status
Build Status
Build Status
--- --- Build Status
XLang --- --- --- Build Status --- --- ---

Pre-Commit Tests Status (on master branch)

--- Java Python Go Website
Non-portable Build Status Build Status
Build Status
Build Status Build Status
Portable --- Build Status --- ---

See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.

@ibzib ibzib force-pushed the java-region branch 2 times, most recently from 52e9607 to 9f28cd7 Compare October 14, 2019 21:37
Copy link
Member

@lukecwik lukecwik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Java Precommit failed due to:
14:42:23 * What went wrong:
14:42:23 Could not determine the dependencies of task ':sdks:java:testing:nexmark:analyzeTestClassesDependencies'.
14:42:23 > Could not resolve all dependencies for configuration ':sdks:java:testing:nexmark:testCompile'.
14:42:23 > Could not find com.google.cloud:google-cloud-bigquery:1.29.2.

@ibzib
Copy link
Contributor Author

ibzib commented Oct 14, 2019

Judging by here, it looks like these bigquery version are way off from $google_clients_version, so I'm guessing they're semantically different?

@lukecwik
Copy link
Member

Pointing google-cloud-bigquery at 1.97.0 picks up the google-cloud-clients 0.115.0-alpha bom. The bom defines a bunch of dependency versions of compatible libraries. Also, the versions of libraries that you selected may not be compatible with the related google support libraries like gax, auth, ....

Please provide the transitive dependency diff including version overrides where we squash a resolved version for an older/newer version.

@ibzib
Copy link
Contributor Author

ibzib commented Oct 15, 2019

I made some more conservative version choices. The diff, besides the obvious/expected 1.27->1.28 changes, is pretty minimal, as follows.

google-http-client picked up some new dependencies. Before:

|    |    +--- com.google.http-client:google-http-client:1.27.0
|    |    |    +--- com.google.code.findbugs:jsr305:3.0.2
|    |    |    +--- com.google.guava:guava:20.0
|    |    |    +--- org.apache.httpcomponents:httpclient:4.5.5
|    |    |    |    +--- org.apache.httpcomponents:httpcore:4.4.9
|    |    |    |    +--- commons-logging:commons-logging:1.2
|    |    |    |    \--- commons-codec:commons-codec:1.10
|    |    |    \--- com.google.j2objc:j2objc-annotations:1.1

After:

|    |    +--- com.google.http-client:google-http-client:1.28.0
|    |    |    +--- com.google.code.findbugs:jsr305:3.0.2
|    |    |    +--- com.google.guava:guava:26.0-android -> 20.0
|    |    |    +--- com.google.j2objc:j2objc-annotations:1.1
|    |    |    +--- io.opencensus:opencensus-api:0.18.0
|    |    |    |    \--- io.grpc:grpc-context:1.14.0
|    |    |    \--- io.opencensus:opencensus-contrib-http-util:0.18.0
|    |    |         +--- io.opencensus:opencensus-api:0.18.0 (*)
|    |    |         \--- com.google.guava:guava:20.0

The existing org.apache.httpcomponents:httpclient was added under google-api-client:

|    |    +--- com.google.http-client:google-http-client-apache:2.0.0
|    |    |    +--- com.google.http-client:google-http-client:1.28.0 (*)
|    |    |    \--- org.apache.httpcomponents:httpclient:4.5.5
|    |    |         +--- org.apache.httpcomponents:httpcore:4.4.9
|    |    |         +--- commons-logging:commons-logging:1.2
|    |    |         \--- commons-codec:commons-codec:1.10

@ibzib
Copy link
Contributor Author

ibzib commented Oct 28, 2019

Run Python PreCommit

@ibzib
Copy link
Contributor Author

ibzib commented Oct 29, 2019

@lukecwik The version choices indicated above required one minor change to pass tests. PTAL

(testExampleEchoPipeline is a known flake.)

@lukecwik
Copy link
Member

Run Java PreCommit

@@ -169,7 +169,7 @@ public int publish(TopicPath topic, List<OutgoingMessage> outgoingMessages) thro
@Nullable Map<String, String> attributes = pubsubMessage.getAttributes();

// Payload.
byte[] elementBytes = pubsubMessage.decodeData();
byte[] elementBytes = pubsubMessage.getData() == null ? null : pubsubMessage.decodeData();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should be modifying the code to make a test pass.

Is the way the test is setting this up is bad or was there an actual API change which now allows getData to be null?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming PubsubJsonClientTest.pullOneMessageWithNoData started to fail.

Copy link
Member

@lukecwik lukecwik Oct 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I downloaded the source for both versions (1.27, 1.28) and they both invoke the Base64 library:

  • decodeData() uses com.google.api.client.util.Base64.decodeBase64
  • encodeData() uses com.google.api.client.util.Base64.encodeBase64URLSafeString

It looks like there was a change in the Base64 library implementation since it now doesn't take null and return null.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the Base64 library version went through several changes from 1.27 -> 1.28, mainly swapping from commons codec to Guava's version of base64 encoding.

Maybe Guava made a backwards incompatible change or the override from Guava 26.0-android back to 20.0 that we are now forcing because of the upgrade is causing the issue.

Copy link
Member

@lukecwik lukecwik Oct 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like Guavas 20 and Guava 26 Base64 didn't allow nulls so checking getData should be fine replacement for prior logic.

It may also be that the test is wrong and should have really called encodeData with an empty byte array to get the no data case for testing but it is unclear from the pubsub documentation what the contents of the field maybe so this is the safer route.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a problem with the HTTP client. (PubsubMessage::decodeData and Base64 Javadocs clearly state that they will return null if the input is null, but this behavior relied on Guava's BaseEncoding implementation, which AFAICT never made such a promise.) Apparently this was fixed in version 1.30.0 of the HTTP client: googleapis/google-http-java-client#644

@lukecwik lukecwik merged commit d664592 into apache:master Oct 30, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants