-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BEAM-8398] Upgrade Dataflow Java Client API #9792
Conversation
52e9607
to
9f28cd7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Java Precommit failed due to:
14:42:23 * What went wrong:
14:42:23 Could not determine the dependencies of task ':sdks:java:testing:nexmark:analyzeTestClassesDependencies'.
14:42:23 > Could not resolve all dependencies for configuration ':sdks:java:testing:nexmark:testCompile'.
14:42:23 > Could not find com.google.cloud:google-cloud-bigquery:1.29.2.
Judging by here, it looks like these bigquery version are way off from $google_clients_version, so I'm guessing they're semantically different? |
Pointing google-cloud-bigquery at 1.97.0 picks up the google-cloud-clients 0.115.0-alpha bom. The bom defines a bunch of dependency versions of compatible libraries. Also, the versions of libraries that you selected may not be compatible with the related google support libraries like gax, auth, .... Please provide the transitive dependency diff including version overrides where we squash a resolved version for an older/newer version. |
I made some more conservative version choices. The diff, besides the obvious/expected 1.27->1.28 changes, is pretty minimal, as follows.
After:
The existing
|
Run Python PreCommit |
@lukecwik The version choices indicated above required one minor change to pass tests. PTAL (testExampleEchoPipeline is a known flake.) |
Run Java PreCommit |
@@ -169,7 +169,7 @@ public int publish(TopicPath topic, List<OutgoingMessage> outgoingMessages) thro | |||
@Nullable Map<String, String> attributes = pubsubMessage.getAttributes(); | |||
|
|||
// Payload. | |||
byte[] elementBytes = pubsubMessage.decodeData(); | |||
byte[] elementBytes = pubsubMessage.getData() == null ? null : pubsubMessage.decodeData(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should be modifying the code to make a test pass.
Is the way the test is setting this up is bad or was there an actual API change which now allows getData to be null?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming PubsubJsonClientTest.pullOneMessageWithNoData started to fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I downloaded the source for both versions (1.27, 1.28) and they both invoke the Base64 library:
- decodeData() uses com.google.api.client.util.Base64.decodeBase64
- encodeData() uses com.google.api.client.util.Base64.encodeBase64URLSafeString
It looks like there was a change in the Base64 library implementation since it now doesn't take null and return null.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the Base64 library version went through several changes from 1.27 -> 1.28, mainly swapping from commons codec to Guava's version of base64 encoding.
Maybe Guava made a backwards incompatible change or the override from Guava 26.0-android back to 20.0 that we are now forcing because of the upgrade is causing the issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like Guavas 20 and Guava 26 Base64 didn't allow nulls so checking getData should be fine replacement for prior logic.
It may also be that the test is wrong and should have really called encodeData with an empty byte array to get the no data case for testing but it is unclear from the pubsub documentation what the contents of the field maybe so this is the safer route.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a problem with the HTTP client. (PubsubMessage::decodeData
and Base64
Javadocs clearly state that they will return null if the input is null, but this behavior relied on Guava's BaseEncoding
implementation, which AFAICT never made such a promise.) Apparently this was fixed in version 1.30.0 of the HTTP client: googleapis/google-http-java-client#644
We need to pick up a newer version of the Dataflow API to add support for the new
worker_region
andworker_zone
flags (BEAM-8251).Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
R: @username
).[BEAM-XXX] Fixes bug in ApproximateQuantiles
, where you replaceBEAM-XXX
with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.Post-Commit Tests Status (on master branch)
Pre-Commit Tests Status (on master branch)
See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.