Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BEAM-8889] Upgrades gcsio to 2.1.3 #11651

Merged
merged 2 commits into from May 21, 2020
Merged

Conversation

veblush
Copy link
Contributor

@veblush veblush commented May 9, 2020

This is part of BEAM-8889 to get new GCS connector 2.1.3 after #11315. This is the first version to support gRPC protocol so it has a new dependency against gRPC and it affects the dependency tree of beam SDK transitively. gRPC feature is only available with use_grpc_for_gcs experiment flag and the whitelisted GCS buckets.


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Choose reviewer(s) and mention them in a comment (R: @username).
  • Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

Post-Commit Tests Status (on master branch)

Lang SDK Apex Dataflow Flink Gearpump Samza Spark
Go Build Status --- --- Build Status --- --- Build Status
Java Build Status Build Status Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status Build Status Build Status
Build Status
Build Status
Python Build Status
Build Status
Build Status
Build Status
--- Build Status
Build Status
Build Status
Build Status
Build Status
--- --- Build Status
XLang --- --- --- Build Status --- --- Build Status

Pre-Commit Tests Status (on master branch)

--- Java Python Go Website
Non-portable Build Status Build Status
Build Status
Build Status Build Status
Portable --- Build Status --- ---

See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.

@veblush
Copy link
Contributor Author

veblush commented May 13, 2020

R: @chamikaramj @lukecwik

@chamikaramj
Copy link
Contributor

Retest this please

@chamikaramj
Copy link
Contributor

Run Java PostCommit

@chamikaramj
Copy link
Contributor

Run Dataflow ValidatesRunner

@veblush
Copy link
Contributor Author

veblush commented May 19, 2020

No performance regression with this change. (Tested with internal tests)

@chamikaramj
Copy link
Contributor

Run Java PreCommit

@chamikaramj
Copy link
Contributor

Run Java PostCommit

@veblush
Copy link
Contributor Author

veblush commented May 19, 2020

It appears that it actually has a problem with dependency.

Java PostCommit / testReadEmptySketchFromBigQuery

Caused by: java.lang.NoSuchFieldError: NETTY_SHADED
	at io.grpc.netty.shaded.io.grpc.netty.NettyClientStream.<clinit>(NettyClientStream.java:59)
	at io.grpc.netty.shaded.io.grpc.netty.NettyClientTransport.newStream(NettyClientTransport.java:177)
	at io.grpc.internal.MetadataApplierImpl.apply(MetadataApplierImpl.java:70)
	at io.grpc.CallCredentials2$1.apply(CallCredentials2.java:61)
	at io.grpc.auth.GoogleAuthLibraryCallCredentials$1.onSuccess(GoogleAuthLibraryCallCredentials.java:133)
	at com.google.auth.oauth2.OAuth2Credentials.getRequestMetadata(OAuth2Credentials.java:134)
	at io.grpc.auth.GoogleAuthLibraryCallCredentials.applyRequestMetadata(GoogleAuthLibraryCallCredentials.java:110)
	at io.grpc.CallCredentials2.applyRequestMetadata(CallCredentials2.java:58)
	at io.grpc.internal.CallCredentialsApplyingTransportFactory$CallCredentialsApplyingTransport.newStream(CallCredentialsApplyingTransportFactory.java:108)
	at io.grpc.internal.ForwardingConnectionClientTransport.newStream(ForwardingConnectionClientTransport.java:49)
	at io.grpc.internal.InternalSubchannel$CallTracingTransport.newStream(InternalSubchannel.java:635)
	at io.grpc.internal.DelayedClientTransport$PendingStream.createRealStream(DelayedClientTransport.java:353)
	at io.grpc.internal.DelayedClientTransport$PendingStream.access$300(DelayedClientTransport.java:341)
	at io.grpc.internal.DelayedClientTransport$5.run(DelayedClientTransport.java:300)

Java PostCommit / SpannerWriteIT/testWrite_2

Caused by: java.lang.NoClassDefFoundError: Could not initialize class io.grpc.netty.shaded.io.grpc.netty.NettyClientStream
	at io.grpc.netty.shaded.io.grpc.netty.NettyClientTransport.newStream(NettyClientTransport.java:177)
	at io.grpc.internal.MetadataApplierImpl.apply(MetadataApplierImpl.java:70)
	at io.grpc.CallCredentials2$1.apply(CallCredentials2.java:61)
	at io.grpc.auth.GoogleAuthLibraryCallCredentials$1.onSuccess(GoogleAuthLibraryCallCredentials.java:133)
	at com.google.auth.oauth2.OAuth2Credentials.getRequestMetadata(OAuth2Credentials.java:134)
	at io.grpc.auth.GoogleAuthLibraryCallCredentials.applyRequestMetadata(GoogleAuthLibraryCallCredentials.java:110)
	at io.grpc.CallCredentials2.applyRequestMetadata(CallCredentials2.java:58)
	at io.grpc.internal.CallCredentialsApplyingTransportFactory$CallCredentialsApplyingTransport.newStream(CallCredentialsApplyingTransportFactory.java:108)
	at io.grpc.internal.ForwardingConnectionClientTransport.newStream(ForwardingConnectionClientTransport.java:49)
	at io.grpc.internal.InternalSubchannel$CallTracingTransport.newStream(InternalSubchannel.java:635)
	at io.grpc.internal.DelayedClientTransport$PendingStream.createRealStream(DelayedClientTransport.java:353)
	at io.grpc.internal.DelayedClientTransport$PendingStream.access$300(DelayedClientTransport.java:341)
	at io.grpc.internal.DelayedClientTransport$5.run(DelayedClientTransport.java:300)

@veblush
Copy link
Contributor Author

veblush commented May 19, 2020

This is most likely caused by gcsio 2.1.3 dependency to grpc-netty-shaded (ref). Meanwhile, Beam uses grpc_netty so it happens to have the same netty library in two different forms.

Copy link
Contributor

@chamikaramj chamikaramj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

@veblush
Copy link
Contributor Author

veblush commented May 19, 2020

It turns out that grpc-netty and grpc-netty-shaded can coexist. So I'm trying to list those new dependencies in the build file to make gRPC component have the same version.

@chamikaramj
Copy link
Contributor

Retest this please

2 similar comments
@chamikaramj
Copy link
Contributor

Retest this please

@chamikaramj
Copy link
Contributor

Retest this please

@chamikaramj
Copy link
Contributor

Run Python2_PVR_Flink PreCommit

1 similar comment
@chamikaramj
Copy link
Contributor

Run Python2_PVR_Flink PreCommit

@chamikaramj
Copy link
Contributor

Have you run the linkage checker ?
https://cwiki.apache.org/confluence/display/BEAM/Dependency+Upgrades

@veblush
Copy link
Contributor Author

veblush commented May 20, 2020

This is the result of linkage chekcer. (it was challenging because it seems to require Java8 strictly to finish it smoothly)

# /bin/bash sdks/java/build-tools/beam-linkage-check.sh
(omitted)
Wed May 20 17:57:11 UTC 2020: Done: 0
No new linkage errors

@chamikaramj
Copy link
Contributor

Run Dataflow ValidatesRunner

1 similar comment
@chamikaramj
Copy link
Contributor

Run Dataflow ValidatesRunner

@chamikaramj
Copy link
Contributor

Having trouble triggering tests at the moment. I think this is good to go when we get following to pass with the latest code.
Run Dataflow ValidatesRunner
Run Java PostCommit

@chamikaramj
Copy link
Contributor

LGTM

@chamikaramj
Copy link
Contributor

Run Java PostCommit

@chamikaramj
Copy link
Contributor

Run Java PostCommit

1 similar comment
@chamikaramj
Copy link
Contributor

Run Java PostCommit

@chamikaramj
Copy link
Contributor

Run Dataflow ValidatesRunner

@TheNeuralBit
Copy link
Member

Run Java PostCommit

@TheNeuralBit
Copy link
Member

whoops

@TheNeuralBit
Copy link
Member

It looks like Dataflow VR tests are going to fail:

15:10:31 org.apache.beam.sdk.transforms.join.CoGroupByKeyTest > testCoGroupByKeyWithWindowing FAILED
15:10:31     java.lang.RuntimeException at CoGroupByKeyTest.java:494
15:10:31         Caused by: java.lang.OutOfMemoryError
15:10:31 
15:10:31 org.apache.beam.sdk.transforms.FlattenTest > testFlattenIterablesLists FAILED
15:10:31     java.lang.RuntimeException at FlattenTest.java:270
15:10:31         Caused by: java.lang.OutOfMemoryError
15:10:31 
15:10:31 org.apache.beam.sdk.transforms.CreateTest > testCreateWithVoidType FAILED
15:10:31     java.lang.RuntimeException at CreateTest.java:326
15:10:31         Caused by: java.lang.OutOfMemoryError
15:10:31 
15:10:31 org.apache.beam.sdk.transforms.ReshuffleTest > testReshuffleAfterSlidingWindows FAILED
15:10:31     java.lang.RuntimeException at ReshuffleTest.java:251
15:10:31         Caused by: java.lang.RuntimeException
15:10:31             Caused by: java.lang.IllegalArgumentException
15:10:31                 Caused by: java.lang.OutOfMemoryError

@veblush
Copy link
Contributor Author

veblush commented May 21, 2020

Thanks for running the test and the result doesn't make sense because this change won't introduce any runtime behavior other than dependency. Is it possible to be caused by flaky environment? Can we rerun the test again?

@TheNeuralBit
Copy link
Member

Run Dataflow ValidatesRunner

@chamikaramj chamikaramj merged commit 5e132a1 into apache:master May 21, 2020
@veblush
Copy link
Contributor Author

veblush commented May 21, 2020

Thanks, Brian and Chamikara!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants