Skip to content

Conversation

dapengzhang0
Copy link
Contributor

@dapengzhang0 dapengzhang0 commented Apr 14, 2021

Fix the following bug:

ManagedChannelImpl.ConfigSelectingClientCall may return early in start() leaving delegate null, and fails request() method after start().

Currently the bug can only be triggered when using xDS.

Example stacktrace:

FINEST: [Channel<4>: (xds:///grpc-test1618369345:8268)] Entering CONNECTING state with picker: EmptyPicker{}
Apr 13, 2021 9:34:20 PM io.grpc.testing.integration.XdsTestClient run
SEVERE: Error running client
java.util.concurrent.ExecutionException: java.lang.NullPointerException
	at com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:566)
	at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:547)
	at com.google.common.util.concurrent.AbstractFuture$TrustedFuture.get(AbstractFuture.java:104)
	at io.grpc.testing.integration.XdsTestClient.runQps(XdsTestClient.java:435)
	at io.grpc.testing.integration.XdsTestClient.run(XdsTestClient.java:264)
	at io.grpc.testing.integration.XdsTestClient.main(XdsTestClient.java:116)
Caused by: java.lang.NullPointerException
	at io.grpc.PartialForwardingClientCall.request(PartialForwardingClientCall.java:34)
	at io.grpc.ForwardingClientCall.request(ForwardingClientCall.java:22)
	at io.grpc.PartialForwardingClientCall.request(PartialForwardingClientCall.java:34)
	at io.grpc.ForwardingClientCall.request(ForwardingClientCall.java:22)
	at io.grpc.ForwardingClientCall$SimpleForwardingClientCall.request(ForwardingClientCall.java:44)
	at io.grpc.PartialForwardingClientCall.request(PartialForwardingClientCall.java:34)
	at io.grpc.ForwardingClientCall.request(ForwardingClientCall.java:22)
	at io.grpc.ForwardingClientCall$SimpleForwardingClientCall.request(ForwardingClientCall.java:44)
	at io.grpc.PartialForwardingClientCall.request(PartialForwardingClientCall.java:34)
	at io.grpc.ForwardingClientCall.request(ForwardingClientCall.java:22)
	at io.grpc.ForwardingClientCall$SimpleForwardingClientCall.request(ForwardingClientCall.java:44)
	at io.grpc.stub.ClientCalls$CallToStreamObserverAdapter.request(ClientCalls.java:415)
	at io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onStart(ClientCalls.java:492)
	at io.grpc.stub.ClientCalls.startCall(ClientCalls.java:333)
	at io.grpc.stub.ClientCalls.asyncUnaryRequestCall(ClientCalls.java:306)
	at io.grpc.stub.ClientCalls.asyncUnaryRequestCall(ClientCalls.java:294)
	at io.grpc.stub.ClientCalls.asyncUnaryCall(ClientCalls.java:68)
	at io.grpc.testing.integration.TestServiceGrpc$TestServiceStub.unaryCall(TestServiceGrpc.java:509)
	at io.grpc.testing.integration.XdsTestClient$1PeriodicRpc.makeRpc(XdsTestClient.java:361)
	at io.grpc.testing.integration.XdsTestClient$1PeriodicRpc.run(XdsTestClient.java:295)
	at com.google.common.util.concurrent.MoreExecutors$ScheduledListeningDecorator$NeverSuccessfulListenableFutureTask.run(MoreExecutors.java:644)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Apr 13, 2021 9:34:20 PM io.grpc.ChannelLogger log
FINEST: [Channel<4>: (xds:///grpc-test1618369345:8268)] Entering TRANSIENT_FAILURE state with picker: Picker{result=PickResult{subchannel=null, streamTracerFactory=null, status=Status{code=UNAVAILABLE, description=NameResolver returned no usable address. addrs=[], attrs={}
io.grpc.xds.XdsNameResolver@7551e211 was used, cause=null}, drop=false}}

@dapengzhang0 dapengzhang0 added this to the 1.38 milestone Apr 14, 2021
@dapengzhang0 dapengzhang0 added the TODO:backport PR needs to be backported. Removed after backport complete label Apr 14, 2021
}
}

private static final ClientCall<Object, Object> NOOP_CALL = new ClientCall<Object, Object>() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this is copied from ClientInterceptors.NOOP_CALL and DelayedClientCall.NOOP_CALL. Not refactoring and reusing the code, because this makes minimum change to backport.

Status status = result.getStatus();
if (!status.isOk()) {
executeCloseObserverInContext(observer, status);
delegate = (ClientCall<ReqT, RespT>) NOOP_CALL;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would have been nice to avoid the cast and I think it is possible to achieve that but I understand this pattern was copied from ClientInterceptors.NOOP_CALL and DelayedClientCall.NOOP_CALL. If possible the refactoring can combine these into one and also avoid the cast?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible the refactoring can combine these into one and also avoid the cast?

I think so.

Copy link
Contributor

@sanjaypujare sanjaypujare left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG with 1 comment

@dapengzhang0 dapengzhang0 merged commit d25ebaf into grpc:master Apr 15, 2021
@dapengzhang0 dapengzhang0 deleted the fix-npe branch April 15, 2021 06:06
@dapengzhang0 dapengzhang0 removed the TODO:backport PR needs to be backported. Removed after backport complete label Apr 15, 2021
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 15, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants