Skip to content

Conversation

@dapengzhang0
Copy link
Contributor

Fix bug found by internal user

W/ChannelExecutor: Runnable threw exception in ChannelExecutor
     java.lang.IllegalStateException: ConnectivityStateManager is already disabled
 at com.google.common.base.Preconditions.checkState(Preconditions.java:459)
 at io.grpc.internal.ConnectivityStateManager.gotoState(ConnectivityStateManager.java:66)
 at io.grpc.internal.ManagedChannelImpl$5.run(ManagedChannelImpl.java:476)
 at io.grpc.internal.ChannelExecutor.drain(ChannelExecutor.java:72)
 at io.grpc.internal.DelayedClientTransport.shutdown(DelayedClientTransport.java:216)
 at io.grpc.internal.ManagedChannelImpl.shutdown(ManagedChannelImpl.java:480)
 at io.grpc.internal.ManagedChannelImpl.shutdown(ManagedChannelImpl.java:69)

@dapengzhang0 dapengzhang0 changed the title core: "fix ConnectivityStateManager is already disabled" bug core: fix "ConnectivityStateManager is already disabled" bug Jul 27, 2017
@dapengzhang0 dapengzhang0 force-pushed the fixchannelstateapibug branch from 4d2d151 to d37a39a Compare July 27, 2017 21:34
@dapengzhang0
Copy link
Contributor Author

Jenkins, retest this please

// When ConnectivityStateManager is already disabled, then channel shutdown is called.
// Keep state being null.
return;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ConnectivityStateManager should not discriminate states based on the assumption of who calls gotoState(), because the manager may be used in other cases where the assumption may not hold.
I think the issue is that ManagedChannelImpl.shutdown() calls gotoState() when the manager is already disabled. To fix this immediate issue, we can add an isDisabled() to the manager and avoid calling gotoState() if it returns true.

@dapengzhang0 dapengzhang0 force-pushed the fixchannelstateapibug branch from ec85979 to 748cb81 Compare July 27, 2017 22:41
@dapengzhang0 dapengzhang0 changed the title core: fix "ConnectivityStateManager is already disabled" bug core: fix ConnectivityStateManager is already disabled bug Jul 27, 2017
createChannel(new FakeNameResolverFactory(false), NO_INTERCEPTOR);
assertEquals(ConnectivityState.IDLE, channel.getState(false));
helper.updatePicker(mockPicker);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because gotoState() is called in ChannelExecutor, which catches all exceptions, this shutdown() will not throw even without the fix.
Perhaps we can add an assert false in ChannelExecutor's exception handling block, to actually break test (assuming tests all have assertion enabled).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applications may also turn on -ea, then the entire application may fail if any runtime exception is thrown in ChannelExecutor's runnables.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about we run unit tests with a -Dtestgrpc=true argument, and in ChannelExecutor's exception handling block we check if testgrpc is true, if it is we assert false?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The decision for ChannelExecutor to catch instead of throwing is questionable. I did it because ChannelExecutor may run in any thread, e.g., network thread, and I didn't feel comfortable to let random code, e.g., LoadBalancer to throw in network thread. But maybe it's better to just throw? @ejona86 WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filed #3293 tracking the ChannelExecutor exception handling.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd much rather defer the question; I don't think the user-visible exception handling behavior should be determined by the unit tests. I do think it tends to make sense to catch and log the exception, since propagating the exception will probably only cause additional failures and we know the current stack does not depend on the result of the call (since the execution is not guaranteed to be complete when drain returns.)

You could inject an exception handler or some such, though. We do know that none of the tests should cause an exception in the channel executor.

@dapengzhang0
Copy link
Contributor Author

Jenkins, retest this please

@dapengzhang0 dapengzhang0 merged commit 18970e6 into grpc:master Aug 1, 2017
@dapengzhang0 dapengzhang0 deleted the fixchannelstateapibug branch August 13, 2017 05:14
@lock lock bot locked as resolved and limited conversation to collaborators Jan 20, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants