Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inprocess: Fix listener race if transport is shutdown while starting #11214

Merged
merged 3 commits into from
May 28, 2024

Conversation

ejona86
Copy link
Member

@ejona86 ejona86 commented May 15, 2024

Returning the runnable did nothing, as both the start method and the runnable are run within the synchronization context. I believe the Runnable used to be required in the previous implementation of ManagedChannelImpl (the lock-based implementation before we created SynchronizationContext).

This fixes a NPE seen in ServerImpl because the server expects proper ordering of transport lifecycle events.

Uncaught exception in the SynchronizationContext. Panic!
java.lang.NullPointerException: Cannot invoke "java.util.concurrent.Future.cancel(boolean)" because "this.handshakeTimeoutFuture" is null
	at io.grpc.internal.ServerImpl$ServerTransportListenerImpl.transportReady(ServerImpl.java:440)
	at io.grpc.inprocess.InProcessTransport$4.run(InProcessTransport.java:215)
	at io.grpc.SynchronizationContext.drain(SynchronizationContext.java:94)

b/338445186

@ejona86 ejona86 requested a review from larry-safran May 15, 2024 23:10
Copy link
Contributor

@larry-safran larry-safran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everywhere already handles a null being returned from start, so LGTM.

@larry-safran
Copy link
Contributor

Some of the tests are checking for transportReady not being called anywhere rather than just not after shutdown.

@ejona86
Copy link
Member Author

ejona86 commented May 16, 2024

Everywhere already handles a null being returned from start

I think after this change it will always be null. But it'll be invasive to make the return value void. (If we did such a change, we could delete this new test I created though.)

Some of the tests are checking for transportReady not being called anywhere rather than just not after shutdown.

That's only in the case that the transport is expected to not connect. In this case, in-process is probably the only transport that would connect so quickly, so the test will probably not catch anything in other transports. But there's also no reason to make it specific to in-process. (AbstractTransportTest actually was created as tests for in-process transport. And then we saw "we should run these tests against all transports" with only a few outliers.)

@larry-safran
Copy link
Contributor

I was referring to:

> Task :grpc-okhttp:test

io.grpc.okhttp.OkHttpTransportTest > clientShutdownBeforeStartRunnable FAILED
    org.mockito.exceptions.verification.VerificationInOrderFailure: 
    Verification in order failure:
    listener.transportReady();
    Wanted 0 times:
    -> at io.grpc.internal.AbstractTransportTest.clientShutdownBeforeStartRunnable(AbstractTransportTest.java:368)
    But was 1 time:
    -> at io.grpc.okhttp.OkHttpClientTransport$ClientFrameHandler.settings(OkHttpClientTransport.java:1298)
        at app//io.grpc.internal.AbstractTransportTest.clientShutdownBeforeStartRunnable(AbstractTransportTest.java:368)

There were more of these against all versions for your initial commit

Returning the runnable did nothing, as both the start method and the
runnable are run within the synchronization context. I believe the
Runnable used to be required in the previous implementation of
ManagedChannelImpl (the lock-based implementation before we created
SynchronizationContext).

This fixes a NPE seen in ServerImpl because the server expects proper
ordering of transport lifecycle events.
```
Uncaught exception in the SynchronizationContext. Panic!
java.lang.NullPointerException: Cannot invoke "java.util.concurrent.Future.cancel(boolean)" because "this.handshakeTimeoutFuture" is null
	at io.grpc.internal.ServerImpl$ServerTransportListenerImpl.transportReady(ServerImpl.java:440)
	at io.grpc.inprocess.InProcessTransport$4.run(InProcessTransport.java:215)
	at io.grpc.SynchronizationContext.drain(SynchronizationContext.java:94)
```

b/338445186
@ejona86 ejona86 force-pushed the inprocess-stop-between-stop branch from 3aa0a0b to eaef45f Compare May 28, 2024 21:47
@ejona86 ejona86 merged commit e4e7f3a into grpc:master May 28, 2024
13 checks passed
@ejona86 ejona86 deleted the inprocess-stop-between-stop branch May 28, 2024 22:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants