Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QuicConnectionListenerTests.AcceptAsync_NoCertificateCallback_RemovedFromPendingConnections failing with leaked QuicConnection #48678

Closed
amcasey opened this issue Jun 8, 2023 · 3 comments · Fixed by #49378
Labels
area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions test-failure test-fixed

Comments

@amcasey
Copy link
Member

amcasey commented Jun 8, 2023

I grabbed a dump during the long delay where nothing is happening and the test is waiting for it's timeout. The dump has one QuicConnection rooted still. Which, when you track the reference down, is from QuicConnection._connectedTcs._keepAlive. So basically the QuicConnection is rooting itself. Normally this would be cleaned up by awaiting the task returned via QuicConnection.FinishHandshakeAsync or QuicConnection.FinishConnectAsync so I tracked down which method the rooted connection came from, and it was from FinishHandshakeAsync.

And the reason the ValueTask returned from FinishHandshakeAsync isn't awaited (thus freeing the GCHandle) is because the method throws which keeps the GCHandle alive forever in this case.

The reason the test worked before was because the TCS wasn't disposed before @stephentoub's change which meant there was a 10 second timeout that would eventually cleanup the ValueTask and thus the GCHandle.

So properly disposing the TCS exposed a bug in QuicConnection.

Originally posted by @BrennanConroy in #48558 (comment)

@amcasey
Copy link
Member Author

amcasey commented Jun 8, 2023

The context is best explained by the thread of comments in the original PR but, to summarize, QuicConnectionListenerTests.AcceptAsync_NoCertificateCallback_RemovedFromPendingConnections started failing when we picked up dotnet/runtime#87057, which revealed a pre-existing bug - a QuicConnection that apparently keeps itself alive, preventing Kestrel's ConditionalWeakTable (referred to as CWT in the test) from being cleared, even though an exception prevented the connection.

@amcasey
Copy link
Member Author

amcasey commented Jun 8, 2023

Also, I suppose this is actually a tracking issue, since next steps are probably on System.Net.Quic.

@amcasey amcasey added area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions test-failure and removed area-runtime labels Jun 8, 2023
@eerhardt
Copy link
Member

eerhardt commented Jun 8, 2023

cc @wfurt @MihaZupan @karelz

dotnet-maestro bot added a commit that referenced this issue Jun 8, 2023
[main] Update dependencies from dotnet/efcore dotnet/runtime


 - Fix Http Results test to respond to dotnet/runtime#87093

 - Convert to new generic JsonStringEnumConverter<TEnum> to fix JSON source generator warning.

InteractionType is used in a JsonSerializerContext. With dotnet/runtime#87224, there is now a warning to convert source generated usages of JsonStringEnumConverter to JsonStringEnumConverter<TEnum>.

I went ahead and updated the other 2 usages as well, to make them consistent.

 - Skip AcceptAsync_NoCertificateCallback_RemovedFromPendingConnections

See #48678
@ghost ghost locked as resolved and limited conversation to collaborators Aug 14, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions test-failure test-fixed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants