Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove Subchannel.State and track state inside load balancer #1590

Merged
merged 4 commits into from
Jan 27, 2022

Conversation

JamesNK
Copy link
Member

@JamesNK JamesNK commented Jan 27, 2022

A load balancer shouldn't use Subchannel.State because it can be updated from other threads while a load balancer is building the picker. Instead, a load balancer should track its own copy of states.

Public API change is ok because this isn't final yet.

@JamesNK
Copy link
Member Author

JamesNK commented Jan 27, 2022

Noticed in test failure below. Load balancer creates picker with two ready subchannels even though it has only been notified about one ready subchannel.

  Failed UnaryCall_UnavailableAddress_FallbackToWorkingAddress [73 ms]
  Error Message:
     Expected: some item equal to "127.0.0.1:50051"
  But was:  < "127.0.0.1:50052", "127.0.0.1:50052" >

  Stack Trace:
     at Grpc.AspNetCore.FunctionalTests.Balancer.RoundRobinBalancerTests.UnaryCall_UnavailableAddress_FallbackToWorkingAddress() in /home/runner/work/grpc-dotnet/grpc-dotnet/test/FunctionalTests/Balancer/RoundRobinBalancerTests.cs:line 269
   at NUnit.Framework.Internal.TaskAwaitAdapter.GenericAdapter`1.BlockUntilCompleted()
   at NUnit.Framework.Internal.MessagePumpStrategy.NoMessagePumpStrategy.WaitForCompletion(AwaitAdapter awaiter)
   at NUnit.Framework.Internal.AsyncToSyncAdapter.Await(Func`1 invoke)
   at NUnit.Framework.Internal.Commands.TestMethodCommand.RunTestMethod(TestExecutionContext context)
   at NUnit.Framework.Internal.Commands.TestMethodCommand.Execute(TestExecutionContext context)
   at NUnit.Framework.Internal.Commands.BeforeAndAfterTestCommand.<>c__DisplayClass1_0.<Execute>b__0()
   at NUnit.Framework.Internal.Commands.BeforeAndAfterTestCommand.RunTestMethodInThreadAbortSafeZone(TestExecutionContext context, Action action)

  Standard Output Messages:
 0.000s GrpcTestContext - Information: Starting RoundRobinBalancerTests.UnaryCall_UnavailableAddress_FallbackToWorkingAddress
 0.027s Grpc.Net.Client.Balancer.Subchannel - Debug: Subchannel id '1' created with addresses: 127.0.0.1:50051
 0.027s Grpc.Net.Client.Balancer.Subchannel - Debug: Subchannel id '2' created with addresses: 127.0.0.1:50052
 0.027s Grpc.Net.Client.Balancer.Subchannel - Trace: Subchannel id '1' connection requested.
 0.027s Grpc.Net.Client.Balancer.Subchannel - Debug: Subchannel id '1' state changed to Connecting. Detail: 'Connection requested.'.
 0.027s Grpc.Net.Client.Balancer.RoundRobinBalancer - Trace: Processing subchannel id '1' state changed to Connecting. Detail: 'Connection requested.'.
 0.027s Grpc.Net.Client.Balancer.Internal.ConnectionManager - Debug: Channel state updated to Connecting.
 0.027s Grpc.Net.Client.Balancer.Internal.ConnectionManager - Debug: Channel picker updated.
 0.027s Grpc.Net.Client.Balancer.Subchannel - Trace: Subchannel id '1' connecting to transport.
 0.027s Grpc.Net.Client.Balancer.Subchannel - Trace: Subchannel id '2' connection requested.
 0.027s Grpc.Net.Client.Balancer.Subchannel - Debug: Subchannel id '2' state changed to Connecting. Detail: 'Connection requested.'.
 0.027s Grpc.Net.Client.Balancer.RoundRobinBalancer - Trace: Processing subchannel id '2' state changed to Connecting. Detail: 'Connection requested.'.
 0.027s Grpc.Net.Client.Balancer.Subchannel - Trace: Subchannel id '2' connecting to transport.
 0.027s Grpc.Net.Client.Balancer.Internal.ConnectionManager - Debug: Channel state updated to Ready.
 0.027s Grpc.Net.Client.Balancer.Internal.ConnectionManager - Debug: Channel picker updated.
 0.027s Grpc.Net.Client.Balancer.Subchannel - Debug: Subchannel id '2' state changed to Ready. Detail: 'Successfully connected to socket.'.
 0.027s GrpcTestContext - Information: Waiting for subchannel ready count: 2
 0.027s Grpc.Net.Client.Balancer.RoundRobinBalancer - Trace: Processing subchannel id '2' state changed to Ready. Detail: 'Successfully connected to socket.'.
 0.027s Grpc.Net.Client.Balancer.Internal.ConnectionManager - Debug: Channel picker updated.
 0.027s GrpcTestContext - Information: Current subchannel ready count: 2
 0.027s GrpcTestContext - Information: Ready subchannel: Id: 1, Addresses: 127.0.0.1:50051, State: Ready, Current address: 127.0.0.1:50051
 0.027s GrpcTestContext - Information: Ready subchannel: Id: 2, Addresses: 127.0.0.1:50052, State: Ready, Current address: 127.0.0.1:50052
 0.027s GrpcTestContext - Information: Finished waiting for subchannel ready.
 0.027s Grpc.Net.Client.Internal.GrpcCall - Debug: Starting gRPC call. Method type: 'Unary', URI: 'http://loadbalancer.temporary.invalid/DynamicService/UnaryMethod'.
 0.027s Grpc.Net.Client.Balancer.Internal.ConnectionManager - Trace: Pick started.
 0.027s Grpc.Net.Client.Balancer.Internal.ConnectionManager - Debug: Successfully picked subchannel id '2' with address 127.0.0.1:50052.
 0.027s Grpc.Net.Client.Internal.GrpcCall - Debug: Sending message.
 0.027s Grpc.Net.Client.Internal.GrpcCall - Trace: Serialized 'Greet.HelloRequest' to 11 byte message.
 0.028s Grpc.Net.Client.Internal.GrpcCall - Trace: Message sent.
 0.028s Grpc.Net.Client.Balancer.Subchannel - Debug: Subchannel id '1' state changed to Ready. Detail: 'Successfully connected to socket.'.
 0.028s Grpc.Net.Client.Balancer.RoundRobinBalancer - Trace: Processing subchannel id '1' state changed to Ready. Detail: 'Successfully connected to socket.'.
 0.028s Grpc.Net.Client.Balancer.Internal.ConnectionManager - Debug: Channel picker updated.
 0.029s Grpc.Net.Client.Internal.GrpcCall - Trace: Response headers received.
 0.029s Grpc.Net.Client.Internal.GrpcCall - Debug: Reading message.
 0.029s Grpc.Net.Client.Internal.GrpcCall - Trace: Deserializing 11 byte message to 'Greet.HelloReply'.
 0.029s Grpc.Net.Client.Internal.GrpcCall - Trace: Received message.
 0.029s Grpc.Net.Client.Internal.GrpcCall - Debug: Finished gRPC call.
 0.029s Grpc.Net.Client.Internal.GrpcCall - Debug: Starting gRPC call. Method type: 'Unary', URI: 'http://loadbalancer.temporary.invalid/DynamicService/UnaryMethod'.
 0.029s Grpc.Net.Client.Balancer.Internal.ConnectionManager - Trace: Pick started.
 0.029s Grpc.Net.Client.Balancer.Internal.ConnectionManager - Debug: Successfully picked subchannel id '2' with address 127.0.0.1:50052.
 0.030s Grpc.Net.Client.Internal.GrpcCall - Debug: Sending message.
 0.030s Grpc.Net.Client.Internal.GrpcCall - Trace: Serialized 'Greet.HelloRequest' to 11 byte message.
 0.030s Grpc.Net.Client.Internal.GrpcCall - Trace: Message sent.
 0.036s Grpc.Net.Client.Internal.GrpcCall - Trace: Response headers received.
 0.036s Grpc.Net.Client.Internal.GrpcCall - Debug: Reading message.
 0.036s Grpc.Net.Client.Internal.GrpcCall - Trace: Deserializing 11 byte message to 'Greet.HelloReply'.
 0.036s Grpc.Net.Client.Internal.GrpcCall - Trace: Received message.
 0.036s Grpc.Net.Client.Internal.GrpcCall - Debug: Finished gRPC call.
 0.070s Grpc.Net.Client.Balancer.Subchannel - Debug: Subchannel id '2' state changed to Idle. Detail: 'Disconnected.'.
 0.070s Grpc.Net.Client.Balancer.RoundRobinBalancer - Trace: Processing subchannel id '2' state changed to Idle. Detail: 'Disconnected.'.
 0.070s Grpc.Net.Client.Balancer.Internal.ConnectionManager - Debug: Channel picker updated.
 0.070s Grpc.Net.Client.Balancer.RoundRobinBalancer - Trace: Refreshing resolver because subchannel id '2' is in state Idle.
 0.070s Grpc.Net.Client.Balancer.RoundRobinBalancer - Trace: Connections unchanged.
 0.070s Grpc.Net.Client.Balancer.RoundRobinBalancer - Trace: Requesting connection for subchannel id '2' because it is in state Idle.
 0.070s Grpc.Net.Client.Balancer.Subchannel - Trace: Subchannel id '2' connection requested.
 0.070s Grpc.Net.Client.Balancer.Subchannel - Debug: Subchannel id '2' state changed to Connecting. Detail: 'Connection requested.'.
 0.070s Grpc.Net.Client.Balancer.RoundRobinBalancer - Trace: Processing subchannel id '2' state changed to Connecting. Detail: 'Connection requested.'.
 0.070s Grpc.Net.Client.Balancer.Internal.ConnectionManager - Debug: Channel picker updated.
 0.070s Grpc.Net.Client.Balancer.Subchannel - Trace: Subchannel id '2' connecting to transport.
 0.071s Grpc.Net.Client.Balancer.Subchannel - Debug: Subchannel id '2' state changed to TransientFailure. Detail: 'Error connecting to subchannel.'.
 System.Net.Sockets.SocketException (111): Connection refused
    at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
    at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource.GetResult(Int16 token)
    at System.Net.Sockets.Socket.<ConnectAsync>g__WaitForConnectWithCancellation|277_0(AwaitableSocketAsyncEventArgs saea, ValueTask connectTask, CancellationToken cancellationToken)
    at Grpc.Net.Client.Balancer.Internal.SocketConnectivitySubchannelTransport.TryConnectAsync(CancellationToken cancellationToken) in /home/runner/work/grpc-dotnet/grpc-dotnet/src/Grpc.Net.Client/Balancer/Internal/SocketConnectivitySubchannelTransport.cs:line 138
 0.071s Grpc.Net.Client.Balancer.RoundRobinBalancer - Trace: Processing subchannel id '2' state changed to TransientFailure. Detail: 'Error connecting to subchannel.'.
 System.Net.Sockets.SocketException (111): Connection refused
    at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
    at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource.GetResult(Int16 token)
    at System.Net.Sockets.Socket.<ConnectAsync>g__WaitForConnectWithCancellation|277_0(AwaitableSocketAsyncEventArgs saea, ValueTask connectTask, CancellationToken cancellationToken)
    at Grpc.Net.Client.Balancer.Internal.SocketConnectivitySubchannelTransport.TryConnectAsync(CancellationToken cancellationToken) in /home/runner/work/grpc-dotnet/grpc-dotnet/src/Grpc.Net.Client/Balancer/Internal/SocketConnectivitySubchannelTransport.cs:line 138
 0.071s Grpc.Net.Client.Balancer.Internal.ConnectionManager - Debug: Channel picker updated.
 0.071s Grpc.Net.Client.Balancer.RoundRobinBalancer - Trace: Refreshing resolver because subchannel id '2' is in state TransientFailure.
 0.071s Grpc.Net.Client.Balancer.RoundRobinBalancer - Trace: Connections unchanged.
 0.071s Grpc.Net.Client.Balancer.Subchannel - Trace: Subchannel id '2' starting connect backoff of 00:00:01.
 0.073s GrpcTestContext - Information: Finishing RoundRobinBalancerTests.UnaryCall_UnavailableAddress_FallbackToWorkingAddress

@JamesNK JamesNK merged commit 282fa2a into grpc:master Jan 27, 2022
@JamesNK JamesNK deleted the jamesnk/subchannel-state-race branch January 27, 2022 20:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants