Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New attributes in BalancerAddress are ignored unless Endpoint has changed #2225

Closed
kalduzov opened this issue Aug 3, 2023 · 6 comments · Fixed by #2228
Closed

New attributes in BalancerAddress are ignored unless Endpoint has changed #2225

kalduzov opened this issue Aug 3, 2023 · 6 comments · Fixed by #2228
Labels
bug Something isn't working
Milestone

Comments

@kalduzov
Copy link

kalduzov commented Aug 3, 2023

@JamesNK , hi.

We encountered a very incomprehensible behavior for us in the balancing implementation.

In our implementation, which is inherited from SubchannelsLoadBalancer, we add attributes for BalancerAddress to the Attributes property for balancing by mesh versions.

Our pod address can change its mesh version while the grpc client is running. Then a new list of addresses arrives, in which the Endpoint has not actually changed, but the composition of the attributes for it has changed.

However, the implementation in SubchannelsLoadBalancer does not allow changing the current subchannel address, because if the subchannel matches the one that came in, then the "new" address with new attributes is ignored and the old subchannel is used.

We think that at least the presence of new attributes on the BalancerAddress should be reflected in the existing subchannel, because from the point of view of logic, BalancerAddress is not only an address, but also its additional properties, which are now ignored.

It seems that in general it will be enough to do in the method
UpdateChannelState(ChannelState state) call like
such newOrCurrentSubchannel.Subchannel.UpdateAddresses(new[] { address });

and correct the conditions in UpdateAddresses to update/add at least attributes to an existing address.

Or add a new mechanism to forward these attributes to an existing subchannel.

What version of gRPC and what language are you using?

All versions with client side balancing

@kalduzov kalduzov added the bug Something isn't working label Aug 3, 2023
@JamesNK
Copy link
Member

JamesNK commented Aug 3, 2023

Why are changing the attributes useful for you? I'd like to understand what you're using them for.

What do you think should happen to a connection if its BalancerAddress has different attributes? Should it be recreated (socket closed, and reopened) or should it continue to be connected but with a different BalancerAddress?

@kalduzov
Copy link
Author

kalduzov commented Aug 3, 2023

We have a complex system in k8s where a sub service can be under different mesh versions. It usually has one address that doesn't change, but its mesh version can change. In addition, one pod can run under different mesh versions.

Our resolver receives from the external ServiceDiscovery the service state with all its ip addresses and mesh versions. From the addresses, we create a BalancerAddress and add to it as attributes which mesh versions apply to it.

The state of the system in k8s is constantly changing, for example during the canary release, we switch one pod from the mesh1 version to the default version. But the address itself does not change at this time.

Our balancing implementation takes this into account and creates several pickers, one for each mesh version, into which the subchannels serving this mesh version fall. The same subchannel can fall into different pickers - and this is normal.

I don't think it's necessary to close the connection and recreate the socket. Rather, it is necessary for the subchannel to have some kind of internal data source that can dynamically change and influence the policy of creating pickers by end users.

Right now the BalancerAddress attributes are essentially the only way to pass such a state to a subchannel. But the problem is that we will not be able to get these attributes at the moment when new BalancerAddress has arrived (with the same address, but the attributes have changed) and an attempt to create a new picker based on subchannels in the Ready status has been initiated.

Thus, although the BalancerAddress data itself has changed, we are unable to use this data when creating a picker.

Maybe it makes sense to change only the BalancerAddress, without recreating the transport - after all, in fact, we are only passing a different state that does not affect the physical connection.

@JamesNK
Copy link
Member

JamesNK commented Sep 1, 2023

A version with this change in it is available now - https://www.nuget.org/packages/Grpc.Net.Client/2.57.0-pre1

Please try it out. If you see anything wrong, it can be fixed in a the next week. Otherwise, you'll have to wait a few months for the next release.

@kalduzov
Copy link
Author

kalduzov commented Sep 1, 2023

We'll check it out soon.

@kalduzov
Copy link
Author

kalduzov commented Sep 9, 2023

Unfortunately, we were unable to test the solution this week. :(

@JamesNK
Copy link
Member

JamesNK commented Sep 11, 2023

Ok, no problem. If there are issues, then you'll need to wait for the next release.

2.57.0 is already out with the change. You can try that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants