New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clusterbus extensions and hostname support #9530
Conversation
Nice! If you ever get hit by a bus, it better be a cluster bus. :-) I haven't looked at the code yet. I think SNI verification between nodes might be useful, just as it is useful between client and cluster, for deploying a system in an untrusted network. We use mutual authentication instead though.
If we ever want to add more fields to CLUSTER SLOTS, perhaps consider making the last argument a map. It may be secondary IP addresses (IPv6 and IPv4). A hostname can be resolved to multiple IP addresses though, if DNS is used, so it might not be needed for that use case. |
@madolson great to see this making progress! I didn't look at the implementation yet, but I suppose that any approach we take to support cluster bus upgrades should be flexible enough to support additional upgrades in the future. I'm sure we'll need that when we proceed with the ClusterV2 plans. I support the Agree about not using DNS names for intra-node connectivity, and I think SNI validation is also not really that important there (adding other basic cert validation configuration is easier and just as good IMHO). |
it would be nice if we also could use hostnames in create cluster process. Right now you need to use IPs for that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@madolson I had a quick look at the code (don't consider it a full review yet) and have a couple of small comments.
Some other thoughts/questions:
- You mention this is semi-breaking change, but IIUC if one enables hostnames and delivers extensions to old nodes they'll just be ignored resulting with inconsistent behavior but no other breakage - right?
- I think there's something a bit confusing about the way extensions are implemented. On one hand it's a generic mechanism with a packet-level flag and extensions count. On the other hand, extensions specifically extend the gossip section. Maybe we should consider going all the way to a more generic extensions mechanism?
- The argument for extensions vs. new commands is atomicity of updates, but IIUC that's not the case when a node joins - it will initially receive information about other nodes without hostnames, and only later have hostnames propagated to it directly from other nodes.
Yeah, I said it's breaking but it's really not as long as you're being deliberate. The danger here is that the extension is grouped with the ping/pong messages themselves, so that failure to parse the extensions means that the entire ping will also be rejected.
Is there something specific you have in mind here that is useful? It is an extension, but it's meant to jump on the existing ping/pong structure that already exists to spread data around. The module interface is already extensible in that you can add new messages if you want. (We could implement hostnames that way as well) There is also no strong reason this mechanism couldn't be generalized to add arbitrary additional data to any of the other existing messages. I'll also mention that I think long term this type of gossip isn't very efficient, and we probably want to figure out a better way to distribute this information in the cluster for cluster V2.
This is mostly right, but we do have atomicity because gossip data isn't that comprehensive. The gossiped information (IP, node name, flags, health information) is just enough so that nodes learning about a new node can reach out and ping it, it's not enough to know detailed information about the node. Specifically slots are missing, which disqualifies it from showing up in This is why I made a very specific point about |
38b0bec
to
0175a7b
Compare
@madolson - also any thoughts on that idea you and I discussed to create |
@dmitrypol It's in one of the checkboxes ;)
My thought was to decouple your ask from this specific PR. This is mostly code complete to my satisfaction for the core. (Also, I'll be out for a couple of weeks, so won't respond quickly) |
My mistake, did not notice |
The only issue I had with the extensions is that it's a bit weird to have the flag and count at the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some thoughts that came to me, I'll fix them when I'm back and have access to my laptop.
@madolson If we stretch scenario further - the hostname itself may also not be known, or be dynamic and different for different clients. In that case, it could be useful to return something like There is an inherent assumption here that the client only uses ports to distinguish between cluster nodes, and that the hostname/address is identical - but I believe that is becoming the case with some network topologies that involve a service mesh proxy / load balancer / gateway / etc. There's practically no work on the server side for this, it's only about setting a convention and communicating it to clients as part of the hostname support change. Any thoughts about this? |
@yossigo - you are absolutely correct, hostname can be different per node. Cluster can be composed of server1.domain.com:6379, server2.domain.com:6379 and server3.domain.com:6379. |
@yossigo That is a good insight. An alternative to what you proposed is we could add a client config so that a client can tell the cluster the hostname/IP that is should always respond with. I think that would require a bit less client changes, as they would more focus on sending an additional command on startup as opposed to changing how interpreting the Cluster Slots/redirects function. @dmitrypol Not sure I followed your comment, is having the server side not know the hostname a better solution for what we talked about? I'm going to rebase and address my changes today in either case. We should be able to quickly add the changes outlined. Once this has general buy in, I'll close off on the tooling improvement. |
3d72c09
to
eb086e3
Compare
@dmitrypol The example you provide is already part of this work, I was actually referring to something else. For example assume there are clients A and B behind different load balancers, both pointing to the same Redis Cluster. The clients may use different, locally known and locally resolved hostnames to reach those load balancers, but the cluster does not know where to redirect each client.
@madolson This is a good point, it involves less parsing changes. On the other hand, we're anyway introducing parsing changes, not just due to hostnames but potentially also pushing clients to finally move from |
thank you for clarifying @yossigo. I misunderstood. |
67022e5
to
790beea
Compare
…edis#9530) Implement the ability for cluster nodes to advertise their location with extension messages.
This is on 7.0-rc2
mention my self @liuchong for issue filter 👀 |
@liuchong It seems as |
Also seeing this |
@zuiderkwast this is resolved by #10436, right? |
Gossip the cluster node blacklist in ping and pong messages. This means that CLUSTER FORGET doesn't need to be sent to all nodes in a cluster. It can be sent to one or more nodes and then be propagated to the rest of them. For each blacklisted node, its node id and its remaining blacklist TTL is gossiped in a cluster bus ping extension (introduced in #9530).
seen a failure in a test introduced here. i assume timing issue.
|
Gossip the cluster node blacklist in ping and pong messages. This means that CLUSTER FORGET doesn't need to be sent to all nodes in a cluster. It can be sent to one or more nodes and then be propagated to the rest of them. For each blacklisted node, its node id and its remaining blacklist TTL is gossiped in a cluster bus ping extension (introduced in redis#9530).
This PR adds a human readable name to a node in clusters that are visible as part of error logs. This is useful so that admins and operators of Redis cluster have better visibility into failures without having to cross-reference the generated ID with some logical identifier (such as pod-ID or EC2 instance ID). This is mentioned in #8948. Specific nodenames can be set by using the variable cluster-announce-human-nodename. The nodename is gossiped using the clusterbus extension in #9530. Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
Gossip the cluster node blacklist in ping and pong messages. This means that CLUSTER FORGET doesn't need to be sent to all nodes in a cluster. It can be sent to one or more nodes and then be propagated to the rest of them. For each blacklisted node, its node id and its remaining blacklist TTL is gossiped in a cluster bus ping extension (introduced in redis#9530).
This PR introduces two changes, it introduces a clusterbus extension system so that we can add additional metadata and then uses that to add hostname support. I've been very slow, and it's been sitting on my laptop for awhile, so would rather publish it with context before I get hit by a bus. I will try to iterate more this week and get the code into a better shape, but would appreciate input from @redis/core-team / @ShooterIT / @zuiderkwast.
Clusterbus extension
Now we can send extra metadata after the end of the gossip information. This is a backwards compatible change, in that you can't use it between nodes of the same cluster version, but you can upgrade all the nodes in a cluster to support it and then start using it. The point of this is we want to send a consistent version of the cluster mode state along with the message, instead of introducing a separate type of message.
An alternative to this would be to add a new type of message, a hostname message. The reason I don't want to introduce this is that it adds yet another way to propagate information throughout the cluster, and introduces periods of time where one of the messages ( the MEET for example) was received but we still don't know the clusters hostname, so we might have to show the IP to incoming clients. A new message also introduces more overhead.
Another use of this extension is that I want to add display names/context names, that can be printed in place of the nodeID (the 40 character hex blob). When debugging, the 40 character hex blob is really annoying.
Hostname support
I've added a new config, "cluster-announce-hostname", which is a hostname that an externally facing client can use to connect to this node. Using the new mechanism we will send an hostname extension to all nodes, so that eventually all nodes in the cluster will know our hostname. NOTE: This is not gossiped, we don't tell other nodes about other nodes hostname's, this is just to reduce message volume. NOTE: Nodes do not talk to each other with the hostname.
You can also add a hostname to a node in existing cluster, and it will be eventually propagated to all nodes.
This hostname will be added as the 4th field to the
CLUSTER SLOTS
output which is the primary way clients will discover it. I'm also proposing we introduce a "cluster-preferred-endpoint-type" option to configure what type of endpoint is shown by default.The hostname will be committed to the cluster nodes file, appended on the end of IP/port/cport information. I think it was a done in a way that supports clients, and it's actually easier to place there then to throw it at the end of the line as a positional argument.
Considerations
CLUSTER SLOTS
will be considered as a first class citizen, butCLUSTER NODES
will be able to support it if clients want to do special work. Right now I am adding the hostname into the cluster nodes file, so that it's loaded on restart, but not considering making it terribly easy to parse. However, I know some clients try to use that to discover the topology, but I don't want to try to do anything special for them with regards to hostname support. A follow up item will be to expose a variant ofCLUSTER NODES
that is more client friendly.Out of scope:
Tasks punted to other PRs:
CLUSTER HEALTH
orCLUSTER STATUS
. It should be able to show the hostnames, but not necessarily be used by clients.CLUSTER SLOTS
really slow. Might want to optimize this in tests.