Clusterbus extensions and hostname support #9530

madolson · 2021-09-21T02:19:34Z

This PR introduces two changes, it introduces a clusterbus extension system so that we can add additional metadata and then uses that to add hostname support. I've been very slow, and it's been sitting on my laptop for awhile, so would rather publish it with context before I get hit by a bus. I will try to iterate more this week and get the code into a better shape, but would appreciate input from @redis/core-team / @ShooterIT / @zuiderkwast.

Clusterbus extension

Now we can send extra metadata after the end of the gossip information. This is a backwards compatible change, in that you can't use it between nodes of the same cluster version, but you can upgrade all the nodes in a cluster to support it and then start using it. The point of this is we want to send a consistent version of the cluster mode state along with the message, instead of introducing a separate type of message.

An alternative to this would be to add a new type of message, a hostname message. The reason I don't want to introduce this is that it adds yet another way to propagate information throughout the cluster, and introduces periods of time where one of the messages ( the MEET for example) was received but we still don't know the clusters hostname, so we might have to show the IP to incoming clients. A new message also introduces more overhead.

Another use of this extension is that I want to add display names/context names, that can be printed in place of the nodeID (the 40 character hex blob). When debugging, the 40 character hex blob is really annoying.

Hostname support

I've added a new config, "cluster-announce-hostname", which is a hostname that an externally facing client can use to connect to this node. Using the new mechanism we will send an hostname extension to all nodes, so that eventually all nodes in the cluster will know our hostname. NOTE: This is not gossiped, we don't tell other nodes about other nodes hostname's, this is just to reduce message volume. NOTE: Nodes do not talk to each other with the hostname.

You can also add a hostname to a node in existing cluster, and it will be eventually propagated to all nodes.

This hostname will be added as the 4th field to the CLUSTER SLOTS output which is the primary way clients will discover it. I'm also proposing we introduce a "cluster-preferred-endpoint-type" option to configure what type of endpoint is shown by default.

The hostname will be committed to the cluster nodes file, appended on the end of IP/port/cport information. I think it was a done in a way that supports clients, and it's actually easier to place there then to throw it at the end of the line as a positional argument.

Considerations

CLUSTER SLOTS will be considered as a first class citizen, but CLUSTER NODES will be able to support it if clients want to do special work. Right now I am adding the hostname into the cluster nodes file, so that it's loaded on restart, but not considering making it terribly easy to parse. However, I know some clients try to use that to discover the topology, but I don't want to try to do anything special for them with regards to hostname support. A follow up item will be to expose a variant of CLUSTER NODES that is more client friendly.
Extensions are only added to PING/PONG/MEET right now, but there is nothing blocking future implementation work to add them for other messages.

Out of scope:

Intra-node DNS resolution, all of the Redis cluster nodes should be in the same network (or I haven't heard a reason why to do that otherwise) so all of the connection establishment is still done through IP.
Doing TLS verification between nodes within the cluster based off of the hostname. This might be a useful verification, but I'm not convinced as of right now.

Tasks punted to other PRs:

All of the tooling should support DNS resolution, especially for redis-cli --cluster stuff. Not a critical requirement for the main release.
There was a follow up ask to make a version of CLUSTER NODES that is more human readable, like CLUSTER HEALTH or CLUSTER STATUS. It should be able to show the hostnames, but not necessarily be used by clients.
A note to myself, a lot of the tests set up non-contiguous slots, which makes CLUSTER SLOTS really slow. Might want to optimize this in tests.

zuiderkwast · 2021-09-21T11:48:10Z

Nice! If you ever get hit by a bus, it better be a cluster bus. :-)

I haven't looked at the code yet.

I think SNI verification between nodes might be useful, just as it is useful between client and cluster, for deploying a system in an untrusted network. We use mutual authentication instead though.

cluster-prefer-hostnames sounds good to me. If redirects use hostnames, that can already break clients, so if that's enabled, we can as well enable it for the first arg in CLUSTER SLOTS. But even better may be to let the client announce its capabilities (e.g. HELLO 3 hostnames).

If we ever want to add more fields to CLUSTER SLOTS, perhaps consider making the last argument a map. It may be secondary IP addresses (IPv6 and IPv4). A hostname can be resolved to multiple IP addresses though, if DNS is used, so it might not be needed for that use case.

yossigo · 2021-09-23T13:48:57Z

@madolson great to see this making progress!

I didn't look at the implementation yet, but I suppose that any approach we take to support cluster bus upgrades should be flexible enough to support additional upgrades in the future. I'm sure we'll need that when we proceed with the ClusterV2 plans.

I support the cluster-prefer-hostnames all-or-nothing approach, so if we use hostnames we use them for everything. There will definitely be some client breakage but I think it's an opportunity to refresh them, and also migrate all to CLUSTER SLOTS while doing so.

Agree about not using DNS names for intra-node connectivity, and I think SNI validation is also not really that important there (adding other basic cert validation configuration is easier and just as good IMHO).

dmitrypol · 2021-09-23T14:23:04Z

it would be nice if we also could use hostnames in create cluster process. Right now you need to use IPs for that.

yossigo

@madolson I had a quick look at the code (don't consider it a full review yet) and have a couple of small comments.
Some other thoughts/questions:

You mention this is semi-breaking change, but IIUC if one enables hostnames and delivers extensions to old nodes they'll just be ignored resulting with inconsistent behavior but no other breakage - right?
I think there's something a bit confusing about the way extensions are implemented. On one hand it's a generic mechanism with a packet-level flag and extensions count. On the other hand, extensions specifically extend the gossip section. Maybe we should consider going all the way to a more generic extensions mechanism?
The argument for extensions vs. new commands is atomicity of updates, but IIUC that's not the case when a node joins - it will initially receive information about other nodes without hostnames, and only later have hostnames propagated to it directly from other nodes.

src/cluster.c

madolson · 2021-09-30T05:53:52Z

@yossigo

You mention this is semi-breaking change, but IIUC if one enables hostnames and delivers extensions to old nodes they'll just be ignored resulting with inconsistent behavior but no other breakage - right?

Yeah, I said it's breaking but it's really not as long as you're being deliberate. The danger here is that the extension is grouped with the ping/pong messages themselves, so that failure to parse the extensions means that the entire ping will also be rejected.

I think there's something a bit confusing about the way extensions are implemented. On one hand it's a generic mechanism with a packet-level flag and extensions count. On the other hand, extensions specifically extend the gossip section. Maybe we should consider going all the way to a more generic extensions mechanism?

Is there something specific you have in mind here that is useful? It is an extension, but it's meant to jump on the existing ping/pong structure that already exists to spread data around. The module interface is already extensible in that you can add new messages if you want. (We could implement hostnames that way as well) There is also no strong reason this mechanism couldn't be generalized to add arbitrary additional data to any of the other existing messages.

I'll also mention that I think long term this type of gossip isn't very efficient, and we probably want to figure out a better way to distribute this information in the cluster for cluster V2.

The argument for extensions vs. new commands is atomicity of updates, but IIUC that's not the case when a node joins - it will initially receive information about other nodes without hostnames, and only later have hostnames propagated to it directly from other nodes.

This is mostly right, but we do have atomicity because gossip data isn't that comprehensive. The gossiped information (IP, node name, flags, health information) is just enough so that nodes learning about a new node can reach out and ping it, it's not enough to know detailed information about the node. Specifically slots are missing, which disqualifies it from showing up in CLUSTER SLOTS. Once it has exchanged a single ping/pong message, it will then know all the information it needs to display it in cluster slots, which is where we can inject the new hostname.

This is why I made a very specific point about CLUSTER NODES as well as SNI for intra-node communication. Cluster nodes requires very deliberate parsing to understand the state, which most clients don't do very well, but the node will show up immediately without the hostname. We also can't do SNI for intra-node based on the current implementation, since we reach out to the node before knowing it's hostname. There is no hard blocker for gossiping the hostname, just seems like extra data.

dmitrypol · 2021-10-02T23:25:39Z

@madolson - also any thoughts on that idea you and I discussed to create cluster health command so that users would not have to parse cluster nodes looking for fail?

madolson · 2021-10-03T04:30:20Z

@dmitrypol It's in one of the checkboxes ;)

There was a follow up ask to make a version of CLUSTER NODES that is more human readable, like CLUSTER HEALTH or CLUSTER STATUS. It should be able to show the hostnames, but not necessarily be used by clients."

My thought was to decouple your ask from this specific PR. This is mostly code complete to my satisfaction for the core. (Also, I'll be out for a couple of weeks, so won't respond quickly)

dmitrypol · 2021-10-03T15:08:21Z

@dmitrypol It's in one of the checkboxes ;)

There was a follow up ask to make a version of CLUSTER NODES that is more human readable, like CLUSTER HEALTH or CLUSTER STATUS. It should be able to show the hostnames, but not necessarily be used by clients."

My thought was to decouple your ask from this specific PR. This is mostly code complete to my satisfaction for the core. (Also, I'll be out for a couple of weeks, so won't respond quickly)

My mistake, did not notice

yossigo · 2021-10-06T13:26:02Z

@madolson

The only issue I had with the extensions is that it's a bit weird to have the flag and count at the clusterMsg, but still have to deal with extensions per clusterMsgData, but I suppose that's really the easiest way to maintain backwards compatible ping payloads. And I agree we'll probably want to move away from the gossip as it works right now anyway.

madolson

Some thoughts that came to me, I'll fix them when I'm back and have access to my laptop.

src/cluster.c

src/config.c

yossigo · 2021-10-25T14:26:46Z

@madolson
Something that is related to this work and came up in a recent discussion: A primary use case for hostnames is to deal with network topologies where the cluster does not have good visibility into what addresses are exposed to clients, but assumes that a hostname will resolve to the right address on the client side.

If we stretch scenario further - the hostname itself may also not be known, or be dynamic and different for different clients. In that case, it could be useful to return something like -MOVED ::<port> (just an example) and expect a well behaved client to reuse the same address/hostname but just a different port.

There is an inherent assumption here that the client only uses ports to distinguish between cluster nodes, and that the hostname/address is identical - but I believe that is becoming the case with some network topologies that involve a service mesh proxy / load balancer / gateway / etc.

There's practically no work on the server side for this, it's only about setting a convention and communicating it to clients as part of the hostname support change. Any thoughts about this?

dmitrypol · 2021-10-25T16:32:34Z

@yossigo - you are absolutely correct, hostname can be different per node. Cluster can be composed of server1.domain.com:6379, server2.domain.com:6379 and server3.domain.com:6379.

madolson · 2021-10-25T22:16:04Z

@yossigo That is a good insight. An alternative to what you proposed is we could add a client config so that a client can tell the cluster the hostname/IP that is should always respond with. I think that would require a bit less client changes, as they would more focus on sending an additional command on startup as opposed to changing how interpreting the Cluster Slots/redirects function.

@dmitrypol Not sure I followed your comment, is having the server side not know the hostname a better solution for what we talked about?

I'm going to rebase and address my changes today in either case. We should be able to quickly add the changes outlined. Once this has general buy in, I'll close off on the tooling improvement.

yossigo · 2021-10-26T09:45:51Z

@dmitrypol The example you provide is already part of this work, I was actually referring to something else. For example assume there are clients A and B behind different load balancers, both pointing to the same Redis Cluster. The clients may use different, locally known and locally resolved hostnames to reach those load balancers, but the cluster does not know where to redirect each client.

+-------------+              +-----------+          +----------------+               
|             | hostnameA    |           |--------->|                |               
|  Client A   |------------->|   LB A    |          |                |               
|             |              |           |          |                |               
+-------------+              +-----------+          |      Redis     |               
                                                    |     Cluster    |               
+-------------+              +-----------+          |                |               
|             | hostnameB    |           |          |                |               
|  Client B   |------------->|   LB B    |          |                |               
|             |              |           |--------->|                |               
+-------------+              +-----------+          +----------------+

@madolson This is a good point, it involves less parsing changes. On the other hand, we're anyway introducing parsing changes, not just due to hostnames but potentially also pushing clients to finally move from CLUSTER NODES to CLUSTER SLOTS so we could try to get this all done together. I don't feel strongly either way though.

dmitrypol · 2021-10-26T20:55:37Z

thank you for clarifying @yossigo. I misunderstood.

…edis#9530) Implement the ability for cluster nodes to advertise their location with extension messages.

liuchong · 2022-03-16T04:02:35Z

This is on 7.0-rc2

root@redis-node-1:/data# redis-cli --cluster create redis-node-1:7001 redis-node-2:7002 redis-node-3:7003 redis-node-4:7004 redis-node-5:7005 redis-node-6:7006 --cluster-replicas 1
>>> Performing hash slots allocation on 6 nodes...
Master[0] -> Slots 0 - 5460
Master[1] -> Slots 5461 - 10922
Master[2] -> Slots 10923 - 16383
Adding replica redis-node-5:7005 to redis-node-1:7001
Adding replica redis-node-6:7006 to redis-node-2:7002
Adding replica redis-node-4:7004 to redis-node-3:7003
M: 8d0e1b09f6e6812c8e99e1eed82653d4d38a43e4 redis-node-1:7001
   slots:[0-5460] (5461 slots) master
M: fbc7c451d2681a24deefa06f78e4929b90634bf9 redis-node-2:7002
   slots:[5461-10922] (5462 slots) master
M: b4feae38993df2e1073cfc43d0e6f9ba5a014833 redis-node-3:7003
   slots:[10923-16383] (5461 slots) master
S: 4c3a58c2cd0217967e82f04ee3df187c78f5a84f redis-node-4:7004
   replicates b4feae38993df2e1073cfc43d0e6f9ba5a014833
S: 3a4ad7c47a3f07146581f1ad38fab7e648305865 redis-node-5:7005
   replicates 8d0e1b09f6e6812c8e99e1eed82653d4d38a43e4
S: 30379a24dd87f9b4016d3a25398a02150c1a6977 redis-node-6:7006
   replicates fbc7c451d2681a24deefa06f78e4929b90634bf9
Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Node redis-node-2:7002 replied with error:
ERR Invalid node address specified: redis-node-1:7001

mention my self @liuchong for issue filter 👀

zuiderkwast · 2022-03-16T07:52:13Z

@liuchong It seems as CLUSTER MEET does not accept a hostname. I guess we need to implement that.

FarhanSajid1 · 2022-03-26T22:00:40Z

This is on 7.0-rc2

root@redis-node-1:/data# redis-cli --cluster create redis-node-1:7001 redis-node-2:7002 redis-node-3:7003 redis-node-4:7004 redis-node-5:7005 redis-node-6:7006 --cluster-replicas 1
>>> Performing hash slots allocation on 6 nodes...
Master[0] -> Slots 0 - 5460
Master[1] -> Slots 5461 - 10922
Master[2] -> Slots 10923 - 16383
Adding replica redis-node-5:7005 to redis-node-1:7001
Adding replica redis-node-6:7006 to redis-node-2:7002
Adding replica redis-node-4:7004 to redis-node-3:7003
M: 8d0e1b09f6e6812c8e99e1eed82653d4d38a43e4 redis-node-1:7001
   slots:[0-5460] (5461 slots) master
M: fbc7c451d2681a24deefa06f78e4929b90634bf9 redis-node-2:7002
   slots:[5461-10922] (5462 slots) master
M: b4feae38993df2e1073cfc43d0e6f9ba5a014833 redis-node-3:7003
   slots:[10923-16383] (5461 slots) master
S: 4c3a58c2cd0217967e82f04ee3df187c78f5a84f redis-node-4:7004
   replicates b4feae38993df2e1073cfc43d0e6f9ba5a014833
S: 3a4ad7c47a3f07146581f1ad38fab7e648305865 redis-node-5:7005
   replicates 8d0e1b09f6e6812c8e99e1eed82653d4d38a43e4
S: 30379a24dd87f9b4016d3a25398a02150c1a6977 redis-node-6:7006
   replicates fbc7c451d2681a24deefa06f78e4929b90634bf9
Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Node redis-node-2:7002 replied with error:
ERR Invalid node address specified: redis-node-1:7001

mention my self @liuchong for issue filter 👀

Also seeing this

oranagra · 2022-03-27T07:56:16Z

@zuiderkwast this is resolved by #10436, right?

zuiderkwast · 2022-03-28T07:47:44Z

@oranagra That's right. I created the issue #10433 to track it too.

src/cluster.c

Gossip the cluster node blacklist in ping and pong messages. This means that CLUSTER FORGET doesn't need to be sent to all nodes in a cluster. It can be sent to one or more nodes and then be propagated to the rest of them. For each blacklisted node, its node id and its remaining blacklist TTL is gossiped in a cluster bus ping extension (introduced in #9530).

oranagra · 2022-09-04T05:13:10Z

seen a failure in a test introduced here. i assume timing issue.
https://github.com/redis/redis-extra-ci/runs/8173232234?check_suite_focus=true

*** [err]: Verify the nodes configured with prefer hostname only show hostname for new nodes in tests/unit/cluster/hostnames.tcl
Expected '' to be equal to 'shard-2.com' (context: type eval line 39 cmd {assert_equal [lindex [get_slot_field $slot_result 0 2 3] 1] "shard-2.com"} proc ::test)

Gossip the cluster node blacklist in ping and pong messages. This means that CLUSTER FORGET doesn't need to be sent to all nodes in a cluster. It can be sent to one or more nodes and then be propagated to the rest of them. For each blacklisted node, its node id and its remaining blacklist TTL is gossiped in a cluster bus ping extension (introduced in redis#9530).

This PR adds a human readable name to a node in clusters that are visible as part of error logs. This is useful so that admins and operators of Redis cluster have better visibility into failures without having to cross-reference the generated ID with some logical identifier (such as pod-ID or EC2 instance ID). This is mentioned in #8948. Specific nodenames can be set by using the variable cluster-announce-human-nodename. The nodename is gossiped using the clusterbus extension in #9530. Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>

Gossip the cluster node blacklist in ping and pong messages. This means that CLUSTER FORGET doesn't need to be sent to all nodes in a cluster. It can be sent to one or more nodes and then be propagated to the rest of them. For each blacklisted node, its node id and its remaining blacklist TTL is gossiped in a cluster bus ping extension (introduced in redis#9530).

madolson linked an issue Sep 21, 2021 that may be closed by this pull request

Will hostnames be supported ? #2186

Closed

madolson mentioned this pull request Sep 21, 2021

Will hostnames be supported ? #2186

Closed

yossigo reviewed Sep 29, 2021

View reviewed changes

src/cluster.c Outdated Show resolved Hide resolved

src/cluster.c Outdated Show resolved Hide resolved

madolson force-pushed the cluster-hostnames branch from 38b0bec to 0175a7b Compare September 30, 2021 16:49

madolson marked this pull request as ready for review October 1, 2021 22:47

madolson mentioned this pull request Oct 3, 2021

Redis Cluster V2 project #8948

Open

madolson commented Oct 7, 2021

View reviewed changes

src/cluster.c Show resolved Hide resolved

src/config.c Outdated Show resolved Hide resolved

madolson force-pushed the cluster-hostnames branch from 3d72c09 to eb086e3 Compare October 25, 2021 22:24

madolson requested a review from yossigo October 26, 2021 01:49

madolson added approval-needed Waiting for core team approval to be merged state:major-decision Requires core team consensus labels Oct 26, 2021

madolson added this to Backlog in 7.0 via automation Oct 26, 2021

madolson moved this from Backlog to In progress in 7.0 Oct 26, 2021

yossigo moved this from In progress to To Do in 7.0 Oct 26, 2021

madolson moved this from To Do to In progress in 7.0 Oct 26, 2021

oranagra added the 7.0-must-have label Oct 26, 2021

madolson force-pushed the cluster-hostnames branch from 67022e5 to 790beea Compare November 9, 2021 06:24

PingXie mentioned this pull request Jan 31, 2022

Consider using sds for clusterNode.hostname? #10213

Closed

This was referenced Jan 31, 2022

Create version interoperability tests for different major/minor versions #10214

Open

Added history for cluster-slots changes for hostnames #10216

Merged

panjf2000 pushed a commit to panjf2000/redis that referenced this pull request Feb 3, 2022

Implement clusterbus message extensions and cluster hostname support (r…

1d2fbcb

…edis#9530) Implement the ability for cluster nodes to advertise their location with extension messages.

zuiderkwast mentioned this pull request Mar 16, 2022

[NEW] redis-cli --cluster create HOSTNAME:port #10433

Closed

PingXie mentioned this pull request Mar 23, 2022

Introduce shard ID to Redis cluster #10474

Closed

ahiaht mentioned this pull request Mar 30, 2022

Support cluster with hostname in incoming redis 7 OT-CONTAINER-KIT/redis-operator#259

Closed

This was referenced Apr 29, 2022

Fix rocsp unit test flakes letsencrypt/boulder#6071

Closed

Support Redis 7.0.0 redis/go-redis#2082

Closed

lintanghui mentioned this pull request May 11, 2022

"ClusterClient.Ping().Result() error: got 4 elements in cluster info address, expected 2 or 3" when used for accessing redis cluster redis/go-redis#2085

Closed

raphaelauv mentioned this pull request May 13, 2022

[NEW] support redis 7 Snapchat/KeyDB#420

Open

zuiderkwast mentioned this pull request Jun 16, 2022

Gossip forgotten nodes on CLUSTER FORGET #10869

Merged

PingXie mentioned this pull request Jun 17, 2022

Gossiped node deletion #10861

Closed

uvletter reviewed Jun 27, 2022

View reviewed changes

src/cluster.c Show resolved Hide resolved

uvletter mentioned this pull request Jun 27, 2022

A minor fix to clusterbus extension estlen #10902

Merged

deveshk0 mentioned this pull request Feb 8, 2023

[bitnami/redis-cluster] cannot use tls bitnami/charts#14673

Closed

This was referenced Mar 8, 2023

Handling clusters configured with cluster-preferred-endpoint-type hostname redis-rb/redis-cluster-client#205

Closed

Handling Redis Cluster configured with cluster-preferred-endpoint-type hostname redis/redis-rb#1184

Closed

vamsidarbhamulla mentioned this pull request Apr 24, 2023

Redis 7.0 cluster access fails with an error: got 4 elements in cluster info address, expected 2 or 3 grafana/k6#3031

Closed

liudonghua123 mentioned this pull request Dec 7, 2023

Run with docker compose which do not export mysql, redis(-cluster). eolinker/apinto-dashboard#27

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clusterbus extensions and hostname support #9530

Clusterbus extensions and hostname support #9530

madolson commented Sep 21, 2021 •

edited

zuiderkwast commented Sep 21, 2021

yossigo commented Sep 23, 2021

dmitrypol commented Sep 23, 2021

yossigo left a comment

madolson commented Sep 30, 2021

dmitrypol commented Oct 2, 2021

madolson commented Oct 3, 2021

dmitrypol commented Oct 3, 2021

yossigo commented Oct 6, 2021

madolson left a comment

yossigo commented Oct 25, 2021

dmitrypol commented Oct 25, 2021

madolson commented Oct 25, 2021 •

edited

yossigo commented Oct 26, 2021

dmitrypol commented Oct 26, 2021

liuchong commented Mar 16, 2022

zuiderkwast commented Mar 16, 2022

FarhanSajid1 commented Mar 26, 2022

oranagra commented Mar 27, 2022

zuiderkwast commented Mar 28, 2022

oranagra commented Sep 4, 2022

Clusterbus extensions and hostname support #9530

Clusterbus extensions and hostname support #9530

Conversation

madolson commented Sep 21, 2021 • edited