Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QUESTION] Redis Cluster check command output. #11304

Open
vineelyalamarthy opened this issue Sep 22, 2022 · 4 comments
Open

[QUESTION] Redis Cluster check command output. #11304

vineelyalamarthy opened this issue Sep 22, 2022 · 4 comments

Comments

@vineelyalamarthy
Copy link

vineelyalamarthy commented Sep 22, 2022

We are running Redis in cluster mode with 6 pods and with persistence enabled. All the six pods are masters and there are no secondaries replicating the data.

On of the set-ups I run

redis-cli --verbose --cluster check localhost:6379

It shows that

Node <ip1:port> has slots in importing state <slots list here>
Node <ip2:port> has slots in importing state <slots list here>
[WARNING] The following slots are open: <slots list here>
All 16384 slots are covered. 

I am a bit confused about the state of the cluster and what can we make out of it?

  1. Why would there be "IMPORTING" state and no corresponding "MIGRATING" state for those slots.
  2. It says all 16384 slots are covered but at the same time mentioned a set of open slots. Both those statements sound contradictory to each other to me. Or not? What is the difference between being covered and being open ?
  3. Apart from the above, I also see that in redis-cli cluster info, it shows all 16384 slots are covered but known nodes is 6 but cluster size is 4.
  4. According to Redis Cluster is a node in known state but not accounted in cluster_size until it has a single slot that that is not in migration nor in import state ?

What are some situations due to which we get into the above state? Is there a chance that it can happen because of failed **rebalance** operation ?

Thanks in advance.

I tried doing redis-cluster fix with replace option after which the problems got fixed but still want to understand this correctly.

@madolson
Copy link
Contributor

  1. The most common case I've seen this issue is when a node comes up with an RDB data that has "some" keys for slots that it does not own. In this scenario, the node will be marked as "importing" said information. Since data is distributed in cluster mode, there is no authoritative owner of data.
  2. My understanding is that the list of open slots includes those that are in migrating or importing state. So that is consistent.
  3. Cluster size is number of masters.
  4. ^

The state could happen because of a failed rebalance as well, but that seems less likely because you would expect the corresponding migrating states to also exist. I'm not sure redis-cluster fix would have resolved the issues.

@zuiderkwast
Copy link
Contributor

  1. As long as a slot has an owner, it is counted as covered. Slots in migrating/importing state are owned by the migrating node until the slot is fully migrated, then the ownership is transferred. Only slots which have no owner at all are uncovered.

@vineelyalamarthy
Copy link
Author

@zuiderkwast So here we have all 16384 slots covered but at the same time so many slots are shown as open and couple of pods/nodes have been shown as multiple slots in the IMPORTING state.

@antirez any idea when this happens.

@vineelyalamarthy
Copy link
Author

  1. The most common case I've seen this issue is when a node comes up with an RDB data that has "some" keys for slots that it does not own. In this scenario, the node will be marked as "importing" said information. Since data is distributed in cluster mode, there is no authoritative owner of data.
  2. My understanding is that the list of open slots includes those that are in migrating or importing state. So that is consistent.
  3. Cluster size is number of masters.
  4. ^

The state could happen because of a failed rebalance as well, but that seems less likely because you would expect the corresponding migrating states to also exist. I'm not sure redis-cluster fix would have resolved the issues.

Yes @madolson we have automated rebalance triggered by the Redis Leader pod (selected via lease based Leader Election using etcd on Kubernetes). In those cases when Rebalance times-out , corresponding MIGRATE states also exist along with IMPORT.

But having only IMPORTING thing is a little weird.

This theory needs to be verified. I will test few times and update here.

It can happen when one node goes (A) down and slots are migrated to B and C , but keys are not. And then those keys get created on the new B and C. But lets say A comes online after hours . But due to persistence enables and loads RDB into memory and then perhaps we can have Orhpaned IMPORT statements. Remedy seems to be to run fix command with and on a case by case basis, it assigns the slot ownership.

It was also asking me to provide some additional parameters to the cluster fix command.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants