Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Slot rebalancing can stop due to loading error #11104

Open
ma2sql opened this issue Aug 11, 2022 · 0 comments · May be fixed by #11105
Open

[BUG] Slot rebalancing can stop due to loading error #11104

ma2sql opened this issue Aug 11, 2022 · 0 comments · May be fixed by #11105

Comments

@ma2sql
Copy link

ma2sql commented Aug 11, 2022

Describe the bug

When I try to move all slots on two or more nodes to other nodes using redis-cli --cluster rebalance, rebalancing can be stopped along the way.

To reproduce

When I have a Redis cluster of 12 master nodes, and want to scale it down to 6 masters, I would be able to use rebalance command like this.

redis-cli --cluster rebalance \
  --cluster-weight {node07_id}=0 \
  --cluster-weight {node08_id}=0 \
  --cluster-weight {node09_id}=0
  --cluster-weight {node10_id}=0 \
  --cluster-weight {node11_id}=0 \
  --cluster-weight {node12_id}=0
  __worker_node_addr__

And it can stop with the following errors:

  • ERR Please use SETSLOT only with masters.
  • LOADING Redis is loading the dataset in memory

Expected behavior

Loading errors can also be ignored.

Additional information

This problem occurs because nodes that were recognized as masters at the beginning of redis-cli execution are converted to replicas as their slots become 0.

When the node that used to be the master changes to a replica, and executes the CLUSTER SETSLOT command on that node, the following error occurs.

  • ERR Please use SETSLOT only with masters.

However, during the process of becoming a replica and synchronizing with the new master, a LOADING error will occur for the CLUSTER SETSLOT command.

  • LOADING Redis is loading the dataset in memory

Of course, there were patches related to this in the past.

But what was previously patched was only for source, and the problem I ran into was that of notifications to others.

while ((ln = listNext(&li)) != NULL) {

I think the error can be ignored for the following reasons.

  • This error will occur only when changing from master to replica.
  • Nodes that were replicas from the beginning were not targets to notify.

Another solution is to omit the notification to nodes that do not have any slots. However, I am not sure that there is no side effect.

ma2sql pushed a commit to ma2sql/redis that referenced this issue Aug 11, 2022
@ma2sql ma2sql linked a pull request Aug 11, 2022 that will close this issue
@oranagra oranagra linked a pull request Aug 24, 2022 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant