-
Notifications
You must be signed in to change notification settings - Fork 23.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CRASH] Assertion Failed when running rebalance command when upgrading from 7.0.11 to 7.2.2 #12695
Comments
Any updates on this? |
@salarali thanks for the report, i am taking a look. Although it's a bit tortuous, I found a way to reproduce it. |
@PingXie since you are here, can you also take a look? this somehow like #12805, if the node is a master, we may need to add it to the shard lins.
the reason for the issue is, like if we have A (7.2) -> B (7.0), B is A master in node A view, A does not know the B's shard id, so in here, we are not able to clear the B's slot_info.
But in here, we will keep increasing B's slot info, and eventually encountering the assert:
|
@enjoy-binbin, it looks like your fix for #12805 might resolve this issue too. With that fix in place, every 7.2 node (like Btw, is re-sharding required to trigger this bug? Generally, I'd lean towards updating all nodes to the same version before doing something as involved as re-sharding. Most folks update the whole cluster first, which is a good call – it keeps things straightforward and avoids the quirks you might run into with a mixed-version setup. |
the fix in #12805 won't help, since we will first check whether sender's shard_id has changed, if it changed, we will add it to the shard list. and in this case, if the sender is a master, we won't change the shard_id, so we are not able to add it the the shard lins |
When a 7.2 node What are the repro steps? Or is it possible for you to share the core dump somehow? It is a bit hard to be certain just by looking at the source code. |
my reproduce step:
I should be doing the C and D step repeatedly, and CLUSTER SHARDS will response with this (we can see the slots section keeps expanding):
The steps to reproduce are quite messy. I upgraded randomly locally. yean, with the fix for #12805, there shard id will remain in the same shard. However, the shard id of a certain master has not been added to the shardi d list. Maybe there is something missing somewhere. # the code in here will check memcmp, and since the sender' shard id is not changed,
# so we won't add it to the shard id list
static void updateShardId(clusterNode *node, const char *shard_id) {
if (shard_id && memcmp(node->shard_id, shard_id, CLUSTER_NAMELEN) != 0) {
clusterRemoveNodeFromShard(node);
memcpy(node->shard_id, shard_id, CLUSTER_NAMELEN);
clusterAddNodeToShard(shard_id, node);
clusterDoBeforeSleep(CLUSTER_TODO_SAVE_CONFIG);
}
if (shard_id && myself != node && myself->slaveof == node) {
if (memcmp(myself->shard_id, shard_id, CLUSTER_NAMELEN) != 0) {
/* shard-id can diverge right after a rolling upgrade
* from pre-7.2 releases */
clusterRemoveNodeFromShard(myself);
memcpy(myself->shard_id, shard_id, CLUSTER_NAMELEN);
clusterAddNodeToShard(shard_id, myself);
clusterDoBeforeSleep(CLUSTER_TODO_SAVE_CONFIG|CLUSTER_TODO_FSYNC_CONFIG);
}
}
} That’s why I plan to add this: if (ext_shardid == NULL and is_master) clusterAddNodeToShard(sender->shard_id, sender); So the essential reason is that in the shard id list, that is, server.cluster->shards, #12805 will only add the replica, but not the master, causing this problem |
Got it, this sounds more like a case where a 7.2 node is just observing a 7.0 shard, not actually replicating from it. Makes me wonder if we even need to go through re-sharding to reproduce this bug?
Did you get a chance to try this out with your latest changes in #12805? The earlier commits had this issue, but I thought your last update would've fixed it for both v7.0 primary and replica nodes. |
I feel it's not needed, i just follow the issue idea, and then i got an env that could be reproduced stably and didn't want to break it.
i did try it (even earlier), it don't work. the new commit indeed will call |
You are right. Now I see two potential issues with
Do you like to propose a change? |
i am happy to make the change and test it, but a little confused. Do you have any ideas, or can you elaborate a bit more? |
Thinking about this more, a better and (more) correct fix would to update the shard topology when a 7.0 replica is connected to its 7.0 primary for the very first time. More specifically, we need to inject a |
I tried it, it didn't work. Luckily I found the minimal steps to reproduce:
so the reason is that a 7.2 node is a 7.0 node 's slave, the shard id dict does not have the 7.0 node, so when we issue cluster shards in 7.2 node, it went here: #12695 (comment) |
This looks like a different issue than observed in your previous #12695 (comment). I think we should still keep the change I proposed in #12695 (comment). We could continue with the fix proposed in #12695 (comment) but an alternative could be fixing the underlying assumption of |
Wow, that's actually what I thought at first, shard dict should be always in sync with the node's shard_id. The first time I tried it, it didn't fix the issue, so I dropped it and considered fixing it with a smaller diff. I actually agree with this idea. We didn’t add shard dict synchronously in some places, which feels like a hidden danger. I will try to open a new PR later to add all these changes we mentioned (for better review) |
…e not sync Crash reported in redis#12695. In the process of upgrading the cluster from 7.0 to 7.2, because the 7.0 nodes will not gossip shard id, in 7.2 we will rely on shard id to build the server.cluster->shards dict. In some cases, for example, the 7.0 master node and the 7.2 replica node. From the view of 7.2 replica node, the cluster->shards dictionary does not have its master node. In this case calling CLUSTER SHARDS on the 7.2 replica node may crash. A CLUSTER SHARDS result output: ``` 1) 1) "slots" 2) 1) (integer) 0 2) (integer) 5461 3) (integer) 0 4) (integer) 5461 5) (integer) 0 ``` We can see that the output contains repeated slots, and each call will append a new one, and then crash on serverAssert: ```c void clusterGenNodesSlotsInfo(int filter) { ... /* Generate slots info when occur different node with start * or end of slot. */ if (i == CLUSTER_SLOTS || n != server.cluster->slots[i]) { if (!(n->flags & filter)) { if (!n->slot_info_pairs) { n->slot_info_pairs = zmalloc(2 * n->numslots * sizeof(uint16_t)); } serverAssert((n->slot_info_pairs_count + 1) < (2 * n->numslots)); n->slot_info_pairs[n->slot_info_pairs_count++] = start; n->slot_info_pairs[n->slot_info_pairs_count++] = i-1; } if (i == CLUSTER_SLOTS) break; n = server.cluster->slots[i]; start = i; } ... } ``` The reason is that in addShardReplyForClusterShards we are not able to clean up the slot_info_pairs corresponding to the 7.0 master node. In the code below, we will loop to find the 7.0 master node, and then we will call clusterFreeNodesSlotsInfo to clean up slot_info_pairs according to the shard id dict list, but the 7.0 master node is not in the list. ```c void addShardReplyForClusterShards(client *c, list *nodes) { ... /* Use slot_info_pairs from the primary only */ while (n->slaveof != NULL) n = n->slaveof; ... addReplyBulkCString(c, "nodes"); addReplyArrayLen(c, listLength(nodes)); listIter li; listRewind(nodes, &li); for (listNode *ln = listNext(&li); ln != NULL; ln = listNext(&li)) { clusterNode *n = listNodeValue(ln); addNodeDetailsToShardReply(c, n); clusterFreeNodesSlotsInfo(n); } } ``` We should fix the underlying assumption of updateShardId, which is that the shard dict should be always in sync with the node's shard_id. The fix was suggested by PingXie, see more details in redis#12695. Co-authored-by: Ping Xie <pingxie@google.com>
@PingXie thanks! I verified that adding it to clusterRenameNode can fix this issue #12695 (comment). please take a look with the fix #12832. |
…e not sync (#12832) Crash reported in #12695. In the process of upgrading the cluster from 7.0 to 7.2, because the 7.0 nodes will not gossip shard id, in 7.2 we will rely on shard id to build the server.cluster->shards dict. In some cases, for example, the 7.0 master node and the 7.2 replica node. From the view of 7.2 replica node, the cluster->shards dictionary does not have its master node. In this case calling CLUSTER SHARDS on the 7.2 replica node may crash. We should fix the underlying assumption of updateShardId, which is that the shard dict should be always in sync with the node's shard_id. The fix was suggested by PingXie, see more details in #12695.
…e not sync (redis#12832) Crash reported in redis#12695. In the process of upgrading the cluster from 7.0 to 7.2, because the 7.0 nodes will not gossip shard id, in 7.2 we will rely on shard id to build the server.cluster->shards dict. In some cases, for example, the 7.0 master node and the 7.2 replica node. From the view of 7.2 replica node, the cluster->shards dictionary does not have its master node. In this case calling CLUSTER SHARDS on the 7.2 replica node may crash. We should fix the underlying assumption of updateShardId, which is that the shard dict should be always in sync with the node's shard_id. The fix was suggested by PingXie, see more details in redis#12695. (cherry picked from commit 5b0c6a8)
…e not sync (#12832) Crash reported in #12695. In the process of upgrading the cluster from 7.0 to 7.2, because the 7.0 nodes will not gossip shard id, in 7.2 we will rely on shard id to build the server.cluster->shards dict. In some cases, for example, the 7.0 master node and the 7.2 replica node. From the view of 7.2 replica node, the cluster->shards dictionary does not have its master node. In this case calling CLUSTER SHARDS on the 7.2 replica node may crash. We should fix the underlying assumption of updateShardId, which is that the shard dict should be always in sync with the node's shard_id. The fix was suggested by PingXie, see more details in #12695. (cherry picked from commit 5b0c6a8)
…e not sync (redis#12832) Crash reported in redis#12695. In the process of upgrading the cluster from 7.0 to 7.2, because the 7.0 nodes will not gossip shard id, in 7.2 we will rely on shard id to build the server.cluster->shards dict. In some cases, for example, the 7.0 master node and the 7.2 replica node. From the view of 7.2 replica node, the cluster->shards dictionary does not have its master node. In this case calling CLUSTER SHARDS on the 7.2 replica node may crash. We should fix the underlying assumption of updateShardId, which is that the shard dict should be always in sync with the node's shard_id. The fix was suggested by PingXie, see more details in redis#12695.
Crash report
Assertion failed error keeps going on for a long time
Additional information
Amazon Linux 2023, redis-version 7.2.2
The rebalance commands get stuck and on investigation, this assertion was found.
The text was updated successfully, but these errors were encountered: