New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Warning commitlog - Cannot parse the version of the file: and coredump after removing node from cluster and adding new node with raft #11228
Comments
Coredump
|
Issue reproduced: Installation detailsKernel Version: 5.15.0-1017-aws Scylla Nodes used in this run:
OS / Image: Test: Issue description>>>>>>>
Logs:
coredump:
|
issue continue to reproduced:
|
Installation detailsKernel Version: 5.15.0-1019-aws Scylla Nodes used in this run:
OS / Image: Test: Issue description>>>>>>>
Logs:
|
Fort the record, the "cannot parse the version" is just schema commitlog existing in same directory as normal commitlog, and the first seeing segments left over by the latter. It is not an error as such (though annoying). |
If it's not an error, why is it printed? |
Lets call it a warning. Commitlog expects itself to be sole owner of directory it works in. When scanning for replayable/recyclable items, if it encounters anything not matching the configured name pattern, it will complain, since this is an indicator someone has meddled with files. Since a while back, we have a separate schema commitlog that uses the same folder, but different name pattern. Thus when doing replay scan on a non-clean instance start, we get these messages as the two instances find each others files. |
Before this patch: - having a mapping without an actual IP address was an internal error - not having a mapping for an IP address was an internal error - the mapping was removed quickly after the entry is removed from the configuration. - re-mapping to a new IP address wasn't allowed After this patch: - the address map API is, well, inaccurate. It may contain a mapping without an actual IP address, and the caller must be prepared for it: get_inet_address() will throw (QQQ: should we add a new exception type?) while find() will return a nullopt. This happens when we first add an entry to Raft configuration and only later learn its IP address, e.g. via gossip. - it is allowed to re-map an existing entry to a new address - When an entry is removed from Raft configuration, it still lingers in the mapping for the expiration interval. I don't know exactly why it's necessary, but it may be relevant to fix issue scylladb#11228, where we observe assertion failures for missing mappings which should never happen.
Before this patch: - having a mapping without an actual IP address was an internal error - not having a mapping for an IP address was an internal error - the mapping was removed quickly after the entry is removed from the configuration. - re-mapping to a new IP address wasn't allowed After this patch: - the address map API is, well, inaccurate. It may contain a mapping without an actual IP address, and the caller must be prepared for it: get_inet_address() will throw (QQQ: should we add a new exception type?) while find() will return a nullopt. This happens when we first add an entry to Raft configuration and only later learn its IP address, e.g. via gossip. - it is allowed to re-map an existing entry to a new address - When an entry is removed from Raft configuration, it still lingers in the mapping for the expiration interval. I don't know exactly why it's necessary, but it may be relevant to fix issue scylladb#11228, where we observe assertion failures for missing mappings which should never happen.
Before this patch: - having a mapping without an actual IP address was an internal error - not having a mapping for an IP address was an internal error - the mapping was removed quickly after the entry is removed from the configuration. - re-mapping to a new IP address wasn't allowed After this patch: - the address map API is, well, inaccurate. It may contain a mapping without an actual IP address, and the caller must be prepared for it: get_inet_address() will throw (QQQ: should we add a new exception type?) while find() will return a nullopt. This happens when we first add an entry to Raft configuration and only later learn its IP address, e.g. via gossip. - it is allowed to re-map an existing entry to a new address - When an entry is removed from Raft configuration, it still lingers in the mapping for the expiration interval. I don't know exactly why it's necessary, but it may be relevant to fix issue scylladb#11228, where we observe assertion failures for missing mappings which should never happen.
The test runs remove_node command with background ddl workload. It was written in an attempt to reproduce scylladb#11228 but seems to have value on its own. The if_exists parameter has been added to the add_table and drop_table functions, since the driver could retry the request sent to a removed node, but that request might have already been completed. Function wait_for_host_known waits until the information about the node reaches the destination node. Since we add new nodes at each iteration in main, this can take some time. A number of abort-related options was added SCYLLA_CMDLINE_OPTIONS as it simplifies nailing down problems.
Before this patch: - having a mapping without an actual IP address was an internal error - not having a mapping for an IP address was an internal error - the mapping was removed quickly after the entry is removed from the configuration. - re-mapping to a new IP address wasn't allowed After this patch: - the address map API is, well, inaccurate. It may contain a mapping without an actual IP address, and the caller must be prepared for it: get_inet_address() will throw (QQQ: should we add a new exception type?) while find() will return a nullopt. This happens when we first add an entry to Raft configuration and only later learn its IP address, e.g. via gossip. - it is allowed to re-map an existing entry to a new address - When an entry is removed from Raft configuration, it still lingers in the mapping for the expiration interval. I don't know exactly why it's necessary, but it may be relevant to fix issue scylladb#11228, where we observe assertion failures for missing mappings which should never happen.
There is a flaw in how the raft rpc endpoints are currently managed. The io_fiber in raft::server is supposed to first add new servers to rpc, then send all the messages and then remove the servers which have been excluded from the configuration. The problem is that the send_messages function isn't synchronous, it schedules send_append_entries to run after all the current requests to the target server, which can happen after we have already removed the server from address_map. Fixes: scylladb#11228
There is a flaw in how the raft rpc endpoints are currently managed. The io_fiber in raft::server is supposed to first add new servers to rpc, then send all the messages and then remove the servers which have been excluded from the configuration. The problem is that the send_messages function isn't synchronous, it schedules send_append_entries to run after all the current requests to the target server, which can happen after we have already removed the server from address_map. In this patch the remove_server function is changed to mark the server_id as expiring rather than synchronously dropping it. This means all currently scheduled requests to that server will still be able to resolve the ip address for that server_id. Fixes: scylladb#11228
Before this patch: - having a mapping without an actual IP address was an internal error - not having a mapping for an IP address was an internal error - the mapping was removed quickly after the entry is removed from the configuration. - re-mapping to a new IP address wasn't allowed After this patch: - the address map API is, well, inaccurate. It may contain a mapping without an actual IP address, and the caller must be prepared for it: get_inet_address() will throw (QQQ: should we add a new exception type?) while find() will return a nullopt. This happens when we first add an entry to Raft configuration and only later learn its IP address, e.g. via gossip. - it is allowed to re-map an existing entry to a new address - When an entry is removed from Raft configuration, it still lingers in the mapping for the expiration interval. I don't know exactly why it's necessary, but it may be relevant to fix issue scylladb#11228, where we observe assertion failures for missing mappings which should never happen.
Before this patch: - having a mapping without an actual IP address was an internal error - not having a mapping for an IP address was an internal error - the mapping was removed quickly after the entry is removed from the configuration. - re-mapping to a new IP address wasn't allowed After this patch: - the address map API is, well, inaccurate. It may contain a mapping without an actual IP address, and the caller must be prepared for it: get_inet_address() will throw (QQQ: should we add a new exception type?) while find() will return a nullopt. This happens when we first add an entry to Raft configuration and only later learn its IP address, e.g. via gossip. - it is allowed to re-map an existing entry to a new address - When an entry is removed from Raft configuration, it still lingers in the mapping for the expiration interval. I don't know exactly why it's necessary, but it may be relevant to fix issue scylladb#11228, where we observe assertion failures for missing mappings which should never happen.
Before this patch: - having a mapping without an actual IP address was an internal error - not having a mapping for an IP address was an internal error - the mapping was removed quickly after the entry is removed from the configuration. - re-mapping to a new IP address wasn't allowed After this patch: - the address map API is, well, inaccurate. It may contain a mapping without an actual IP address, and the caller must be prepared for it: get_inet_address() will throw while find() will return a nullopt. This happens when we first add an entry to Raft configuration and only later learn its IP address, e.g. via gossip. - it is allowed to re-map an existing entry to a new address - When an entry is removed from Raft configuration, it still lingers in the mapping for the expiration interval. The reason for it is that a server may be removed from the current Raft configuration, but some messages may still be destined to it, and will need the translation. This fixes scylladb#11228.
1) make address map API flexible Before this patch: - having a mapping without an actual IP address was an internal error - not having a mapping for an IP address was an internal error - the mapping was removed quickly after the entry is removed from the configuration. - re-mapping to a new IP address wasn't allowed After this patch: - the address map API is, well, inaccurate. It may contain a mapping without an actual IP address, and the caller must be prepared for it: get_inet_address() will throw while find() will return a nullopt. This happens when we first add an entry to Raft configuration and only later learn its IP address, e.g. via gossip. - it is allowed to re-map an existing entry to a new address - When an entry is removed from Raft configuration, it still lingers in the mapping for the expiration interval. The reason for it is that a server may be removed from the current Raft configuration, but some messages may still be destined to it, and will need the translation. This fixes scylladb#11228. 2) subscribe to gossip notifications Learning IP addresses from gossip allows us to adjust the address map whenever a node IP address changes. 3) prompt address map state with app state Initialize the raft address map with persistent snapshot of gossip application state, specifically IPs of members of the cluster. With this, we no longer need to store these IPs in Raft configuration (and update them when they change). The obvious drawback of this approach is that a node may join Raft config before it propagates its IP address to the cluster via gossip - so the boot process has to wait until it happens. raft: (service) move the pinger to the address map Since the configuration change event and ip address discovery are now disconnected, i.e. we first may have a new raft id in the configuration, and only later learn its address through gossip, we should change the pinger subscription not whenever there is a raft config change, but whenever we know the ip address we'd like to subscribe to. raft: (service) do not store IPs in Raft configuration Thanks to the rest of the patches in these series, Raft address map no longer needs to get this information from Raft configuration. Keep the 'server_info' column in the raft_config system table, in case we change our mind or decide to store something else in there. Note that the pinger is updated on configuration changes, when the IP address may still be missing. When we learn the IP address we don't update the pinger. This is a bug, since the node may be declared down if raft configuration changes ahead of gossip. Should pinger updates be moved to raft address map?
1) make address map API flexible Before this patch: - having a mapping without an actual IP address was an internal error - not having a mapping for an IP address was an internal error - the mapping was removed quickly after the entry is removed from the configuration. - re-mapping to a new IP address wasn't allowed After this patch: - the address map API is, well, inaccurate. It may contain a mapping without an actual IP address, and the caller must be prepared for it: get_inet_address() will throw while find() will return a nullopt. This happens when we first add an entry to Raft configuration and only later learn its IP address, e.g. via gossip. - it is allowed to re-map an existing entry to a new address - When an entry is removed from Raft configuration, it still lingers in the mapping for the expiration interval. The reason for it is that a server may be removed from the current Raft configuration, but some messages may still be destined to it, and will need the translation. This fixes scylladb#11228. 2) subscribe to gossip notifications Learning IP addresses from gossip allows us to adjust the address map whenever a node IP address changes. 3) prompt address map state with app state Initialize the raft address map with persistent snapshot of gossip application state, specifically IPs of members of the cluster. With this, we no longer need to store these IPs in Raft configuration (and update them when they change). The obvious drawback of this approach is that a node may join Raft config before it propagates its IP address to the cluster via gossip - so the boot process has to wait until it happens. Since the configuration change event and ip address discovery are now disconnected, i.e. we first may have a new raft id in the configuration, and only later learn its address through gossip, we should change the pinger subscription not whenever there is a raft config change, but whenever we know the ip address we'd like to subscribe to. Thanks to the changes above, Raft address map no longer needs to get this information from Raft configuration. Keep the 'server_info' column in the raft_config system table, in case we change our mind or decide to store something else in there.
The test runs remove_node command with background ddl workload. It was written in an attempt to reproduce scylladb#11228 but seems to have value on its own. The if_exists parameter has been added to the add_table and drop_table functions, since the driver could retry the request sent to a removed node, but that request might have already been completed. Function wait_for_host_known waits until the information about the node reaches the destination node. Since we add new nodes at each iteration in main, this can take some time. A number of abort-related options was added SCYLLA_CMDLINE_OPTIONS as it simplifies nailing down problems.
The test runs remove_node command with background ddl workload. It was written in an attempt to reproduce scylladb#11228 but seems to have value on its own. The if_exists parameter has been added to the add_table and drop_table functions, since the driver could retry the request sent to a removed node, but that request might have already been completed. Function wait_for_host_known waits until the information about the node reaches the destination node. Since we add new nodes at each iteration in main, this can take some time. A number of abort-related options was added SCYLLA_CMDLINE_OPTIONS as it simplifies nailing down problems.
The test runs remove_node command with background ddl workload. It was written in an attempt to reproduce scylladb#11228 but seems to have value on its own. The if_exists parameter has been added to the add_table and drop_table functions, since the driver could retry the request sent to a removed node, but that request might have already been completed. Function wait_for_host_known waits until the information about the node reaches the destination node. Since we add new nodes at each iteration in main, this can take some time. A number of abort-related options was added SCYLLA_CMDLINE_OPTIONS as it simplifies nailing down problems.
The test runs remove_node command with background ddl workload. It was written in an attempt to reproduce #11228 but seems to have value on its own. The if_exists parameter has been added to the add_table and drop_table functions, since the driver could retry the request sent to a removed node, but that request might have already been completed. Function wait_for_host_known waits until the information about the node reaches the destination node. Since we add new nodes at each iteration in main, this can take some time. A number of abort-related options was added SCYLLA_CMDLINE_OPTIONS as it simplifies nailing down problems. Closes #11734
1) make address map API flexible Before this patch: - having a mapping without an actual IP address was an internal error - not having a mapping for an IP address was an internal error - the mapping was removed quickly after the entry is removed from the configuration. - re-mapping to a new IP address wasn't allowed After this patch: - the address map API is, well, inaccurate. It may contain a mapping without an actual IP address, and the caller must be prepared for it: get_inet_address() will throw while find() will return a nullopt. This happens when we first add an entry to Raft configuration and only later learn its IP address, e.g. via gossip. - it is allowed to re-map an existing entry to a new address - When an entry is removed from Raft configuration, it still lingers in the mapping for the expiration interval. The reason for it is that a server may be removed from the current Raft configuration, but some messages may still be destined to it, and will need the translation. This fixes scylladb#11228. 2) subscribe to gossip notifications Learning IP addresses from gossip allows us to adjust the address map whenever a node IP address changes. 3) prompt address map state with app state Initialize the raft address map with persistent snapshot of gossip application state, specifically IPs of members of the cluster. With this, we no longer need to store these IPs in Raft configuration (and update them when they change). The obvious drawback of this approach is that a node may join Raft config before it propagates its IP address to the cluster via gossip - so the boot process has to wait until it happens. Since the configuration change event and ip address discovery are now disconnected, i.e. we first may have a new raft id in the configuration, and only later learn its address through gossip, we should change the pinger subscription not whenever there is a raft config change, but whenever we know the ip address we'd like to subscribe to. Thanks to the changes above, Raft address map no longer needs to get this information from Raft configuration. Keep the 'server_info' column in the raft_config system table, in case we change our mind or decide to store something else in there.
1) make address map API flexible Before this patch: - having a mapping without an actual IP address was an internal error - not having a mapping for an IP address was an internal error - the mapping was removed quickly after the entry is removed from the configuration. - re-mapping to a new IP address wasn't allowed After this patch: - the address map API is, well, inaccurate. It may contain a mapping without an actual IP address, and the caller must be prepared for it: get_inet_address() will throw while find() will return a nullopt. This happens when we first add an entry to Raft configuration and only later learn its IP address, e.g. via gossip. - it is allowed to re-map an existing entry to a new address - When an entry is removed from Raft configuration, it still lingers in the mapping for the expiration interval. The reason for it is that a server may be removed from the current Raft configuration, but some messages may still be destined to it, and will need the translation. This fixes scylladb#11228. 2) subscribe to gossip notifications Learning IP addresses from gossip allows us to adjust the address map whenever a node IP address changes. 3) prompt address map state with app state Initialize the raft address map with persistent snapshot of gossip application state, specifically IPs of members of the cluster. With this, we no longer need to store these IPs in Raft configuration (and update them when they change). The obvious drawback of this approach is that a node may join Raft config before it propagates its IP address to the cluster via gossip - so the boot process has to wait until it happens. Since the configuration change event and ip address discovery are now disconnected, i.e. we first may have a new raft id in the configuration, and only later learn its address through gossip, we should change the pinger subscription not whenever there is a raft config change, but whenever we know the ip address we'd like to subscribe to. Thanks to the changes above, Raft address map no longer needs to get this information from Raft configuration. Keep the 'server_info' column in the raft_config system table, in case we change our mind or decide to store something else in there.
1) make address map API flexible Before this patch: - having a mapping without an actual IP address was an internal error - not having a mapping for an IP address was an internal error - the mapping was removed quickly after the entry is removed from the configuration. - re-mapping to a new IP address wasn't allowed After this patch: - the address map API is, well, inaccurate. It may contain a mapping without an actual IP address, and the caller must be prepared for it: get_inet_address() will throw while find() will return a nullopt. This happens when we first add an entry to Raft configuration and only later learn its IP address, e.g. via gossip. - it is allowed to re-map an existing entry to a new address - When an entry is removed from Raft configuration, it still lingers in the mapping for the expiration interval. The reason for it is that a server may be removed from the current Raft configuration, but some messages may still be destined to it, and will need the translation. This fixes scylladb#11228. 2) subscribe to gossip notifications Learning IP addresses from gossip allows us to adjust the address map whenever a node IP address changes. 3) prompt address map state with app state Initialize the raft address map with persistent snapshot of gossip application state, specifically IPs of members of the cluster. With this, we no longer need to store these IPs in Raft configuration (and update them when they change). The obvious drawback of this approach is that a node may join Raft config before it propagates its IP address to the cluster via gossip - so the boot process has to wait until it happens. Since the configuration change event and ip address discovery are now disconnected, i.e. we first may have a new raft id in the configuration, and only later learn its address through gossip, we should change the pinger subscription not whenever there is a raft config change, but whenever we know the ip address we'd like to subscribe to. Thanks to the changes above, Raft address map no longer needs to get this information from Raft configuration. Keep the 'server_info' column in the raft_config system table, in case we change our mind or decide to store something else in there.
1) make address map API flexible Before this patch: - having a mapping without an actual IP address was an internal error - not having a mapping for an IP address was an internal error - the mapping was removed quickly after the entry is removed from the configuration. - re-mapping to a new IP address wasn't allowed After this patch: - the address map API is, well, inaccurate. It may contain a mapping without an actual IP address, and the caller must be prepared for it: get_inet_address() will throw while find() will return a nullopt. This happens when we first add an entry to Raft configuration and only later learn its IP address, e.g. via gossip. - it is allowed to re-map an existing entry to a new address - When an entry is removed from Raft configuration, it still lingers in the mapping for the expiration interval. The reason for it is that a server may be removed from the current Raft configuration, but some messages may still be destined to it, and will need the translation. This fixes scylladb#11228. 2) subscribe to gossip notifications Learning IP addresses from gossip allows us to adjust the address map whenever a node IP address changes. 3) prompt address map state with app state Initialize the raft address map with persistent snapshot of gossip application state, specifically IPs of members of the cluster. With this, we no longer need to store these IPs in Raft configuration (and update them when they change). The obvious drawback of this approach is that a node may join Raft config before it propagates its IP address to the cluster via gossip - so the boot process has to wait until it happens. Since the configuration change event and ip address discovery are now disconnected, i.e. we first may have a new raft id in the configuration, and only later learn its address through gossip, we should change the pinger subscription not whenever there is a raft config change, but whenever we know the ip address we'd like to subscribe to. Thanks to the changes above, Raft address map no longer needs to get this information from Raft configuration. Keep the 'server_info' column in the raft_config system table, in case we change our mind or decide to store something else in there.
1) make address map API flexible Before this patch: - having a mapping without an actual IP address was an internal error - not having a mapping for an IP address was an internal error - the mapping was removed quickly after the entry is removed from the configuration. - re-mapping to a new IP address wasn't allowed After this patch: - the address map API is, well, inaccurate. It may contain a mapping without an actual IP address, and the caller must be prepared for it: get_inet_address() will throw while find() will return a nullopt. This happens when we first add an entry to Raft configuration and only later learn its IP address, e.g. via gossip. - it is allowed to re-map an existing entry to a new address - When an entry is removed from Raft configuration, it still lingers in the mapping for the expiration interval. The reason for it is that a server may be removed from the current Raft configuration, but some messages may still be destined to it, and will need the translation. This fixes scylladb#11228. 2) subscribe to gossip notifications Learning IP addresses from gossip allows us to adjust the address map whenever a node IP address changes. 3) prompt address map state with app state Initialize the raft address map with persistent snapshot of gossip application state, specifically IPs of members of the cluster. With this, we no longer need to store these IPs in Raft configuration (and update them when they change). The obvious drawback of this approach is that a node may join Raft config before it propagates its IP address to the cluster via gossip - so the boot process has to wait until it happens. Since the configuration change event and ip address discovery are now disconnected, i.e. we first may have a new raft id in the configuration, and only later learn its address through gossip, we should change the pinger subscription not whenever there is a raft config change, but whenever we know the ip address we'd like to subscribe to. Thanks to the changes above, Raft address map no longer needs to get this information from Raft configuration. Keep the 'server_info' column in the raft_config system table, in case we change our mind or decide to store something else in there.
1) make address map API flexible Before this patch: - having a mapping without an actual IP address was an internal error - not having a mapping for an IP address was an internal error - the mapping was removed quickly after the entry is removed from the configuration. - re-mapping to a new IP address wasn't allowed After this patch: - the address map API is, well, inaccurate. It may contain a mapping without an actual IP address, and the caller must be prepared for it: get_inet_address() will throw while find() will return a nullopt. This happens when we first add an entry to Raft configuration and only later learn its IP address, e.g. via gossip. - it is allowed to re-map an existing entry to a new address - When an entry is removed from Raft configuration, it still lingers in the mapping for the expiration interval. The reason for it is that a server may be removed from the current Raft configuration, but some messages may still be destined to it, and will need the translation. This fixes scylladb#11228. 2) subscribe to gossip notifications Learning IP addresses from gossip allows us to adjust the address map whenever a node IP address changes. 3) prompt address map state with app state Initialize the raft address map with persistent snapshot of gossip application state, specifically IPs of members of the cluster. With this, we no longer need to store these IPs in Raft configuration (and update them when they change). The obvious drawback of this approach is that a node may join Raft config before it propagates its IP address to the cluster via gossip - so the boot process has to wait until it happens. Since the configuration change event and ip address discovery are now disconnected, i.e. we first may have a new raft id in the configuration, and only later learn its address through gossip, we should change the pinger subscription not whenever there is a raft config change, but whenever we know the ip address we'd like to subscribe to. Thanks to the changes above, Raft address map no longer needs to get this information from Raft configuration. Keep the 'server_info' column in the raft_config system table, in case we change our mind or decide to store something else in there. Only group0 rpc needs to pay attention to its configuration change events, not just any rpc, since we only maintain hte mapping when group0 configuration changes.
1) make address map API flexible Before this patch: - having a mapping without an actual IP address was an internal error - not having a mapping for an IP address was an internal error - the mapping was removed quickly after the entry is removed from the configuration. - re-mapping to a new IP address wasn't allowed After this patch: - the address map API is, well, inaccurate. It may contain a mapping without an actual IP address, and the caller must be prepared for it: get_inet_address() will throw while find() will return a nullopt. This happens when we first add an entry to Raft configuration and only later learn its IP address, e.g. via gossip. - it is allowed to re-map an existing entry to a new address - When an entry is removed from Raft configuration, it still lingers in the mapping for the expiration interval. The reason for it is that a server may be removed from the current Raft configuration, but some messages may still be destined to it, and will need the translation. This fixes scylladb#11228. 2) subscribe to gossip notifications Learning IP addresses from gossip allows us to adjust the address map whenever a node IP address changes. 3) prompt address map state with app state Initialize the raft address map with persistent snapshot of gossip application state, specifically IPs of members of the cluster. With this, we no longer need to store these IPs in Raft configuration (and update them when they change). The obvious drawback of this approach is that a node may join Raft config before it propagates its IP address to the cluster via gossip - so the boot process has to wait until it happens. Since the configuration change event and ip address discovery are now disconnected, i.e. we first may have a new raft id in the configuration, and only later learn its address through gossip, we should change the pinger subscription not whenever there is a raft config change, but whenever we know the ip address we'd like to subscribe to. Thanks to the changes above, Raft address map no longer needs to get this information from Raft configuration. Keep the 'server_info' column in the raft_config system table, in case we change our mind or decide to store something else in there. Only group0 rpc needs to pay attention to its configuration change events, not just any rpc, since we only maintain hte mapping when group0 configuration changes.
1) make address map API flexible Before this patch: - having a mapping without an actual IP address was an internal error - not having a mapping for an IP address was an internal error - the mapping was removed quickly after the entry is removed from the configuration. - re-mapping to a new IP address wasn't allowed After this patch: - the address map API is, well, inaccurate. It may contain a mapping without an actual IP address, and the caller must be prepared for it: get_inet_address() will throw while find() will return a nullopt. This happens when we first add an entry to Raft configuration and only later learn its IP address, e.g. via gossip. - it is allowed to re-map an existing entry to a new address - When an entry is removed from Raft configuration, it still lingers in the mapping for the expiration interval. The reason for it is that a server may be removed from the current Raft configuration, but some messages may still be destined to it, and will need the translation. This fixes scylladb#11228. 2) subscribe to gossip notifications Learning IP addresses from gossip allows us to adjust the address map whenever a node IP address changes. 3) prompt address map state with app state Initialize the raft address map with persistent snapshot of gossip application state, specifically IPs of members of the cluster. With this, we no longer need to store these IPs in Raft configuration (and update them when they change). The obvious drawback of this approach is that a node may join Raft config before it propagates its IP address to the cluster via gossip - so the boot process has to wait until it happens. Since the configuration change event and ip address discovery are now disconnected, i.e. we first may have a new raft id in the configuration, and only later learn its address through gossip, we should change the pinger subscription not whenever there is a raft config change, but whenever we know the ip address we'd like to subscribe to. Thanks to the changes above, Raft address map no longer needs to get this information from Raft configuration. Keep the 'server_info' column in the raft_config system table, in case we change our mind or decide to store something else in there. Only group0 rpc needs to pay attention to its configuration change events, not just any rpc, since we only maintain hte mapping when group0 configuration changes.
1) make address map API flexible Before this patch: - having a mapping without an actual IP address was an internal error - not having a mapping for an IP address was an internal error - the mapping was removed quickly after the entry is removed from the configuration. - re-mapping to a new IP address wasn't allowed After this patch: - the address map API is, well, inaccurate. It may contain a mapping without an actual IP address, and the caller must be prepared for it: get_inet_address() will throw while find() will return a nullopt. This happens when we first add an entry to Raft configuration and only later learn its IP address, e.g. via gossip. - it is allowed to re-map an existing entry to a new address - When an entry is removed from Raft configuration, it still lingers in the mapping for the expiration interval. The reason for it is that a server may be removed from the current Raft configuration, but some messages may still be destined to it, and will need the translation. This fixes scylladb#11228. 2) subscribe to gossip notifications Learning IP addresses from gossip allows us to adjust the address map whenever a node IP address changes. 3) prompt address map state with app state Initialize the raft address map with persistent snapshot of gossip application state, specifically IPs of members of the cluster. With this, we no longer need to store these IPs in Raft configuration (and update them when they change). The obvious drawback of this approach is that a node may join Raft config before it propagates its IP address to the cluster via gossip - so the boot process has to wait until it happens. Since the configuration change event and ip address discovery are now disconnected, i.e. we first may have a new raft id in the configuration, and only later learn its address through gossip, we should change the pinger subscription not whenever there is a raft config change, but whenever we know the ip address we'd like to subscribe to. Thanks to the changes above, Raft address map no longer needs to get this information from Raft configuration. Keep the 'server_info' column in the raft_config system table, in case we change our mind or decide to store something else in there. Only group0 rpc needs to pay attention to its configuration change events, not just any rpc, since we only maintain hte mapping when group0 configuration changes.
1) make address map API flexible Before this patch: - having a mapping without an actual IP address was an internal error - not having a mapping for an IP address was an internal error - the mapping was removed quickly after the entry is removed from the configuration. - re-mapping to a new IP address wasn't allowed After this patch: - the address map API is, well, inaccurate. It may contain a mapping without an actual IP address, and the caller must be prepared for it: get_inet_address() will throw while find() will return a nullopt. This happens when we first add an entry to Raft configuration and only later learn its IP address, e.g. via gossip. - it is allowed to re-map an existing entry to a new address - When an entry is removed from Raft configuration, it still lingers in the mapping for the expiration interval. The reason for it is that a server may be removed from the current Raft configuration, but some messages may still be destined to it, and will need the translation. This fixes scylladb#11228. 2) subscribe to gossip notifications Learning IP addresses from gossip allows us to adjust the address map whenever a node IP address changes. 3) prompt address map state with app state Initialize the raft address map with persistent snapshot of gossip application state, specifically IPs of members of the cluster. With this, we no longer need to store these IPs in Raft configuration (and update them when they change). The obvious drawback of this approach is that a node may join Raft config before it propagates its IP address to the cluster via gossip - so the boot process has to wait until it happens. Since the configuration change event and ip address discovery are now disconnected, i.e. we first may have a new raft id in the configuration, and only later learn its address through gossip, we should change the pinger subscription not whenever there is a raft config change, but whenever we know the ip address we'd like to subscribe to. Thanks to the changes above, Raft address map no longer needs to get this information from Raft configuration. Keep the 'server_info' column in the raft_config system table, in case we change our mind or decide to store something else in there. Only group0 rpc needs to pay attention to its configuration change events, not just any rpc, since we only maintain hte mapping when group0 configuration changes.
@kbr-scylla please evaluate for backport |
Included in 5.2, Raft is experimental in earlier releases, no point in backporting. |
@kostja , @elcallio , can you please advice -
Cc @roydahan |
@yarongilor @elcallio write above that the error itself can be expected, it's merely a warning. Do you reproduce the crash? The crash should be gone by now. |
the warning problem is fixed here, schema commitlog is moved to another directory |
@kostja , no crashes reproduced, only the warnings |
Moving to 5.3, and reducing priority. Just a warning, shouldn't be in newer versions. |
I don't think anything needs to be done for this issue. In 5.3 Raft will be using schema commit log. |
Installation details
Kernel Version: 5.15.0-1015-aws
Scylla version (or git commit hash):
5.1.0~dev-20220802.663f2e2a8f81
with build-idd6f523279872624d89a9a451f02d8276f0073cb5
Cluster size: 6 nodes (i3.4xlarge)
Scylla Nodes used in this run:
OS / Image:
ami-0da67a130b28d1222
(aws: eu-north-1)Test:
longevity-100gb-4h-with-raft-test
Test id:
4db1e59f-a58b-4a9f-af8e-5b5d9b285260
Test name:
scylla-master/SCT_Raft_Experimental/longevity-100gb-4h-with-raft-test
Test config file(s):
Issue description
Nemesis RemoveNodeandAndNew terminate targe node, then run nodetool repair on each node, then remove node with nodetool removenode, and new one.
During test job, the nemesis after repair operations execute nodetool remove node and node 1 got next error
2022-08-04T17:26:27+00:00 longevity-100gb-4h-master-db-node-4db1e59f-1 !ERR | scylla[7861]: [shard 0] raft_group_registry - Destination raft server not found with id a27f46c0-127a-4eca-acd8-5135943046d3, at: 0x4fa031e 0x4fa0810 0x4fa0b18 0x4be9835 0x3aeef3e 0x3bc4195 0x3bc4f78 0x4c165d4 0x4c179b7 0x4c16c0c 0x4bbccf8 0x4bbc1d1 0x1066fcf 0x106451a /opt/scylladb/libreloc/libc.so.6+0x27b74 0x10633ed
and then when new node 7 was adding to cluster next errors and coredump happened on cluster:
$ hydra investigate show-monitor 4db1e59f-a58b-4a9f-af8e-5b5d9b285260
$ hydra investigate show-logs 4db1e59f-a58b-4a9f-af8e-5b5d9b285260
Logs:
Jenkins job URL
The text was updated successfully, but these errors were encountered: