Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New node won't join the cluster and replace the lost node automatically #114

Closed
vladiceanu opened this issue Jun 2, 2020 · 5 comments · Fixed by #258
Closed

New node won't join the cluster and replace the lost node automatically #114

vladiceanu opened this issue Jun 2, 2020 · 5 comments · Fixed by #258
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@vladiceanu
Copy link

Describe the bug
We've tried to simulate a node failure where a node gets forcefully removed and a new node is provided. But, when a new node was trying to join the cluster, we were seeing the following error message:

INFO  2020-05-15 10:24:36,589 [shard 0] init - Shutdown database started
INFO  2020-05-15 10:24:36,589 [shard 0] compaction_manager - Asked to stop
INFO  2020-05-15 10:24:36,589 [shard 0] compaction_manager - Stopped
INFO  2020-05-15 10:24:36,691 [shard 0] init - Shutdown database finished
INFO  2020-05-15 10:24:36,691 [shard 0] init - stopping prometheus API server
INFO  2020-05-15 10:24:36,691 [shard 0] init - Startup failed: std::runtime_error (A node with address 10.100.136.228 already exists, cancelling join. Use replace_address if you want to replace this node.)

where the address 10.100.136.228 is the old IP.

After kubectl exec and running nodetool removenode <node_id>, the new node was able to join the cluster, but when running nodetool gossipinfo, we are seeing the following:

/10.100.136.228 < ---- this is the node name? 
  RPC_ADDRESS:10.100.136.228
 ...
  INTERNAL_IP:172.26.29.218 <---- this is the real new IP

To Reproduce
Steps to reproduce the behavior:

  1. Create a Scylla cluster using the scylla-operator;
  2. Remove a node (Kubernetes node in this case) and provide a new, empty node (usually Autoscaler will do that);
  3. See the error logs in the new pod;

Expected behavior
A (VM) node is removed from the cluster, a new empty node is available to host the Scylla Pod, the new Pod/Node joins automatically the cluster, no user action required.

Config Files
Default;

Logs
(see the description above, please let me know if additional logs required)

Environment:

  • Platform: EKS
  • Kubernetes version: 1.15.11
  • Scylla version: 3.2.1
  • Scylla-operator version: v0.1.6
@vladiceanu vladiceanu added the kind/bug Categorizes issue or PR as related to a bug. label Jun 2, 2020
@dahankzter
Copy link
Contributor

This is not implemented yet but has an issue for it #48. Scylla requires special handling if you want to reuse the IP address.

@vladiceanu
Copy link
Author

Scylla requires special handling if you want to reuse the IP address.

The thing is that we don't really want to use the same IP address, we just want the new Pod to join the cluster and replace the old pod that was on a dead node. The error is misleading because the new Pod has a different IP, whereas10.100.136.228 is the IP of the old pod.

@mmatczuk mmatczuk added this to the 1.0 milestone Sep 22, 2020
@jkarjala
Copy link

I ran into a related problem with the latest scylla-operator and the example EKS cluster configuration.

I am simulating a failure by terminating one Kubernetes node hosting a Scylla node via AWS API. AWS autoscaler brings up new node, and Kubernetes adopts it fine.

However, the replacement Scylla pod cannot be scheduled to the new node due to the old pod's PVC still pointing to the PV in the lost Kubernetes node. Manually deleting the PVC resets the situation, and the new pod gets scheduled and it joins the Scylla cluster and nodetool status shows it as UP.

According to kubernetes/kubernetes#61620, this situation should be managed by the (scylla) operator. Is this going to be fixed as part of this ticket (or #48), or is this a separate issue?

@zimnx
Copy link
Collaborator

zimnx commented Oct 21, 2020

@jkarjala do you have Operator logs from this situation?

@jkarjala
Copy link

The operator log tail attached (EC2 instance was terminated around 12:30).

scylla-op-40.log

The describe pod for the pending Scylla pod has Events:

`
Warning FailedScheduling 2m7s default-scheduler 0/6 nodes are available: 6 Insufficient cpu.

Warning FailedScheduling 49s (x4 over 99s) default-scheduler 0/7 nodes are available: 1 node(s) had taints that the pod didn't tolerate, 6 Insufficient cpu.

Warning FailedScheduling 44s (x3 over 47s) default-scheduler 0/7 nodes are available: 1 node(s) had volume node affinity conflict, 6 Insufficient cpu.
`

Once I delete the PVC for that pod, as well as the pod itself, the new pod gets scheduled with a new PVC (pointing to the new local PV in the new node). According to the above kubernetes issue, the scylla operator should take care of this in case of node failure.

--

My previous comment claiming the new pod joins the cluster was actually false.
The scylla in the new pod does exit due to IP address conflict which seems to be the topic of #48:

ERROR 2020-10-21 13:03:17,613 [shard 0] init - Startup failed: std::runtime_error (A node with address 10.100.0.108 already exists, cancelling join. Use replace_address if you want to replace this node.)

The new EC2 node has a new IP address, but it seems the pod on Kubernetes level still has a local IP which exists.

zimnx added a commit that referenced this issue Nov 20, 2020
When k8s node is gone, PVC might still have node affinity pointing
to lost node. In this situation, PVC is deleted by the Operator
and node replacement logic is triggered to restore cluster RF.

Fixes #215
Fixes #114
zimnx added a commit that referenced this issue Nov 20, 2020
When k8s node is gone, PVC might still have node affinity pointing
to lost node. In this situation, PVC is deleted by the Operator
and node replacement logic is triggered to restore cluster RF.

Fixes #114
zimnx added a commit that referenced this issue Nov 20, 2020
When k8s node is gone, PVC might still have node affinity pointing
to lost node. In this situation, PVC is deleted by the Operator
and node replacement logic is triggered to restore cluster RF.

Fixes #114
Fixes #215
zimnx added a commit that referenced this issue Nov 23, 2020
When k8s node is gone, PVC might still have node affinity pointing
to lost node. In this situation, PVC is deleted by the Operator
and node replacement logic is triggered to restore cluster RF.

Fixes #114
Fixes #215
zimnx added a commit that referenced this issue Nov 23, 2020
When k8s node is gone, PVC might still have node affinity pointing
to lost node. In this situation, PVC is deleted by the Operator
and node replacement logic is triggered to restore cluster RF.

Fixes #114
Fixes #215
zimnx added a commit that referenced this issue Nov 23, 2020
When k8s node is gone, PVC might still have node affinity pointing
to lost node. In this situation, PVC is deleted by the Operator
and node replacement logic is triggered to restore cluster RF.

Fixes #114
Fixes #215
zimnx added a commit that referenced this issue Nov 23, 2020
When k8s node is gone, PVC might still have node affinity pointing
to lost node. In this situation, PVC is deleted by the Operator
and node replacement logic is triggered to restore cluster RF.

Fixes #114
Fixes #215
zimnx added a commit that referenced this issue Nov 23, 2020
When k8s node is gone, PVC might still have node affinity pointing
to lost node. In this situation, PVC is deleted by the Operator
and node replacement logic is triggered to restore cluster RF.

Fixes #114
Fixes #215
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants