You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
in relation to comment 8/9 in issue 375.
Rancher 2.2.7, Longhorn 0.5
2 nodes, one node docker is stopped and after some 10 minutes started again. All volumes which were previously assigned to that node stay in detached forever.
Sometimes scaling down/up of the affected workload helps, sometimes not.
What always helps is manual attaching back.
Error messages same as in 375. Happy to provide details.
Think its quite important to address this, should be quite some common scenario that not the whole node is rebooted but docker down for one or the other reason.
Cheers,
Chris
The text was updated successfully, but these errors were encountered:
If we want to detect if the docker daemon was restarted, we need to find a way to know it's the reason for the controller down, like what we did with node restart and Kubernetes NodeBootID.
Hmm I’m a newbie here - but why would it harm if we always at least try to reattach when the restarting (typical StatefulSet) workload asks for it ? Independent of the potential reason why it was detached ?
Sometimes when manually scale down and up again the workload after Docker restart longhorn seems to do exactly that .
Thanks for the explanation .
Chris
We need to check if Kubernetes asked for reattaching or not.
In this case, the engine crashed but we don't know why, so the volume is automatically detached. If Kubernetes asked the volume to be attached, we should just do that. But my guess is the Kubernetes's volume attachment didn't reflect that the volume was detached so Kubernetes still think the volume is attached thus forgo the request to reattach the volume. That's the reason the pod got stuck. We're going to check if it's the case.
in relation to comment 8/9 in issue 375.
Rancher 2.2.7, Longhorn 0.5
2 nodes, one node docker is stopped and after some 10 minutes started again. All volumes which were previously assigned to that node stay in detached forever.
Sometimes scaling down/up of the affected workload helps, sometimes not.
What always helps is manual attaching back.
Error messages same as in 375. Happy to provide details.
Think its quite important to address this, should be quite some common scenario that not the whole node is rebooted but docker down for one or the other reason.
Cheers,
Chris
The text was updated successfully, but these errors were encountered: