Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Volumes stay detached after docker restart on node #686

Closed
chrisbulgaria opened this issue Aug 17, 2019 · 4 comments
Closed

Volumes stay detached after docker restart on node #686

chrisbulgaria opened this issue Aug 17, 2019 · 4 comments
Assignees
Labels
component/longhorn-manager Longhorn manager (control plane) kind/bug
Milestone

Comments

@chrisbulgaria
Copy link

in relation to comment 8/9 in issue 375.
Rancher 2.2.7, Longhorn 0.5
2 nodes, one node docker is stopped and after some 10 minutes started again. All volumes which were previously assigned to that node stay in detached forever.
Sometimes scaling down/up of the affected workload helps, sometimes not.
What always helps is manual attaching back.
Error messages same as in 375. Happy to provide details.
Think its quite important to address this, should be quite some common scenario that not the whole node is rebooted but docker down for one or the other reason.
Cheers,
Chris

@yasker yasker added component/longhorn-manager Longhorn manager (control plane) kind/bug labels Aug 19, 2019
@yasker
Copy link
Member

yasker commented Aug 19, 2019

If we want to detect if the docker daemon was restarted, we need to find a way to know it's the reason for the controller down, like what we did with node restart and Kubernetes NodeBootID.

@chrisbulgaria
Copy link
Author

Hmm I’m a newbie here - but why would it harm if we always at least try to reattach when the restarting (typical StatefulSet) workload asks for it ? Independent of the potential reason why it was detached ?
Sometimes when manually scale down and up again the workload after Docker restart longhorn seems to do exactly that .
Thanks for the explanation .
Chris

@yasker
Copy link
Member

yasker commented Sep 4, 2019

@chrisbulgaria

We need to check if Kubernetes asked for reattaching or not.

In this case, the engine crashed but we don't know why, so the volume is automatically detached. If Kubernetes asked the volume to be attached, we should just do that. But my guess is the Kubernetes's volume attachment didn't reflect that the volume was detached so Kubernetes still think the volume is attached thus forgo the request to reattach the volume. That's the reason the pod got stuck. We're going to check if it's the case.

@yasker yasker added this to the v0.7.0 milestone Sep 4, 2019
@yasker yasker added the kind/poc Potential feature request but need POC label Sep 23, 2019
@yasker yasker removed the kind/poc Potential feature request but need POC label Sep 24, 2019
@yasker yasker modified the milestones: v0.7.0, v0.8.0 Oct 22, 2019
@yasker yasker modified the milestones: v0.8.0, v0.7.0 Nov 14, 2019
@yasker
Copy link
Member

yasker commented Nov 14, 2019

Done as a part of #851

@yasker yasker closed this as completed Nov 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/longhorn-manager Longhorn manager (control plane) kind/bug
Projects
None yet
Development

No branches or pull requests

3 participants