New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] DRV stuck in attaching state when restoring is interrupted by rebooting attached node #1328
Comments
Validated after fix of #1260 got merged, still this issue is reproducible. |
This issue is almost the same as the reopened issue #1270. The fix for that issue would handle this. |
@khushboo-rancher In this case, you've rebooted the node that engine or replica is running on? |
@yasker The node where replica was running was rebooted. |
@khushboo-rancher : and it's not the node that the volume was attached to? If so, this is a different case from #1270 . IIUC, #1270 only deals with engine detachment. |
Yes, the node rebooted was not the attached one. |
@yasker ignore my previous comment, I am seeing different behavior in latest master build. I don't see volume stuck in attaching state anymore.
|
@khushboo-rancher That's good news. So which case is this issue originally about? I want to make sure we have separate issues for separate cases. |
Originally, the issue was if a node(volume not attached to this node) where a replica was running rebooted while restore was in progress, it was stuck in attaching state forever. Now, the above case doesn't have problem but if the attached node is rebooted volume gets stuck in attaching/detaching state. |
@khushboo-rancher Let's either open a new issue for the reboot the attached node case, or repurpose this issue for that. If we want to repurpose this issue, you can update the first comment to reflect that, make sure it's about |
Manual test 1:
Manual test 2:
|
The attached node rebooting will lead to the engine crashing unexpectedly. @khushboo-rancher Can you verify if this issue is still reproducible after the PR gets merged? |
Verified in master build. Verified reboot scenarios while initial and incremental restoring of DR volume. It worked fine.
|
Describe the bug
Creating DRV stuck in attaching state when restoration of volume is interrupted by rebooting one of the node.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
DRV should get created with 2 replicas if 1 node is rebooting.
Log
longhorn-support-bundle_e8d217c6-3283-4664-9018-a84ee301ee96_2020-05-12T22-10-40Z.zip
Environment:
The text was updated successfully, but these errors were encountered: