Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multipass can't delete LXD instances that go into Error state #2492

Closed
ricab opened this issue Mar 24, 2022 · 8 comments · Fixed by #1946
Closed

Multipass can't delete LXD instances that go into Error state #2492

ricab opened this issue Mar 24, 2022 · 8 comments · Fixed by #1946
Assignees
Labels
Milestone

Comments

@ricab
Copy link
Collaborator

ricab commented Mar 24, 2022

Describe the bug
When using the LXD backend, after a failed launch where the instance gets into an error state (112 from LXD, translated to unknown in multipass), multipass is unable to delete the instance:

$ multipass delete --purge --all
[2022-03-24T17:29:10.900] [error] [lxd request] Operation completed with error: (400) The instance cannot be cleanly shutdown as in Error status
delete failed: Operation completed with error: (400) The instance cannot be cleanly shutdown as in Error status

We would need to give LXD the equivalent of the --force flag in such cases.

To Reproduce
Because of https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1935880 , a pretty reliable way to get a LXD vm in error state is just

  1. multipass launch --cpus 2
  2. multipass delete --purge <instance> then fails as above

Expected behavior
Multipass would be able to delete the instance.

@ricab ricab added the bug label Mar 24, 2022
@oscarmparedes
Copy link

oscarmparedes commented May 7, 2022

I had the same issue and restarting lxd via snap fixed the issue.
Update: Actually while it initially seemed to work, I realised I was using multipass 1.9.1-rc.1+gcdd5686e, so removed it via snap, and installed the stable release 1.9.0. and I tried launching a 2G instance but it hanged on creating, then status was unknown, then the issue you just described happened again.
The reason why I am running lxd driver is because I wanted to automatically assign an ip from my lan to every instance.

@aatrcoutinho
Copy link

the same problem happened to me today. And now, How can I remove the VM?

@ricab
Copy link
Collaborator Author

ricab commented Jul 26, 2022

Hi @aatrcoutinho, until we get this fixed, you can delete instances in LXD directly: lxc delete <instance-name> --project=multipass

@holta
Copy link

holta commented Nov 5, 2022

Hi @aatrcoutinho, until we get this fixed, you can delete instances in LXD directly: lxc delete <instance-name> --project=multipass

Lifesaver !! This worked for me too:

lxc delete deb12 --project=multipass --force

Huge Thanks @ricab as Multipass accidentally does serious damage (blocking deletion of the VM, and even blocking the host machine from rebooting) when Multipass effectively loses control of VM's like this one:

multipass launch -n deb12 https://cloud.debian.org/images/cloud/bookworm/daily/latest/debian-12-generic-amd64-daily.qcow2

CLARIF: The above worked in recent weeks, but no longer works today, painfully freezing out many/most all Multipass actions. Even snap restart multipass.multipassd does not help. Not even a forced reboot of the entire host PC helped.

Hence the need to force the deletion of a Multipass VM using LXD. As Multipass itself is unfortunately not capable of reliably deleting its own VM's:

# multipass info deb12
info failed: ssh connection failed: 'Connection refused'

# multipass stop deb12
Stopping deb12 \[2022-11-05T12:19:34.936] [error] [lxd request] Timeout getting response for GET operation on unix://multipass/var/snap/lxd/common/lxd/unix.socket@1.0/operations/94e4dd5b-67a6-47d1-ae91-1083e0e96044/wait?project=multipass
[2022-11-05T12:19:34.937] [error] [lxd request] Timeout getting response for GET operation on unix://multipass/var/snap/lxd/common/lxd/unix.socket@1.0/operationstop failed: Timeout getting response for GET operation on unix://multipass/var/snap/lxd/common/lxd/unix.socket@1.0/operations/94e4dd5b-67a6-47d1-ae91-1083e0e96044/wait?project=multipass

# multipass delete deb12
[2022-11-05T12:35:55.949] [error] [lxd request] Timeout getting response for GET operation on unix://multipass/var/snap/lxd/common/lxd/unix.socket@1.0/operations/8d951713-031f-461f-8512-93d907fa1d09/wait?project=multipass
[2022-11-05T12:35:55.949] [error] [lxd request] Timeout getting response for GET operation on unix://multipass/var/snap/lxd/common/lxd/unix.socket@1.0/operations/8d951713-031f-461f-8512-93d907fa1d09/wait?project=multipass
delete failed: Timeout getting response for GET operation on unix://multipass/var/snap/lxd/common/lxd/unix.socket@1.0/operations/8d951713-031f-461f-8512-93d907fa1d09/wait?project=multipass

This lifesaving trick should be better documented within https://multipass.run/docs e.g. at remove-an-instance or some similar space, Thanks All!

Background:

@holta
Copy link

holta commented Nov 5, 2022

To CLARIFY the above, even this very latest Edge Channel version of Multipass cannot reliably delete its own VM's / instances:

# multipass --version
multipass   1.12.0-dev.379+g6cfdc875
multipassd  1.12.0-dev.379+g6cfdc875

# snap info multipass
...
  latest/edge:      1.12.0-dev.379+g6cfdc875 2022-10-30 (8154) 112MB -
installed:          1.12.0-dev.379+g6cfdc875            (8154) 112MB -

@ricab
Copy link
Collaborator Author

ricab commented Nov 8, 2022

@holta, Multipass is designed to launch Ubuntu VMs, not Debian. You can try, of course, but you're on your own then. It might work, but it is no surprise that it doesn't.

@holta
Copy link

holta commented Nov 8, 2022

Just FYI Debian 12 instances worked fine with Multipass in the past.

The Much Larger Point (this ticket!) is that Multipass accidentally sabotages the host PC (preventing the "reboot" command from working, etc) when any instance (Debian or Ubuntu or whatever, that's not the point) encounters everyday such testing errors.

So your original request (thank you to @ricab) is quite important: Multipass indeed needs to evolve to be able to delete dysfunctional instances -- according to the instructions it provides, which for the moment is: https://multipass.run/docs/remove-an-instance

@holta
Copy link

holta commented Apr 2, 2023

Just FYI this bug (or something extremely similar!) has become substantially more severe in the past ~10 days (seemingly manifesting in new ways, e.g. known-to-be-reliable Ubuntu instances often-and-intermittently cannot boot). I don't know why.

Certainly known-to-be-reliable Multipass instances (e.g. Ubuntu 22.04, Ubuntu 23.04 pre-releases, and others) frequently fail to start. FYI the Host PC is Ubuntu Server 22.04.

The only workaround I've found so far is to repeatedly reboot the Host PC (rebooting it twice is sometimes necessary) until the Multipass instance in question...finally boots cleanly.

CLARIF 1: Running sudo snap restart multipass is not enough to resolve the problem. Full reboot(s) of the Host PC unfortunately appear to be the only way forward, so far.

CLARIF 2: I just reverted from Multipass's "edge" channel (1.12.x) to its "beta" channel (1.11.1) to see if that might somewhat help going forward?

# multipass version
multipass   1.11.1
multipassd  1.11.1

# lxd version
5.12

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
6 participants