Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multipass becomes unresponsive when purging multiple machines simultaneously #357

Closed
morphis opened this issue Aug 29, 2018 · 4 comments
Closed
Assignees
Labels

Comments

@morphis
Copy link

morphis commented Aug 29, 2018

I was removing two machines from multipass today and it turned unresponsive when calling the purge command:

+ multipass delete ams0
+ multipass delete lxd0
+ multipass purge
purge failed: Socket closed
$ multipass purge
purge failed: Connect Failed
$ multipass ls
list failed: Connect Failed

After waiting a moment multipass was able to answer my requests again. It seems that multipass seems to hang on parallel requests quite a lot like a multipass delete <container name> will cause a parallel executed multipass ls to hang until the delete operation has finished.

To improve the user experience none of the multipass commands should hang or return a error message saying Connect Failed. If certain operations cannot be executed in parallel multipass should surface that with a proper error message.

@townsend2010
Copy link
Contributor

@morphis,

Thanks for the report. Based on that, it actually seems like the daemon died and then was re-spawned by snapd. Anything in journalctl to shed light on what might have happened at that time?

@morphis
Copy link
Author

morphis commented Aug 29, 2018

@townsend2010 That is what I thought too but the multipassd daemon was alive and the last message shown in the journal output was

Aug 29 08:08:33 saturn multipassd[22278]: gRPC listening on unix:/run/multipass_socket

I've verified that once the daemon was reactive again it wasn't restarted in between.

It feels a bit like the RPC endpoint in multipassd is single-threaded or a lock is somewhere blocking requests to be processed in parallel.

@townsend2010 townsend2010 self-assigned this Aug 29, 2018
@townsend2010
Copy link
Contributor

This could possibly be a race between the previous not fully completed delete operation and the subsequent purge operation working on the same data in the internal DB. Will look further into it.

@townsend2010
Copy link
Contributor

This should be fixed by the recent asynchronous work put into multipassd.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants