Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

savevm hangs #523

Closed
Kojoley opened this issue Jan 8, 2020 · 12 comments
Closed

savevm hangs #523

Kojoley opened this issue Jan 8, 2020 · 12 comments

Comments

@Kojoley
Copy link
Contributor

Kojoley commented Jan 8, 2020

It is working with a year old pandas, so most probably was broken by QEMU update.

@nathanjackson
Copy link
Contributor

savevm seems to work okay for me, I actually make use of this quite often. Any chance you could share the guest image and configuration?

@Kojoley
Copy link
Contributor Author

Kojoley commented Jan 28, 2020

The guest image is http://panda.moyix.net/~moyix/wheezy_panda2.qcow2, and I use loadvm, so probably it is qemu snapshot migration issue, I guess.

@Kojoley
Copy link
Contributor Author

Kojoley commented Feb 5, 2020

It hangs even without loadvm, just panda-system-i386 wheezy_panda2.qcow2 -serial stdio.

@AndrewFasano
Copy link
Member

AndrewFasano commented Apr 6, 2020

Are you still having this issue? It sounds like your qcow might've gotten corrupted because that exact qcow is fairly well tested and I haven't had any issues with it lately. If you are still having problems, can you try downloading it again?

@Kojoley
Copy link
Contributor Author

Kojoley commented Apr 6, 2020

A corrupted qcow should not hang the program, if so - it is a bug :-)
sha1sum for wheezy_panda2.qcow2 is 1a944531ba019541a1242b9c2b450daf72518206

P.S. There are commits in upstream about savevm hang https://lists.gnu.org/archive/html/qemu-devel/2017-05/msg04135.html

@AndrewFasano
Copy link
Member

We generally consider upstream QEMU issues out of scope for us to fix. QEMU bugs should be reported to the (much larger) QEMU dev team and once they're fixed upstream, we'll eventually get them merged into PANDA.

In cases like this where it looks like the bug may have been fixed upstream, you can backport those commits and make a PR if you'd like it to get fixed sooner. Otherwise, we'll get those changes merged in once we tackle #570.

@pcworld
Copy link
Contributor

pcworld commented May 15, 2021

I can reproduce this in the official Docker image.

$ sudo docker images
REPOSITORY                      TAG       IMAGE ID       CREATED         SIZE
pandare/panda                   latest    dfa6dc8a896d   3 days ago      2GB
$ sudo docker run --rm -it pandare/panda bash
root@08c7f6ba3115:/# cd /tmp                                                                             
root@08c7f6ba3115:/tmp# wget --no-check-certificate https://panda.moyix.net/~moyix/wheezy_panda2.qcow2
…
root@08c7f6ba3115:/tmp# sha256sum wheezy_panda2.qcow2 
890cbcc65fbce7ef65f5260eca36009f94207ec684b2da21f12531f1b1906f28  wheezy_panda2.qcow2
root@08c7f6ba3115:/tmp# sha1sum wheezy_panda2.qcow2 
a3f437109634612cf3a834fd59c36f16379357b5  wheezy_panda2.qcow2   # note that this differs from the one posted above
root@08c7f6ba3115:/tmp# panda-system-i386 wheezy_panda2.qcow2 -curses

The system boots into grub and switches to graphic mode, as expected. Press Alt-2 to switch to the monitor. Run loadvm root.
At this point, the panda/qemu process seems to hang (at least for several minutes), no way to switch back to the VGA or serial console via Alt-num, SIGTERM does nothing, only SIGKILL is able to kill the panda/qemu process.
Note that even using invalid tags such as loadvm asdf hangs the process. (savevm does too.)
I have tried applying the commits from https://lists.gnu.org/archive/html/qemu-devel/2017-05/msg04135.html ("block: fix 'savevm' hang with -object iothread"), that has not helped however.

The host OS (beneath the Docker) is Fedora 34.

@AndrewFasano
Copy link
Member

Thanks for the reproducibility details pcworld. I wouldn't have expected the invalid tag to hang PANDA. That will make things easier to track down if I find the cycles to try fixing this.

@github-actions
Copy link

Stale issue message

@github-actions
Copy link

This issue has gone stale! If you believe it is still a problem, please comment on this issue or it will be closed in 30 days

@pcworld
Copy link
Contributor

pcworld commented Oct 30, 2021

Regarding my comment #523 (comment) (which might be different from the original issue), I had found out later that when not using -curses but rather redirecting serial console to telnet (e.g., -serial telnet:127.0.0.1:4444,server,wait) and using -monitor stdio, loadvm works fine. Knowing this, it's not really a problem (maybe should be documented if a fix is not feasible).

@github-actions
Copy link

This issue has gone stale! If you believe it is still a problem, please comment on this issue or it will be closed in 30 days

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants