You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have been running into an issue where it appears that QEMU/PANDA hangs after issuing the command to start my VM where there is no output and nothing happens. This may be some sort of race condition as it only happens sometimes. I have narrowed the issue down to using a network tap with the vexpress-a9 ARM board. I am attaching a sample VM and script where this behavior can be reproduced.
# first setup a network tap
sudo tunctl -t tap1 -u ${USER}
sudo ip link set tap1 up
sudo ip addr add 192.168.1.2/24 dev tap1
# extract sample VM
tar xvf vm_sample.tar.gz
# run PANDA, need sudo for net tap
sudo ./run.sh
I've crafted a simple bash script (run.sh) to attempt to start the VM over and over again and then exit when it fails to start. Using gdb to attach to the PANDA pid when it hangs, I've gotten the following backtrace:
Attaching to process 40651
[New LWP 40653]
[New LWP 40655]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
__lll_lock_wait (futex=futex@entry=0x7fab8fc91340 <qemu_global_mutex>,
private=0) at lowlevellock.c:52
52 lowlevellock.c: No such file or directory.
(gdb) bt
#0 __lll_lock_wait
(futex=futex@entry=0x7fab8fc91340 <qemu_global_mutex>, private=0)
at lowlevellock.c:52
#1 0x00007fab882490a3 in __GI___pthread_mutex_lock
(mutex=0x7fab8fc91340 <qemu_global_mutex>)
at ../nptl/pthread_mutex_lock.c:80
#2 0x00007fab8f2936bd in qemu_mutex_lock
(mutex=mutex@entry=0x
7fab8fc91340 <qemu_global_mutex>)
at /home/user/panda/util/qemu-thread-posix.c:60
#3 0x00007fab8eed2e81 in qemu_mutex_lock_iothread ()
at /home/user/panda/cpus.c:1459
#4 0x00007fab8f28ffd3 in os_host_main_loop_wait (timeout=<optimized out>)
at /home/user/panda/util/main-loop.c:262
#5 main_loop_wait (nonblocking=<optimized out>)
at /home/user/panda/util/main-loop.c:540
#6 0x00007fab8f004334 in main_loop () at /home/user/panda/vl.c:1971
#7 0x00007fab8f0095c8 in main_aux
(argc=<optimized out>, argv=<optimized out>, envp=<optimized out>, pmm=PANDA_NORMAL) at /home/user/panda/vl.c:5070
#8 0x00007fab8e926083 in __libc_start_main (main=
0x55fc1ea37060 <main>, argc=27, argv=0x7ffe1298b038, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffe1298b028)
at ../csu/libc-start.c:308
#9 0x000055fc1ea3709e in _start ()
I'm getting the same behavior in both my local PANDA build as well as the latest pandare/panda:latest docker container image. I have found that this issue only happens when using the vexpress-a9 QEMU ARM machine. If I switch to -M virt or -M versatilepb it seems to boot fine every time. It only happens with the tap network device. For example, if I change the net config to -net nic -net user it boots fine every time. Using either -netdev tap or -net tap seems to trigger this behavior on the vexpress-a9 board. It happens whether the filesystem is an initramfs or -drive device. I have reproduced this with several different versions of Linux targeting vexpress-a9.
The text was updated successfully, but these errors were encountered:
We have run into this same issue but the fix still has some race conditions: #1232
Not sure if your issue is arriving at the race condition in the same way, but one workaround we had was to add an lpj=kernel argument. Just observed boot without the network tap to grab a sane value from the console.
We have run into this same issue but the fix still has some race conditions: #1232
Not sure if your issue is arriving at the race condition in the same way, but one workaround we had was to add an lpj=kernel argument. Just observed boot without the network tap to grab a sane value from the console.
Interesting, this is probably a duplicate of your issue then. If I boot with lpj=2912256 it seems to boot reliably.
I have been running into an issue where it appears that QEMU/PANDA hangs after issuing the command to start my VM where there is no output and nothing happens. This may be some sort of race condition as it only happens sometimes. I have narrowed the issue down to using a network tap with the vexpress-a9 ARM board. I am attaching a sample VM and script where this behavior can be reproduced.
To reproduce this issue with vm_sample.tar.gz:
I've crafted a simple bash script (run.sh) to attempt to start the VM over and over again and then exit when it fails to start. Using gdb to attach to the PANDA pid when it hangs, I've gotten the following backtrace:
For reference the PANDA command being used is:
I'm getting the same behavior in both my local PANDA build as well as the latest pandare/panda:latest docker container image. I have found that this issue only happens when using the vexpress-a9 QEMU ARM machine. If I switch to
-M virt
or-M versatilepb
it seems to boot fine every time. It only happens with the tap network device. For example, if I change the net config to-net nic -net user
it boots fine every time. Using either-netdev tap
or-net tap
seems to trigger this behavior on the vexpress-a9 board. It happens whether the filesystem is an initramfs or-drive
device. I have reproduced this with several different versions of Linux targeting vexpress-a9.The text was updated successfully, but these errors were encountered: