Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upxen-netfront crash when detaching network of a VM that uses two network interfaces. #2796
Comments
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
May 8, 2017
Member
xendriverdomain service crash is probably unrelated to xen-netfront kernel driver problem. Do you have backtrace of that crash?
|
xendriverdomain service crash is probably unrelated to xen-netfront kernel driver problem. Do you have backtrace of that crash? |
andrewdavidwong
added
bug
C: xen
labels
May 9, 2017
andrewdavidwong
added this to the Release 3.2 updates milestone
May 9, 2017
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
ptitdoc
commented
May 9, 2017
|
I will try to generate one as soon as systemd let me generate a coredump. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
ptitdoc
May 10, 2017
May 10 10:01:06 sys-net kernel: bridge0: port 2(vif33.0) entered disabled state
May 10 10:01:06 sys-net NetworkManager[541]: <info> (vif33.0): link disconnected
May 10 10:01:06 sys-net kernel: bridge0: port 2(vif33.0) entered disabled state
May 10 10:01:06 sys-net kernel: device vif33.0 left promiscuous mode
May 10 10:01:06 sys-net kernel: bridge0: port 2(vif33.0) entered disabled state
May 10 10:01:06 sys-net audit: ANOM_PROMISCUOUS dev=vif33.0 prom=0 old_prom=256 auid=4294967295 uid=0 gid=0 ses=4294967295
May 10 10:01:06 sys-net NetworkManager[541]: <info> (bridge0): bridge port vif33.0 was detached
May 10 10:01:06 sys-net NetworkManager[541]: <info> (vif33.0): released from master bridge0
May 10 10:01:06 sys-net NetworkManager[541]: <warn> (vif33.0): failed to disable userspace IPv6LL address handling
May 10 10:01:06 sys-net NetworkManager[541]: <warn> (vif33.1): failed to disable userspace IPv6LL address handling
May 10 10:01:06 sys-net audit[29086]: ANOM_ABEND auid=4294967295 uid=0 gid=0 ses=4294967295 pid=29086 comm="xl" exe="/usr/sbin/xl" sig=11
May 10 10:01:06 sys-net kernel: traps: xl[29086] general protection ip:7f207dc2b37d sp:7fff62256490 error:0 in libxenlight.so.4.6.0[7f207dc11000+91000]
May 10 10:01:06 sys-net abrt-hook-ccpp[29384]: Process 29086 (xl) of user 0 killed by SIGSEGV - dumping core
May 10 10:01:06 sys-net systemd[1]: xendriverdomain.service: Main process exited, code=dumped, status=11/SEGV
May 10 10:01:06 sys-net systemd[1]: xendriverdomain.service: Unit entered failed state.
May 10 10:01:06 sys-net audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=xendriverdomain comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
May 10 10:01:06 sys-net systemd[1]: xendriverdomain.service: Failed with result 'core-dump'.
May 10 10:01:07 sys-net abrt-server[29389]: Deleting problem directory ccpp-2017-05-10-10:01:06-29086 (dup of ccpp-2017-05-08-12:58:50-569)
[root@sys-net ccpp-2017-05-08-12:58:50-569]# cat core_backtrace
{ "signal": 11
, "executable": "/usr/sbin/xl"
, "only_crash_thread": true
, "stacktrace":
[ { "crash_thread": true
, "frames":
[ { "address": 140309223891837
, "build_id": "14b6785eb9d1706f54c3ea508ef8bff639bcfd31"
, "build_id_offset": 107389
, "function_name": "backend_watch_callback"
, "file_name": "/usr/lib64/libxenlight.so.4.6.0"
}
, { "address": 140309224107048
, "build_id": "14b6785eb9d1706f54c3ea508ef8bff639bcfd31"
, "build_id_offset": 322600
, "function_name": "watchfd_callback"
, "file_name": "/usr/lib64/libxenlight.so.4.6.0"
}
, { "address": 140309224114319
, "build_id": "14b6785eb9d1706f54c3ea508ef8bff639bcfd31"
, "build_id_offset": 329871
, "function_name": "afterpoll_internal"
, "file_name": "/usr/lib64/libxenlight.so.4.6.0"
}
, { "address": 140309224115170
, "build_id": "14b6785eb9d1706f54c3ea508ef8bff639bcfd31"
, "build_id_offset": 330722
, "function_name": "eventloop_iteration"
, "file_name": "/usr/lib64/libxenlight.so.4.6.0"
}
, { "address": 140309224116204
, "build_id": "14b6785eb9d1706f54c3ea508ef8bff639bcfd31"
, "build_id_offset": 331756
, "function_name": "libxl__ao_inprogress"
, "file_name": "/usr/lib64/libxenlight.so.4.6.0"
}
, { "address": 140309223934413
, "build_id": "14b6785eb9d1706f54c3ea508ef8bff639bcfd31"
, "build_id_offset": 149965
, "function_name": "libxl_device_events_handler"
, "file_name": "/usr/lib64/libxenlight.so.4.6.0"
}
, { "address": 4306657
, "build_id": "fdb209bbadba71e82f19d6a3060471da74fadb54"
, "build_id_offset": 112353
, "function_name": "main_devd"
, "file_name": "/usr/sbin/xl"
}
, { "address": 4225998
, "build_id": "fdb209bbadba71e82f19d6a3060471da74fadb54"
, "build_id_offset": 31694
, "function_name": "main"
, "file_name": "/usr/sbin/xl"
} ]
} ]
}
Core was generated by `/usr/sbin/xl devd --pidfile=/var/run/xldevd.pid'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007f9c4972e37d in backend_watch_callback () from /lib64/libxenlight.so.4.6
[Current thread is 1 (Thread 0x7f9c49d2f8c0 (LWP 569))]
Missing separate debuginfos, use: dnf debuginfo-install xen-qubes-vm-4.6.4-26.fc23.x86_64
(gdb) bt
#0 0x00007f9c4972e37d in backend_watch_callback () from /lib64/libxenlight.so.4.6
#1 0x00007f9c49762c28 in watchfd_callback () from /lib64/libxenlight.so.4.6
#2 0x00007f9c4976488f in afterpoll_internal () from /lib64/libxenlight.so.4.6
#3 0x00007f9c49764be2 in eventloop_iteration () from /lib64/libxenlight.so.4.6
#4 0x00007f9c49764fec in libxl.ao_inprogress () from /lib64/libxenlight.so.4.6
#5 0x00007f9c497389cd in libxl_device_events_handler () from /lib64/libxenlight.so.4.6
#6 0x000000000041b6e1 in main_devd ()
#7 0x0000000000407bce in main ()
I'm checking if I can get more information from a xen debuginfo package.
ptitdoc
commented
May 10, 2017
I'm checking if I can get more information from a xen debuginfo package. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
ptitdoc
May 10, 2017
Core was generated by `/usr/sbin/xl devd --pidfile=/var/run/xldevd.pid'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 search_for_guest (ddomain=0x7ffca1d026c0, domid=14) at libxl.c:4363
4363 if (dguest->domid == domid)
[Current thread is 1 (Thread 0x7f9c49d2f8c0 (LWP 569))]
(gdb) bt
#0 search_for_guest (ddomain=0x7ffca1d026c0, domid=14) at libxl.c:4363
#1 backend_watch_callback (egc=0x7ffca1d025f0, watch=0x7ffca1d026c8, watch_path=<optimized out>, event_path=<optimized out>)
at libxl.c:4549
#2 0x00007f9c49762c28 in watchfd_callback (egc=0x7ffca1d025f0, ev=<optimized out>, fd=<optimized out>, events=<optimized out>,
revents=<optimized out>) at libxl_event.c:577
#3 0x00007f9c4976488f in afterpoll_internal (egc=egc@entry=0x7ffca1d025f0, poller=poller@entry=0xe8c920, nfds=2, fds=0xe8cc20,
now=...) at libxl_event.c:1271
#4 0x00007f9c49764be2 in eventloop_iteration (egc=egc@entry=0x7ffca1d025f0, poller=0xe8c920) at libxl_event.c:1716
#5 0x00007f9c49764fec in libxl__ao_inprogress (ao=ao@entry=0xe8c890, file=file@entry=0x7f9c4977f737 "libxl.c",
line=line@entry=4689, func=func@entry=0x7f9c497823b0 <__func__.21345> "libxl_device_events_handler") at libxl_event.c:2001
#6 0x00007f9c497389cd in libxl_device_events_handler (ctx=<optimized out>, ao_how=ao_how@entry=0x0) at libxl.c:4689
#7 0x000000000041b6e1 in main_devd (argc=2, argv=0x7ffca1d02950) at xl_cmdimpl.c:8153
#8 0x0000000000407bce in main (argc=2, argv=0x7ffca1d02950) at xl.c:361
ptitdoc
commented
May 10, 2017
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
Thanks, that is exactly what I need :) |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
ptitdoc
commented
May 10, 2017
|
Thanks for your help. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
May 12, 2017
Member
Anything else special about your setup? I cannot reproduce or find what is wrong based on above backtrace.
Also info locals on that coredump could be useful - in this frame and after switching to previous one (frame 1).
|
Anything else special about your setup? I cannot reproduce or find what is wrong based on above backtrace. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
ptitdoc
May 12, 2017
I essentially changed the behavior of AppVMs in order to bridge it to the netvm using two interfaces, however the same crash occurs when I use two standard natted vifs.
Another change is that the second interface uses a new MAC (I only increment the third byte of the MAC).
from qubes.qubes import QubesAppVm,register_qubes_vm_class,QubesVmLabels
from qubes.qubes import defaults
from qubes.qubes import QubesException,dry_run,vmm
import psutil
import sys
class QubesBridgeableVm(QubesAppVm):
# In which order load this VM type from qubes.xml
load_order = 1000
@property
def type(self):
return "BridgeableVM"
bridge_template = """
<interface type='bridge'>
<source bridge='bridge0'/>
<mac address='{mac}'/>
<script path='vif-bridge'/>
<backenddomain name='{backend}'/>
</interface>
"""
nat_template = """
<interface type='ethernet'>
<mac address='{mac}'/>
<ip address='{ip}'/>
<script path='vif-route-qubes'/>
<backenddomain name='{backend}'/>
</interface>
"""
inibit_detach_bridge = False
def _format_net_dev(self, ip, mac, backend):
# If the netvm is a proxyvm use the default network configuration
if self.netvm is None or self.netvm.is_proxyvm():
return QubesAppVm._format_net_dev(self, ip, mac, backend)
# If the netvm is not a proxyvm, consider that we have to setup a bridge
else:
# Create a libvirt VIF XML code for two bridged VIF
# Don't give the IP address so that xen bridge scripts will allow all forwarding (in and out) for our interface
# If an address is given, rules are setup to all all outgoing traffic and only incoming dhcp traffic
#<ip address='{ip}'/>
# <source bridge='{bridge}' />
# <bridge if='{bridge}' />
net_dev = self.bridge_template.format(ip=ip, mac=mac, backend=backend)
net_dev += self.nat_template.format(ip=ip, mac=self.next_mac(self.mac), backend=backend)
return net_dev
def next_mac(self, mac):
# Create a 2nd MAC address by incrementing the third MAC byte (UID increments is already used to differentiate VMs)
next_mac = mac.split(":")
if next_mac[2].upper() == "FF":
next_mac[2] = "00"
else:
next_mac[2] = chr(ord(next_mac[2].decode("hex"))+1).encode("hex").upper()
next_mac = ":".join(next_mac)
return next_mac
register_qubes_vm_class(QubesBridgeableVm)
ptitdoc
commented
May 12, 2017
|
I essentially changed the behavior of AppVMs in order to bridge it to the netvm using two interfaces, however the same crash occurs when I use two standard natted vifs. Another change is that the second interface uses a new MAC (I only increment the third byte of the MAC).
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
ptitdoc
May 12, 2017
Of course there is a bridge0 setup in the netvm (I use NetworkManager but it can be achieved using brctl)
ptitdoc
commented
May 12, 2017
|
Of course there is a bridge0 setup in the netvm (I use NetworkManager but it can be achieved using brctl) |
ptitdoc commentedMay 8, 2017
Qubes OS version (e.g.,
R3.2):R3.2
Affected TemplateVMs (e.g.,
fedora-23, if applicable):fedora-23
Expected behavior:
A pvguest vm is customized to use two network interface.
When the pvguest is shutdown or the pvguest netvm is changed, the current netvm should cleanup properly the network.
Actual behavior:
A pvguest vm is customized to use two network interface.
When the pvguest network is detached, the xendriverdomain service of the netvm (xl devd) crash completely (crashdump).
Steps to reproduce the behavior:
1/ Customize an AppVM to use two network interfaces, and the default netvm
2/ Start the AppVM
3/ Change the AppVM network to none
General notes:
This problem occurs only when modifying the QubesOS xen configuration.
The problem occurs both when using two bridged interface or two interface using the vif-route-qubes vif configuration.
Related issues:
https://patchwork.kernel.org/patch/8092491/