New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't suspend with 3.1 (netvm kernel panic) #1644

Closed
talex5 opened this Issue Jan 16, 2016 · 12 comments

Comments

Projects
None yet
3 participants
@talex5

talex5 commented Jan 16, 2016

My laptop now often doesn't suspend when I close the lid. If I log back in and shut down net-vm manually then it does suspend. The logs for net-vm show:

[11461.894148] ------------[ cut here ]------------
[11461.894168] WARNING: CPU: 3 PID: 38 at /home/user/rpmbuild/BUILD/kernel-4.1.13/linux-4.1.13/drivers/net/wireless/iwlwifi/iwl-trans.h:962 iwl_trans_pcie_wait_txq_empty+0x439/0x480 [iwlwifi]()
[11461.894179] Modules linked in: ccm fuse iptable_raw xt_nat xt_REDIRECT nf_nat_redirect xen_netback xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat ip6table_filter nf_conntrack ip6_tables arc4 iwlmvm mac80211 iwlwifi e1000e cfg80211 intel_rapl iosf_mbi ptp x86_pkg_temp_thermal rfkill pps_core coretemp crct10dif_pclmul crc32_pclmul crc32c_intel pcspkr xen_pcifront dummy_hcd udc_core xenfs xen_blkback u2mfn(O) xen_privcmd xen_blkfront
[11461.894233] CPU: 3 PID: 38 Comm: xenwatch Tainted: G        W  O    4.1.13-7.pvops.qubes.x86_64 #1
[11461.894241]  0000000000000000 0000000036ad0d4c ffff880010bdf888 ffffffff8176baff
[11461.894250]  0000000000000000 0000000000000000 ffff880010bdf8c8 ffffffff8109fe2a
[11461.894260]  ffff880010bdf8d8 0000000000000001 0000000000a02e38 ffff88000095c000
[11461.894270] Call Trace:
[11461.894278]  [<ffffffff8176baff>] dump_stack+0x45/0x57
[11461.894284]  [<ffffffff8109fe2a>] warn_slowpath_common+0x8a/0xc0
[11461.894291]  [<ffffffff8109ff5a>] warn_slowpath_null+0x1a/0x20
[11461.894303]  [<ffffffffa01cc299>] iwl_trans_pcie_wait_txq_empty+0x439/0x480 [iwlwifi]
[11461.894315]  [<ffffffffa02c6ce0>] iwl_mvm_prepare_mac_removal+0x80/0xc0 [iwlmvm]
[11461.894327]  [<ffffffffa02c6d47>] iwl_mvm_mac_remove_interface+0x27/0x110 [iwlmvm]
[11461.894348]  [<ffffffffa01fb370>] ieee80211_do_stop+0x540/0x930 [mac80211]
[11461.894357]  [<ffffffff816746a1>] ? dev_deactivate_many+0x211/0x260
[11461.894373]  [<ffffffffa01fb77a>] ieee80211_stop+0x1a/0x20 [mac80211]
[11461.894380]  [<ffffffff81647e35>] __dev_close_many+0xa5/0x110
[11461.894387]  [<ffffffff81647f38>] dev_close_many+0x98/0x140
[11461.894393]  [<ffffffff8164986d>] rollback_registered_many+0x10d/0x310
[11461.894399]  [<ffffffff81649af9>] unregister_netdevice_many+0x19/0x60
[11461.894414]  [<ffffffffa01fd112>] ieee80211_remove_interfaces+0x122/0x1b0 [mac80211]
[11461.894426]  [<ffffffffa01e48df>] ieee80211_unregister_hw+0x5f/0x120 [mac80211]
[11461.894435]  [<ffffffffa02cd102>] iwl_op_mode_mvm_stop+0x32/0x110 [iwlmvm]
[11461.894441]  [<ffffffffa01c086e>] _iwl_op_mode_stop.isra.12+0x2e/0x60 [iwlwifi]
[11461.894451]  [<ffffffffa01c15e0>] iwl_drv_stop+0x30/0x80 [iwlwifi]
[11461.894460]  [<ffffffffa01c50b1>] iwl_pci_remove+0x21/0x30 [iwlwifi]
[11461.894467]  [<ffffffff813f5ddf>] pci_device_remove+0x3f/0xc0
[11461.894475]  [<ffffffff814e86b7>] __device_release_driver+0x87/0x120
[11461.894482]  [<ffffffff814e8773>] device_release_driver+0x23/0x30
[11461.894489]  [<ffffffff813eed1c>] pci_stop_bus_device+0x9c/0xb0
[11461.894497]  [<ffffffff813eee16>] pci_stop_and_remove_bus_device+0x16/0x30
[11461.894505]  [<ffffffffa0021d45>] pcifront_backend_changed+0x5c5/0x6d1 [xen_pcifront]
[11461.894515]  [<ffffffff8147efbd>] xenbus_otherend_changed+0xad/0x110
[11461.894524]  [<ffffffff810e3d9a>] ? prepare_to_wait_event+0x8a/0xf0
[11461.894531]  [<ffffffff8177341f>] ? _raw_spin_unlock_irqrestore+0x1f/0x50
[11461.894539]  [<ffffffff8147e160>] ? unregister_xenbus_watch+0x1d0/0x1d0
[11461.894544]  [<ffffffff814814d3>] backend_changed+0x13/0x20
[11461.894552]  [<ffffffff8147e1b5>] xenwatch_thread+0x55/0x160
[11461.894558]  [<ffffffff810e3e00>] ? prepare_to_wait_event+0xf0/0xf0
[11461.894565]  [<ffffffff810bf3a8>] kthread+0xd8/0xf0
[11461.894570]  [<ffffffff810bf2d0>] ? kthread_create_on_node+0x1b0/0x1b0
[11461.894577]  [<ffffffff81773f22>] ret_from_fork+0x42/0x70
[11461.894582]  [<ffffffff810bf2d0>] ? kthread_create_on_node+0x1b0/0x1b0
[11461.894587] ---[ end trace c850fc74def63c20 ]---
@talex5

This comment has been minimized.

Show comment
Hide comment
@talex5

talex5 Jan 16, 2016

Also, I can't restart net-vm after this (have to reboot the laptop). Trying it from virsh shows:

virsh # list
 Id    Name                           State
----------------------------------------------------
 0     Domain-0                       running

virsh # start sys-net
error: Failed to start domain sys-net
error: Requested operation is not valid: PCI device 0000:00:19.0 is in use by driver xenlight, domain sys-net

talex5 commented Jan 16, 2016

Also, I can't restart net-vm after this (have to reboot the laptop). Trying it from virsh shows:

virsh # list
 Id    Name                           State
----------------------------------------------------
 0     Domain-0                       running

virsh # start sys-net
error: Failed to start domain sys-net
error: Requested operation is not valid: PCI device 0000:00:19.0 is in use by driver xenlight, domain sys-net
@marmarek

This comment has been minimized.

Show comment
Hide comment
@marmarek

marmarek Jan 17, 2016

Member

The logs for net-vm show:

Is that log from suspend try, or shutting down sys-net? For me it looks for the later...

Also try what happens when you suspend sys-net directly from libvirt:

virsh -c xen:/// dompmsuspend sys-net mem

(resume with dompmwakeup)
If that puts domain into sleep correctly, there is some problem with Qubes suspend scripts, otherwise, there is a kernel driver problem.
Do not try to suspend the whole system when you manually suspended sys-net - it will most likely fail.

Member

marmarek commented Jan 17, 2016

The logs for net-vm show:

Is that log from suspend try, or shutting down sys-net? For me it looks for the later...

Also try what happens when you suspend sys-net directly from libvirt:

virsh -c xen:/// dompmsuspend sys-net mem

(resume with dompmwakeup)
If that puts domain into sleep correctly, there is some problem with Qubes suspend scripts, otherwise, there is a kernel driver problem.
Do not try to suspend the whole system when you manually suspended sys-net - it will most likely fail.

@marmarek marmarek added this to the Release 3.1 milestone Jan 17, 2016

@talex5

This comment has been minimized.

Show comment
Hide comment
@talex5

talex5 Jan 17, 2016

I don't remember exactly - I probably tried shutting it down after it failed to suspend. Here's an earlier panic from the log (the first one after booting):

[11122.388053] IPv6: ADDRCONF(NETDEV_CHANGE): wlp0s1: link becomes ready
[11122.765213] e1000e: enp0s0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
[11122.765284] IPv6: ADDRCONF(NETDEV_CHANGE): enp0s0: link becomes ready
[11156.111919] wlp0s1: deauthenticating from 34:12:98:07:1f:e5 by local choice (Reason: 3=DEAUTH_LEAVING)
[11156.266497] e1000e: enp0s0 NIC Link is Down
[11156.576046] Freezing user space processes ... (elapsed 0.001 seconds) done.
[11156.577497] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
[11156.579180] PM: freeze of devices complete after 0.457 msecs
[11156.579300] PM: late freeze of devices complete after 0.108 msecs
[11156.590825] PM: noirq freeze of devices complete after 11.513 msecs
[11156.591879] xen:grant_table: Grant tables using version 1 layout
[11156.591879] Using NULL legacy PIC
[11156.591946] PM: noirq thaw of devices complete after 0.415 msecs
[11156.591946] PM: early thaw of devices complete after 0.213 msecs
[11156.592363] PM: thaw of devices complete after 0.586 msecs
[11156.592441] Restarting tasks ... done.
[11157.080927] IPv6: ADDRCONF(NETDEV_UP): wlp0s1: link is not ready
[11157.081909] iwlwifi 0000:00:01.0: L1 Enabled - LTR Enabled
[11157.082618] iwlwifi 0000:00:01.0: L1 Enabled - LTR Enabled
[11157.145033] iwlwifi 0000:00:01.0: L1 Enabled - LTR Enabled
[11157.145640] iwlwifi 0000:00:01.0: L1 Enabled - LTR Enabled
[11157.168747] IPv6: ADDRCONF(NETDEV_UP): wlp0s1: link is not ready
[11157.169831] IPv6: ADDRCONF(NETDEV_UP): enp0s0: link is not ready
[11157.373161] IPv6: ADDRCONF(NETDEV_UP): enp0s0: link is not ready
[11157.446796] IPv6: ADDRCONF(NETDEV_UP): wlp0s1: link is not ready
[11160.309710] wlp0s1: authenticate with 34:12:98:07:1f:e5
[11160.316635] wlp0s1: send auth to 34:12:98:07:1f:e5 (try 1/3)
[11160.336237] wlp0s1: authenticated
[11160.337068] wlp0s1: associate with 34:12:98:07:1f:e5 (try 1/3)
[11160.338203] wlp0s1: RX AssocResp from 34:12:98:07:1f:e5 (capab=0x1011 status=0 aid=6)
[11160.348368] wlp0s1: associated
[11160.348470] IPv6: ADDRCONF(NETDEV_CHANGE): wlp0s1: link becomes ready
[11161.561367] e1000e: enp0s0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
[11161.561440] IPv6: ADDRCONF(NETDEV_CHANGE): enp0s0: link becomes ready
[11405.978142] e1000e 0000:00:00.0 enp0s0: removed PHC
[11406.075363] e1000e: enp0s0 NIC Link is Down
[11406.194664] pcifront pci-0: Rescanning PCI Frontend Bus 0000:00
[11406.427478] wlp0s1: deauthenticating from 34:12:98:07:1f:e5 by local choice (Reason: 3=DEAUTH_LEAVING)
[11406.428443] iwlwifi 0000:00:01.0: Failed to wake NIC for hcmd
[11406.670692] iwlwifi 0000:00:01.0: Error sending MAC_CONTEXT_CMD: enqueue_hcmd failed: -5
[11406.670692] iwlwifi 0000:00:01.0: Failed to send MAC context (action:2): -5
[11406.670724] iwlwifi 0000:00:01.0: failed to update MAC 4c:34:88:55:cb:75
[11406.671766] iwlwifi 0000:00:01.0: Failed to wake NIC for hcmd
[11406.910392] iwlwifi 0000:00:01.0: Error sending TXPATH_FLUSH: enqueue_hcmd failed: -5
[11406.910405] iwlwifi 0000:00:01.0: Failed to send flush command (-5)
[11406.910405] iwlwifi 0000:00:01.0: flush request fail
[11408.912064] iwlwifi 0000:00:01.0: fail to flush all tx fifo queues Q 0
[11408.912090] iwlwifi 0000:00:01.0: Current SW read_ptr 5 write_ptr 6
[11408.913040] ------------[ cut here ]------------
[11408.913040] WARNING: CPU: 3 PID: 38 at /home/user/rpmbuild/BUILD/kernel-4.1.13/linux-4.1.13/drivers/net/wireless/iwlwifi/pcie/trans.c:1429 iwl_trans_pcie_grab_nic_access+0x110/0x120 [iwlwifi]()
[11408.913040] Timeout waiting for hardware access (CSR_GP_CNTRL 0x080403d8)
[11408.913040] Modules linked in: ccm fuse iptable_raw xt_nat xt_REDIRECT nf_nat_redirect xen_netback xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat ip6table_filter nf_conntrack ip6_tables arc4 iwlmvm mac80211 iwlwifi e1000e cfg80211 intel_rapl iosf_mbi ptp x86_pkg_temp_thermal rfkill pps_core coretemp crct10dif_pclmul crc32_pclmul crc32c_intel pcspkr xen_pcifront dummy_hcd udc_core xenfs xen_blkback u2mfn(O) xen_privcmd xen_blkfront
[11408.913040] CPU: 3 PID: 38 Comm: xenwatch Tainted: G           O    4.1.13-7.pvops.qubes.x86_64 #1
[11408.913040]  0000000000000000 0000000036ad0d4c ffff880010bdf4d8 ffffffff8176baff
[11408.913040]  0000000000000000 ffff880010bdf530 ffff880010bdf518 ffffffff8109fe2a
[11408.913040]  0000000000000200 ffff88000095c000 ffff88000095fcb0 ffff880010bdf5c8
[11408.913040] Call Trace:
[11408.913040]  [<ffffffff8176baff>] dump_stack+0x45/0x57
[11408.913040]  [<ffffffff8109fe2a>] warn_slowpath_common+0x8a/0xc0
[11408.913040]  [<ffffffff8109feb5>] warn_slowpath_fmt+0x55/0x70
[11408.913040]  [<ffffffffa01cd8a0>] iwl_trans_pcie_grab_nic_access+0x110/0x120 [iwlwifi]
[11408.913040]  [<ffffffffa01cb9fe>] iwl_trans_pcie_read_mem+0x3e/0xd0 [iwlwifi]
[11408.913040]  [<ffffffffa01cc034>] iwl_trans_pcie_wait_txq_empty+0x1d4/0x480 [iwlwifi]
[11408.913040]  [<ffffffff810c8c1b>] ? finish_task_switch+0x6b/0x1a0
[11408.913040]  [<ffffffffa02c548b>] iwl_mvm_mac_flush+0x14b/0x180 [iwlmvm]
[11408.913040]  [<ffffffffa02160bb>] __ieee80211_flush_queues+0xbb/0x1c0 [mac80211]
[11408.913040]  [<ffffffffa02161d3>] ieee80211_flush_queues+0x13/0x20 [mac80211]
[11408.913040]  [<ffffffffa02358a8>] ieee80211_set_disassoc+0x3f8/0x520 [mac80211]
[11408.913040]  [<ffffffffa0239bb3>] ieee80211_mgd_deauth+0xf3/0x190 [mac80211]
[11408.913040]  [<ffffffffa0200d78>] ieee80211_deauth+0x18/0x20 [mac80211]
[11408.913040]  [<ffffffffa00e899e>] cfg80211_mlme_deauth+0x9e/0x150 [cfg80211]
[11408.913040]  [<ffffffffa00e8c6b>] cfg80211_mlme_down+0x6b/0x90 [cfg80211]
[11408.913040]  [<ffffffff810d51ce>] ? account_entity_dequeue+0xbe/0xe0
[11408.913040]  [<ffffffffa00ecbcb>] cfg80211_disconnect+0x18b/0x1b0 [cfg80211]
[11408.913040]  [<ffffffff81004f72>] ? xen_mc_flush+0x172/0x1a0
[11408.913040]  [<ffffffffa00c1078>] __cfg80211_leave+0x98/0x160 [cfg80211]
[11408.913040]  [<ffffffffa00c116f>] cfg80211_leave+0x2f/0x50 [cfg80211]
[11408.913040]  [<ffffffffa00c14cb>] cfg80211_netdev_notifier_call+0x33b/0x5f0 [cfg80211]
[11408.913040]  [<ffffffff816c9b97>] ? inetdev_event+0x37/0x530
[11408.913040]  [<ffffffff810c047e>] notifier_call_chain+0x4e/0x80
[11408.913040]  [<ffffffff810c05b6>] raw_notifier_call_chain+0x16/0x20
[11408.913040]  [<ffffffff81647aa9>] call_netdevice_notifiers_info+0x39/0x70
[11408.913040]  [<ffffffff81647df1>] __dev_close_many+0x61/0x110
[11408.913040]  [<ffffffff81647f38>] dev_close_many+0x98/0x140
[11408.913040]  [<ffffffff8164986d>] rollback_registered_many+0x10d/0x310
[11408.913040]  [<ffffffff81649af9>] unregister_netdevice_many+0x19/0x60
[11408.913040]  [<ffffffffa01fd112>] ieee80211_remove_interfaces+0x122/0x1b0 [mac80211]
[11408.913040]  [<ffffffffa01e48df>] ieee80211_unregister_hw+0x5f/0x120 [mac80211]
[11408.913040]  [<ffffffffa02cd102>] iwl_op_mode_mvm_stop+0x32/0x110 [iwlmvm]
[11408.913040]  [<ffffffffa01c086e>] _iwl_op_mode_stop.isra.12+0x2e/0x60 [iwlwifi]
[11408.913040]  [<ffffffffa01c15e0>] iwl_drv_stop+0x30/0x80 [iwlwifi]
[11408.913040]  [<ffffffffa01c50b1>] iwl_pci_remove+0x21/0x30 [iwlwifi]
[11408.913040]  [<ffffffff813f5ddf>] pci_device_remove+0x3f/0xc0
[11408.913040]  [<ffffffff814e86b7>] __device_release_driver+0x87/0x120
[11408.913040]  [<ffffffff814e8773>] device_release_driver+0x23/0x30
[11408.913040]  [<ffffffff813eed1c>] pci_stop_bus_device+0x9c/0xb0
[11408.913040]  [<ffffffff813eee16>] pci_stop_and_remove_bus_device+0x16/0x30
[11408.913040]  [<ffffffffa0021d45>] pcifront_backend_changed+0x5c5/0x6d1 [xen_pcifront]
[11408.913040]  [<ffffffff8147efbd>] xenbus_otherend_changed+0xad/0x110
[11408.913040]  [<ffffffff810e3d9a>] ? prepare_to_wait_event+0x8a/0xf0
[11408.913040]  [<ffffffff8177341f>] ? _raw_spin_unlock_irqrestore+0x1f/0x50
[11408.913040]  [<ffffffff8147e160>] ? unregister_xenbus_watch+0x1d0/0x1d0
[11408.913040]  [<ffffffff814814d3>] backend_changed+0x13/0x20
[11408.913040]  [<ffffffff8147e1b5>] xenwatch_thread+0x55/0x160
[11408.913040]  [<ffffffff810e3e00>] ? prepare_to_wait_event+0xf0/0xf0
[11408.913040]  [<ffffffff810bf3a8>] kthread+0xd8/0xf0
[11408.913040]  [<ffffffff810bf2d0>] ? kthread_create_on_node+0x1b0/0x1b0
[11408.913040]  [<ffffffff81773f22>] ret_from_fork+0x42/0x70
[11408.913040]  [<ffffffff810bf2d0>] ? kthread_create_on_node+0x1b0/0x1b0
[11408.913040] ---[ end trace c850fc74def63bf3 ]---
[11409.152000] iwl data: 00000000: 80 18 39 0e 00 88 ff ff 00 00 00 00 00 00 00 00  ..9.............
[11409.391797] iwlwifi 0000:00:01.0: FH TRBs(0) = 0x5a5a5a5a
[11409.631974] iwlwifi 0000:00:01.0: FH TRBs(1) = 0x5a5a5a5a
[11409.872074] iwlwifi 0000:00:01.0: FH TRBs(2) = 0x5a5a5a5a
[11410.111891] iwlwifi 0000:00:01.0: FH TRBs(3) = 0x5a5a5a5a
[11410.350516] iwlwifi 0000:00:01.0: FH TRBs(4) = 0x5a5a5a5a
[11410.588520] iwlwifi 0000:00:01.0: FH TRBs(5) = 0x5a5a5a5a
[11410.826845] iwlwifi 0000:00:01.0: FH TRBs(6) = 0x5a5a5a5a
[11411.065141] iwlwifi 0000:00:01.0: FH TRBs(7) = 0x5a5a5a5a

I'll try the dompmsuspend thing next time it happens.

talex5 commented Jan 17, 2016

I don't remember exactly - I probably tried shutting it down after it failed to suspend. Here's an earlier panic from the log (the first one after booting):

[11122.388053] IPv6: ADDRCONF(NETDEV_CHANGE): wlp0s1: link becomes ready
[11122.765213] e1000e: enp0s0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
[11122.765284] IPv6: ADDRCONF(NETDEV_CHANGE): enp0s0: link becomes ready
[11156.111919] wlp0s1: deauthenticating from 34:12:98:07:1f:e5 by local choice (Reason: 3=DEAUTH_LEAVING)
[11156.266497] e1000e: enp0s0 NIC Link is Down
[11156.576046] Freezing user space processes ... (elapsed 0.001 seconds) done.
[11156.577497] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
[11156.579180] PM: freeze of devices complete after 0.457 msecs
[11156.579300] PM: late freeze of devices complete after 0.108 msecs
[11156.590825] PM: noirq freeze of devices complete after 11.513 msecs
[11156.591879] xen:grant_table: Grant tables using version 1 layout
[11156.591879] Using NULL legacy PIC
[11156.591946] PM: noirq thaw of devices complete after 0.415 msecs
[11156.591946] PM: early thaw of devices complete after 0.213 msecs
[11156.592363] PM: thaw of devices complete after 0.586 msecs
[11156.592441] Restarting tasks ... done.
[11157.080927] IPv6: ADDRCONF(NETDEV_UP): wlp0s1: link is not ready
[11157.081909] iwlwifi 0000:00:01.0: L1 Enabled - LTR Enabled
[11157.082618] iwlwifi 0000:00:01.0: L1 Enabled - LTR Enabled
[11157.145033] iwlwifi 0000:00:01.0: L1 Enabled - LTR Enabled
[11157.145640] iwlwifi 0000:00:01.0: L1 Enabled - LTR Enabled
[11157.168747] IPv6: ADDRCONF(NETDEV_UP): wlp0s1: link is not ready
[11157.169831] IPv6: ADDRCONF(NETDEV_UP): enp0s0: link is not ready
[11157.373161] IPv6: ADDRCONF(NETDEV_UP): enp0s0: link is not ready
[11157.446796] IPv6: ADDRCONF(NETDEV_UP): wlp0s1: link is not ready
[11160.309710] wlp0s1: authenticate with 34:12:98:07:1f:e5
[11160.316635] wlp0s1: send auth to 34:12:98:07:1f:e5 (try 1/3)
[11160.336237] wlp0s1: authenticated
[11160.337068] wlp0s1: associate with 34:12:98:07:1f:e5 (try 1/3)
[11160.338203] wlp0s1: RX AssocResp from 34:12:98:07:1f:e5 (capab=0x1011 status=0 aid=6)
[11160.348368] wlp0s1: associated
[11160.348470] IPv6: ADDRCONF(NETDEV_CHANGE): wlp0s1: link becomes ready
[11161.561367] e1000e: enp0s0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
[11161.561440] IPv6: ADDRCONF(NETDEV_CHANGE): enp0s0: link becomes ready
[11405.978142] e1000e 0000:00:00.0 enp0s0: removed PHC
[11406.075363] e1000e: enp0s0 NIC Link is Down
[11406.194664] pcifront pci-0: Rescanning PCI Frontend Bus 0000:00
[11406.427478] wlp0s1: deauthenticating from 34:12:98:07:1f:e5 by local choice (Reason: 3=DEAUTH_LEAVING)
[11406.428443] iwlwifi 0000:00:01.0: Failed to wake NIC for hcmd
[11406.670692] iwlwifi 0000:00:01.0: Error sending MAC_CONTEXT_CMD: enqueue_hcmd failed: -5
[11406.670692] iwlwifi 0000:00:01.0: Failed to send MAC context (action:2): -5
[11406.670724] iwlwifi 0000:00:01.0: failed to update MAC 4c:34:88:55:cb:75
[11406.671766] iwlwifi 0000:00:01.0: Failed to wake NIC for hcmd
[11406.910392] iwlwifi 0000:00:01.0: Error sending TXPATH_FLUSH: enqueue_hcmd failed: -5
[11406.910405] iwlwifi 0000:00:01.0: Failed to send flush command (-5)
[11406.910405] iwlwifi 0000:00:01.0: flush request fail
[11408.912064] iwlwifi 0000:00:01.0: fail to flush all tx fifo queues Q 0
[11408.912090] iwlwifi 0000:00:01.0: Current SW read_ptr 5 write_ptr 6
[11408.913040] ------------[ cut here ]------------
[11408.913040] WARNING: CPU: 3 PID: 38 at /home/user/rpmbuild/BUILD/kernel-4.1.13/linux-4.1.13/drivers/net/wireless/iwlwifi/pcie/trans.c:1429 iwl_trans_pcie_grab_nic_access+0x110/0x120 [iwlwifi]()
[11408.913040] Timeout waiting for hardware access (CSR_GP_CNTRL 0x080403d8)
[11408.913040] Modules linked in: ccm fuse iptable_raw xt_nat xt_REDIRECT nf_nat_redirect xen_netback xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat ip6table_filter nf_conntrack ip6_tables arc4 iwlmvm mac80211 iwlwifi e1000e cfg80211 intel_rapl iosf_mbi ptp x86_pkg_temp_thermal rfkill pps_core coretemp crct10dif_pclmul crc32_pclmul crc32c_intel pcspkr xen_pcifront dummy_hcd udc_core xenfs xen_blkback u2mfn(O) xen_privcmd xen_blkfront
[11408.913040] CPU: 3 PID: 38 Comm: xenwatch Tainted: G           O    4.1.13-7.pvops.qubes.x86_64 #1
[11408.913040]  0000000000000000 0000000036ad0d4c ffff880010bdf4d8 ffffffff8176baff
[11408.913040]  0000000000000000 ffff880010bdf530 ffff880010bdf518 ffffffff8109fe2a
[11408.913040]  0000000000000200 ffff88000095c000 ffff88000095fcb0 ffff880010bdf5c8
[11408.913040] Call Trace:
[11408.913040]  [<ffffffff8176baff>] dump_stack+0x45/0x57
[11408.913040]  [<ffffffff8109fe2a>] warn_slowpath_common+0x8a/0xc0
[11408.913040]  [<ffffffff8109feb5>] warn_slowpath_fmt+0x55/0x70
[11408.913040]  [<ffffffffa01cd8a0>] iwl_trans_pcie_grab_nic_access+0x110/0x120 [iwlwifi]
[11408.913040]  [<ffffffffa01cb9fe>] iwl_trans_pcie_read_mem+0x3e/0xd0 [iwlwifi]
[11408.913040]  [<ffffffffa01cc034>] iwl_trans_pcie_wait_txq_empty+0x1d4/0x480 [iwlwifi]
[11408.913040]  [<ffffffff810c8c1b>] ? finish_task_switch+0x6b/0x1a0
[11408.913040]  [<ffffffffa02c548b>] iwl_mvm_mac_flush+0x14b/0x180 [iwlmvm]
[11408.913040]  [<ffffffffa02160bb>] __ieee80211_flush_queues+0xbb/0x1c0 [mac80211]
[11408.913040]  [<ffffffffa02161d3>] ieee80211_flush_queues+0x13/0x20 [mac80211]
[11408.913040]  [<ffffffffa02358a8>] ieee80211_set_disassoc+0x3f8/0x520 [mac80211]
[11408.913040]  [<ffffffffa0239bb3>] ieee80211_mgd_deauth+0xf3/0x190 [mac80211]
[11408.913040]  [<ffffffffa0200d78>] ieee80211_deauth+0x18/0x20 [mac80211]
[11408.913040]  [<ffffffffa00e899e>] cfg80211_mlme_deauth+0x9e/0x150 [cfg80211]
[11408.913040]  [<ffffffffa00e8c6b>] cfg80211_mlme_down+0x6b/0x90 [cfg80211]
[11408.913040]  [<ffffffff810d51ce>] ? account_entity_dequeue+0xbe/0xe0
[11408.913040]  [<ffffffffa00ecbcb>] cfg80211_disconnect+0x18b/0x1b0 [cfg80211]
[11408.913040]  [<ffffffff81004f72>] ? xen_mc_flush+0x172/0x1a0
[11408.913040]  [<ffffffffa00c1078>] __cfg80211_leave+0x98/0x160 [cfg80211]
[11408.913040]  [<ffffffffa00c116f>] cfg80211_leave+0x2f/0x50 [cfg80211]
[11408.913040]  [<ffffffffa00c14cb>] cfg80211_netdev_notifier_call+0x33b/0x5f0 [cfg80211]
[11408.913040]  [<ffffffff816c9b97>] ? inetdev_event+0x37/0x530
[11408.913040]  [<ffffffff810c047e>] notifier_call_chain+0x4e/0x80
[11408.913040]  [<ffffffff810c05b6>] raw_notifier_call_chain+0x16/0x20
[11408.913040]  [<ffffffff81647aa9>] call_netdevice_notifiers_info+0x39/0x70
[11408.913040]  [<ffffffff81647df1>] __dev_close_many+0x61/0x110
[11408.913040]  [<ffffffff81647f38>] dev_close_many+0x98/0x140
[11408.913040]  [<ffffffff8164986d>] rollback_registered_many+0x10d/0x310
[11408.913040]  [<ffffffff81649af9>] unregister_netdevice_many+0x19/0x60
[11408.913040]  [<ffffffffa01fd112>] ieee80211_remove_interfaces+0x122/0x1b0 [mac80211]
[11408.913040]  [<ffffffffa01e48df>] ieee80211_unregister_hw+0x5f/0x120 [mac80211]
[11408.913040]  [<ffffffffa02cd102>] iwl_op_mode_mvm_stop+0x32/0x110 [iwlmvm]
[11408.913040]  [<ffffffffa01c086e>] _iwl_op_mode_stop.isra.12+0x2e/0x60 [iwlwifi]
[11408.913040]  [<ffffffffa01c15e0>] iwl_drv_stop+0x30/0x80 [iwlwifi]
[11408.913040]  [<ffffffffa01c50b1>] iwl_pci_remove+0x21/0x30 [iwlwifi]
[11408.913040]  [<ffffffff813f5ddf>] pci_device_remove+0x3f/0xc0
[11408.913040]  [<ffffffff814e86b7>] __device_release_driver+0x87/0x120
[11408.913040]  [<ffffffff814e8773>] device_release_driver+0x23/0x30
[11408.913040]  [<ffffffff813eed1c>] pci_stop_bus_device+0x9c/0xb0
[11408.913040]  [<ffffffff813eee16>] pci_stop_and_remove_bus_device+0x16/0x30
[11408.913040]  [<ffffffffa0021d45>] pcifront_backend_changed+0x5c5/0x6d1 [xen_pcifront]
[11408.913040]  [<ffffffff8147efbd>] xenbus_otherend_changed+0xad/0x110
[11408.913040]  [<ffffffff810e3d9a>] ? prepare_to_wait_event+0x8a/0xf0
[11408.913040]  [<ffffffff8177341f>] ? _raw_spin_unlock_irqrestore+0x1f/0x50
[11408.913040]  [<ffffffff8147e160>] ? unregister_xenbus_watch+0x1d0/0x1d0
[11408.913040]  [<ffffffff814814d3>] backend_changed+0x13/0x20
[11408.913040]  [<ffffffff8147e1b5>] xenwatch_thread+0x55/0x160
[11408.913040]  [<ffffffff810e3e00>] ? prepare_to_wait_event+0xf0/0xf0
[11408.913040]  [<ffffffff810bf3a8>] kthread+0xd8/0xf0
[11408.913040]  [<ffffffff810bf2d0>] ? kthread_create_on_node+0x1b0/0x1b0
[11408.913040]  [<ffffffff81773f22>] ret_from_fork+0x42/0x70
[11408.913040]  [<ffffffff810bf2d0>] ? kthread_create_on_node+0x1b0/0x1b0
[11408.913040] ---[ end trace c850fc74def63bf3 ]---
[11409.152000] iwl data: 00000000: 80 18 39 0e 00 88 ff ff 00 00 00 00 00 00 00 00  ..9.............
[11409.391797] iwlwifi 0000:00:01.0: FH TRBs(0) = 0x5a5a5a5a
[11409.631974] iwlwifi 0000:00:01.0: FH TRBs(1) = 0x5a5a5a5a
[11409.872074] iwlwifi 0000:00:01.0: FH TRBs(2) = 0x5a5a5a5a
[11410.111891] iwlwifi 0000:00:01.0: FH TRBs(3) = 0x5a5a5a5a
[11410.350516] iwlwifi 0000:00:01.0: FH TRBs(4) = 0x5a5a5a5a
[11410.588520] iwlwifi 0000:00:01.0: FH TRBs(5) = 0x5a5a5a5a
[11410.826845] iwlwifi 0000:00:01.0: FH TRBs(6) = 0x5a5a5a5a
[11411.065141] iwlwifi 0000:00:01.0: FH TRBs(7) = 0x5a5a5a5a

I'll try the dompmsuspend thing next time it happens.

@talex5

This comment has been minimized.

Show comment
Hide comment
@talex5

talex5 Feb 7, 2016

Suspending sys-net via virsh seems to work:

[tal@dom0 ~]$ virsh -c xen:/// dompmsuspend sys-net mem
Domain sys-net successfully suspended
[tal@dom0 ~]$ sudo xl list
Name                                        ID   Mem VCPUs      State   Time(s)
dom0                                         0  3906     4     r-----    4276.5
sys-net                                      1   298     4     ---ss-     365.4
mirage-firewall                              2    20     1     -b----     125.0
untrusted                                    7  1192     4     -b----    1057.4
comms                                        8  2289     4     -b----    1218.3
[tal@dom0 ~]$ virsh -c xen:/// dompmwakeup sys-net 
Domain sys-net successfully woken up
[tal@dom0 ~]$ sudo xl list
Name                                        ID   Mem VCPUs      State   Time(s)
dom0                                         0  3906     4     r-----    4278.0
sys-net                                      1   298     4     -b----     365.5
mirage-firewall                              2    20     1     -b----     125.0
untrusted                                    7  1192     4     -b----    1057.5
comms                                        8  2289     4     -b----    1218.5

However, closing the lid or suspending using the menu (Leave -> Sleep) just turns on the screen lock.

talex5 commented Feb 7, 2016

Suspending sys-net via virsh seems to work:

[tal@dom0 ~]$ virsh -c xen:/// dompmsuspend sys-net mem
Domain sys-net successfully suspended
[tal@dom0 ~]$ sudo xl list
Name                                        ID   Mem VCPUs      State   Time(s)
dom0                                         0  3906     4     r-----    4276.5
sys-net                                      1   298     4     ---ss-     365.4
mirage-firewall                              2    20     1     -b----     125.0
untrusted                                    7  1192     4     -b----    1057.4
comms                                        8  2289     4     -b----    1218.3
[tal@dom0 ~]$ virsh -c xen:/// dompmwakeup sys-net 
Domain sys-net successfully woken up
[tal@dom0 ~]$ sudo xl list
Name                                        ID   Mem VCPUs      State   Time(s)
dom0                                         0  3906     4     r-----    4278.0
sys-net                                      1   298     4     -b----     365.5
mirage-firewall                              2    20     1     -b----     125.0
untrusted                                    7  1192     4     -b----    1057.5
comms                                        8  2289     4     -b----    1218.5

However, closing the lid or suspending using the menu (Leave -> Sleep) just turns on the screen lock.

@marmarek

This comment has been minimized.

Show comment
Hide comment
@marmarek

marmarek Feb 7, 2016

Member

Check /var/log/messages after such failed suspend attempt. Do you have any fails there related to suspend services (qubes-suspend, systemd-suspend etc). Also check kernel messages in dom0...

Member

marmarek commented Feb 7, 2016

Check /var/log/messages after such failed suspend attempt. Do you have any fails there related to suspend services (qubes-suspend, systemd-suspend etc). Also check kernel messages in dom0...

@talex5

This comment has been minimized.

Show comment
Hide comment
@talex5

talex5 Feb 7, 2016

Hmm... this is interesting:

[tal@dom0 log]$ systemctl status qubes-suspend.service -l
qubes-suspend.service - Qubes suspend hooks
   Loaded: loaded (/usr/lib/systemd/system/qubes-suspend.service; enabled)
   Active: deactivating (stop-post) since Sun 2016-02-07 16:22:32 GMT; 3h 10min ago
 Main PID: 8885 (code=exited, status=0/SUCCESS);         : 9060 (01qubes-sync-vm)
   CGroup: /system.slice/qubes-suspend.service
       └─control
     ├─9060 /bin/sh /usr/lib64/pm-utils/sleep.d/01qubes-sync-vms-clock resume suspend
     ├─9063 /usr/bin/python2 /usr/bin/qvm-run --all -u root date -s "Sun  7 Feb 16:22:35 GMT 2016"
     └─9091 /usr/lib/qubes/qrexec-client -d mirage-firewall root:date -s "Sun  7 Feb 16:22:35 GMT 2016" -e

Feb 07 16:22:35 dom0 51qubes-suspend-netvm[9014]: method return time=1454862155.181976 sender=:1.4 -> destination=:1.216 serial=13470 reply_serial=2
Feb 07 16:22:35 dom0 01qubes-sync-vms-clock[9060]: Syncing VMs clock to: Sun  7 Feb 16:22:35 GMT 2016
Feb 07 16:22:35 dom0 01qubes-sync-vms-clock[9060]: Running command on VM: 'sys-net'...
Feb 07 16:22:35 dom0 01qubes-sync-vms-clock[9060]: Running command on VM: 'untrusted'...
Feb 07 16:22:35 dom0 01qubes-sync-vms-clock[9060]: Running command on VM: 'comms'...
Feb 07 16:22:35 dom0 01qubes-sync-vms-clock[9060]: Running command on VM: 'personal'...
Feb 07 16:22:36 dom0 01qubes-sync-vms-clock[9060]: Running command on VM: 'dev'...
Feb 07 16:22:36 dom0 01qubes-sync-vms-clock[9060]: Running command on VM: 'mirage-firewall'...
Feb 07 18:38:29 dom0 systemd[1]: Starting Qubes suspend hooks...
Feb 07 19:31:16 dom0 systemd[1]: Starting Qubes suspend hooks...

The mirage-firewall logs show:

INF [qubes.rexec] Execute "root:date -s \"Sun  7 Feb 16:22:35 GMT 2016\"\000"
WRN [command] << Unknown command "date -s \"Sun  7 Feb 16:22:35 GMT 2016\""

(this seems different to the normal time setting, which looks like INF [qubes.rexec] Execute "root:QUBESRPC qubes.SetDateTime dom0\000")

So it appears that mirage-firewall received the request (but presumably replied with an error). qrexec-client was invoked with -e which says it should exit after sending, so I'm not sure what it's waiting for. Perhaps it should time-out by itself? Killing the stuck qrexec-client caused the system to suspend.

talex5 commented Feb 7, 2016

Hmm... this is interesting:

[tal@dom0 log]$ systemctl status qubes-suspend.service -l
qubes-suspend.service - Qubes suspend hooks
   Loaded: loaded (/usr/lib/systemd/system/qubes-suspend.service; enabled)
   Active: deactivating (stop-post) since Sun 2016-02-07 16:22:32 GMT; 3h 10min ago
 Main PID: 8885 (code=exited, status=0/SUCCESS);         : 9060 (01qubes-sync-vm)
   CGroup: /system.slice/qubes-suspend.service
       └─control
     ├─9060 /bin/sh /usr/lib64/pm-utils/sleep.d/01qubes-sync-vms-clock resume suspend
     ├─9063 /usr/bin/python2 /usr/bin/qvm-run --all -u root date -s "Sun  7 Feb 16:22:35 GMT 2016"
     └─9091 /usr/lib/qubes/qrexec-client -d mirage-firewall root:date -s "Sun  7 Feb 16:22:35 GMT 2016" -e

Feb 07 16:22:35 dom0 51qubes-suspend-netvm[9014]: method return time=1454862155.181976 sender=:1.4 -> destination=:1.216 serial=13470 reply_serial=2
Feb 07 16:22:35 dom0 01qubes-sync-vms-clock[9060]: Syncing VMs clock to: Sun  7 Feb 16:22:35 GMT 2016
Feb 07 16:22:35 dom0 01qubes-sync-vms-clock[9060]: Running command on VM: 'sys-net'...
Feb 07 16:22:35 dom0 01qubes-sync-vms-clock[9060]: Running command on VM: 'untrusted'...
Feb 07 16:22:35 dom0 01qubes-sync-vms-clock[9060]: Running command on VM: 'comms'...
Feb 07 16:22:35 dom0 01qubes-sync-vms-clock[9060]: Running command on VM: 'personal'...
Feb 07 16:22:36 dom0 01qubes-sync-vms-clock[9060]: Running command on VM: 'dev'...
Feb 07 16:22:36 dom0 01qubes-sync-vms-clock[9060]: Running command on VM: 'mirage-firewall'...
Feb 07 18:38:29 dom0 systemd[1]: Starting Qubes suspend hooks...
Feb 07 19:31:16 dom0 systemd[1]: Starting Qubes suspend hooks...

The mirage-firewall logs show:

INF [qubes.rexec] Execute "root:date -s \"Sun  7 Feb 16:22:35 GMT 2016\"\000"
WRN [command] << Unknown command "date -s \"Sun  7 Feb 16:22:35 GMT 2016\""

(this seems different to the normal time setting, which looks like INF [qubes.rexec] Execute "root:QUBESRPC qubes.SetDateTime dom0\000")

So it appears that mirage-firewall received the request (but presumably replied with an error). qrexec-client was invoked with -e which says it should exit after sending, so I'm not sure what it's waiting for. Perhaps it should time-out by itself? Killing the stuck qrexec-client caused the system to suspend.

@marmarek

This comment has been minimized.

Show comment
Hide comment
@marmarek

marmarek Feb 7, 2016

Member

Ouch, there should be qubes.SetDateTime service call... #1728

Anyway, -e means that it will not wait for the command termination, will not pass stdin/out/err. But will still wait for qrexec-agent reporting the process was started. qrexec-agent in case of MSG_JUST_EXEC should connect "data" vchan and just send exit code (zero if process was started, non-zero in case of some error).
This probably wouldn't matter after fixing #1728

Member

marmarek commented Feb 7, 2016

Ouch, there should be qubes.SetDateTime service call... #1728

Anyway, -e means that it will not wait for the command termination, will not pass stdin/out/err. But will still wait for qrexec-agent reporting the process was started. qrexec-agent in case of MSG_JUST_EXEC should connect "data" vchan and just send exit code (zero if process was started, non-zero in case of some error).
This probably wouldn't matter after fixing #1728

@talex5

This comment has been minimized.

Show comment
Hide comment
@talex5

talex5 Feb 7, 2016

Note that it normally works:

[tal@dom0 ~]$ cat clock-test.sh
#!/bin/sh
DATE=$(date)
echo "Syncing VMs clock to: $DATE"
exec qvm-run --all -u root "date -s \"$DATE\""
[tal@dom0 ~]$ ./clock-test.sh
Syncing VMs clock to: Sun  7 Feb 21:27:50 GMT 2016
Running command on VM: 'sys-net'...
Running command on VM: 'untrusted'...
Running command on VM: 'comms'...
Running command on VM: 'mirage-firewall'...
[tal@dom0 ~]$

Must be a race somewhere...

talex5 commented Feb 7, 2016

Note that it normally works:

[tal@dom0 ~]$ cat clock-test.sh
#!/bin/sh
DATE=$(date)
echo "Syncing VMs clock to: $DATE"
exec qvm-run --all -u root "date -s \"$DATE\""
[tal@dom0 ~]$ ./clock-test.sh
Syncing VMs clock to: Sun  7 Feb 21:27:50 GMT 2016
Running command on VM: 'sys-net'...
Running command on VM: 'untrusted'...
Running command on VM: 'comms'...
Running command on VM: 'mirage-firewall'...
[tal@dom0 ~]$

Must be a race somewhere...

@talex5

This comment has been minimized.

Show comment
Hide comment
@talex5

talex5 Feb 7, 2016

Other possibly relevant things in the mirage-firewall log:

A while before the 16:22:35 date command:

INF [qubes.rexec] Execute "root:QUBESRPC qubes.SetDateTime dom0\000"
WRN [qubes.rexec] Error setting up vchan (domain 0, port 513): Failure("Invalid ring order: 0")

(either left_order or right_order was read as 0 from the interface C struct, which seems odd)

Quite a few WRN [qubes.rexec] End-of-file while closing flow (with exit status 0) (closing a flow involves sending an empty stdout packet and the exit code - I guess this means dom0 closed the vchan without waiting).

talex5 commented Feb 7, 2016

Other possibly relevant things in the mirage-firewall log:

A while before the 16:22:35 date command:

INF [qubes.rexec] Execute "root:QUBESRPC qubes.SetDateTime dom0\000"
WRN [qubes.rexec] Error setting up vchan (domain 0, port 513): Failure("Invalid ring order: 0")

(either left_order or right_order was read as 0 from the interface C struct, which seems odd)

Quite a few WRN [qubes.rexec] End-of-file while closing flow (with exit status 0) (closing a flow involves sending an empty stdout packet and the exit code - I guess this means dom0 closed the vchan without waiting).

@talex5 talex5 referenced this issue in talex5/qubes-mirage-firewall Feb 9, 2016

Open

Invalid ring order: 0 #3

@talex5

This comment has been minimized.

Show comment
Hide comment
@talex5

talex5 Feb 13, 2016

Here's the output after it failed to suspend today - this one doesn't seem to involve mirage-firewall:

[tal@dom0 ~]$ systemctl status qubes-suspend.service
qubes-suspend.service - Qubes suspend hooks
   Loaded: loaded (/usr/lib/systemd/system/qubes-suspend.service; enabled)
   Active: deactivating (stop-post) since Sat 2016-02-13 22:20:56 GMT; 4min 39s ago
  Process: 10172 ExecStop=/usr/lib64/pm-utils/sleep.d/52qubes-pause-vms resume suspend (code=exited, status=0/SUCCESS)
  Process: 10037 ExecStart=/usr/lib64/pm-utils/sleep.d/52qubes-pause-vms suspend suspend (code=exited, status=0/SUCCESS)
  Process: 10004 ExecStartPre=/usr/lib64/pm-utils/sleep.d/51qubes-suspend-netvm suspend suspend (code=exited, status=0/SUCCESS)
  Process: 10000 ExecStartPre=/usr/lib64/pm-utils/sleep.d/01qubes-sync-vms-clock suspend suspend (code=exited, status=0/SUCCESS)
 Main PID: 10037 (code=exited, status=0/SUCCESS);         : 10178 (51qubes-suspend)
   CGroup: /system.slice/qubes-suspend.service
       └─control
     ├─10178 /bin/sh /usr/lib64/pm-utils/sleep.d/51qubes-suspend-netvm resume suspend
     ├─10203 /usr/bin/python2 /usr/bin/qvm-run -u root --pass-io sys-net QUBESRPC qubes.SuspendPost dom0
     └─10208 /usr/lib/qubes/qrexec-client -d sys-net root:QUBESRPC qubes.SuspendPost dom0

Feb 13 22:19:48 dom0 51qubes-suspend-netvm[10004]: method return time=1455401989.322433 sender=:1.4 -> destination=:1.217 serial=11589 reply_serial=2
Feb 13 22:20:48 dom0 52qubes-pause-vms[10037]: Failed to suspend VM sys-net: internal error: Failed to suspend domain '1'
Feb 13 22:20:48 dom0 systemd[1]: Started Qubes suspend hooks.
Feb 13 22:20:56 dom0 systemd[1]: Unit qubes-suspend.service is not needed anymore. Stopping.
Feb 13 22:20:56 dom0 systemd[1]: Stopping Qubes suspend hooks...

net-vm log:

[13417.991589] IPv6: ADDRCONF(NETDEV_CHANGE): vif10.0: link becomes ready
[17581.753584] e1000e: enp0s0 NIC Link is Down
[17582.136464] Freezing user space processes ... (elapsed 0.001 seconds) done.
[17582.138047] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
[17582.139894] PM: freeze of devices complete after 0.517 msecs
[17645.503308] timekeeping watchdog: Marking clocksource 'tsc' as unstable, because the skew is too large:
[17645.503319]  'xen' wd_now: 101813121264 wd_last: 101789ac7219 mask: ffffffffffffffff
[17645.503326]  'tsc' cs_now: be94e38c cs_last: 8cce560c232 mask: ffffffffffffffff

Had to reboot, as restarting netvm seems to be impossible.

talex5 commented Feb 13, 2016

Here's the output after it failed to suspend today - this one doesn't seem to involve mirage-firewall:

[tal@dom0 ~]$ systemctl status qubes-suspend.service
qubes-suspend.service - Qubes suspend hooks
   Loaded: loaded (/usr/lib/systemd/system/qubes-suspend.service; enabled)
   Active: deactivating (stop-post) since Sat 2016-02-13 22:20:56 GMT; 4min 39s ago
  Process: 10172 ExecStop=/usr/lib64/pm-utils/sleep.d/52qubes-pause-vms resume suspend (code=exited, status=0/SUCCESS)
  Process: 10037 ExecStart=/usr/lib64/pm-utils/sleep.d/52qubes-pause-vms suspend suspend (code=exited, status=0/SUCCESS)
  Process: 10004 ExecStartPre=/usr/lib64/pm-utils/sleep.d/51qubes-suspend-netvm suspend suspend (code=exited, status=0/SUCCESS)
  Process: 10000 ExecStartPre=/usr/lib64/pm-utils/sleep.d/01qubes-sync-vms-clock suspend suspend (code=exited, status=0/SUCCESS)
 Main PID: 10037 (code=exited, status=0/SUCCESS);         : 10178 (51qubes-suspend)
   CGroup: /system.slice/qubes-suspend.service
       └─control
     ├─10178 /bin/sh /usr/lib64/pm-utils/sleep.d/51qubes-suspend-netvm resume suspend
     ├─10203 /usr/bin/python2 /usr/bin/qvm-run -u root --pass-io sys-net QUBESRPC qubes.SuspendPost dom0
     └─10208 /usr/lib/qubes/qrexec-client -d sys-net root:QUBESRPC qubes.SuspendPost dom0

Feb 13 22:19:48 dom0 51qubes-suspend-netvm[10004]: method return time=1455401989.322433 sender=:1.4 -> destination=:1.217 serial=11589 reply_serial=2
Feb 13 22:20:48 dom0 52qubes-pause-vms[10037]: Failed to suspend VM sys-net: internal error: Failed to suspend domain '1'
Feb 13 22:20:48 dom0 systemd[1]: Started Qubes suspend hooks.
Feb 13 22:20:56 dom0 systemd[1]: Unit qubes-suspend.service is not needed anymore. Stopping.
Feb 13 22:20:56 dom0 systemd[1]: Stopping Qubes suspend hooks...

net-vm log:

[13417.991589] IPv6: ADDRCONF(NETDEV_CHANGE): vif10.0: link becomes ready
[17581.753584] e1000e: enp0s0 NIC Link is Down
[17582.136464] Freezing user space processes ... (elapsed 0.001 seconds) done.
[17582.138047] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
[17582.139894] PM: freeze of devices complete after 0.517 msecs
[17645.503308] timekeeping watchdog: Marking clocksource 'tsc' as unstable, because the skew is too large:
[17645.503319]  'xen' wd_now: 101813121264 wd_last: 101789ac7219 mask: ffffffffffffffff
[17645.503326]  'tsc' cs_now: be94e38c cs_last: 8cce560c232 mask: ffffffffffffffff

Had to reboot, as restarting netvm seems to be impossible.

@unman

This comment has been minimized.

Show comment
Hide comment
@unman

unman Apr 24, 2017

Member

@talex5 Does this issue still arise for you in 3.2? If so please update milestone to 3.2. If not, then please close.

Member

unman commented Apr 24, 2017

@talex5 Does this issue still arise for you in 3.2? If so please update milestone to 3.2. If not, then please close.

@talex5

This comment has been minimized.

Show comment
Hide comment
@talex5

talex5 Apr 25, 2017

Seem mostly OK with 3.2.

talex5 commented Apr 25, 2017

Seem mostly OK with 3.2.

@talex5 talex5 closed this Apr 25, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment