Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upFix suspend netvm #146
Comments
marmarek
assigned
rootkovska
Mar 8, 2015
marmarek
added
bug
C: core
P: major
labels
Mar 8, 2015
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
Mar 8, 2015
Member
Comment by marmarek on 28 Mar 2011 10:55 UTC
Instead of pci-detach, we can down interface before suspend. Ex stop NetworkManager before suspend (with qvm-run), and start it after resume.
|
Comment by marmarek on 28 Mar 2011 10:55 UTC |
marmarek
assigned
marmarek
and unassigned
rootkovska
Mar 8, 2015
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
Modified by joanna on 28 Mar 2011 10:59 UTC |
marmarek
added this to the Release 1 Beta 1 milestone
Mar 8, 2015
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
Mar 8, 2015
Member
Comment by marmarek on 29 Mar 2011 09:24 UTC
Running pm-scripts in netvm fixes the problem.
|
Comment by marmarek on 29 Mar 2011 09:24 UTC |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
Mar 8, 2015
Member
Comment by marmarek on 29 Mar 2011 10:40 UTC
Done.
http://git.qubes-os.org/gitweb/?p=marmarek/core.git;a=commit;h=2bcbc1742ea68dae5b55d7e5cdb3b65a0befae4a
http://git.qubes-os.org/gitweb/?p=marmarek/core.git;a=commit;h=464337a24e1279b99dec2abfe6cd90d69ceeddcf
http://git.qubes-os.org/gitweb/?p=marmarek/core.git;a=commit;h=c2e0a84c222be070449c6d679ed43d0f0f48759e
marmarek
closed this
Mar 8, 2015
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
Mar 8, 2015
Member
Comment by joanna on 29 Mar 2011 22:51 UTC
The script doesn't work as expected!
First of all, the qvm-get-default-netvm script, under normal circumstances, would returns firewallvm, not the actual hardware-attached netvm!
I think we should also retrieve all currently running netvms and apply the command to all of them, not just the default one (the user might e.g. have two netvms for each hardware NIC).
Also, I see the following error in /var/log/pm-suspend.log:
/usr/lib64/pm-utils/sleep.d/01qubes-suspend-netvm resume suspend: /usr/lib64/pm-utils/sleep.d/01qubes-suspend-netvm: line 19: [missing `](:)'
method return sender=:1.46 -> dest=:1.45 reply_serial=2
|
Comment by joanna on 29 Mar 2011 22:51 UTC First of all, the qvm-get-default-netvm script, under normal circumstances, would returns firewallvm, not the actual hardware-attached netvm! I think we should also retrieve all currently running netvms and apply the command to all of them, not just the default one (the user might e.g. have two netvms for each hardware NIC). Also, I see the following error in /var/log/pm-suspend.log:
|
marmarek
added
P: critical
and removed
P: major
labels
Mar 8, 2015
marmarek
reopened this
Mar 8, 2015
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
Mar 8, 2015
Member
Comment by joanna on 29 Mar 2011 23:07 UTC
After I have manually hardcoded NETVM=netvm in the pmscript, and also after manually adding the missing whitespace before the closeing bracket, the script still doesn't work as expected. Specifically, my netvm consumes some 99% of CPU for about 30sec after resume, during which time it outputs lots of dramatic messages into its dmesg (MAC in deep sleep! Hardware Error! etc). After this 30s or so, it reinitializes the NIC and all is fine...
I'm using iwlagn.
|
Comment by joanna on 29 Mar 2011 23:07 UTC I'm using iwlagn. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
Mar 8, 2015
Member
Comment by joanna on 29 Mar 2011 23:10 UTC
In case you like some fetish:
[ 2410.492733] iwlagn 0000:00:01.0: MAC is in deep sleep!. CSR_GP_CNTRL = 0x080003D8
[ 2410.545825] iwlagn 0000:00:01.0: MAC is in deep sleep!. CSR_GP_CNTRL = 0x080003D8
[ 2410.593919] iwlagn 0000:00:01.0: Desc Time data1 data2 line
[ 2410.593924] iwlagn 0000:00:01.0: ADVANCED SYSASSERT (#-1515870816) 2779096480 0xA5A5A5A0 0xA5A5A5A0 2779096480
[ 2410.593927] iwlagn 0000:00:01.0: blink1 blink2 ilink1 ilink2
[ 2410.593929] iwlagn 0000:00:01.0: 0xA5A5A5A0 0xA5A5A5A0 0xA5A5A5A0 0xA5A5A5A0
[ 2410.593933] iwlagn 0000:00:01.0: CSR values:
[ 2410.593935] iwlagn 0000:00:01.0: (2nd byte of CSR_INT_COALESCING is CSR_INT_PERIODIC_REG)
[ 2410.593964] iwlagn 0000:00:01.0: CSR_HW_IF_CONFIG_REG: 0X00000000
[ 2410.593989] iwlagn 0000:00:01.0: CSR_INT_COALESCING: 0X0000ff00
[ 2410.594014] iwlagn 0000:00:01.0: CSR_INT: 0X20000000
[ 2410.594039] iwlagn 0000:00:01.0: CSR_INT_MASK: 0X00000000
[ 2410.594064] iwlagn 0000:00:01.0: CSR_FH_INT_STATUS: 0X00000000
[ 2410.594089] iwlagn 0000:00:01.0: CSR_GPIO_IN: 0X0000000f
[ 2410.594114] iwlagn 0000:00:01.0: CSR_RESET: 0X00000002
[ 2410.594139] iwlagn 0000:00:01.0: CSR_GP_CNTRL: 0X080003d0
[ 2410.594164] iwlagn 0000:00:01.0: CSR_HW_REV: 0X00000074
[ 2410.594189] iwlagn 0000:00:01.0: CSR_EEPROM_REG: 0X00000000
[ 2410.594214] iwlagn 0000:00:01.0: CSR_EEPROM_GP: 0X90000001
[ 2410.594240] iwlagn 0000:00:01.0: CSR_OTP_GP_REG: 0X00030001
[ 2410.594265] iwlagn 0000:00:01.0: CSR_GIO_REG: 0X00080040
[ 2410.594290] iwlagn 0000:00:01.0: CSR_GP_UCODE_REG: 0X00000000
[ 2410.594315] iwlagn 0000:00:01.0: CSR_GP_DRIVER_REG: 0X00000000
[ 2410.594341] iwlagn 0000:00:01.0: CSR_UCODE_DRV_GP1: 0X00000000
[ 2410.594366] iwlagn 0000:00:01.0: CSR_UCODE_DRV_GP2: 0X00000000
[ 2410.594391] iwlagn 0000:00:01.0: CSR_LED_REG: 0X00000018
[ 2410.594416] iwlagn 0000:00:01.0: CSR_DRAM_INT_TBL_REG: 0X00000000
[ 2410.594442] iwlagn 0000:00:01.0: CSR_GIO_CHICKEN_BITS: 0X27800200
[ 2410.594467] iwlagn 0000:00:01.0: CSR_ANA_PLL_CFG: 0X00000000
[ 2410.594492] iwlagn 0000:00:01.0: CSR_HW_REV_WA_REG: 0X0001001a
[ 2410.594517] iwlagn 0000:00:01.0: CSR_DBG_HPET_MEM_REG: 0X82000510
[ 2410.594520] iwlagn 0000:00:01.0: FH register values:
[ 2410.597907] iwlagn 0000:00:01.0: MAC is in deep sleep!. CSR_GP_CNTRL = 0x080003D8
[ 2410.647533] iwlagn 0000:00:01.0: FH_RSCSR_CHNL0_STTS_WPTR_REG: 0Xa5a5a5a0
[ 2410.651522] iwlagn 0000:00:01.0: MAC is in deep sleep!. CSR_GP_CNTRL = 0x080003D8
[ 2410.700440] iwlagn 0000:00:01.0: FH_RSCSR_CHNL0_RBDCB_BASE_REG: 0Xa5a5a5a0
[ 2410.704429] iwlagn 0000:00:01.0: MAC is in deep sleep!. CSR_GP_CNTRL = 0x080003D8
[ 2410.753333] iwlagn 0000:00:01.0: FH_RSCSR_CHNL0_WPTR: 0Xa5a5a5a0
[ 2410.757320] iwlagn 0000:00:01.0: MAC is in deep sleep!. CSR_GP_CNTRL = 0x080003D8
[ 2410.806298] iwlagn 0000:00:01.0: FH_MEM_RCSR_CHNL0_CONFIG_REG: 0Xa5a5a5a0
[ 2410.810286] iwlagn 0000:00:01.0: MAC is in deep sleep!. CSR_GP_CNTRL = 0x080003D8
[ 2410.858716] iwlagn 0000:00:01.0: FH_MEM_RSSR_SHARED_CTRL_REG: 0Xa5a5a5a0
[ 2410.862703] iwlagn 0000:00:01.0: MAC is in deep sleep!. CSR_GP_CNTRL = 0x080003D8
[ 2410.911470] iwlagn 0000:00:01.0: FH_MEM_RSSR_RX_STATUS_REG: 0Xa5a5a5a0
[ 2410.915459] iwlagn 0000:00:01.0: MAC is in deep sleep!. CSR_GP_CNTRL = 0x080003D8
[ 2410.964485] iwlagn 0000:00:01.0: FH_MEM_RSSR_RX_ENABLE_ERR_IRQ2DRV: 0Xa5a5a5a0
[ 2410.968475] iwlagn 0000:00:01.0: MAC is in deep sleep!. CSR_GP_CNTRL = 0x080003D8
[ 2411.017786] iwlagn 0000:00:01.0: FH_TSSR_TX_STATUS_REG: 0Xa5a5a5a0
[ 2411.021772] iwlagn 0000:00:01.0: MAC is in deep sleep!. CSR_GP_CNTRL = 0x080003D8
[ 2411.070656] iwlagn 0000:00:01.0: FH_TSSR_TX_ERROR_REG: 0Xa5a5a5a0
[ 2411.074644] iwlagn 0000:00:01.0: MAC is in deep sleep!. CSR_GP_CNTRL = 0x080003D8
[ 2411.127341] iwlagn 0000:00:01.0: MAC is in deep sleep!. CSR_GP_CNTRL = 0x080003D8
[ 2411.178625] iwlagn 0000:00:01.0: MAC is in deep sleep!. CSR_GP_CNTRL = 0x080003D8
[ 2411.230143] iwlagn 0000:00:01.0: MAC is in deep sleep!. CSR_GP_CNTRL = 0x080003D8
[ 2411.277577] iwlagn 0000:00:01.0: Log capacity -1515870816 is bogus, limit to 512 entries
[ 2411.277579] iwlagn 0000:00:01.0: Log write index -1515870816 is bogus, limit to 512
[ 2411.277581] iwlagn 0000:00:01.0: Start IWL Event Log Dump: display last 20 entries
[ 2411.281572] iwlagn 0000:00:01.0: MAC is in deep sleep!. CSR_GP_CNTRL = 0x080003D8
[ 2411.281572] iwlagn 0000:00:01.0: EVT_LOGT:2779096480:0xa5a5a5a0:2779096480
[ 2411.281572] iwlagn 0000:00:01.0: EVT_LOGT:2779096480:0xa5a5a5a0:2779096480
[ 2411.281572] iwlagn 0000:00:01.0: EVT_LOGT:2779096480:0xa5a5a5a0:2779096480
[ 2411.281572] iwlagn 0000:00:01.0: EVT_LOGT:2779096480:0xa5a5a5a0:2779096480
[ 2411.281572] iwlagn 0000:00:01.0: EVT_LOGT:2779096480:0xa5a5a5a0:2779096480
[ 2411.281572] iwlagn 0000:00:01.0: EVT_LOGT:2779096480:0xa5a5a5a0:2779096480
[ 2411.281572] iwlagn 0000:00:01.0: EVT_LOGT:2779096480:0xa5a5a5a0:2779096480
[ 2411.281572] iwlagn 0000:00:01.0: EVT_LOGT:2779096480:0xa5a5a5a0:2779096480
[ 2411.281572] iwlagn 0000:00:01.0: EVT_LOGT:2779096480:0xa5a5a5a0:2779096480
[ 2411.281572] iwlagn 0000:00:01.0: EVT_LOGT:2779096480:0xa5a5a5a0:2779096480
[ 2411.281572] iwlagn 0000:00:01.0: EVT_LOGT:2779096480:0xa5a5a5a0:2779096480
[ 2411.281572] iwlagn 0000:00:01.0: EVT_LOGT:2779096480:0xa5a5a5a0:2779096480
[ 2411.281572] iwlagn 0000:00:01.0: EVT_LOGT:2779096480:0xa5a5a5a0:2779096480
[ 2411.281572] iwlagn 0000:00:01.0: EVT_LOGT:2779096480:0xa5a5a5a0:2779096480
[ 2411.281572] iwlagn 0000:00:01.0: EVT_LOGT:2779096480:0xa5a5a5a0:2779096480
[ 2411.281572] iwlagn 0000:00:01.0: EVT_LOGT:2779096480:0xa5a5a5a0:2779096480
[ 2411.281572] iwlagn 0000:00:01.0: EVT_LOGT:2779096480:0xa5a5a5a0:2779096480
[ 2411.281572] iwlagn 0000:00:01.0: EVT_LOGT:2779096480:0xa5a5a5a0:2779096480
[ 2411.281572] iwlagn 0000:00:01.0: EVT_LOGT:2779096480:0xa5a5a5a0:2779096480
[ 2411.281572] iwlagn 0000:00:01.0: EVT_LOGT:2779096480:0xa5a5a5a0:2779096480
[ 2411.329946] iwlagn 0000:00:01.0: Hardware error detected. Restarting.
...
Again, after this 30sec of those dramatic complaining, the interface goes back to normal and is usable... But this works regardless of the new pm script. Perhaps there is still something wrong with this script?
|
Comment by joanna on 29 Mar 2011 23:10 UTC
Again, after this 30sec of those dramatic complaining, the interface goes back to normal and is usable... But this works regardless of the new pm script. Perhaps there is still something wrong with this script? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
Mar 8, 2015
Member
Comment by joanna on 29 Mar 2011 23:18 UTC
Ha! Indeed, after I replaced the mysterious arguments to qvm-run (the command to be executed on the netvm) with plain stupid:
/etc/init.d/NetworkManager stop
and
/etc/init.d/NetworkManager start
Now it works! (Although the NM restart causes ip_forward to be zeroed...)
So, we need a solution that:
- Actually works! (NM start/stop seems to work, the other one not)
- Targets all running hardware-attached netvms in the system
- Takes care about ip_forward in case we decided to restart NM directly.
|
Comment by joanna on 29 Mar 2011 23:18 UTC
and
Now it works! (Although the NM restart causes ip_forward to be zeroed...) So, we need a solution that:
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
Mar 8, 2015
Member
Comment by marmarek on 31 Mar 2011 00:43 UTC
http://git.qubes-os.org/gitweb/?p=marmarek/core.git;a=commit;h=212fd13957fff8eac1d39dc610cbda4575e58826
|
Comment by marmarek on 31 Mar 2011 00:43 UTC |
marmarek
closed this
Mar 8, 2015
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
Mar 8, 2015
Member
Comment by rafal on 11 Apr 2011 13:16 UTC
Unfortunately, it seems that stopping NetworkManager does not quarantee that all interfaces are down. In one problem case netvm uses eth0, and during pm-suspend NM prints
Apr 11 14:28:48 localhost NetworkManager[ caught signal 15, shutting down normally.
Apr 11 14:28:49 localhost NetworkManager1358: (wlan0): now unmanaged
Apr 11 14:28:49 localhost NetworkManager[ (wlan0): device state change: 3 -> 1 (reason 36)
Apr 11 14:28:49 localhost NetworkManager1358: (wlan0): cleaning up...
Apr 11 14:28:49 localhost NetworkManager[ (wlan0): taking down device.
Apr 11 14:28:49 localhost NetworkManager1358: exiting (success)
apparently, nothing about eth0. After resume, eth0 does not work - one needs to ifconfig down, then up manually.
|
Comment by rafal on 11 Apr 2011 13:16 UTC apparently, nothing about eth0. After resume, eth0 does not work - one needs to ifconfig down, then up manually. |
marmarek
modified the milestones:
Release 1 Beta 2,
Release 1 Beta 1
Mar 8, 2015
marmarek
added
P: minor
and removed
P: critical
labels
Mar 8, 2015
marmarek
reopened this
Mar 8, 2015
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
Mar 8, 2015
Member
Comment by marmarek on 11 Apr 2011 13:23 UTC
Apparently "service NetworkManager stop" and "service NetworkManager start" is also sufficient to repair eth0.
|
Comment by marmarek on 11 Apr 2011 13:23 UTC |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
Mar 8, 2015
Member
Comment by rafal on 14 Apr 2011 08:37 UTC
Actually, I believe that not downing eth0 may lead to hang during suspend. I experienced it once now, it has never happened before with pci-detach mechanism. Raising to major.
|
Comment by rafal on 14 Apr 2011 08:37 UTC |
marmarek
added
P: major
and removed
P: minor
labels
Mar 8, 2015
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
Modified by marmarek on 19 Apr 2011 10:47 UTC |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
Mar 8, 2015
Member
Comment by marmarek on 19 Apr 2011 10:58 UTC
http://git.qubes-os.org/gitweb/?p=marmarek/core.git;a=commit;h=ae661a614814daaef6bcfdbb5e28d3e8236011a4
|
Comment by marmarek on 19 Apr 2011 10:58 UTC |
marmarek commentedMar 8, 2015
Reported by marmarek on 28 Mar 2011 10:46 UTC
After suspend, network driver cannot allocate PCI memory.
Migrated-From: https://wiki.qubes-os.org/ticket/146