Skip to content

sys-firewall unresponsive after suspend/resume #8139

@ghost

Description

Qubes OS release

R4.1.2 on a desktop PC (MSI Z690-A pro ddr4 with 13th gen CPU - using MSI's bios until Dasharo/coreboot supports Raptor Lake).

Running kernel-latest/6.1.12 (with 5.15.94 there is a huge display refresh lag and the machine can't resume after suspend). Everything works fine with kernel-latest except suspend/resume.

Brief summary

sys-firewall is unresponsive after resume so qubes that use it as netvm have no network connectivity. Some qvm-* commands work, other don't. There's nothing in the logs, and things magically get back to normal after a few minutes.

Details

  • unlike typical resume problems:

    • sys-net doesn't have any issues after resume, it just works
    • most of qvm-... commands in dom0 work
  • when sys-firewall is unresponsive:

    • qvm-ls shows sys-firewall isn't paused. Pausing it and unpausing it has no effect
    • there's nothing in /var/log/qubes/*sys-firewall*.log
    • xl console sys-firewall doesn't show anything weird; I can't login (I can type the username - I guess because xl has "input feedback"- but nothing happens after pressing the enter key).
    • in a sys-firewall terminal, a command that generated some output before suspend (eg. watch -n1 date) doesn't show anything after resume (and obviously, nothing can be typed in the terminal).
    • in dom0: qvm-run work xterm works, but qvm-run -q -a --service -- work qubes.StartApp+xterm doesn't work (nothing happens)
    • in other qubes that use sys-firewall as netvm: ping someip always return one successful echo reply, whatever the ip, with timeout=0ms (which is obviously not possible). ping is then stuck (even with -w or -W flags).
    • powering off the machine takes an awful lot of time, typical of when something went awry with qubesd
  • after a while (I've just measured 9 minutes today), things magically get back to normal. When that happens there are the following log entries (which look pretty normal):

    • in dom0:
    dom0 qrexec-policy-daemon[2626]: qrexec: qubes.GetDate+: sys-firewall -> @default: allowed to dom0
    dom0 qrexec-policy-daemon[2626]: qrexec: qubes.GetDate+: work -> @default: allowed to dom0
    dom0 qrexec-policy-daemon[2626]: qrexec: qubes.GetDate+: dispBrowser -> @default: allowed to dom0
    
    • in sys-net:
    systemd[1]: Starting systemd-tmpfiles-clean.service - Cleanup of Temporary Directories...
    systemd[1]: Starting systemd-hostnamed.service - Hostname Service...                                                                            systemd[1]:   systemd-tmpfiles-clean.service: Deactivated successfully.
    systemd[1]: Finished systemd-tmpfiles-clean.service - Cleanup of Temporary Directories.                                                           systemd[1]: Started systemd-hostnamed.service - Hostname Service.
    
    • in sys-firewall:
    systemd-resolved[436]: Clock change detected. Flushing caches.
    systemd[1]: qubes-sync-time.service: Deactivated successfully
    

Any idea? I've been using Qubes since R3.0, that's the first time I have such an issue (usually sys-net is the culprit, and/or none of qvm-* commands work, etc.)

(might have been related to issue #8086 - but in my case qvm-run vm cmdworks)

Metadata

Metadata

Assignees

No one assigned

    Labels

    C: kernelThis issue pertains to kernels in Qubes OS.P: defaultPriority: default. Default priority for new issues, to be replaced given sufficient information.affects-4.1This issue affects Qubes OS 4.1.diagnosedTechnical diagnosis of this issue has been performed.

    Type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions