New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot start VMs with initial memory too low #3581

Closed
qavenull opened this Issue Feb 13, 2018 · 6 comments

Comments

Projects
None yet
5 participants
@qavenull

Qubes OS version:

R4.0

Affected TemplateVMs:

N/A


Steps to reproduce the behavior:

Reduce initial memory of a VM to a low value (for a fedora-26 template, 200 MB was low enough).

Expected behavior:

Either the VM starts, or a popup telling me I need to add more memory is displayed.

Actual behavior:

The VM does not start, I get a popup saying 'Cannot connect to qrexec agent: No such process' (and the same goes into the VM logs).

General notes:

I thought that I would not be able to reclaim any memory that's given at the VM startup, so I tried to lower it... Since then I noticed that it wasn't so bad to have more memory than needed at VM startup.


Related issues:

@taradiddles

This comment has been minimized.

Show comment
Hide comment
@taradiddles

taradiddles Feb 14, 2018

@andrewdavidwong - is it really a bug ? It's somewhat expected that a VM without enough memory would crash. Granted, displaying the cause of the failure to connect to the qrexsec agent would really be helpful but is it feasible technically ? Some out-of-memory errors can be seen in xen's hypervisor.log (eg. p2m_pod_demand_populate: Dom... out of PoD memory) but I doubt kernel panics, the lack of installed qrexec agent or other errors in VMs can be seen from dom0 (save for parsing the VM's console).

FWIW the log files listed in Qubes Manager aren't the ones I usually look at when debugging: /var/log/qubes/vm-vmname.log, /var/log/xen/console/hypervisor.log and /var/log/xen/console/guest-vmname-dm.log are missing...

Maybe a list of relevant log files should be displayed in the 'can't connect to qrexec' error message so that users would know where to look.

@qavenull : out of curiosity, do you see something in the logs mentioned above that is relevant to your VM when it fails to start ?
BTW the only reason (IMO) to decrease the minimum memory is when memory balancing isn't enabled (eg. VMs with PCI passthrough like sys-net). Otherwise the balancing algoritm works pretty well (you'll notice that a lot of memory is allocated after the VM boots but that it will decrease when the whole system is under memory "pressure" ; for instance my sys-firewall VM drops from ~800M to ~200M when I have a lot of VMs started).

@andrewdavidwong - is it really a bug ? It's somewhat expected that a VM without enough memory would crash. Granted, displaying the cause of the failure to connect to the qrexsec agent would really be helpful but is it feasible technically ? Some out-of-memory errors can be seen in xen's hypervisor.log (eg. p2m_pod_demand_populate: Dom... out of PoD memory) but I doubt kernel panics, the lack of installed qrexec agent or other errors in VMs can be seen from dom0 (save for parsing the VM's console).

FWIW the log files listed in Qubes Manager aren't the ones I usually look at when debugging: /var/log/qubes/vm-vmname.log, /var/log/xen/console/hypervisor.log and /var/log/xen/console/guest-vmname-dm.log are missing...

Maybe a list of relevant log files should be displayed in the 'can't connect to qrexec' error message so that users would know where to look.

@qavenull : out of curiosity, do you see something in the logs mentioned above that is relevant to your VM when it fails to start ?
BTW the only reason (IMO) to decrease the minimum memory is when memory balancing isn't enabled (eg. VMs with PCI passthrough like sys-net). Otherwise the balancing algoritm works pretty well (you'll notice that a lot of memory is allocated after the VM boots but that it will decrease when the whole system is under memory "pressure" ; for instance my sys-firewall VM drops from ~800M to ~200M when I have a lot of VMs started).

@qavenull

This comment has been minimized.

Show comment
Hide comment
@qavenull

qavenull Feb 14, 2018

I have nothing in /var/log/qubes/vm-vmname.log or /var/log/xen/console/hypervisor.log. However, /var/log/xen/console/guest-vmname-dm.log is more interesting,

log_oom_killer.txt

Two attempts are recorded in the file. OOM killer is mentioned, and the last messages don't seem to be seen in normal VM startups.

I only checked the /var/log/qubes directory, now that I know I should look into /var/log/xen debugging such problems will be easier! (Maybe displaying the list of files in the 'can't connect to qrexec' is indeed enough to help people dealing with the issues ?)

I have nothing in /var/log/qubes/vm-vmname.log or /var/log/xen/console/hypervisor.log. However, /var/log/xen/console/guest-vmname-dm.log is more interesting,

log_oom_killer.txt

Two attempts are recorded in the file. OOM killer is mentioned, and the last messages don't seem to be seen in normal VM startups.

I only checked the /var/log/qubes directory, now that I know I should look into /var/log/xen debugging such problems will be easier! (Maybe displaying the list of files in the 'can't connect to qrexec' is indeed enough to help people dealing with the issues ?)

@marmarek

This comment has been minimized.

Show comment
Hide comment
@marmarek

marmarek Feb 14, 2018

Member

However, /var/log/xen/console/guest-vmname-dm.log is more interesting,

This file really looks like /var/log/xen/console/guest-vmname.log (no -dm).

Member

marmarek commented Feb 14, 2018

However, /var/log/xen/console/guest-vmname-dm.log is more interesting,

This file really looks like /var/log/xen/console/guest-vmname.log (no -dm).

@qavenull

This comment has been minimized.

Show comment
Hide comment
@qavenull

qavenull Feb 15, 2018

Oops, you're right, there's no -dm file.

Oops, you're right, there's no -dm file.

@andrewdavidwong

This comment has been minimized.

Show comment
Hide comment
@andrewdavidwong

andrewdavidwong Feb 15, 2018

Member

is it really a bug ? It's somewhat expected that a VM without enough memory would crash. Granted, displaying the cause of the failure to connect to the qrexsec agent would really be helpful but is it feasible technically ? Some out-of-memory errors can be seen in xen's hypervisor.log (eg. p2m_pod_demand_populate: Dom... out of PoD memory) but I doubt kernel panics, the lack of installed qrexec agent or other errors in VMs can be seen from dom0 (save for parsing the VM's console).

The absence of a sufficiently informative error message could be a UX bug. However, for the reasons you point out, it may be that this is the best we can do. I leave it to the experts to determine whether this is the case.

Member

andrewdavidwong commented Feb 15, 2018

is it really a bug ? It's somewhat expected that a VM without enough memory would crash. Granted, displaying the cause of the failure to connect to the qrexsec agent would really be helpful but is it feasible technically ? Some out-of-memory errors can be seen in xen's hypervisor.log (eg. p2m_pod_demand_populate: Dom... out of PoD memory) but I doubt kernel panics, the lack of installed qrexec agent or other errors in VMs can be seen from dom0 (save for parsing the VM's console).

The absence of a sufficiently informative error message could be a UX bug. However, for the reasons you point out, it may be that this is the best we can do. I leave it to the experts to determine whether this is the case.

@marmarta

This comment has been minimized.

Show comment
Hide comment
@marmarta

marmarta Jul 15, 2018

Unfortunately, it is the best we can do. The popup will now not vanish and have path to the log file ( commit QubesOS/qubes-desktop-linux-manager@0e52962 ).

Unfortunately, it is the best we can do. The popup will now not vanish and have path to the log file ( commit QubesOS/qubes-desktop-linux-manager@0e52962 ).

@marmarta marmarta closed this Jul 15, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment