New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
qmemman prevents creation of VMs with large amounts of RAM #1136
Comments
Can you provide some error message and/or qmemman log from such problem? |
Right now it's even giving me "ERROR: insufficient memory to start VM" for a 16GB HVM when dom0 has 56GB assigned (of which 38GB completely free according to "free"). Log has (not totally sure if it's related to this though):
dmesg and xl dmesg seem to say nothing relevant except "xen_balloon: Initialising balloon driver", kernel is 3.19.8-100.fc20.x86_64, xen is 4.4.2-7.fc20 Disabling memory balancing and setting dom0 memory manually to 8GB makes the error go away. After reenabling memory balancing and letting dom0 size go back to 50GB, it worked once with a 16GB HVM, then it failed with the same error, then again it worked, then it fails. When working, it shows messages like this:
When it says "insufficient memory", no messages are shown. I can't reproduce it right now, but I also managed to have qvm-start start the HVM, but get it in ---sc--- state. PV AppVM with 16GB RAM also fails with insufficient memory. Killing and restarting qmemman doesn't seem to change things. |
Ok, I think I know the cause. |
Check stubdomain log in such case ( |
The "insufficient memory" error seems to be solved by raising MAX_TRIES in do_balloon, both for HVM and PV VMs. It probably should be checking whether it is making progress in freeing memory rather than having a fixed limit (and ideally a progress bar should be shown to the user). Right now on my machine, the ballooning rate for dom0 seems to be around 5GB/second, although I suppose this varies a lot between machines, states and configurations. Can't reproduce the crashed HVM at the moment. |
Still two problems that happen when the VM size is very close to the maximum possible available memory:
|
So basically there seems to be an issue where an HVM with 52GB RAM fails to start, either with xenlight internal error, or with the stubdomain just not appearing in xl list. However, if I hack qmemman to add an extra 512MB to the memory request, then it starts (but 256MB is not enough). Stubdomain is 44 MB on "xl list". Once it is running, "xl list" reports 3443 MB memory for dom0, but "free" only reports 2176664 total memory. When the large VM is stopped, then dom0 memory from "xl list" and "free" seems to better match again. So basically:
|
On Fri, Oct 09, 2015 at 04:04:23AM -0700, qubesuser wrote:
Check /var/log/libvirt/libxl/VMNAME.log. And probably also stubdom log
Take a look at #959. Do you have Best Regards, |
On Fri, Oct 09, 2015 at 04:19:37AM -0700, qubesuser wrote:
Logs mentioned earlier should also show how much memory is needed for
Maybe something related to page tables or other in kernel memory "metadata"? Best Regards, |
I think the problem regarding the xenlight internal error is that Xen has memory overhead for VMs, but Qubes is ignoring that. At https://wiki.openstack.org/wiki/XenServer/Overhead there is a table that shows that for instance a 122GB VM requires an extra 1GB. For small VMs, the extra few MBs that Qubes leaves free are enough, but that's not the case for large VMs. OpenStack uses the following code to estimate overhead, should probably imitate it in Qubes (from nova/virt/xenapi/driver.py)
BTW, I have properly patched qmemman to check for progress, should submit a pull request soon |
Pull request at marmarek/qubes-core-admin#6 I think I won't attempt to fix the missing overhead calculation myself because it looks like qmemman might need to become aware of it, and I'm not quite sure how qmemman balancing works exactly. So, to sum up:
|
On Fri, Oct 09, 2015 at 04:57:28AM -0700, qubesuser wrote:
I think it would be enough to just include the overhead during VM This doesn't solve anything about "xl list"/free gap, but I think those
Thanks!
I'll add requesting 1% more memory than VM have assigned.
May be fixed by limiting dom0 to 4GB. Even if there is some Linux kernel Best Regards, |
I added a pull request to try to tackle the memory overhead at marmarek/qubes-core-admin#7 on top of the code in the previous pull request. It tries to minimize the impact on the codebase by dividing Xen total and free memory by the overhead, instead of multiplying everything else by it. I guess it should be enough since basically transferring memory between VMs doesn't change the overhead, but not completely sure. The memory gap issue makes this hard to test since qmemman attempts to OOM dom0, trying to figure out what's happening there now. |
On Fri, Oct 09, 2015 at 10:54:28AM -0700, qubesuser wrote:
Thanks :)
The question is whether that overhead is taken from VM memory, or added
Have you tried limiting dom0 memory with Best Regards, |
So the problem is that it seems that the kernel has something like 1/64 RAM + ~256MB overhead, which is not counted as "normal" memory, which qmemman doesn't consider when calling prefmem on dom0, instead just multiplying by 1.3 and adding a fixed 350MB. This means that on machines with at least 32-64GB RAM and no dom0 mem limit, qmemman should consistently OOM dom0 when asking to balloon the maximum possible memory. PV VMs also have the same problem, although it seems that less RAM is reserved there so the issue is less visible. The fundamental problem is that meminfo-writer should find out the amount of overhead RAM and it should be added to MemTotal in qmemman's calculations (either by having meminfo-writer add it to MemTotal, or adding a new MemOverhead/MemReserved that is added by qmemman). Also the Linux kernel should be fixed so that it doesn't unnecessarily keep metadata around (I assume it's "struct page" structs) for memory that has not been plugged in. It seems quite possible that the Linux kernel already supports this (since doing otherwise is not that smart), but is not properly configured, or maybe the Xen developers failed to properly implement it. Setting dom0 maxmen obviously mitigates the issue, but it shouldn't be required for things to work. |
1.3 is greater than 1/64, so theoretically shouldn't be a problem. Am I missing something?
Since this looks like Linux specific thing, it should be handled by meminfo-writer. But since I don't know exact formula, I don't want to introduce some estimation which could negatively affect users with smaller systems (majority of them).
I think it is supported as memory hotplug (CONFIG_XEN_BALLOON_MEMORY_HOTPLUG), but last time I've tested it (AFAIR around 3.14), I've got some crashes.
This is recommended configuration by Xen best practices - actually it is recommended to set fixed memory size for dom0. And there is also explanation why - along the lines we've got here. |
It's 1/64 of the maximum possible dom0 RAM, e.g. 1GB for 64GB RAM without dom0_mem max set plus on my machine around 300MB. The 1.3 factor is applied to the used RAM instead, so in a typical scenario where dom0 is using 1GB it will be 300MB. And then 300MB + 350MB bonus = 750MB < 1.3GB, which results in either OOM or swapping if qmemman is asked to balloon dom0 exactly to the minimum possible.
Yes, problem is, I can't find a good way to get the info at the moment. In theory one just needs to get the domain size from Xen and subtract it from the total memory from /proc/meminfo, but the problem is that if ballooning is happening, this is racy. |
On Fri, Oct 09, 2015 at 02:18:24PM -0700, qubesuser wrote:
Ah, indeed.
Still, setting dom0 max mem to 4GB seems to be a sensible move.
Looking at
The race at least can be detected - by reading ballooning target before Best Regards, |
I think I found a way:
The value at /sys/devices/system/xen_memory/xen_memory0/info/current_kb is the "xl list" value and can then be exported by meminfo-writer as "DomTotal" and used by qmemman instead of MemTotal if available. |
On Fri, Oct 09, 2015 at 02:53:14PM -0700, qubesuser wrote:
Generally I think meminfo-writer should export just one number: used Anyway in the current shape, it should be added to to "MemTotal". No new Best Regards, |
Pull request for meminfo-writer at marmarek/qubes-linux-utils#1 Haven't done much testing yet on all the patches combined. |
It seems to work (after fixing marmarek/qubes-core-admin#7). There's a final minor issue: on my machine, "xl list" reports that the stubdomain takes 44 MB, but the QubesHVm code only reserved 32 MB, which should thus probably be raised. It would also probably be a good test to write a system that does automatic stress tests of VM creation/destruction (not planning to do it myself). Also added a commit to marmarek/qubes-linux-utils#1 to use 64-bit integers in meminfo-writer, so that it works on machines with more than 2 TB RAM (of course other changes might be needed since I assume no one ever tested). |
On Fri, Oct 09, 2015 at 04:34:14PM -0700, qubesuser wrote:
Thanks!
That was changed recently and indeed not all the places was updated. Best Regards, |
* origin/pr/6: Support large VMs by removing the fixed balloon iteration limit QubesOS/qubes-issues#1136
Linux kernel have some memory overhead depending on maxmem. Dom0 isn't meant to use that much memory (most should be assigned to AppVMs), so on big systems this will be pure waste. QubesOS/qubes-issues#1136 Fixes QubesOS/qubes-issues#1313
Linux kernel have some memory overhead depending on maxmem. Dom0 isn't meant to use that much memory (most should be assigned to AppVMs), so on big systems this will be pure waste. QubesOS/qubes-issues#1136 Fixes QubesOS/qubes-issues#1313
Linux kernel have some memory overhead depending on maxmem. Dom0 isn't meant to use that much memory (most should be assigned to AppVMs), so on big systems this will be pure waste. QubesOS/qubes-issues#1136 Fixes QubesOS/qubes-issues#1313
Linux kernel have some memory overhead depending on maxmem. Dom0 isn't meant to use that much memory (most should be assigned to AppVMs), so on big systems this will be pure waste. QubesOS/qubes-issues#1136 Fixes QubesOS/qubes-issues#1313
Linux kernel have some memory overhead depending on maxmem. Dom0 isn't meant to use that much memory (most should be assigned to AppVMs), so on big systems this will be pure waste. QubesOS/qubes-issues#1136 Fixes QubesOS/qubes-issues#1313
Linux kernel have some memory overhead depending on maxmem. Dom0 isn't meant to use that much memory (most should be assigned to AppVMs), so on big systems this will be pure waste. QubesOS/qubes-issues#1136 Fixes QubesOS/qubes-issues#1313
Linux kernel have some memory overhead depending on maxmem. Dom0 isn't meant to use that much memory (most should be assigned to AppVMs), so on big systems this will be pure waste. QubesOS/qubes-issues#1136 Fixes QubesOS/qubes-issues#1313
Linux kernel have some memory overhead depending on maxmem. Dom0 isn't meant to use that much memory (most should be assigned to AppVMs), so on big systems this will be pure waste. QubesOS/qubes-issues#1136 Fixes QubesOS/qubes-issues#1313
Linux kernel have some memory overhead depending on maxmem. Dom0 isn't meant to use that much memory (most should be assigned to AppVMs), so on big systems this will be pure waste. QubesOS/qubes-issues#1136 Fixes QubesOS/qubes-issues#1313
Linux kernel have some memory overhead depending on maxmem. Dom0 isn't meant to use that much memory (most should be assigned to AppVMs), so on big systems this will be pure waste. QubesOS/qubes-issues#1136 Fixes QubesOS/qubes-issues#1313
Linux kernel have some memory overhead depending on maxmem. Dom0 isn't meant to use that much memory (most should be assigned to AppVMs), so on big systems this will be pure waste. QubesOS/qubes-issues#1136 Fixes QubesOS/qubes-issues#1313
Linux kernel have some memory overhead depending on maxmem. Dom0 isn't meant to use that much memory (most should be assigned to AppVMs), so on big systems this will be pure waste. QubesOS/qubes-issues#1136 Fixes QubesOS/qubes-issues#1313
Linux kernel have some memory overhead depending on maxmem. Dom0 isn't meant to use that much memory (most should be assigned to AppVMs), so on big systems this will be pure waste. QubesOS/qubes-issues#1136 Fixes QubesOS/qubes-issues#1313 Signed-off-by: Frédéric Pierret <frederic.epitre@orange.fr>
Linux kernel have some memory overhead depending on maxmem. Dom0 isn't meant to use that much memory (most should be assigned to AppVMs), so on big systems this will be pure waste. QubesOS/qubes-issues#1136 Fixes QubesOS/qubes-issues#1313 Signed-off-by: Frédéric Pierret <frederic.epitre@orange.fr>
Linux kernel have some memory overhead depending on maxmem. Dom0 isn't meant to use that much memory (most should be assigned to AppVMs), so on big systems this will be pure waste. QubesOS/qubes-issues#1136 Fixes QubesOS/qubes-issues#1313 Signed-off-by: Frédéric Pierret <frederic.epitre@orange.fr>
Linux kernel have some memory overhead depending on maxmem. Dom0 isn't meant to use that much memory (most should be assigned to AppVMs), so on big systems this will be pure waste. QubesOS/qubes-issues#1136 Fixes QubesOS/qubes-issues#1313
Linux kernel have some memory overhead depending on maxmem. Dom0 isn't meant to use that much memory (most should be assigned to AppVMs), so on big systems this will be pure waste. QubesOS/qubes-issues#1136 Fixes QubesOS/qubes-issues#1313
Linux kernel have some memory overhead depending on maxmem. Dom0 isn't meant to use that much memory (most should be assigned to AppVMs), so on big systems this will be pure waste. QubesOS/qubes-issues#1136 Fixes QubesOS/qubes-issues#1313
Linux kernel have some memory overhead depending on maxmem. Dom0 isn't meant to use that much memory (most should be assigned to AppVMs), so on big systems this will be pure waste. QubesOS/qubes-issues#1136 Fixes QubesOS/qubes-issues#1313
Linux kernel have some memory overhead depending on maxmem. Dom0 isn't meant to use that much memory (most should be assigned to AppVMs), so on big systems this will be pure waste. QubesOS/qubes-issues#1136 Fixes QubesOS/qubes-issues#1313
Linux kernel have some memory overhead depending on maxmem. Dom0 isn't meant to use that much memory (most should be assigned to AppVMs), so on big systems this will be pure waste. QubesOS/qubes-issues#1136 Fixes QubesOS/qubes-issues#1313
Linux kernel have some memory overhead depending on maxmem. Dom0 isn't meant to use that much memory (most should be assigned to AppVMs), so on big systems this will be pure waste. QubesOS/qubes-issues#1136 Fixes QubesOS/qubes-issues#1313
Linux kernel have some memory overhead depending on maxmem. Dom0 isn't meant to use that much memory (most should be assigned to AppVMs), so on big systems this will be pure waste. QubesOS/qubes-issues#1136 Fixes QubesOS/qubes-issues#1313
Linux kernel have some memory overhead depending on maxmem. Dom0 isn't meant to use that much memory (most should be assigned to AppVMs), so on big systems this will be pure waste. QubesOS/qubes-issues#1136 Fixes QubesOS/qubes-issues#1313
-- This will solve rhinstaller#959 for new installations. Related to QubesOS/qubes-issues#959 -- -- Linux kernel have some memory overhead depending on maxmem. Dom0 isn't meant to use that much memory (most should be assigned to AppVMs), so on big systems this will be pure waste. QubesOS/qubes-issues#1136 Fixes QubesOS/qubes-issues#1313 -- -- Many Intel processors (and BIOSes) have invalid IOMMU configuration for IGFX, which cause multiple problems - from screen glitches, to system hang. Since IGFX currently is still in dom0 (isn't isolated from other system components), disabling IOMMU for it doesn't lower overall security. When GUI domain will be implemented, we need to re-enable IOMMU here and hope hardware manufacturers will fix it in the meantime. Fixes QubesOS/qubes-issues#2836 -- -- Try to update microcode as early as possible if provided. This option will scan all multiboot modules besides dom0 kernel. In our case this is perfect - there is only one other module and it is initramfs which have microcode early cpio prepended. QubesOS/qubes-issues#3703 -- -- Defaults set during package installation do not apply, as booloader configuration doesn't exist at that stage yet. Reported by @rustybird QubesOS/qubes-issues#4252 -- -- It tries to mount every existing block device, including VM images. -- Based on : - 0f7cc31 - 40c7967 - 5bd0830 - 223e0ad - 724b128 - 203e8d7 - a6bfe43
-- This will solve rhinstaller#959 for new installations. Related to QubesOS/qubes-issues#959 -- -- Linux kernel have some memory overhead depending on maxmem. Dom0 isn't meant to use that much memory (most should be assigned to AppVMs), so on big systems this will be pure waste. QubesOS/qubes-issues#1136 Fixes QubesOS/qubes-issues#1313 -- -- Many Intel processors (and BIOSes) have invalid IOMMU configuration for IGFX, which cause multiple problems - from screen glitches, to system hang. Since IGFX currently is still in dom0 (isn't isolated from other system components), disabling IOMMU for it doesn't lower overall security. When GUI domain will be implemented, we need to re-enable IOMMU here and hope hardware manufacturers will fix it in the meantime. Fixes QubesOS/qubes-issues#2836 -- -- Try to update microcode as early as possible if provided. This option will scan all multiboot modules besides dom0 kernel. In our case this is perfect - there is only one other module and it is initramfs which have microcode early cpio prepended. QubesOS/qubes-issues#3703 -- -- Defaults set during package installation do not apply, as booloader configuration doesn't exist at that stage yet. Reported by @rustybird QubesOS/qubes-issues#4252 -- -- It tries to mount every existing block device, including VM images. -- Based on : - 0f7cc31 - 40c7967 - 5bd0830 - 223e0ad - 724b128 - 203e8d7 - a6bfe43
-- This will solve rhinstaller#959 for new installations. Related to QubesOS/qubes-issues#959 -- -- Linux kernel have some memory overhead depending on maxmem. Dom0 isn't meant to use that much memory (most should be assigned to AppVMs), so on big systems this will be pure waste. QubesOS/qubes-issues#1136 Fixes QubesOS/qubes-issues#1313 -- -- Many Intel processors (and BIOSes) have invalid IOMMU configuration for IGFX, which cause multiple problems - from screen glitches, to system hang. Since IGFX currently is still in dom0 (isn't isolated from other system components), disabling IOMMU for it doesn't lower overall security. When GUI domain will be implemented, we need to re-enable IOMMU here and hope hardware manufacturers will fix it in the meantime. Fixes QubesOS/qubes-issues#2836 -- -- Try to update microcode as early as possible if provided. This option will scan all multiboot modules besides dom0 kernel. In our case this is perfect - there is only one other module and it is initramfs which have microcode early cpio prepended. QubesOS/qubes-issues#3703 -- -- Defaults set during package installation do not apply, as booloader configuration doesn't exist at that stage yet. Reported by @rustybird QubesOS/qubes-issues#4252 -- -- It tries to mount every existing block device, including VM images. -- Based on : - 0f7cc31 - 40c7967 - 5bd0830 - 223e0ad - 724b128 - 203e8d7 - a6bfe43
-- This will solve rhinstaller#959 for new installations. Related to QubesOS/qubes-issues#959 -- -- Linux kernel have some memory overhead depending on maxmem. Dom0 isn't meant to use that much memory (most should be assigned to AppVMs), so on big systems this will be pure waste. QubesOS/qubes-issues#1136 Fixes QubesOS/qubes-issues#1313 -- -- Many Intel processors (and BIOSes) have invalid IOMMU configuration for IGFX, which cause multiple problems - from screen glitches, to system hang. Since IGFX currently is still in dom0 (isn't isolated from other system components), disabling IOMMU for it doesn't lower overall security. When GUI domain will be implemented, we need to re-enable IOMMU here and hope hardware manufacturers will fix it in the meantime. Fixes QubesOS/qubes-issues#2836 -- -- Try to update microcode as early as possible if provided. This option will scan all multiboot modules besides dom0 kernel. In our case this is perfect - there is only one other module and it is initramfs which have microcode early cpio prepended. QubesOS/qubes-issues#3703 -- -- Defaults set during package installation do not apply, as booloader configuration doesn't exist at that stage yet. Reported by @rustybird QubesOS/qubes-issues#4252 -- -- It tries to mount every existing block device, including VM images. -- Based on : - 0f7cc31 - 40c7967 - 5bd0830 - 223e0ad - 724b128 - 203e8d7 - a6bfe43
-- This will solve rhinstaller#959 for new installations. Related to QubesOS/qubes-issues#959 -- -- Linux kernel have some memory overhead depending on maxmem. Dom0 isn't meant to use that much memory (most should be assigned to AppVMs), so on big systems this will be pure waste. QubesOS/qubes-issues#1136 Fixes QubesOS/qubes-issues#1313 -- -- Try to update microcode as early as possible if provided. This option will scan all multiboot modules besides dom0 kernel. In our case this is perfect - there is only one other module and it is initramfs which have microcode early cpio prepended. QubesOS/qubes-issues#3703 -- -- Defaults set during package installation do not apply, as booloader configuration doesn't exist at that stage yet. Reported by @rustybird QubesOS/qubes-issues#4252 -- -- It tries to mount every existing block device, including VM images. -- Based on : - 0f7cc31 - 40c7967 - 5bd0830 - 223e0ad - 724b128 - 203e8d7 - a6bfe43
-- This will solve rhinstaller#959 for new installations. Related to QubesOS/qubes-issues#959 -- -- Linux kernel have some memory overhead depending on maxmem. Dom0 isn't meant to use that much memory (most should be assigned to AppVMs), so on big systems this will be pure waste. QubesOS/qubes-issues#1136 Fixes QubesOS/qubes-issues#1313 -- -- Try to update microcode as early as possible if provided. This option will scan all multiboot modules besides dom0 kernel. In our case this is perfect - there is only one other module and it is initramfs which have microcode early cpio prepended. QubesOS/qubes-issues#3703 -- -- Defaults set during package installation do not apply, as booloader configuration doesn't exist at that stage yet. Reported by @rustybird QubesOS/qubes-issues#4252 -- -- It tries to mount every existing block device, including VM images. -- Based on : - 0f7cc31 - 40c7967 - 5bd0830 - 223e0ad - 724b128 - 203e8d7 - a6bfe43
Launching a 16GB HVM domain on a machine with 64GB RAM (and not many other VMs running) doesn't work, while an 8GB HVM seems to work reliably.
It seems that there is some issue in qmemman that makes it fail.
Disabling qmemman temporarily works around the issue.
For anyone with the same problem, here is a reliable way to do so:
The text was updated successfully, but these errors were encountered: