New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial memory assignment strategy leads to PoD exhaustion (crashes domains without memory balloon drivers) #4135

Open
jpouellet opened this Issue Jul 24, 2018 · 0 comments

Comments

Projects
None yet
2 participants
@jpouellet
Contributor

jpouellet commented Jul 24, 2018

When qvm-prefs memory != maxmem (domain xml currentMemory != memory), Xen advertises maxmem existing memory to the guest, but only allocates / assigns enough mfns for currentMemory backing pages. The difference is made up by a pool of "Populate on Demand" pages. More info at: https://blog.xenproject.org/2014/02/14/ballooning-rebooting-and-the-feature-youve-never-heard-of/

This has an implicit assumption that whenever memory != maxmem, your guest actually supports memory ballooning and will initialize the memory balloon driver and reserve maxmem - memory amount of memory to not be used by the guest. If this is not the case (e.g., if your operating system does not support Xen memory ballooning), then if maxmem > memory you inevitably run out of PoD pages and the guest crashes:

(XEN) p2m_pod_demand_populate: Dom24 out of PoD memory! (tot=102385 ents=921600 dom24)
(XEN) domain_crash called from p2m-pod.c:1218
(XEN) Domain 24 (vcpu#0) crashed on cpu#3:
(XEN) ----[ Xen-4.8.3  x86_64  debug=n   Not tainted ]----
(XEN) CPU:    3
(XEN) RIP:    0008:[<ffffffff811e7bf0>]
(XEN) RFLAGS: 0000000000010206   CONTEXT: hvm guest (d24v0)
(XEN) rax: 0000000000001e00   rbx: ffff80002216cfe8   rcx: 00000000000001c0
(XEN) rdx: 0000000000899b80   rsi: 0000000000898d80   rdi: ffff80000b107000
(XEN) rbp: ffff80002216cd90   rsp: ffff80002216cd50   r8:  00007f7fffffc000
(XEN) r9:  0000000000000002   r10: 0000004000058848   r11: ffffffff811e7c20
(XEN) r12: ffff80000b106000   r13: 0000000000001e00   r14: ffff80002216cf58
(XEN) r15: 0000000000001e00   cr0: 0000000080010031   cr4: 00000000001406b0
(XEN) cr3: 00000001076ed000   cr2: 000000000087c5bc
(XEN) fsb: 000000026b735000   gsb: ffffffff81863ff0   gss: ffffffff81863ff0
(XEN) ds: 0023   es: 0023   fs: 0023   gs: 0023   ss: 0010   cs: 0008
(XEN) p2m_pod_demand_populate: Dom24 out of PoD memory! (tot=102385 ents=921600 dom24)
(XEN) domain_crash called from p2m-pod.c:1218

Unselecting "Include in memory balancing" in qubes-vm-settings greys out the maxmem UI and excludes the domain from qmemman balancing, but still reports whatever maxmem value was set previously, resulting in crashes.

One possible solution (which creates potentially undesired coupling between prefs and services) is: if "Include in memory balancing" is unselected (qvm-service meminfo-writer off), then the memory domain xml value be set to the memory (instead of maxmem) qvm-pref (as we currently do for VMs w/ PCI devices), or potentially the currentMemory domain xml value could be omitted entirely.

Relevant libvirt docs for memory config at: https://libvirt.org/formatdomain.html#elementsMemoryAllocation

Relevant libvirt xml template code in core-admin at: https://github.com/QubesOS/qubes-core-admin/blob/c3d287a33cc0cf5c2038e715eb4da66d78d2703d/templates/libvirt/xen.xml#L5-L10

If anyone wants to reproduce, OpenBSD is one such OS which does not (yet) support Xen memory ballooning.

Workaround: simply set memory and maxmem prefs to the same value.

@jpouellet jpouellet changed the title from Xen config generation allows PoD exhaustion (crashes domains without memory balloon drivers) to Initial memory assignment strategy leads to PoD exhaustion (crashes domains without memory balloon drivers) Jul 24, 2018

@andrewdavidwong andrewdavidwong added this to the Release 4.0 updates milestone Jul 25, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment