Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hugetlbfs allocations not completely freed by Qemu on kernels 6.6 to 6.8 #49269

Closed
abhocetabhac opened this issue Mar 13, 2024 · 1 comment
Closed
Labels
bug Something isn't working needs-testing Testing a PR or reproducing an issue needed

Comments

@abhocetabhac
Copy link

abhocetabhac commented Mar 13, 2024

Is this a new report?

Yes

System Info

Void 6.6.x, 6.7.x and 6.8.x, musl, AMD Ryzen 5 3600, 32 GB RAM

Package(s) Affected

linux6.6 and linux6.7

Does a report exist for this bug with the project's home (upstream) and/or another distro?

I was unable to find anything in the Web about the issue.

Expected behaviour

All allocated hugetlbfs memory should be freed upon the exit of the virtual machines, similar to the behavior observed with kernels up to 6.5.

Actual behaviour

With kernels 6.6 to 6.8, 1/4 of the memory allocated to Qemu virtual machines from the hugetlbfs filesystem is not freed upon their exit.

Steps to reproduce

I use hugetlbfs with 1GB huge pages as the memory backend of Qemu virtual machines. Everything used to work fine up to kernel 6.5. But, with kernels 6.6 to 6.8, 1/4 of the memory allocated to virtual machines is not freed upon exit.

My kernel command line is:

linux /boot/vmlinuz root=LABEL=0_voidlinux resume=LABEL=0_swap add_efi_memmap loglevel=7 rootfstype=ext4 rootflags=ro,noatime,data=ordered transparent_hugepage=never hugepagesz=1G hugepages=16 default_hugepagesz=2M transparent_hugepage=always hugepagesz=2M hugepages=2048 psi=1

Setting 'transparent_hugepage=always' or 'madvise' for 1GB huge pages does not affect the issue.

The hugetlbfs filesystem is mounted with commands:

sysctl -w vm.hugetlb_shm_group=$(getent group kvm | cut -d: -f3)
mkdir -p -m 1777 /var/tmp/hugepages
mount -t hugetlbfs hugetlbfs /var/tmp/hugepages -o mode=1770,gid=$(getent group kvm | cut -d: -f3),pagesize=1g,size=100%

The virtual machines are launched with commands like:

qemu-system-x86_64 -snapshot -drive file=vm.qcow2,if=virtio,index=0,media=disk,cache=directsync,discard=unmap,detect-zeroes=unmap -boot c -device virtio-net-pci,netdev=tap0,mac=52:66:66:00:00:30 -netdev tap,ifname=tap0,id=tap0,script=no,downscript=no -enable-kvm -machine type=q35,accel=kvm -cpu host,kvm=off -smp 4 -m 4G -object memory-backend-file,id=mem,size=4G,mem-path=/var/tmp/hugepages,share=on -numa node,memdev=mem -machine acpi=off -rtc clock=host -device virtio-vga,max_outputs=1 -device ich9-intel-hda -device hda-duplex -usb -device qemu-xhci,id=xhci -device virtio-tablet,wheel-axis=true -display default,show-cursor=on -device virtio-serial-pci -spice unix=on,addr=/tmp/qemu-spice-bb.sock,disable-ticketing=on -device virtserialport,chardev=spicechannel0,name=com.redhat.spice.0 -chardev spicevmc,id=spicechannel0,name=vdagent -monitor unix:/tmp/qemu-monitor-bb.sock,server,nowait -fsdev local,id=share0,path=/tmp/mnt-qemu,security_model=none,writeout=immediate -device virtio-9p-pci,fsdev=share0,mount_tag=share-mtt0 -daemonize

Or, for short,

qemu-system-x86_64 ... -enable-kvm -machine type=q35,accel=kvm -cpu host,kvm=off -smp 4 -m 4G -object memory-backend-file,id=mem,size=4G,mem-path=/var/tmp/hugepages,share=on -numa node,memdev=mem ...

One in four of the 1GB huge pages allocated for each virtual machine is never recovered. After exiting a virtual machine with 4GB of memory, 1GB remains allocated. For a VM with 8GB, 2GB remain allocated. That can be observed with command 'df -h'.

After exiting virtual machines, the command

grep -R "" /sys/kernel/mm/hugepages/ /proc/sys/vm/huge

returns, for example,

/sys/kernel/mm/hugepages/hugepages-2048kB/free_hugepages:2048
/sys/kernel/mm/hugepages/hugepages-2048kB/resv_hugepages:0
/sys/kernel/mm/hugepages/hugepages-2048kB/surplus_hugepages:0
/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages_mempolicy:2048
/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages:2048
/sys/kernel/mm/hugepages/hugepages-2048kB/nr_overcommit_hugepages:0
/sys/kernel/mm/hugepages/hugepages-1048576kB/demote_size:2048kB
/sys/kernel/mm/hugepages/hugepages-1048576kB/free_hugepages:14
/sys/kernel/mm/hugepages/hugepages-1048576kB/resv_hugepages:0
/sys/kernel/mm/hugepages/hugepages-1048576kB/surplus_hugepages:0
/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages_mempolicy:16
grep: /sys/kernel/mm/hugepages/hugepages-1048576kB/demote: Permission denied
/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages:16
/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_overcommit_hugepages:0
/proc/sys/vm/hugetlb_optimize_vmemmap:0
/proc/sys/vm/hugetlb_shm_group:24
/proc/sys/vm/nr_hugepages:2048
/proc/sys/vm/nr_hugepages_mempolicy:2048
/proc/sys/vm/nr_overcommit_hugepages:0

Attempting to forcefully free the memory with the following commands does not effectively resolve the issue:

sudo umount /var/tmp/hugepages
sudo bash -c 'echo 0 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages'
sudo bash -c 'echo 3 > /proc/sys/vm/drop_caches'
sudo bash -c 'echo 1 > /proc/sys/vm/compact_memory'

Then, the command

grep -R "" /sys/kernel/mm/hugepages/ /proc/sys/vm/huge

returns

/sys/kernel/mm/hugepages/hugepages-2048kB/free_hugepages:2048
/sys/kernel/mm/hugepages/hugepages-2048kB/resv_hugepages:0
/sys/kernel/mm/hugepages/hugepages-2048kB/surplus_hugepages:0
/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages_mempolicy:2048
/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages:2048
/sys/kernel/mm/hugepages/hugepages-2048kB/nr_overcommit_hugepages:0
/sys/kernel/mm/hugepages/hugepages-1048576kB/demote_size:2048kB
/sys/kernel/mm/hugepages/hugepages-1048576kB/free_hugepages:0
/sys/kernel/mm/hugepages/hugepages-1048576kB/resv_hugepages:0
/sys/kernel/mm/hugepages/hugepages-1048576kB/surplus_hugepages:2
/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages_mempolicy:2
grep: /sys/kernel/mm/hugepages/hugepages-1048576kB/demote: Permission denied
/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages:2
/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_overcommit_hugepages:0
/proc/sys/vm/hugetlb_optimize_vmemmap:0
/proc/sys/vm/hugetlb_shm_group:24
/proc/sys/vm/nr_hugepages:2048
/proc/sys/vm/nr_hugepages_mempolicy:2048
/proc/sys/vm/nr_overcommit_hugepages:0

Attempting to recover all the 1 GB huge pages with command

sudo bash -c 'echo 16 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages'

does not work. In this example, the 'locked' 1 GB huge pages reappear as '/sys/kernel/mm/hugepages/hugepages-1048576kB/free_hugepages:14'.

@abhocetabhac abhocetabhac added bug Something isn't working needs-testing Testing a PR or reproducing an issue needed labels Mar 13, 2024
@abhocetabhac abhocetabhac changed the title Hugetlbfs allocations not completely freed by Qemu on kernels 6.6 and 6.7 Hugetlbfs allocations not completely freed by Qemu on kernels 6.6 to 6.8 Mar 26, 2024
@abhocetabhac
Copy link
Author

Solved in kernel 9.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs-testing Testing a PR or reproducing an issue needed
Projects
None yet
Development

No branches or pull requests

1 participant