Incorrect RAM usage report in containers #175

Closed
theredcat opened this Issue Feb 22, 2017 · 9 comments

Comments

Projects
None yet
9 participants

Hello,

I have a problem with lxcfs ram report in containers. It seems that the cached files from disk is shown as used memory.

Here is a example with a little one liner that :

  • List all containers on the system
  • Lxc-attach and exec free -m and grep used - buffers/cache (aka : the real used memory)
  • Sum

Then a free -m with the same value but on the host.

14:12:56 root@hypervisor-01:~ # for i in $(lxc-ls -1); do lxc-attach -n $i -- free -m|grep 'buffers/cache' |awk '{print $3}'; done| awk '{s+=$1} END {printf "%.0f\n", s}'
97365
14:12:59 root@hypervisor-01:~ # free -m |grep 'buffers/cache' |awk '{print $3}'
46074
14:14:06 root@hypervisor-01:~ # lxc-ls --version
2.0.5
14:14:09 root@hypervisor-01:~ # lxcfs --version                                                                                                                                            
2.0.5
14:14:13 root@hypervisor-01:~ #

varuzam commented Mar 24, 2017

i have the same problem in all my lxc conteainers

# free -h
             total       used       free     shared    buffers     cached
Mem:          5.9G       5.8G        52M       687M         0B       4.6G
-/+ buffers/cache:       1.2G       4.7G
Swap:           0B         0B         0B
# cat /proc/meminfo |grep -E 'Mem|Cached|Buff'
MemTotal:        6144000 kB
MemFree:           11956 kB
MemAvailable:      11956 kB
Buffers:               0 kB
Cached:          4914860 kB

MemAvailable should be MemFree+Cached+Buffers

Contributor

tssge commented May 15, 2017

I think this is due to kernel cgroup member memory.usage_in_bytes exporting the memory usage of the container as a whole, including cached. This member is used to collect memory usage data by lxcfs. Behavior for this was changed relatively lately in the kernel, so the behavior was correct before, however now it needs to be fixed.

Cgroup member memory.stat should be used instead to parse these values. According to kernel documentation memory.stat is also more accurate than memory.usage_in_bytes, however I am not sure if this raises some performance concerns.

Please see https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt for more information. I am looking into working for a patch on this one, but no guarantees.

Could someone more knowledgeable confirm if I am right on this?

snowfag commented Jun 8, 2017

I have this same issue. Cached ram on the host even caused an lxc container to start swapping due to thinking 100% of the ram was used.

theredcat commented Jun 8, 2017

I've upgraded to 2.0.7 and those issues are resolved.

Validation procedure to ensure that issue was indeed resolved :

  • Open term 1 with htop in container => Show 100% Used RAM (green bar) and 1% Buffer/Cache (Yellow bar)
  • Update LXCFS
  • Open term 2 on hypervisor and kill -USR1 $LXCFS_PID
  • In term 1 htop show instantly Used ram @ 75% (green bar) and 25% Buffer/Cache (Yellow bar)

I have same issue with lxcfs 2.0.7 (centos 7 3.10.0-514.16.1.el7, docker 1.11.1).

cat /proc/meminfo
MemTotal:        6291456 kB
MemFree:             376 kB
MemAvailable:        376 kB
Buffers:               0 kB
Cached:          1768584 kB
SwapCached:            0 kB
Active:          4628400 kB
Inactive:        1662128 kB
Active(anon):    4521944 kB
Inactive(anon):        0 kB
Active(file):     106456 kB
Inactive(file):  1662128 kB
Unevictable:           0 kB
Mlocked:          463948 kB
SwapTotal:             0 kB
SwapFree:              0 kB
Dirty:              1664 kB
Writeback:             0 kB
AnonPages:      63824080 kB
Mapped:           579460 kB
Shmem:             14972 kB
Slab:               0 kB
SReclaimable:          0 kB
SUnreclaim:            0 kB
KernelStack:      120416 kB
PageTables:       202012 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    98901132 kB
Committed_AS:   128121380 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      857852 kB
VmallocChunk:   34258080768 kB
HardwareCorrupted:     0 kB
AnonHugePages:  59121664 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:      293624 kB
DirectMap2M:    10096640 kB
DirectMap1G:    192937984 kB
Owner

hallyn commented Jul 14, 2017

Note that any results using Centos 7's 3.10 kernel cannot be trusted, since
that kernel does not properly convert ucreds across pid namespaces.

visit1985 commented Jul 20, 2017

I have a similar issue on Ubuntu 16.04.

lxd 2.14
lxcfs 2.0.7
kernel 4.4.0

container # grep -E 'Mem|^Cached|Buff' /proc/meminfo
MemTotal:       65925368 kB
MemFree:        37637724 kB
MemAvailable:   37637724 kB
Buffers:               0 kB
Cached:         24778996 kB
container # free
              total        used        free      shared  buff/cache   available
Mem:       65925368     3508832    37637540      159508    24778996    37637540
Swap:       3903484      990660     2912824

host # grep -E 'Mem|^Cached|Buff' /proc/meminfo
MemTotal:       65925368 kB
MemFree:         6461628 kB
MemAvailable:   54547988 kB
Buffers:         5199720 kB
Cached:         38429564 kB
host # free
              total        used        free      shared  buff/cache   available
Mem:       65925368    10464248     6461628      159508    48999492    54547988
Swap:       3903484      990664     2912820

MemAvailable is somehow reporting the same value as MemFree in lxcfs. It seems like Cached is not excluded from MemAvailable.

Related to hishamhm/htop#599

towe75 commented Sep 20, 2017

Hi,

I am also affected by this problem on

ubuntu 16.04
lxcfs 2.0.7
lxc 2.0.8
kernel 4.4.0 (various patch levels)

By looking at binding.c in 2.0 branch (and master) i can see that MemFree and MemAvailable indeed have the same calculation:

            } else if (startswith(line, "MemFree:")) {
		snprintf(lbuf, 100, "MemFree:        %8lu kB\n", memlimit - memusage);
		printme = lbuf;
	} else if (startswith(line, "MemAvailable:")) {
		snprintf(lbuf, 100, "MemAvailable:   %8lu kB\n", memlimit - memusage);
		printme = lbuf;
	}

@theredcat are you sure the issue is resolved for you in lxcfs 2.0.7? The code snipped clearly states that there is no difference between MemFree and MemAvailable calculation.

Should Available not sum up Cached and Buffered like @varuzam suggested above?
Those values are already available in this code block, so seems to be simple.

Is this the correct approach?

Same here.

Ubuntu 16.04.3 LTS
lxd 2.18
lxcfs 2.0.7

container# grep -E 'Mem|^Cached|Buff' /proc/meminfo
MemTotal: 4194304 kB
MemFree: 4760 kB
MemAvailable: 4760 kB
Buffers: 0 kB
Cached: 12 kB

container# free
total used free shared buff/cache available
Mem: 4194304 4191804 2488 72532 12 2488
Swap: 8388604 155992 8232612

host# grep -E 'Mem|^Cached|Buff' /proc/meminfo
MemTotal: 32911468 kB
MemFree: 23752392 kB
MemAvailable: 31387156 kB
Buffers: 274976 kB
Cached: 7566692 kB

host# free
total used free shared buff/cache available
Mem: 32911468 889052 23752140 72532 8270276 31386912
Swap: 8388604 155988 8232616

And it also starts swapping due to thinking 100% of the ram was used.

@cronnelly cronnelly referenced this issue in lxc/lxd Oct 26, 2017

Closed

OOM killers in LXC #3337

asokoloski added a commit to asokoloski/lxcfs that referenced this issue Dec 4, 2017

Change MemAvailable figure in /proc/meminfo to include cache memory --
…Fixes #175 I think.

MemAvailable represents roughly how much more memory we can use before
we start swapping.  Page cache memory can be reclaimed if it's needed
for something else, so it should count as available memory.  This
change should also fix the "available" column of the "free" command,
as well as the "avail Mem" value in "top", both of which come from
MemAvailable.

Note that this isn't perfectly accurate.  On a physical machine, the
value for MemAvailable is the result of a calculation that takes into
account that when memory gets low (but before it's completely
exhausted), kswapd wakes up and starts paging things out.  See:

https://github.com/torvalds/linux/blob/a0908a1b7d68706ee52ed4a039756e70c8e956e9/mm/page_alloc.c#L4553
(si_mem_available function)

I tried to think of a way to be more exact, but this calculation
includes figures that we don't have available for a given cgroup
hierarchy, such as reclaimable slab memory and the low watermark for
zones.  So it's not really feasible to reproduce it exactly.

But anyway, since the kernel calculation itself is just an estimation,
it doesn't seem too bad that we're a little bit off.  Adding in the
amount of memory used for page cache seems much better than what we
were doing before (just copying the free memory figure), because that
can be wrong by gigabytes.

asokoloski added a commit to asokoloski/lxcfs that referenced this issue Dec 4, 2017

Change MemAvailable figure in /proc/meminfo to include cache memory --
…Fixes #175 I think.

MemAvailable represents roughly how much more memory we can use before
we start swapping.  Page cache memory can be reclaimed if it's needed
for something else, so it should count as available memory.  This
change should also fix the "available" column of the "free" command,
as well as the "avail Mem" value in "top", both of which come from
MemAvailable.

Note that this isn't perfectly accurate.  On a physical machine, the
value for MemAvailable is the result of a calculation that takes into
account that when memory gets low (but before it's completely
exhausted), kswapd wakes up and starts paging things out.  See:

https://github.com/torvalds/linux/blob/a0908a1b7d68706ee52ed4a039756e70c8e956e9/mm/page_alloc.c#L4553
(si_mem_available function)

I tried to think of a way to be more exact, but this calculation
includes figures that we don't have available for a given cgroup
hierarchy, such as reclaimable slab memory and the low watermark for
zones.  So it's not really feasible to reproduce it exactly.

But anyway, since the kernel calculation itself is just an estimation,
it doesn't seem too bad that we're a little bit off.  Adding in the
amount of memory used for page cache seems much better than what we
were doing before (just copying the free memory figure), because that
can be wrong by gigabytes.

Signed-off-by: Aaron Sokoloski <asokoloski@gmail.com>

asokoloski added a commit to asokoloski/lxcfs that referenced this issue Dec 4, 2017

Change MemAvailable figure in /proc/meminfo to include cache memory --
…Fixes #175 I think.

MemAvailable represents roughly how much more memory we can use before
we start swapping.  Page cache memory can be reclaimed if it's needed
for something else, so it should count as available memory.  This
change should also fix the "available" column of the "free" command,
as well as the "avail Mem" value in "top", both of which come from
MemAvailable.

Note that this isn't perfectly accurate.  On a physical machine, the
value for MemAvailable is the result of a calculation that takes into
account that when memory gets low (but before it's completely
exhausted), kswapd wakes up and starts paging things out.  See:

https://github.com/torvalds/linux/blob/a0908a1b7d68706ee52ed4a039756e70c8e956e9/mm/page_alloc.c#L4553
(si_mem_available function)

I tried to think of a way to be more exact, but this calculation
includes figures that we don't have available for a given cgroup
hierarchy, such as reclaimable slab memory and the low watermark for
zones.  So it's not really feasible to reproduce it exactly.

For a more detailed understanding how MemAvailable comes about one
should look at 34e431b0ae398fc54ea69ff85ec700722c9da773 in the Linux
kernel tree.

But anyway, since the kernel calculation itself is just an estimation,
it doesn't seem too bad that we're a little bit off.  Adding in the
amount of memory used for page cache seems much better than what we
were doing before (just copying the free memory figure), because that
can be wrong by gigabytes.

Signed-off-by: Aaron Sokoloski <asokoloski@gmail.com>

@brauner brauner closed this in ad19b86 Dec 4, 2017

brauner added a commit that referenced this issue Dec 4, 2017

Merge pull request #228 from asokoloski/fix-175-c
Change MemAvailable figure in /proc/meminfo to include cache memory -- Fixes #175 I think.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment