Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker stats memory usage is misleading #10824

Closed
SamSaffron opened this issue Feb 16, 2015 · 37 comments
Closed

Docker stats memory usage is misleading #10824

SamSaffron opened this issue Feb 16, 2015 · 37 comments

Comments

@SamSaffron
Copy link

@SamSaffron SamSaffron commented Feb 16, 2015

# free -h
             total       used       free     shared    buffers     cached
Mem:          2.0G       1.9G        95M       273M        97M       879M
-/+ buffers/cache:       930M       1.0G
Swap:         1.0G        15M       1.0G
# docker stats app
CONTAINER           CPU %               MEM USAGE/LIMIT       MEM %               NET I/O
app                 51.20%              1.311 GiB/1.955 GiB   67.06%              58.48 MiB/729.5 MiB

so I have 1GB of RAM to play with yet docker stats is reporting 1.3G used on a 2G system, so its reporting pages that are used for disk caching as well as used, which it "technically" used but will be freed by the OS once we run low on RAM

Instead stat should just report (used - buffers cached) and (buffers cached in brackets)

Otherwise people using this information may be mislead as to how bad stuff really is ( http://www.linuxatemyram.com/ )

Also recommend using standard unix postfixes here, so it 1.311G not 1.311 GiB

cc @crosbymichael

@crosbymichael
Copy link
Contributor

@crosbymichael crosbymichael commented Feb 23, 2015

+1 I think it would make sense to make this change.

@blackfader
Copy link

@blackfader blackfader commented Feb 26, 2015

yes memory is not right

@mattva01
Copy link

@mattva01 mattva01 commented Mar 22, 2015

@spf13
This is a little harder then it seems, as it requires modifying libcontainer code here: https://github.com/docker/libcontainer/blob/master/cgroups/fs/memory.go
Do we use memory.stat.rss, memory.stat.usage_in_bytes - memory.stat.cache, or some other combination of stats...?

@jessfraz
Copy link
Contributor

@jessfraz jessfraz commented Mar 22, 2015

I switched to the proficient label

On Sunday, March 22, 2015, Matthew Gallagher notifications@github.com
wrote:

@spf13 https://github.com/spf13
This is a little harder then it seems, as it requires modifying
libcontainer code here:
https://github.com/docker/libcontainer/blob/master/cgroups/fs/memory.go
Do we use memory.stat.rss, memory.stat.usage_in_bytes - memory.stat.cache,
or some other combination of stats...?


Reply to this email directly or view it on GitHub
#10824 (comment).

@imleon
Copy link

@imleon imleon commented Mar 23, 2015

Try set the param of docker run --memory,then check your
/sys/fs/cgroup/memory/docker/<container_id>/memory.usage_in_bytes
It should be right.

@resouer
Copy link
Contributor

@resouer resouer commented Mar 27, 2015

#dibs ping @wonderflow, I think this is what you are trying to fix. Just send a PR.

@nhsiehgit
Copy link
Contributor

@nhsiehgit nhsiehgit commented Mar 27, 2015

Hi @resouer @wonderflow,
Please remember to include the keywords closes or fixes once you've submitted your PR to help make tracking of fixes easier. Thanks!

@wonderflow
Copy link
Contributor

@wonderflow wonderflow commented Apr 3, 2015

"Also recommend using standard unix postfixes here, so it 1.311G not 1.311 GiB"
This is related using function HumanSize or BytesSize in size.go.

I'm not sure if I should change. @crosbymichael
cc @SamSaffron

@SamSaffron
Copy link
Author

@SamSaffron SamSaffron commented Apr 14, 2015

@wonderflow thanks for the fixes !

@wonderflow
Copy link
Contributor

@wonderflow wonderflow commented Jul 21, 2015

I can’t find your comment on github.

Can you tell me more details about it? Thanks.

On Jul 21, 2015, at 12:14 AM, Etienne Bruines notifications@github.com wrote:

Since this PR (#12172 #12172) was merged April 14th, and I'm running 1.6.2 (released May 13th), I believe I'm running a version that should have this fix.

However, docker stats still returns the memory usage including the buffers.

containername 0.00% 218.4 MiB/512 MiB 42.66% 17.22 KiB/130.1 KiB

$ free -m
total used free shared buffers
Mem: 971 889 82 26 46
-/+ buffers: 842 129
Swap: 2047 0 2047
I like the fact that it's not saying the usage is 889M, but 889 - 842 != 218.

Any way this can be fixed?

(Posting in this issue because it feels like the same issue, that's not 100% fixed, instead of a new issue)


Reply to this email directly or view it on GitHub #10824 (comment).

@pdericson
Copy link
Contributor

@pdericson pdericson commented Aug 13, 2015

Just want to point out that this has had me chasing phantom memory leaks for more hours than I care to admit ;)

top says "18.4" MB, docker stats says "192.9" MB.

screen shot 2015-08-13 at 11 47 59 pm

Update:

Here's the part that worries me, if I set a memory limit on the container of 32MB, it looks like my container will be killed even though resident memory doesn't go above 18.4 MB.

@akailash
Copy link

@akailash akailash commented Jan 13, 2017

Is this resolved? I have the same problem as @pdericson with my apps running in docker containers (running on my local machine). top shows me 15MB in RES and the docker stats show a slowly linearly increasing memory usage. Which climbs to around 250MB over 12 hours.
I am using Docker Version: 1.12.5

@Sergeant007
Copy link

@Sergeant007 Sergeant007 commented May 13, 2017

Hi @farcaller, @MadMub

My comments are below (however I'm not a docker representative and not an expert in system internals).

Does that mean that page cache will account for the memory limit...

I think no. Memory limits (including killing of oversized apps) is responsibility to cgroups, not Docker. Docker kills nothing. And most probably "developers of cgroups" did it correctly and do not account disk cache.

we have been chasing real and phantom memory leaks over the past month... Or will the container always release the cache if the PID needs more memory?

Please, share more information if you still have the problem. The container does not release IO cache. Cached memory is released by operating system. At the same time "cgroups" just split cached memory across all cgroups and keep some statistics about it. At the same time docker just picks that statistic up and reports it (and, in case of this particular issue, cached memory should not be counted in the "docker stats" report).

P.S. The fix is still hanging in the PR.

yadutaf added a commit to yadutaf/ctop that referenced this issue Jul 27, 2017
Arguably, cache space is also reclaimable, meaning that, under pressure,
it can be reclaimed instead of triggering an OOM. Wether to substract it
or not would be a debate in itself. I chose to do it here to follow the
lead of Docker. Even if this is not the best decision (and I have not
opinion on that), doing the opposite would introduce confusion and
confusion rarely do good.

See: moby/moby#10824
yadutaf added a commit to yadutaf/ctop that referenced this issue Jul 27, 2017
Arguably, cache space is also reclaimable, meaning that, under pressure,
it can be reclaimed instead of triggering an OOM. Wether to substract it
or not would be a debate in itself. I chose to do it here to follow the
lead of Docker. Even if this is not the best decision (and I have not
opinion on that), doing the opposite would introduce confusion and
confusion rarely do good.

See: moby/moby#10824
yadutaf added a commit to yadutaf/ctop that referenced this issue Jul 28, 2017
Arguably, cache space is also reclaimable, meaning that, under pressure,
it can be reclaimed instead of triggering an OOM. Wether to substract it
or not would be a debate in itself. I chose to do it here to follow the
lead of Docker. Even if this is not the best decision (and I have not
opinion on that), doing the opposite would introduce confusion and
confusion rarely do good.

See: moby/moby#10824
@coolljt0725
Copy link
Contributor

@coolljt0725 coolljt0725 commented Nov 29, 2017

@theonlydoo If I understand correctly, the memory usage in docker stats is exactly read from containers's memory cgroup, you can see the value is the same with 490270720 which you read from cat /sys/fs/cgroup/memory/docker/665e99f8b760c0300f10d3d9b35b1a5e5fdcf1b7e4a0e27c1b6ff100981d9a69/memory.usage_in_bytes , and the limit is also the memory cgroup limit which is set by -m when you create container. The statistics of RES and memory cgroup are different, the RES does not take caches into account, but the memory cgroup does, that's why MEM USAGE in docker stats is much more than RES in top

Hope this helps :)

@theonlydoo
Copy link

@theonlydoo theonlydoo commented Nov 30, 2017

@coolljt0725 so caches are increasing as the app is writing logs on the host's filesystem, and it triggers
oom reaping even if the app isn't using that memory and it's only caches.

@thaJeztah
Copy link
Member

@thaJeztah thaJeztah commented Dec 3, 2017

OOM reaping is handled by the kernel though

@axelabs
Copy link

@axelabs axelabs commented Apr 6, 2018

@Sergeant007, your suggested container memory usage formula does not match the sum of process memory usage for me. When I compare memory used from cgroups (minus cache) to the sum of process rss, I get a not so small difference:

# RSS_USED=$(ps -eo rss --sort -rss --no-headers|awk '{RSS+=$1}END{print RSS*1024}')
# MEM_USED=$((`cat /sys/fs/cgroup/memory/memory.usage_in_bytes` - `awk '/^cache/{print $2}' /sys/fs/cgroup/memory/memory.stat`))
# echo $RSS_USED - $MEM_USED = $(($RSS_USED-$MEM_USED))
27811590144 - 27780145152 = 31444992

Does ps not reflect actual memory usage of processes in the container like some other tools?

This is in a CentOS 7.3 container running on Amazon Linux 2016.09.

@Sergeant007
Copy link

@Sergeant007 Sergeant007 commented Apr 7, 2018

@axelabs, I'm not sure if ps work 100% accurate, but even my method described above does not pretend to be fully correct. It just fixes the original behavior that was really-really misleading.

So in your particular measurement the difference was 32MB out of 27GB of consumed memory. This is 0.1%. I'm not sure if there is an easier to achieve something better. And I'm not even sure which one (RSS_USED or MEM_USED) was correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.