New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about memory limits and page cache #21759

Closed
taviLaies opened this Issue Apr 5, 2016 · 13 comments

Comments

Projects
None yet
8 participants
@taviLaies
Copy link

taviLaies commented Apr 5, 2016

I'm using a memory limit with my microservice containers and memory usage keeps going up until they are killed.
A top command within the container shows that most of the memory is used for page cache and would be available if needed. However, docker stats reports that it is very close to the limit. The value includes the page cache as used memory.

I must be missing something because assuming that the OS running in the container uses most free available memory for page caches there is no conceivable way Docker's (or cgroup) memory limit would work if you don't eliminate page cache from calculations.

What am I missing? I looked all over for documentation but can't find any on this subject. All I can find is that the memory usage reported by cgroups and docker includes the page cache.

My containers are running in ECS.

Docker version 1.9.1, build a34a1d5/1.9.1

Containers: 2
Images: 21
Server Version: 1.9.1
Storage Driver: devicemapper
Pool Name: docker-docker--pool
Pool Blocksize: 524.3 kB
Base Device Size: 107.4 GB
Backing Filesystem: xfs
Data file:
Metadata file:
Data Space Used: 469.8 MB
Data Space Total: 9.437 GB
Data Space Available: 8.967 GB
Metadata Space Used: 221.2 kB
Metadata Space Total: 25.17 MB
Metadata Space Available: 24.94 MB
Udev Sync Supported: true
Deferred Removal Enabled: true
Deferred Deletion Enabled: true
Deferred Deleted Device Count: 0
Library Version: 1.02.93-RHEL7 (2015-01-28)
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 4.1.13-19.30.amzn1.x86_64
Operating System: Amazon Linux AMI 2015.09
CPUs: 4
Total Memory: 7.308 GiB
Name: qa-tier-user-service-us-east-1c-i-d5c7b34e
ID: 6XZJ:Z5U3:BTAH:AJ6W:4TRM:KUX7:PSMZ:XVYE:MB73:VSIA:RNFB:YI3I

@GordonTheTurtle

This comment has been minimized.

Copy link

GordonTheTurtle commented Apr 5, 2016

If you are reporting a new issue, make sure that we do not have any duplicates already open. You can ensure this by searching the issue list for this repository. If there is a duplicate, please close your issue and add a comment to the existing issue instead.

If you suspect your issue is a bug, please edit your issue description to include the BUG REPORT INFORMATION shown below. If you fail to provide this information within 7 days, we cannot debug your issue and will close it. We will, however, reopen it if you later provide the information.

For more information about reporting issues, see CONTRIBUTING.md.

You don't have to include this information if this is a feature request

(This is an automated, informational response)


BUG REPORT INFORMATION

Use the commands below to provide key information from your environment:

docker version:
docker info:

Provide additional environment details (AWS, VirtualBox, physical, etc.):

List the steps to reproduce the issue:
1.
2.
3.

Describe the results you received:

Describe the results you expected:

Provide additional info you think is important:

----------END REPORT ---------

#ENEEDMOREINFO

@marklieberman

This comment has been minimized.

Copy link

marklieberman commented Apr 6, 2016

I believe I am having the same issue and want to add my bug report.

I have two containers, both running Play 2 framework Java applications but one is being killed and the other is not. Both containers are running in the same Amazon ECS cluster on ECS-optimized container instances. The one that is being killed is constantly producing, processing and deleting temporary files about 750k in size. Eventually the memory consumption reported by ECS hits 100% and my container is killed. However, my in-JVM metrics do not report any excessive memory consumption and my JVM memory limits are well below my container limits.

Excerpts from docker inspect on my container:

Command used to run JVM:
"Args": [
        "-J-server",
        "-J-Xms1792m",
        "-J-Xmx1792m",
        "-J-XX:MaxMetaspaceSize=256m",
        "-Dhttp.port=9100",
        "-Dconfig.file=conf/prod.conf",
        "-J-javaagent:lib/aspectjweaver-1.8.7.jar",
        "-Dkamon.auto-start=true"
    ],

Memory fields from HostConfig:
        "Memory": 2684354560,
        "MemoryReservation": 0,
        "MemorySwap": 5368709120,
        "KernelMemory": 0,
        "CpuShares": 256,

You can see that I've limited my JVM to 1792m and set the container memory limit to 2560m. My JVM statistics show that used/committed memory never exceeds max.
screen shot 2016-04-06 at 10 29 10 am

However, the Docker is reporting via the ECS agent that close to 100% of the allocated container memory is being used. In these charts my container is killed and restarted around 10:45 UTC (6:45 EDT).
screen shot 2016-04-06 at 10 30 52 am


Docker version:

Client:
 Version:      1.9.1
 API version:  1.21
 Go version:   go1.4.2
 Git commit:   a34a1d5/1.9.1
 Built:        
 OS/Arch:      linux/amd64

Server:
 Version:      1.9.1
 API version:  1.21
 Go version:   go1.4.2
 Git commit:   a34a1d5/1.9.1
 Built:        
 OS/Arch:      linux/amd64

Docker info:

Containers: 5
Images: 192
Server Version: 1.9.1
Storage Driver: devicemapper
 Pool Name: docker-docker--pool
 Pool Blocksize: 524.3 kB
 Base Device Size: 107.4 GB
 Backing Filesystem: xfs
 Data file: 
 Metadata file: 
 Data Space Used: 6.477 GB
 Data Space Total: 11.32 GB
 Data Space Available: 4.848 GB
 Metadata Space Used: 2.929 MB
 Metadata Space Total: 25.17 MB
 Metadata Space Available: 22.24 MB
 Udev Sync Supported: true
 Deferred Removal Enabled: true
 Deferred Deletion Enabled: true
 Deferred Deleted Device Count: 0
 Library Version: 1.02.93-RHEL7 (2015-01-28)
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 4.1.17-22.30.amzn1.x86_64
Operating System: Amazon Linux AMI 2015.09
CPUs: 2
Total Memory: 7.8 GiB
Name: ip-172-31-22-137
ID: HIBA:PW45:BB52:SBAQ:IEDO:HJIH:OZGN:E7XM:SPLI:MHRH:44E5:6X6I
@taviLaies

This comment has been minimized.

Copy link

taviLaies commented Apr 6, 2016

This does seem related to I/O.
We are generating significant amounts of log messages - service logs and access logs.
Service logs are written both to their own file and sysout (so are being captured in the docker log). All of them are rotated (including the docker log which is using the json driver).
The files for the service log and access log are written in a mounted volume that is a folder on the host.
We have found that turning our logs off completely resolves the problem. Obviously this is not a permanent option.

Taking memory snapshots at different stages, the service within is stable at peak load. However, cached memory in the container keeps growing until docker reports 100% memory usage.

@taviLaies

This comment has been minimized.

Copy link

taviLaies commented Apr 6, 2016

I created a script inside the container:
while true; do echo A message >> /var/log/test.log; done
Tested a version with a sleep of 0.1 and 1 in combination with the log file being written to the mounted volume and on the container's filesystem and in all cases the container reaches 100% memory utilization.

The OS used by the container:
NAME="Alpine Linux"
ID=alpine
VERSION_ID=3.2.3
PRETTY_NAME="Alpine Linux v3.2"
HOME_URL="http://alpinelinux.org"
BUG_REPORT_URL="http://bugs.alpinelinux.org"

@unclejack

This comment has been minimized.

Copy link
Contributor

unclejack commented Apr 11, 2016

You're using an older version of Docker. The current version of Docker is 1.10.3 and 1.11 will soon be released. Perhaps you might also want to try using an up to date version of Docker.

The problem you've described doesn't seem to be something related to Docker itself. I'll investigate this.

@unclejack unclejack self-assigned this Apr 11, 2016

@marklieberman

This comment has been minimized.

Copy link

marklieberman commented Apr 11, 2016

Hi, thanks for looking into this.

The current official Amazon ECS-optimized AMI is still on Docker 1.9.1 so upgrading the Docker version isn't possible. (I need to be compatible with the amazon-ecs-agent.)

Even if the problem cannot be fixed, any insight into how to mitigate the effects would be appreciated. Amazon just moved to 1.9 a couple months ago so I am probably stuck with it for a while.

@taviLaies

This comment has been minimized.

Copy link

taviLaies commented Apr 18, 2016

We investigated further on our end and the page caching is not a problem. That is just the OS using all available memory for IO performance.
There are a few cases where processes get killed due to memory constraints.

Docker itself does not kill containers or tasks in case of memory issues.

The kernel (http://stackoverflow.com/questions/18786209/what-is-the-relationship-between-the-docker-host-os-and-the-container-base-image) of the EC2 instance kills tasks in that cgroup (container) in case of OOM. The only case where it may kill the whole container is if the EC2 instance is running out of memory in which case the OOM Killer will choose a process to kill and it may select the docker daemon or other processes that are essential to containers. However, it is very unlikely that the docker processes will be selected to be killed because most likely there are other processes that take more memory than the docker daemon for example.

It is documented that the ECS agent kills the container if it goes over its memory limitations. AWS referenced a couple of docker issues that might cause a container to go over:
#8769
#14487
#14399

@marklieberman - it's either the ECS agent that killed the container or the kernel if the EC2 instance is running out of memory. Look either in /var/log/messages or /var/log/ecs/* for more information.

@unclejack

This comment has been minimized.

Copy link
Contributor

unclejack commented May 4, 2016

I've looked into this a bit while researching some other performance issues. It doesn't seem to be a problem with Docker. It's mostly about memory usage of those services running in the containers and with how much memory the containers are using.

Docker makes it easy to spawn a lot of containers and get better density, but it also makes it easy to run too many services on one machine or to run services which require way too much RAM.

The official documentation lists devicemapper (direct-lvm) as a production ready storage driver, but it doesn't have very efficient memory usage (https://docs.docker.com/engine/userguide/storagedriver/selectadriver/). The official documentation doesn't state otherwise either. Multiple identical containers will increase memory usage for the page cache.

In order to make this better and get better performance, the following should help, in a similar way to how it helps outside of Docker and containers in general:

  • make containers smaller for long running services & applications (e.g. smaller binaries, smaller images, optimize memory usage, etc)
  • VERY IMPORTANT: use volumes and bind mounts, instead of storing data inside the container
  • VERY IMPORTANT: make sure to run a system with a maintained kernel, up to date Docker and devicemapper libraries (e.g. fully updated CentOS 7 / RHEL 7 / Ubuntu 14.04 / Ubuntu 16.04)

Since I couldn't discover any major issue during investigation, this issue seems to be entirely about the fact that devicemapper isn't very memory efficient due to its duplication of the page cache across identical containers. I don't think there's more to do here. I'll close the issue. Please feel free to comment.

@unclejack unclejack closed this May 4, 2016

@ytwu

This comment has been minimized.

Copy link

ytwu commented Jun 13, 2016

We have the same issue, and just found out the root cause.
the kernel should free cache/buffer as much as possible when the system short of free memory.
But we have java process been killed due to OOM, we have more than 1.2G cache/buffer wont' be released.

It turns out that the /run is a memory file system, and systemd-journald keep write logs into it, which keep consuming more and more memory. Once OOM, docker will try to kill the processes.

df -ha shows:
tmpfs 95G 689M 94G 1% /run

cd /run/log/journal/be5a6a3715bb421091ef4b2929612c95
total 688M
1744823 drwxr-sr-x 2 root systemd-journal 160 Jun 13 16:55 .
1744822 drwxr-sr-x 3 root systemd-journal 60 Jun 2 08:41 ..
492516318 -rw-r----- 1 root systemd-journal 128M Jun 8 12:29 system@153662a2d2df4a35baac25c0cf03e5d7-0000000000091b68-000534a41be2d8cd.journal
614732463 -rw-r----- 1 root systemd-journal 128M Jun 9 17:08 system@153662a2d2df4a35baac25c0cf03e5d7-00000000000b6163-000534bcc0a807e9.journal
735063910 -rw-r----- 1 root systemd-journal 128M Jun 10 21:45 system@153662a2d2df4a35baac25c0cf03e5d7-00000000000da5df-000534d4c4036107.journal
843918801 -rw-r----- 1 root systemd-journal 128M Jun 12 02:52 system@153662a2d2df4a35baac25c0cf03e5d7-00000000000fecdf-000534ecc009c2f5.journal
950046129 -rw-r----- 1 root systemd-journal 128M Jun 13 07:43 system@153662a2d2df4a35baac25c0cf03e5d7-00000000001238d9-0005350526e3efa7.journal
1054326538 -rw-r----- 1 root systemd-journal 48M Jun 13 17:02 system.journal

delete each of system@153662a2d2df4a35baac25c0cf03e5d7-xxx file, you can free 128M memory from cache.
change the systemd-journald configration to avoid it.

@ytwu

This comment has been minimized.

Copy link

ytwu commented Jun 13, 2016

After we delete all the unnecessary journal files under /run/log/ in each container on the same baremetal host, we free more than 100G+ memory from page cache!

@marklieberman

This comment has been minimized.

Copy link

marklieberman commented Jun 13, 2016

@ytwu
I am running on Amazon Linux which does not use sytemd-journald. So unfortunately, that is not the solution for me.

@taviLaies
Thanks for the tip, but I already know the kernel is killing my process because of cgroup OOM reasons. I just don't know why the memory consumption outside of the JVM process in my container continues to grow. For now I've just told ECS to give my container about 4x the memory it requires, and it rarely gets killed. The only difference I can think of between my problem container and other containers built on the same Java framework is the temporary file churn.

@andyxning

This comment has been minimized.

Copy link

andyxning commented Aug 21, 2016

@ytwu i try to find under what condition kernel will trigger oom.

the kernel should free cache/buffer as much as possible when the system short of free memory.

I have test that in some circumstances, kernel do really reclaim page caches. However, it is not clear so much as i can not find any official document about how to trigger a oom. :(

@thaJeztah

This comment has been minimized.

Copy link
Member

thaJeztah commented Aug 21, 2016

as i can not find any official document about how to trigger a oom. :(

An oom is literally that; out of memory, in other words, of the system as a whole is running out of memory, the kernel starts to kill processes to make resources (memory) available. When doing so, it takes the --oom-score-adj weight into account to decide which process to kill first. If --oom-score-adj is not provided, potentially, there's a chance that the docker daemon is killed before containers are killed (although on docker 1.12, we set an oom-score on the daemon itself to make this less likely to happen)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment