New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open files after docker containers removed #8693
Comments
Just a bit more context, we're calling docker logs --follow on the container, and after the container is removed also do a kill -TERM on the docker logs --follow process since there is an existing bug where docker logs --follow hangs forever if the container doesn't exist. |
are you able to reproduce on the latest version of docker and if so can you please give exact steps to do so |
closing as stale please ping me with exact steps to reproduce on the latest version and I will reopen |
Not so quick on the draw @jfrazelle, here is your repo https://bugzilla.redhat.com/show_bug.cgi?id=1189028 , happens on 1.5 too. |
|
Do we open a socket to talk to dockerinit, or does that happen through some
other means?
|
I can't seem to reproduce this using your test case. Once for loop exits, number of Fds open still remain the same. I am using latest docker dynamically build binary on lvm thin pool as graphdriver. |
@rhvgoyal not sure if there are other parts to the puzzle, seemed pretty cut and dry to me. We have several other reports noted in https://bugzilla.redhat.com/show_bug.cgi?id=1189028 |
That report also says that "It doesn't happen on RHEL7 + docker 1.4.1" So looks like it does not happen on all combinations. I guess it is a good thing as it should help us pin point the problem. BTW, I am using upstream kernel 4.0.-rc2 and latest docker. Following is how my docker info looks like. Containers: 0 |
Same issue here, using latest docker running on RHEL6.6.
Containing this stuff:
|
We are experiencing the same issue with docker 1.4.1 on RHEL 6.5 (2.6.32-504.8.1.el6.x86_64). Is there a better recommend workaround? |
I'm experiencing the exact same problem with Docker 1.6 and RHEL 6.6. We ran into this when the docker user reached it's limit on open files. I took a look at
Here's the
|
@iangkent: Another work around is to attach to the docker process and tell it to close the file descriptors. I'm doing something like this:
Note: what I'm doing here is attaching to the docker daemon using gdb, and telling it to close file descriptors that have 'deleted' in the file listing.. It's ugly but better than restarting the containers that I want to keep running. UPDATE: I noticed that killing eventfd descriptors can cause a tainted panic. Updated command to not remove those eventfd |
It's pretty weird that there is fds from cgroups :/ I'll try to reproduce. |
@LK4D4: it always seems to be the 'memory.oom_control' file that is left open for each cgroup instance. |
Hmm, I really can't reproduce on master. Even if I thought that I saw this before. |
It's present at least as recently as docker 1.6 - If I look at the file descriptors I end up with: sudo ls /proc/$(cat /var/run/docker.pid)/fd -l --time-style=+'%s' |
Seeing this on RHEL 6.6 2.6.32-504.el6.x86_64 too.
|
I'm guessing this hasn't been fixed in Docker 1.7? 😢 |
I've had to downgrade to 1.3 as the issue doesn't exist there. I haven't bothered doing a git bisect to try to triage the exact commit that introduced it yet as I simply haven't had the time. It may be simply that the bug won't get a lot of attention since most people running docker are doing so on RHEL7. |
@amaltson it is not fixed as of current master (1.7.1 -> 1.8-in-devel) on EL6.6. |
@rhvgoyal This issue is caused by a spurious read on the eventfd in Because the read occurs before the cgroup's vfs entries have been unlinked, there is a race condition wherein the channel blocking send is reached before the container is cleaned up, the container destroyed and the channel read, and finally the for loop restarts reading the eventfd -- no additional events occur after this first "spurious" one and thus the goroutine/fd leaks. Adding an Lstat test after I can post my patch if it would illustrate this more clearly, although I suspect someone will want to do a bit more thorough job refactoring the loop. |
Bump. |
@unixist (& others interested): I suggest you test under EL/Centos 6.7. I am unable to replicate the problem on 2.6.32-573.3.1.el6 , and the kernel-2.6.32-573.3.1.el6 rpm has an official RH bugfix that looks suspiciously related.
meanwhile …
(cgroupoom is just a simple utility that uses the same notification code as docker 1.7's libcontainer) |
@unixist can you try if that newer kernel resolves it for you? |
@thaJeztah I hope to be able to test soon. I don't yet have -573 in a position to test. Can anybody else comment? |
Still happening. Fresh install from the docker repo.
|
I believe you might be hitting a separate, but very potentially related, issue. There is a race condition in the kernel where a write to a memory cgroup's
If this is what is happening to you, then a simple To aid those wishing to reproduce the tests I have been running, I will post my test code here (again, it's the same algorithm found in 1.7's libcontainer wrapped in some debug output). |
Experiencing the same in CoreOS 766.4.0. Huge container log file remaining open after stoping a container started with --rm
|
@pmoust -- different problem. The problem being discussed here applies specifically to docker keeping file handles open to the memory.oom_control cgroup control file. Anything else (docker logfiles, container logfiles, etc) is unrelated to this bug. |
@elfchief not really, I experience the same behavior as the original issue reporter. |
We are getting hit by the same issue. The file locks are not released as expected when the logs are rotated. The locks are (naturally) released when restarting the docker service. Sample output from
Output from
|
@sebdah (edit: I just noticed that you're on Ubuntu so disregard my notes below on rhel 6 :) I don't think this issue is going to get any traction as there are other problems running docker on rhel 6, so the current docker docs recommends to run docker only on rhel 7. I've read somewhere if rhel 6 is needed it can be safely used it as the docker image base for the containers running on a rhel7 host. That said my workaround cronjob has kept the machine from crashing and it's been up for months... even though the open files isn't a problem, I'm noticing inconsistent network stack issues in the containers that moving to rhel 7 is on the horizon for me. Here's the gist of the cronjob that runs every night on my rhel6 host, not a solution, but a stop gap: |
@JohnMorales Thanks! We are on Ubuntu 14.04.02 |
Is this issue supposed to be resolved on RHEL 7? I'm running CentOS 7 with docker 1.9.1, and I'm having the problem reported by the original submitter (large number of deleted json.log files being held open).
The xfs-formatted root disk is 30 GB. There are no other disks on the system. |
I have the same problem on Ubuntu 14.04.4.
|
Same issue here with 1.11.2. This was causing our long-running service to fail periodically, because it had no free file descriptors left... |
@Majkl578 what storage driver are you using? If you're using overlay, that may be a different issue |
Let's close this, because it's likely been different issues over time, I do not see a reason to keep it open. @thaJeztah WDYT? |
@cpuguy83 agreed. For those following this; we're closing this issue because it's been collecting various unrelated issues; if you're still having this issue, please open a new issue, and detailed information about your issue. |
docker version:
Client version: 1.1.2
Client API version: 1.13
Go version (client): go1.2.1
Git commit (client): d84a070
Server version: 1.1.2
Server API version: 1.13
Go version (server): go1.2.1
Git commit (server): d84a070
I ran out of file descriptors running docker run, and I realized docker daemon is holding on to a lot of file handles (11110 of them), while I only have 6 containers with docker ps -a
lots of them looks like this:
6c665c78a63396955dc76873bd8636bb642b38b43ec0e6-json.log (deleted)
docker 6937 root 329u unix 0xffff880166560e00 0t0 353408770 /var/run/docker.sock
docker 6937 root 330r REG 9,2 0 1627517 /var/lib/docker/containers/0734b3b41a4b039cf983cb690b1ad2814a8434289752f8082fd312e837f04672/0734b3b41a4b039cf9
83cb690b1ad2814a8434289752f8082fd312e837f04672-json.log (deleted)
docker 6937 root 331u unix 0xffff880166560a80 0t0 353427790 /var/run/docker.sock
docker 6937 root 332r REG 9,2 219 1627529 /var/lib/docker/containers/933d4e44aa370529284c40b5a1fae6ecf5e887e285d9860ba44ce61439c692f0/933d4e44aa37052928
4c40b5a1fae6ecf5e887e285d9860ba44ce61439c692f0-json.log (deleted)
docker 6937 root 333u unix 0xffff881024179c00 0t0 353428260 /var/run/docker.sock
docker 6937 root 334r REG 9,2 220 1627499 /var/lib/docker/containers/70607073f3aa18cfcd05d8a3ef52b15bc59d4dbafaacd2f45f39890c7eee19e3/70607073f3aa18cfcd
05d8a3ef52b15bc59d4dbafaacd2f45f39890c7eee19e3-json.log (deleted)
docker 6937 root 335u unix 0xffff8809fd121f80 0t0 353441825 /var/run/docker.sock
docker 6937 root 336u unix 0xffff880166562300 0t0 353426334 /var/run/docker.sock
docker 6937 root 337r REG 9,2 75 1627531 /var/lib/docker/containers/6695d56daf8b312e34b613c24e995d3c71d14fba0ee1189db74a189572f01a11/6695d56daf8b312e34
b613c24e995d3c71d14fba0ee1189db74a189572f01a11-json.log (deleted)
docker 6937 root 338r REG 9,2 107 1627524 /var/lib/docker/containers/e9305143bc65c98b116a38aa56d80813e74923d213a1d0c8c2096293fce6814b/e9305143bc65c98b11
6a38aa56d80813e74923d213a1d0c8c2096293fce6814b-json.log (deleted)
docker 6937 root 339u unix 0xffff8809fd124d00 0t0 353442787 /var/run/docker.sock
docker 6937 root 340r REG 9,2 74 1627530 /var/lib/docker/containers/f6ba7a20569957ccee8ea51fb3ea181e1107572217312645e003472933b47fce/f6ba7a20569957ccee
8ea51fb3ea181e1107572217312645e003472933b47fce-json.log (deleted)
docker 6937 root 341u unix 0xffff881024179180 0t0 353455659 /var/run/docker.sock
docker 6937 root 342r REG 9,2 107 1627520 /var/lib/docker/containers/7b692bbda1e77df58b75c78394f7cee05ab476bf665c9d5ea8e6c454be1253e7/7b692bbda1e77df58b
75c78394f7cee05ab476bf665c9d5ea8e6c454be1253e7-json.log (deleted)
docker 6937 root 343u unix 0xffff8809fd123800 0t0 353473830 /var/run/docker.sock
docker 6937 root 344u unix 0xffff880a2eda3100 0t0 353485041 /var/run/docker.sock
docker 6937 root 345r REG 9,2 107 1627532 /var/lib/docker/containers/7f9706c3eb50d0047f0a254fedd8cfcf3c24d5ce6beb30ac17e488c54e600b11/7f9706c3eb50d0047f
0a254fedd8cfcf3c24d5ce6beb30ac17e488c54e600b11-json.log (deleted)
docker 6937 root 346r REG 9,2 219 1627536 /var/lib/docker/containers/93e745632e8ced642b9f87a2a6991aace88e4e36172d430e09715d3d093ab0bd/93e745632e8ced642b
9f87a2a6991aace88e4e36172d430e09715d3d093ab0bd-json.log (deleted)
docker 6937 root 347u unix 0xffff88015c303100 0t0 353485124 /var/run/docker.sock
docker 6937 root 348u unix 0xffff8801b046c600 0t0 353487224 /var/run/docker.sock
docker 6937 root 349r REG 9,2 75 1627538 /var/lib/docker/containers/5951bc184d7678055eb63f0b561657e12831b04d40c1c8b8548ca29d32d8ea85/5951bc184d7678055e
b63f0b561657e12831b04d40c1c8b8548ca29d32d8ea85-json.log (deleted)
docker 6937 root 350u unix 0xffff880166567380 0t0 353476902 /var/run/docker.sock
docker 6937 root 351r REG 9,2 107 1627534 /var/lib/docker/containers/4e3be7bcc0f2e3cc59f68191ed005ef2152821845cca2a16f94c1ea84d4cfb2f/4e3be7bcc0f2e3cc59
f68191ed005ef2152821845cca2a16f94c1ea84d4cfb2f-json.log (deleted)
docker 6937 root 352r REG 9,2 75 1627526 /var/lib/docker/containers/c1f591f680628cc90f0a7bde12d9defc17911cbaf2c8b4a01414cdeff2f30d90/c1f591f680628cc90f
0a7bde12d9defc17911cbaf2c8b4a01414cdeff2f30d90-json.log (deleted)
The text was updated successfully, but these errors were encountered: