New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update arbiter.py to not leave tmp files around (fix #1327) #1428

Merged
merged 1 commit into from Oct 27, 2017

Conversation

Projects
None yet
5 participants
@bsergean
Contributor

bsergean commented Jan 11, 2017

(deleted) tmp files left around #1327

@tilgovi

This comment has been minimized.

Collaborator

tilgovi commented Jan 11, 2017

This patch helped us diagnose the problem, but I don't think it is the best solution.

It would be less costly to set the close on exec flag for the temp file after it's inherited by its worker, right after the fork. This way, we never have to worry about it again and it won't be inherited by any other process that might get spawned for any reason.

@bsergean

This comment has been minimized.

Contributor

bsergean commented Jan 11, 2017

Found those references:

http://stackoverflow.com/questions/6125068/what-does-the-fd-cloexec-fcntl-flag-do

flags = fcntl.fcntl(fd, fcntl.F_GETFD)
fcntl.fcntl(fd, fcntl.F_SETFD, flags | fcntl.FD_CLOEXEC)
@bsergean

This comment has been minimized.

Contributor

bsergean commented Jan 11, 2017

Oh it looks like there's already a utility for that: close_on_exec

@bsergean

This comment has been minimized.

Contributor

bsergean commented Jan 11, 2017

There's a worker.tmp in the code, is it that temp file we're talking about ?

@tilgovi

This comment has been minimized.

Collaborator

tilgovi commented Jan 12, 2017

I am wrong, close_on_exec won't work because the workers don't exec.

@tilgovi tilgovi self-assigned this Mar 15, 2017

@tilgovi

This comment has been minimized.

Collaborator

tilgovi commented Aug 6, 2017

@benoitc @berkerpeksag I'd like to merge this, but would appreciate one other review. I don't see a better way right now and I've been putting this off for weeks.

@berkerpeksag

I think your solution is correct, but I don't quite understand why file descriptors of a worker are inheritable by default especially in Python 3.4+.

@@ -565,6 +565,10 @@ def spawn_worker(self):
self.WORKERS[pid] = worker
return pid
# Do not inherit the temporary files of other workers

This comment has been minimized.

@berkerpeksag

berkerpeksag Oct 26, 2017

Collaborator

I like @jkemp101's suggestion in #1327:

# Close inherited temporary files of other workers
@@ -565,6 +565,10 @@ def spawn_worker(self):
self.WORKERS[pid] = worker
return pid
# Do not inherit the temporary files of other workers
for sibling in self.WORKERS.values():
sibling.tmp.close()

This comment has been minimized.

@berkerpeksag

berkerpeksag Oct 26, 2017

Collaborator

Perhaps a more elegant way would be:

import os
import resource

fd_high = resource.getrlimit(resource.RLIMIT_NOFILE)[0]
os.closerange(3, fd_high)

Or we could store FDs of temp files of workers in a list and pass the highest number as fd_high.

This comment has been minimized.

@tilgovi

tilgovi Oct 26, 2017

Collaborator

I'm not sure it would be safe to assume that all open FDs are WorkerTmp files. We could have any number of log files open at this point, no?

This comment has been minimized.

@berkerpeksag

berkerpeksag Oct 26, 2017

Collaborator

You have a point :) So unless we keep tracking file descriptors of those WorkerTmp files I don't see a better way to solve this problem either.

@benoitc

This comment has been minimized.

Owner

benoitc commented Oct 26, 2017

I'm confused we already set close on exec there : https://github.com/benoitc/gunicorn/blob/master/gunicorn/workers/base.py#L106

and we already unlink it there:
https://github.com/benoitc/gunicorn/blob/master/gunicorn/workers/workertmp.py#L32

Normally these files shouldn't be let on the fs then. So the actual question for me is on which system is it reproducible? How do you launch gunicorn ?

Anyway why not using os.set_inheritable(fd, inheritable) when creating a temporary file on python 3 with inheritable=false? Wouldn't it solve the issue?

@benoitc benoitc self-requested a review Oct 26, 2017

@tilgovi

This comment has been minimized.

Collaborator

tilgovi commented Oct 26, 2017

@benoitc:

I'm confused we already set close on exec

The workers don't call exec.

and we already unlink it there

We don't close it, though.

Normally these files shouldn't be let on the fs then. So the actual question for me is on which system is it reproducible? How do you launch gunicorn ?

As far as I understand, this is not OS-specific and it doesn't matter how you launch Gunicorn.

Anyway why not using os.set_inheritable(fd, inheritable) when creating a temporary file on python 3 with inheritable=false? Wouldn't it solve the issue?

We need the file to be inheritable. The arbiter unlinks the file, so passing the fd through the fork is the only way the worker can access it.

Also, per the documentation [1], "On UNIX, non-inheritable file descriptors are closed in child processes at the execution of a new program, other file descriptors are inherited." The workers do not call exec.

[1] https://docs.python.org/3/library/os.html#inheritance-of-file-descriptors

@benoitc

This comment has been minimized.

Owner

benoitc commented Oct 26, 2017

That's true. We want them inheritable, like i said i'm confused by that ticket :). I don't reproduce the issue there. I'm not sure how you can still have tmp file on disk if they are unlinked. So I guess it's OS specific, or maybe just another systemd crap as it's apparently used there. I would be interrested in knowing more what does systemd or whatever. Anyway the patch is OK, let's merge it.

@tilgovi

This comment has been minimized.

Collaborator

tilgovi commented Oct 27, 2017

I don't reproduce the issue there. I'm not sure how you can still have tmp file on disk if they are unlinked.

They are not in the filesystem, but in the list of open files for the worker processes. You have to call lsof on a worker to see that it will have some other workers tmp files open, too.

@tilgovi tilgovi merged commit a3e258f into benoitc:master Oct 27, 2017

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@berkerpeksag

This comment has been minimized.

Collaborator

berkerpeksag commented Oct 27, 2017

Note that I think we can use os.register_at_fork() under Python 3.7+. Unlike the close-on-exec flag, this doesn't require the child to call exec* functions to work.

(Python 3.7 is not released yet :))

@benoitc

This comment has been minimized.

Owner

benoitc commented Oct 27, 2017

Still i was not reproducing it when the process is restarted using HUP or USR2 in my systems so something is triggering it.... I guess a better fix would be finally replacing that temporary file trick :) I need to focus on that.

@berkerpeksag you mean to have a function cleaning on fork?

@berkerpeksag

This comment has been minimized.

Collaborator

berkerpeksag commented Oct 27, 2017

@berkerpeksag you mean to have a function cleaning on fork?

Yes, os.register_at_fork() may help us to close temp files.

@boazin

This comment has been minimized.

boazin commented Dec 23, 2017

Is this fix merged into a release? When is it expected in an official release?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment