Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update arbiter.py to not leave tmp files around (fix #1327) #1428

Merged
merged 1 commit into from Oct 27, 2017

Conversation

bsergean
Copy link
Contributor

(deleted) tmp files left around #1327

@tilgovi
Copy link
Collaborator

tilgovi commented Jan 11, 2017

This patch helped us diagnose the problem, but I don't think it is the best solution.

It would be less costly to set the close on exec flag for the temp file after it's inherited by its worker, right after the fork. This way, we never have to worry about it again and it won't be inherited by any other process that might get spawned for any reason.

@bsergean
Copy link
Contributor Author

Found those references:

http://stackoverflow.com/questions/6125068/what-does-the-fd-cloexec-fcntl-flag-do

flags = fcntl.fcntl(fd, fcntl.F_GETFD)
fcntl.fcntl(fd, fcntl.F_SETFD, flags | fcntl.FD_CLOEXEC)

@bsergean
Copy link
Contributor Author

Oh it looks like there's already a utility for that: close_on_exec

@bsergean
Copy link
Contributor Author

There's a worker.tmp in the code, is it that temp file we're talking about ?

@tilgovi
Copy link
Collaborator

tilgovi commented Jan 12, 2017

I am wrong, close_on_exec won't work because the workers don't exec.

@tilgovi tilgovi self-assigned this Mar 15, 2017
@tilgovi
Copy link
Collaborator

tilgovi commented Aug 6, 2017

@benoitc @berkerpeksag I'd like to merge this, but would appreciate one other review. I don't see a better way right now and I've been putting this off for weeks.

Copy link
Collaborator

@berkerpeksag berkerpeksag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think your solution is correct, but I don't quite understand why file descriptors of a worker are inheritable by default especially in Python 3.4+.

@@ -565,6 +565,10 @@ def spawn_worker(self):
self.WORKERS[pid] = worker
return pid

# Do not inherit the temporary files of other workers
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like @jkemp101's suggestion in #1327:

# Close inherited temporary files of other workers

@@ -565,6 +565,10 @@ def spawn_worker(self):
self.WORKERS[pid] = worker
return pid

# Do not inherit the temporary files of other workers
for sibling in self.WORKERS.values():
sibling.tmp.close()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps a more elegant way would be:

import os
import resource

fd_high = resource.getrlimit(resource.RLIMIT_NOFILE)[0]
os.closerange(3, fd_high)

Or we could store FDs of temp files of workers in a list and pass the highest number as fd_high.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it would be safe to assume that all open FDs are WorkerTmp files. We could have any number of log files open at this point, no?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have a point :) So unless we keep tracking file descriptors of those WorkerTmp files I don't see a better way to solve this problem either.

@benoitc
Copy link
Owner

benoitc commented Oct 26, 2017

I'm confused we already set close on exec there : https://github.com/benoitc/gunicorn/blob/master/gunicorn/workers/base.py#L106

and we already unlink it there:
https://github.com/benoitc/gunicorn/blob/master/gunicorn/workers/workertmp.py#L32

Normally these files shouldn't be let on the fs then. So the actual question for me is on which system is it reproducible? How do you launch gunicorn ?

Anyway why not using os.set_inheritable(fd, inheritable) when creating a temporary file on python 3 with inheritable=false? Wouldn't it solve the issue?

@benoitc benoitc self-requested a review October 26, 2017 06:48
@tilgovi
Copy link
Collaborator

tilgovi commented Oct 26, 2017

@benoitc:

I'm confused we already set close on exec

The workers don't call exec.

and we already unlink it there

We don't close it, though.

Normally these files shouldn't be let on the fs then. So the actual question for me is on which system is it reproducible? How do you launch gunicorn ?

As far as I understand, this is not OS-specific and it doesn't matter how you launch Gunicorn.

Anyway why not using os.set_inheritable(fd, inheritable) when creating a temporary file on python 3 with inheritable=false? Wouldn't it solve the issue?

We need the file to be inheritable. The arbiter unlinks the file, so passing the fd through the fork is the only way the worker can access it.

Also, per the documentation [1], "On UNIX, non-inheritable file descriptors are closed in child processes at the execution of a new program, other file descriptors are inherited." The workers do not call exec.

[1] https://docs.python.org/3/library/os.html#inheritance-of-file-descriptors

@benoitc
Copy link
Owner

benoitc commented Oct 26, 2017

That's true. We want them inheritable, like i said i'm confused by that ticket :). I don't reproduce the issue there. I'm not sure how you can still have tmp file on disk if they are unlinked. So I guess it's OS specific, or maybe just another systemd crap as it's apparently used there. I would be interrested in knowing more what does systemd or whatever. Anyway the patch is OK, let's merge it.

@tilgovi
Copy link
Collaborator

tilgovi commented Oct 27, 2017

I don't reproduce the issue there. I'm not sure how you can still have tmp file on disk if they are unlinked.

They are not in the filesystem, but in the list of open files for the worker processes. You have to call lsof on a worker to see that it will have some other workers tmp files open, too.

@tilgovi tilgovi merged commit a3e258f into benoitc:master Oct 27, 2017
@berkerpeksag
Copy link
Collaborator

berkerpeksag commented Oct 27, 2017

Note that I think we can use os.register_at_fork() under Python 3.7+. Unlike the close-on-exec flag, this doesn't require the child to call exec* functions to work.

(Python 3.7 is not released yet :))

@benoitc
Copy link
Owner

benoitc commented Oct 27, 2017

Still i was not reproducing it when the process is restarted using HUP or USR2 in my systems so something is triggering it.... I guess a better fix would be finally replacing that temporary file trick :) I need to focus on that.

@berkerpeksag you mean to have a function cleaning on fork?

@berkerpeksag
Copy link
Collaborator

@berkerpeksag you mean to have a function cleaning on fork?

Yes, os.register_at_fork() may help us to close temp files.

@boazin
Copy link

boazin commented Dec 23, 2017

Is this fix merged into a release? When is it expected in an official release?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants