-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create a cleanup entrypoint, move docker GC to it #97
Conversation
8e23ee0
to
5120d24
Compare
This allows external services to schedule periodic worker cleanup tasks on every node, and removes the functionality of running docker garbage collection on every docker task run.
5120d24
to
13cec40
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the GC specification (schedule, frequency) something that we'd want to add to the Profiles/CeleryApplicationConfiguration for apps to centralize description of an app? Then our Ansible could read that standard config for the app and set up a cron accordingly.
the same directory as this file. After that, deletes all images that are | ||
no longer used by any containers. | ||
""" | ||
print('Garbage collecting docker containers and images.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we start moving away from print
to using loggers for better control of output? is there a G-W policy here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should, though that will involve squashing it out across many places. Until now we've just been redirecting stdout to a file, but getting decent logging is important, so we should open an issue for it.
Not sure. |
It occurs to me that we could avoid adding a new cleanup entrypoint and have added deployment complexity, by going with our original plan of moving the GC operation into another task. However, rather than having to use a periodic task we can just schedule it at the end of the worker task, something like: gc_task.delay() This would mean the work task doesn't incur the GC cost but the clean up will be done as needed. |
Would that also schedule it on an arbitrary worker, or would it always run on the same one? |
Yes, your are right. Sorry, what I had in mind was using one of the celery handlers ( I miss wrote, shouldn't append to PRs on a Friday evening!), in particular after_return. This is executed in the same worker ( actually in the same process ) but after the task function has returned, so I think it meets our needs. So I would propose that we add a new register_XXX(...) function here, to allow executors to register cleanup code ( we could keep the concept generic if we want, may be after_run ). Then we could change the task definition here to something like this @app.task(name='girder_worker.run', after_return=cleanup) Then we would implement the cleanup function so that it would look to see if the executor has a cleanup callback registered and call it if necessary. Then in the docker plugin we could just need to register our GC code and everything should work out. Does that make sense? If so I can probably carve out a few hour this week to work on it if that would help? |
I like that idea, as long as we are sure it runs on the same worker. We
should add some sort of check to the docker cleanup in particular so we can
enforce a minimum duration between running the cleanup script (touch a
stamp file in the temp dirx and compare the mtime).
…On Feb 5, 2017 1:28 PM, "Chris Harris" ***@***.***> wrote:
Yes, your are right. What I had in mind was using one of the celery
handlers ( I miss wrote, shouldn't append to PRs on a Friday evening!), in
particular after_return
<http://docs.celeryproject.org/en/latest/_modules/celery/app/task.html#Task.after_return>.
This is executed in the same worker ( actually in the same process ) but
after the task function has returned, so I think it meets our needs. So I
would propose that we add a new register_XXX(...) function here
<https://github.com/girder/girder_worker/blob/master/girder_worker/core/__init__.py>,
to allow executors to register cleanup code ( we could keep the concept
generic if we want, may be after_run ). Then we could change the task
definition here
<https://github.com/girder/girder_worker/blob/master/girder_worker/tasks.py#L6>
to something like this
@app.task(name='girder_worker.run', after_return=cleanup)
Then we would implement the cleanup function so that it would look to see
if the executor has a cleanup callback registered and call it if necessary.
Then in the docker plugin we could just need to register our GC code and
everything should work out. Does that make sense? If so I can probably
carve out a few hour this week to work on it if that would help?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#97 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAh7t4sgOYYRCWxwOsvczqjcjlJTsRb3ks5rZhTmgaJpZM4L0VkQ>
.
|
e2adf93
to
72071b6
Compare
72071b6
to
08a9ac2
Compare
I think this is ready for final QA |
@zachmullen I will try to QA today ... |
|
||
|
||
def main(): | ||
events.trigger('cleanup') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we keeping this around in case we want to trigger cleanup external ( crontab ) still?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was my intention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I have tested the manually to ensure the after_return hook is fired.
Deleted the comment, was a phishing attack against email.
|
This allows external services to schedule periodic worker
cleanup tasks on every node, and removes the functionality
of running docker garbage collection on every docker task run.
ping @brianhelba @cjh1 @kotfic @mgrauer
The tradeoff here is that it moves the burden of scheduling cleanup to the deployment level. We should update our ansible roles for girder-worker to include configuration of a cronjob that occasionally runs this new entrypoint.