[Proposal] Proof of concept refactor to allow other tasks to be run by girder_worker #29

kotfic · 2016-05-16T18:53:04Z

Overview

This is a PR that demonstrates a way of refactoring girder_worker to allow for celery tasks other than run, convert and validators to be run by external systems like girder's worker plugin It is not really intended to be merged but instead to provide a concrete proof of concept for discussion.

Motivation

Girder worker (previously romanesco) provides a lot of functionality that is specific to running chains of analyses in different languages as well as converting data formats between these analyses. This involves specifications (or just 'specs') which include inputs, outputs and script locations. While these are critical components in the context of certain projects, there are other projects that require distributed job management that do not need these features or are hampered by these features. In these cases those projects are obligated to design their own solution which i would argue is sub-optimal.

This PR refactors the current girder_worker application to provide a lower level API that provides access to celery for projects that do not need the higher level spec/input/output API. The changes are intentionally minimal because this PR is intended to spark conversation about what other needs exist in terms of distributed job management.

Structure

In addition to this PR there is a trivial example of a separate package (or 'plugin', or 'extension') that adds additional functionality to girder_worker.

This PR moves the application out of the main() function and into the girder_worker init file. This allows for app to be imported and for additional tasks to be decorated (e.g. here in the example extension) .

The primary code retained in the girder_worker main() function is the call to app.worker_main() and new functionality that uses setuptools entry points to discover available girder_worker tasks. This can be see here, where the example extension defines its tasks, as well as here where girder defines the location of its internal tasks.

To test this you can checkout and install this branch of girder_worker in a virtual environment, then checkout and install the gwexample repository. From the girder_worker root directory running girder-worker -l info should produce the following output:

(gwtest) kotfic@minastirith:~/src/girder_worker_test/girder_worker$ girder-worker -l info

...

 -------------- celery@minastirith v3.1.20 (Cipater)
---- **** ----- 
--- * ***  * -- Linux-4.5.2-1-ARCH-x86_64-with-glibc2.2.5
-- * - **** --- 
- ** ---------- [config]
- ** ---------- .> app:         girder_worker:0x7faf653aaad0
- ** ---------- .> transport:   amqp://guest:**@localhost:5672//
- ** ---------- .> results:     amqp://guest@localhost/
- *** --- * --- .> concurrency: 32 (prefork)
-- ******* ---- 
--- ***** ----- [queues]
 -------------- .> celery           exchange=celery(direct) key=celery


[tasks]
  . girder_worker.convert
  . girder_worker.run
  . girder_worker.validators
  . gwexample.analyses.tasks.fib

[2016-05-16 13:54:47,311: INFO/MainProcess] Connected to amqp://guest:**@127.0.0.1:5672//
[2016-05-16 13:54:47,321: INFO/MainProcess] mingle: searching for neighbors
[2016-05-16 13:54:48,331: INFO/MainProcess] mingle: all alone
[2016-05-16 13:54:48,355: WARNING/MainProcess] celery@minastirith ready.

We can see that gwexample.analyses.tasks.fib is now available as a registered task in the celery process.

Changes to girder's worker plugin

For this to be successful the worker plugin will also need to be slightly refactored. The actual task queued by the worker plugin is currently hard coded but this could be easily changed to provide generic access (as in the now defunct celery_jobs plugin)

Considerations

A refactor along these line would allow us to essentially remove the celery_jobs plugin (the new worker plugin will support the same functionality). It would provide an API for access through the worker plugin and the girder_worker process directly into the mechanics of celery which should allow for a wider set of uses cases. This means individual extensions would be responsible for implementing updates back to the girder job. I have some ideas about this based on celery Signals but will save that for further discussion.

kotfic · 2016-05-16T18:56:05Z

@girder/developers PTAL

I'm hoping to spark some conversation here about unifying infrastructure for distributed job management with girder and girder_worker.

Thanks!

zachmullen · 2016-05-16T18:59:13Z

The only thing that might be controversial here is that, prior to this change, celery was not a hard dependency of girder_worker, and now it is. Curious what @jeffbaumes thinks about that.

zachmullen · 2016-05-16T19:02:54Z

Also worth noting is that we now have an infinite loop/stack overflow situation on travis ;)

kotfic · 2016-05-16T19:03:02Z

@zachmullen could we fix that by moving the definition of app and the tasks inside a function and then doing something like:

from girder_worker import get_worker_app
app = get_worker_app()

@app.task
def fib(n):
    ....

zachmullen · 2016-05-16T19:03:46Z

That works for me, and the __main__ could import that, but __init__ would not.

jeffbaumes · 2016-05-16T19:19:29Z

Celery was already in requirements.txt so I assumed it was already a hard dependency. I suppose it was not in that you could checkout the girder_worker repo and perform import girder_worker without pulling in celery, but I do not see that as a big concern.

jeffbaumes · 2016-05-16T19:28:43Z

One way I see this proposal is that some folks may like or need the Worker specification metadata, while others just want to use arbitrary celery through Girder and want an easy way to enable running/exposing these tasks. @kotfic's proposal seems to be: why not manage these two mechanisms in the same repository, even if they are a bit independent of each other, instead of having two separate repos or frameworks? I can get on board with that logic (assuming I interpreted the logic correctly). Even if they seem pretty independent now, having them together will probably make things easier down the road by allowing interdependencies and bridging between different execution modes, etc.

cjh1 · 2016-05-16T19:35:29Z

As @kotfic mentioned we have bunch of logic to reflect the state of a celery task into a Girder model ( using Signals ) which could be a natural extension of this and make girder_work more compelling to use with standard Celery tasks.

kotfic · 2016-05-16T20:37:16Z

@jeffbaumes I think your assessment is correct. As @cjh1 points out, both the standard worker specification and generic celery tasks could benefit from using signals to update job metadata.

signals work like hooks and allow arbitrary functions to be run on

before_task_publish
after_task_publish
task_prerun
task_postrun
task_retry
task_success
task_failure
task_revoked

they are defined independent of individual tasks and so could apply across the board to spec based tasks, as well as custom tasks. e.g.

import requests
from celery.signals import after_task_publish

@after_task_publish.connect
def task_sent_handler(sender=None, body=None, **kwargs):
    jobInfo = kwargs['headers']['jobInfo']
    requests.patch("jobs/%s/status" % jobInfo['_id'], 
                   headers={'Girder-Token': jobInfo['token']},
                   data= {'status': 'published'})

kotfic · 2016-05-17T16:30:10Z

@brianhelba I'm wondering if you could weigh in on this approach, would something like this meet/screw-up your distributed job management needs?

kotfic · 2016-10-31T19:46:30Z

Closing this, actual work based on this proposal is taking place in #72

kotfic added 2 commits May 16, 2016 14:02

Proof of concept refactor to allow non 'run' jobs

071dca5

Fix console script

87a8463

Fixing tests

d9d676e

kotfic mentioned this pull request Jul 12, 2016

Plugin framework for Gaia OpenDataAnalytics/gaia#69

Closed

kotfic mentioned this pull request Oct 31, 2016

Refactor core girder worker functionality into 'core' submodule #72

Merged

kotfic closed this Oct 31, 2016

zachmullen deleted the gw-proposal branch January 11, 2018 19:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Proposal] Proof of concept refactor to allow other tasks to be run by girder_worker #29

[Proposal] Proof of concept refactor to allow other tasks to be run by girder_worker #29

kotfic commented May 16, 2016

kotfic commented May 16, 2016

zachmullen commented May 16, 2016

zachmullen commented May 16, 2016

kotfic commented May 16, 2016

zachmullen commented May 16, 2016

jeffbaumes commented May 16, 2016

jeffbaumes commented May 16, 2016

cjh1 commented May 16, 2016 •

edited

Loading

kotfic commented May 16, 2016 •

edited

Loading

kotfic commented May 17, 2016

kotfic commented Oct 31, 2016

[Proposal] Proof of concept refactor to allow other tasks to be run by girder_worker #29

[Proposal] Proof of concept refactor to allow other tasks to be run by girder_worker #29

Conversation

kotfic commented May 16, 2016

Overview

Motivation

Structure

Changes to girder's worker plugin

Considerations

kotfic commented May 16, 2016

zachmullen commented May 16, 2016

zachmullen commented May 16, 2016

kotfic commented May 16, 2016

zachmullen commented May 16, 2016

jeffbaumes commented May 16, 2016

jeffbaumes commented May 16, 2016

cjh1 commented May 16, 2016 • edited Loading

kotfic commented May 16, 2016 • edited Loading

kotfic commented May 17, 2016

kotfic commented Oct 31, 2016

cjh1 commented May 16, 2016 •

edited

Loading

kotfic commented May 16, 2016 •

edited

Loading