New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: Pre-post task hooks #4789
Comments
This is already doable with scheduler plugins, which provide a general way to hook into the scheduler for custom feedback. I suspect the information exposed in the |
@jcrist scheduler plugins only run on the scheduler, not on the worker, or do I misunderstood that? Use cases listed above (e.g. prometheus metric push and interpreter cleanup) would require the code to be executed on the worker. |
I'm not sure what you mean by prometheus metric push, but if you're just trying to push a log every time a task completes, a scheduler plugin may be sufficient. For running code actually on the workers you'd currently need to implement this by wrapping a task. You'd have two options here:
from dask.core import istask
def wrapper(func, *args):
print("Before task")
out = func(*args)
print("After task)
return out
def wrap_tasks(dsk):
# Note that this only handles top-level tasks, and not nested tasks
# Rewrites (func, a, b, c) -> (wrapper, func, a, b, c)
return {k: (wrapper,) + v if istask(v) else v
for k, v in dsk.items()}
Worker plugins could be added, but we'd want to determine what plugin points were needed. If you think this is something you'd be interested in, it would be useful to better understand your use cases. In particular, why do you need to run code on the worker pre/post each task? Can you elaborate on your applications listed above? |
First of all, thanks for the very detailed response and helpful implementation ideas.
Fair. Let me elaborate a bit: Prometheus Metric PushI'm talking about this feature. It harvests the metrics registered in the interpreter-wide global registry and pushes them to a pushgateway. The metrics I'm thinking about are registered on the worker, e.g.:
So you can now argue that the distributed worker already exposes Prometheus metrics and that these can be harvested. The problem here is the Prometheus data model which works similar to Hollywood: prometheus calls you and you cannot influence when you're called. Since we use our distributed cluster for a diverse set of tasks, we would like to know these metrics on a per task level (ideally on the per-graph-node level) so we can better track the mentioned metrics on production systems. This requires us to push the metrics to a pushgateway after node finished. LoggingIt would be helpful if we could setup logging before a graph node starts so we can add tags to the logging messages depending on who submitted the computation to the distributed scheduler. These submissions can overlap and multiple diverse computations might be active/running at the same time. Tagging the logging messages with "who submitted this and how is this computation called" is essential for debugging and to even find the right logging messages in systems like Graylog. Interpreter SetupSometimes you want to wipe/clean the interpreter state before a graph node runs, e.g. for debugging or for determinism. The state may include: |
I don't quite understand this, and how it relates to the issue. Taking the title, "pre-post task hooks" literally, a scheduler plugin should suffice. @crepererum could you try to clear up my misunderstanding? FYI, worker plugins were recently added to distributed: dask/distributed#2453 |
I think he's looking for something more like Scheduler Plugins (which have hooks for a variety of events) but on the Worker. The recently added Worker Plugins only support hooks for for setup/teardown currently, but more could be added for things like task transitions and other events. |
@TomAugspurger sure. I think I kinda missed the 2 modes the prometheus offers. So the metrics are present on the worker and have to be gathered/pushed/collected from there. You have 2 ways of doing so:
|
Does this help clarify things, @TomAugspurger? |
@crepererum is this something that you would be interested in doing? |
Sure. Might take a week or two. |
I think we can close this since dask/distributed#2994 was merged which basically covers the use cases I had in mind via the transition hook. |
Great. Closing. Thanks @marco-neumann-jdas . @crepererum if you disagree please speak up and we're reopen |
Abstract
I would like to request that dask (and therefore distributed) support pre-/post-task hooks. These are functions that can be executed before/after a node of the computation graph is executed on the target system (subprocess, distributed worker).
Applications
Technical Details
Alternatives
This should currently be possible by writing a custom "optimization" pass, but it kinda feels like this would abuse the system. On the other hand, it is a generic system and we could just extend the documentation to mention that such a "bending" would be possible.
The text was updated successfully, but these errors were encountered: