Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Task/Workflow hooks or modifiers #258

Closed
JacobHayes opened this issue Sep 14, 2022 · 0 comments · Fixed by #351
Closed

Support for Task/Workflow hooks or modifiers #258

JacobHayes opened this issue Sep 14, 2022 · 0 comments · Fixed by #351
Assignees

Comments

@JacobHayes
Copy link
Contributor

Hi, I'd like to be able to set some "global" hooks that will be called during Task or Workflow creation that can modify them. For example, my company has:

  • custom GKE selectors and tolerations required to use high mem instances that must be set in any code that may provision high mem jobs
  • custom GKE volume storage classes that set WaitForFirstConsumer to prevent volumes from being provisioned in 1 zone, but the pod in another (leading to unschedulable jobs).
  • labels that we apply to most jobs for cost and other attribution; we'd like to either add defaults if not set OR error to catch missing ones

For all tasks/workflows submitted within our projects, we'd like these to be handled automatically, rather than every use needing the extra logic. Additionally, I've been thinking about writing some code that wraps hera and may not expose Tasks/Workflows directly (ie: it will create those for the user) - this layer would make it a bit more challenging to apply these customizations, but hooks might be an easy way to bypass.

--

A couple design points / questions:

  • I think it'd be useful for these to run on Task/Workflow creation (ie: at the end of __init__), but another option might be to call them on .build (just before generating the spec). I think on __init__ would be beneficial because:
    • it would allow folks to inspect the generated object and tweak further if desired
    • any hook errors, intentionally or not, will show tracebacks right at the source, not later where .build is finally is called
  • the hooks could probably be registered to a specific WorkflowService instance, rather than be "true" globals. However, setting them on a WorkflowService instance would force us to use the .build approach from the item above, otherwise the Task won't have the service available to identify the hooks. If they were true global hooks, then Task.__init__ could just find and call them at the end.
  • What is the right name for this concept, "hook", "modifier", etc?
    • I somewhat like "hook" because nothing forces the implementation to modify (though I don't know what else they'd do 😅) and it hints at the mechanism (that things are being "intercepted")

--

Spitballing the updates for WorkflowService, maybe something like:

TaskHook = Callable[[Task], None]
WorkflowHook = Callable[[Workflow], None]

# Add instance attributes tracking the hooks (pretending this is a pydantic model, so setting fields this way). dicts instead of lists may be useful in case we want to de-register or override.
class WorkflowService(...):
    task_hooks = list[TaskHook] # An ordered set of hooks that will be called on Task creation in FIFO order.
    workflow_hooks = list[WorkflowHook] # An ordered set of hooks that will be called on Workflow creation in FIFO order.

    def register_{task,workflow}_hook(self, hook: {Task,Workflow}Hook) -> {Task,Workflow}Hook:
        self.{task,workflow}_hooks.append(hook)
        return hook # Allow use as a @decorator

Then, perhaps Workflow.build would lookup the hooks for the provided (or defaulted) WorkflowService and call all the appropriate hooks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants