From 12557a3112d893ccc2062f17510510d316a2e032 Mon Sep 17 00:00:00 2001 From: Ville Tuulos Date: Mon, 14 Jul 2025 18:59:19 +0300 Subject: [PATCH 1/3] add exit hook docs --- docs/index.md | 4 +- docs/metaflow/composing-flows/introduction.md | 3 +- .../scheduling-with-argo-workflows.md | 8 +++ docs/scaling/failures.md | 55 +++++++++++++++++++ 4 files changed, 67 insertions(+), 3 deletions(-) diff --git a/docs/index.md b/docs/index.md index 1a642622..0339205a 100644 --- a/docs/index.md +++ b/docs/index.md @@ -40,8 +40,8 @@ Metaflow makes it easy to build and manage real-life data science, AI, and ML pr - [Introduction to Scalable Compute and Data](scaling/introduction) - [Computing at Scale](scaling/remote-tasks/introduction) -- [Managing Dependencies](scaling/dependencies) ✨*New support for `uv`*✨ -- [Dealing with Failures](scaling/failures) +- [Managing Dependencies](scaling/dependencies) ✨*New: support for `uv`*✨ +- [Dealing with Failures](scaling/failures) ✨*New: support for `@exit_hook`*✨ - [Checkpointing Progress](scaling/checkpoint/introduction) ✨*New*✨ - [Loading and Storing Data](scaling/data) - [Organizing Results](scaling/tagging) diff --git a/docs/metaflow/composing-flows/introduction.md b/docs/metaflow/composing-flows/introduction.md index 4f4fb1f9..e936811e 100644 --- a/docs/metaflow/composing-flows/introduction.md +++ b/docs/metaflow/composing-flows/introduction.md @@ -13,7 +13,8 @@ steps and flows. For example, you might define shared, project-specific patterns - Tracking data and model lineage, - Performing feature engineering and transformations, - Training and evaluating a model, - - Accessing an external service, e.g. an LLM endpoint through a model router. + - Accessing an external service, e.g. an LLM endpoint through a model router, + - Making tools available for agentic workflows. You can handle cases like these by developing a shared library that encapsulates the logic and importing it in your steps. Metaflow will [package the diff --git a/docs/production/scheduling-metaflow-flows/scheduling-with-argo-workflows.md b/docs/production/scheduling-metaflow-flows/scheduling-with-argo-workflows.md index b52396a2..2ee3b90f 100644 --- a/docs/production/scheduling-metaflow-flows/scheduling-with-argo-workflows.md +++ b/docs/production/scheduling-metaflow-flows/scheduling-with-argo-workflows.md @@ -308,6 +308,14 @@ production. On Argo Workflows we support sending notifications on a successful or failed flow. To enable notifications, supply the `--notify-on-success/--notify-on-error` flags while deploying your flow. You must also configure the notification provider. The ones currently supported are +### Custom notifications + +:::info +New in Metaflow 2.16 +::: + +You can set up a custom function to be called on success or failure on Argo Workflows using [exit hooks](/scaling/failures#exit-hooks-executing-a-function-upon-success-or-failure). + ### Slack notifications In order to enable Slack notifications, we need to first create a webhook endpoing that Metaflow can send the notifications to by following the instructions at https://api.slack.com/messaging/webhooks diff --git a/docs/scaling/failures.md b/docs/scaling/failures.md index 2f438d7b..54316e65 100644 --- a/docs/scaling/failures.md +++ b/docs/scaling/failures.md @@ -329,6 +329,60 @@ if __name__ == '__main__': This example handles a timeout in `start` gracefully without showing any exceptions. +## Exit hooks: Executing a function upon success or failure + +:::info +This is a new feature in Metaflow 2.16. Exit hooks work with local runs and when +[deployed on Argo Workflows](/production/scheduling-metaflow-flows/scheduling-with-argo-workflows). +::: + +Exit hooks let you define a special function that runs at the end of a flow, regardless +of whether the flow succeeds or fails. Unlike the end step, which is skipped if the flow +fails, exit hooks always run. This makes them suitable for tasks like sending notifications +or cleaning up resources. However, since they run outside of steps, they cannot be used to +produce artifacts. + +You can attach one or more exit hook functions to a flow using the `@exit_hook` decorator. For example: + +```python +from metaflow import step, FlowSpec, Parameter, exit_hook, Run + +def success_print(): + print("✅ Flow completed successfully!") + +def failure_print(run): + if run: + print(f"💥 Run {run.pathspec} failed. Failed tasks:") + for step in run: + for task in step: + if not task.successful: + print(f" → {task.pathspec}") + else: + print(f"💥 Run failed during initialization") + +@exit_hook(on_error=[failure_print], on_success=[success_print]) +class ExitHookFlow(FlowSpec): + should_fail = Parameter(name="should-fail", default=False) + + @step + def start(self): + print("Starting 👋") + print("Should fail?", self.should_fail) + if self.should_fail: + raise Exception("failing as expected") + self.next(self.end) + + @step + def end(self): + print("Done! 🏁") + +if __name__ == "__main__": + ExitHookFlow() +``` + +Note that when deployed on Argo Workflows, exit hook functions execute as separate +containers (pods), so they will execute even if steps fail e.g. due to out of memory condition. + ## Summary Here is a quick summary of failure handling in Metaflow: @@ -341,4 +395,5 @@ Here is a quick summary of failure handling in Metaflow: safely](failures.md#how-to-prevent-retries). It is a good idea to use `times=0` for `retry` in this case. * Use `timeout` with any of the above if your code can get stuck. +* Use `@exit_hook` to execute custom functions upon success or failure. From 4f988c1e470d84bf229da7ae7a1201218d620412 Mon Sep 17 00:00:00 2001 From: Ville Tuulos Date: Mon, 14 Jul 2025 20:17:47 +0300 Subject: [PATCH 2/3] add a note about @exit_hook(options=) --- docs/scaling/failures.md | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/docs/scaling/failures.md b/docs/scaling/failures.md index 54316e65..07147e60 100644 --- a/docs/scaling/failures.md +++ b/docs/scaling/failures.md @@ -383,6 +383,15 @@ if __name__ == "__main__": Note that when deployed on Argo Workflows, exit hook functions execute as separate containers (pods), so they will execute even if steps fail e.g. due to out of memory condition. +### Custom dependencies in exit hooks + +Since exit hook functions are not steps, you can't use `@pypi` or `@conda` to manage +their dependencies. +Instead, you can provide a custom image in `options={'image': ...}` like here: +``` +@exit_hook(on_error=[failure_print], options={"image": URL_TO_AN_IMAGE}) +``` + ## Summary Here is a quick summary of failure handling in Metaflow: From 8591ee83cd60c12c6b46bcd97cf6d150070b438f Mon Sep 17 00:00:00 2001 From: Ville Tuulos Date: Mon, 14 Jul 2025 20:26:51 +0300 Subject: [PATCH 3/3] improve custom deco docs --- docs/metaflow/composing-flows/custom-decorators.md | 4 +++- docs/metaflow/composing-flows/mutators.md | 2 ++ 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/docs/metaflow/composing-flows/custom-decorators.md b/docs/metaflow/composing-flows/custom-decorators.md index d85d68af..45b5eaa1 100644 --- a/docs/metaflow/composing-flows/custom-decorators.md +++ b/docs/metaflow/composing-flows/custom-decorators.md @@ -107,7 +107,9 @@ python waiterflow.py run --with myprofiler.my_profile Custom decorators don't require special treatment when [executing tasks in the cloud](/scaling/remote-tasks/introduction) or when [deploying flows to production](/production/introduction) - -they will get [packaged automatically by Metaflow](/scaling/dependencies#unpacking-a-metaflow-project). Try it: +they will get [packaged automatically by Metaflow](/scaling/dependencies#unpacking-a-metaflow-project). You don't +even need to `import` the decorators if you add them with `--with`. Try it: + ``` python waiterflow.py run --with myprofiler.my_profile --with kubernetes ``` diff --git a/docs/metaflow/composing-flows/mutators.md b/docs/metaflow/composing-flows/mutators.md index 80248365..0829a953 100644 --- a/docs/metaflow/composing-flows/mutators.md +++ b/docs/metaflow/composing-flows/mutators.md @@ -122,6 +122,8 @@ You can test the effect of the options with `@robust_flow` above. You can see th python failflow.py dump RUN_ID/start ``` +Note that the same options apply to adding flow-level decorators as well. + ## Introspecting a flow and applying configs Let's walk through a more advanced mutator that shows how you can