From 12557a3112d893ccc2062f17510510d316a2e032 Mon Sep 17 00:00:00 2001 From: Ville Tuulos Date: Mon, 14 Jul 2025 18:59:19 +0300 Subject: [PATCH] add exit hook docs --- docs/index.md | 4 +- docs/metaflow/composing-flows/introduction.md | 3 +- .../scheduling-with-argo-workflows.md | 8 +++ docs/scaling/failures.md | 55 +++++++++++++++++++ 4 files changed, 67 insertions(+), 3 deletions(-) diff --git a/docs/index.md b/docs/index.md index 1a642622..0339205a 100644 --- a/docs/index.md +++ b/docs/index.md @@ -40,8 +40,8 @@ Metaflow makes it easy to build and manage real-life data science, AI, and ML pr - [Introduction to Scalable Compute and Data](scaling/introduction) - [Computing at Scale](scaling/remote-tasks/introduction) -- [Managing Dependencies](scaling/dependencies) ✨*New support for `uv`*✨ -- [Dealing with Failures](scaling/failures) +- [Managing Dependencies](scaling/dependencies) ✨*New: support for `uv`*✨ +- [Dealing with Failures](scaling/failures) ✨*New: support for `@exit_hook`*✨ - [Checkpointing Progress](scaling/checkpoint/introduction) ✨*New*✨ - [Loading and Storing Data](scaling/data) - [Organizing Results](scaling/tagging) diff --git a/docs/metaflow/composing-flows/introduction.md b/docs/metaflow/composing-flows/introduction.md index 4f4fb1f9..e936811e 100644 --- a/docs/metaflow/composing-flows/introduction.md +++ b/docs/metaflow/composing-flows/introduction.md @@ -13,7 +13,8 @@ steps and flows. For example, you might define shared, project-specific patterns - Tracking data and model lineage, - Performing feature engineering and transformations, - Training and evaluating a model, - - Accessing an external service, e.g. an LLM endpoint through a model router. + - Accessing an external service, e.g. an LLM endpoint through a model router, + - Making tools available for agentic workflows. You can handle cases like these by developing a shared library that encapsulates the logic and importing it in your steps. Metaflow will [package the diff --git a/docs/production/scheduling-metaflow-flows/scheduling-with-argo-workflows.md b/docs/production/scheduling-metaflow-flows/scheduling-with-argo-workflows.md index b52396a2..2ee3b90f 100644 --- a/docs/production/scheduling-metaflow-flows/scheduling-with-argo-workflows.md +++ b/docs/production/scheduling-metaflow-flows/scheduling-with-argo-workflows.md @@ -308,6 +308,14 @@ production. On Argo Workflows we support sending notifications on a successful or failed flow. To enable notifications, supply the `--notify-on-success/--notify-on-error` flags while deploying your flow. You must also configure the notification provider. The ones currently supported are +### Custom notifications + +:::info +New in Metaflow 2.16 +::: + +You can set up a custom function to be called on success or failure on Argo Workflows using [exit hooks](/scaling/failures#exit-hooks-executing-a-function-upon-success-or-failure). + ### Slack notifications In order to enable Slack notifications, we need to first create a webhook endpoing that Metaflow can send the notifications to by following the instructions at https://api.slack.com/messaging/webhooks diff --git a/docs/scaling/failures.md b/docs/scaling/failures.md index 2f438d7b..54316e65 100644 --- a/docs/scaling/failures.md +++ b/docs/scaling/failures.md @@ -329,6 +329,60 @@ if __name__ == '__main__': This example handles a timeout in `start` gracefully without showing any exceptions. +## Exit hooks: Executing a function upon success or failure + +:::info +This is a new feature in Metaflow 2.16. Exit hooks work with local runs and when +[deployed on Argo Workflows](/production/scheduling-metaflow-flows/scheduling-with-argo-workflows). +::: + +Exit hooks let you define a special function that runs at the end of a flow, regardless +of whether the flow succeeds or fails. Unlike the end step, which is skipped if the flow +fails, exit hooks always run. This makes them suitable for tasks like sending notifications +or cleaning up resources. However, since they run outside of steps, they cannot be used to +produce artifacts. + +You can attach one or more exit hook functions to a flow using the `@exit_hook` decorator. For example: + +```python +from metaflow import step, FlowSpec, Parameter, exit_hook, Run + +def success_print(): + print("✅ Flow completed successfully!") + +def failure_print(run): + if run: + print(f"💥 Run {run.pathspec} failed. Failed tasks:") + for step in run: + for task in step: + if not task.successful: + print(f" → {task.pathspec}") + else: + print(f"💥 Run failed during initialization") + +@exit_hook(on_error=[failure_print], on_success=[success_print]) +class ExitHookFlow(FlowSpec): + should_fail = Parameter(name="should-fail", default=False) + + @step + def start(self): + print("Starting 👋") + print("Should fail?", self.should_fail) + if self.should_fail: + raise Exception("failing as expected") + self.next(self.end) + + @step + def end(self): + print("Done! 🏁") + +if __name__ == "__main__": + ExitHookFlow() +``` + +Note that when deployed on Argo Workflows, exit hook functions execute as separate +containers (pods), so they will execute even if steps fail e.g. due to out of memory condition. + ## Summary Here is a quick summary of failure handling in Metaflow: @@ -341,4 +395,5 @@ Here is a quick summary of failure handling in Metaflow: safely](failures.md#how-to-prevent-retries). It is a good idea to use `times=0` for `retry` in this case. * Use `timeout` with any of the above if your code can get stuck. +* Use `@exit_hook` to execute custom functions upon success or failure.