feature: add argo workflows list-runs command #1416

saikonen · 2023-05-16T13:29:07Z

add command for listing runs (and filtering by status) for argo-workflows

addresses part of #1387

Note: due to API limitations, this feature required adding a custom label to workflows with the workflow-template as a value, as labels are the only available filter for custom fields on objects. This makes the feature not backwards compatible, as the necessary label does not exist on older runs.

savingoyal · 2023-06-05T15:19:25Z

metaflow/plugins/argo/argo_client.py

+        # including the following: spec.workflowTemplateRef.name, status.phase
+        # Therefore we use labels for filtering the runs, but this has required us to add a label for the workflow-template name
+        # making the solution not backwards compatible.
+        filters = ["workflows.argoproj.io/workflow-template=%s" % name] + [


can you verify that this solution works when the "name" is rather large?

I'll change this to a hash of the name as discussed. Kubernetes has a max length of 63 for labels, but argo workflow names can be 253 characters long

metaflow/plugins/argo/argo_workflows.py

metaflow/plugins/argo/argo_workflows_cli.py

savingoyal

lgtm! few minor comments

metaflow/plugins/argo/argo_workflows.py

saikonen · 2023-07-05T14:21:45Z

Some thoughts and observations on this feature before shipping.
The way we currently truncate our deployed Argo workflows names to the maximum that Kubernetes allows for its metadata (253 characters) has some problems, the main one being that these workflows can not be launched through Argo's web UI as it tries to populate a label with the workflow template name.

Creating workflows through Argo web UI has a validation that checks that the metadata name does not exceed 63 characters (so it fits in a label). So Argo is much stricter with naming than what we are with submitted workflows. For the Metaflow use case, we can not start truncating workflow names all the way down to 63 characters though, mainly because of how project branches are named. As an example, these two flows would result in a collision with their workflow template names if truncated to 63 characters:

averylongprojectname.user.verylongfirstname.verylonglastnamecompanyemailaddress.com.FirstFlow
averylongprojectname.user.verylongfirstname.verylonglastnamecompanyemailaddress.com.SecondFlow

which is a very real possibility with the latest changes of allowing email addresses as usernames for deployed flows.

For the scope of this PR, the only solution for adding discoverability to workflows via the Kubernetes API is to introduce a custom label that uses a hash of the workflow template name, as was implemented.

saikonen · 2023-07-05T14:24:54Z

metaflow/plugins/argo/argo_workflows.py

+    def _label_hash(name):
+        # Hash a name for use as a Kubernetes label.
+        # Use the maximum allowed 63 characters for the hash to minimize collisions.
+        # Preserve part of the name for legibility purposes.


after some thought, preserving legibility is probably not a priority for a label, as these are not meant for human consumption to begin with. If we want to store the template name as well, this can be done in the annotations which do not have the same length limitations. I don't see a use case for storing the full template name at this point though

saikonen · 2023-09-04T12:36:29Z

this feature is somewhat blocked until #1521 is solved. After that we have a canonical way of generating the workflow-template name label, and can use that for finding the runs.

saikonen added 5 commits May 15, 2023 19:30

wip: list executions for argo

e34806a

filtering by status works.

4cffe37

add custom label to argo workflows for the workflow template name

648a2c8

use custom labels fo filtering runs from argo workflows

7c695c8

remove limit from listing query

09d462d

saikonen requested a review from savingoyal May 16, 2023 13:29