Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to create a workflow with 10k tasks #151

Closed
dmerrick opened this issue May 4, 2022 · 13 comments
Closed

How to create a workflow with 10k tasks #151

dmerrick opened this issue May 4, 2022 · 13 comments

Comments

@dmerrick
Copy link
Contributor

dmerrick commented May 4, 2022

I'd like to have a workflow with many tasks, but I'm running into the 1.5MB k8s/etcd file limit. The workflow isn't complicated, it's basically 10k very-short bash commands that all run on the same image with the same resources etc.

I think the solution here is to use a WorkflowTemplate, but I haven't figured out how to use input parameters with hera.

I have something like this:

# set up the WorkflowTemplate
wt = WorkflowTemplate(...)

# is this the right way to pass input in?
t = Task("cmd", lambda: _, command=["bash", "-c", "{{inputs.parameters.cmd}}"], ...)
wt.add_task(t)
wt.create()

# how do I get these commands into the workflow as parameters?
commands = ["echo foo", "echo bar"]

# create the Workflow
workflow = Workflow(..., workflow_template_ref=wt.name)
workflow.create()
@szdr
Copy link
Contributor

szdr commented May 5, 2022

Is this issue relevant?
#138

@dmerrick
Copy link
Contributor Author

dmerrick commented May 5, 2022

It does seem relevant and I looked it over before submitting this issue, but unfortunately I'm still feeling stuck.

I think maybe my issue is easier? If Hera already supports WorkflowTemplates than I imagine they support passing in params, I just haven't found an example yet

@szdr
Copy link
Contributor

szdr commented May 6, 2022

How about using template_ref?

First, create WorkflowTemplate.

def echo(value: str):
    import subprocess
    subprocess.run(["echo", value])


wts = WorkflowTemplateService(
    host=host,
    token=token,
    namespace=namespace,
)

wt = WorkflowTemplate(
    "echo-template", wts, namespace=namespace, service_account_name=service_account
)
t = Task(
    "echo-task",
    func=echo,
    func_params=[{"value": "DEFAULT_VALUE"}],
    image=image
)
wt.add_task(t)
wt.create(namespace=namespace)

Specify the created WorkflowTemplate in Task.template_ref

ws = WorkflowService(
    host=host,
    token=token,
    namespace=namespace,
)

w = Workflow(
    "use-echo_template",
    ws,
    namespace=namespace,
    service_account_name=service_account,
)

echo_values = ["hoge", "fuga", "piyo"]
tasks = [
    Task(
        f"task-{v}",
        template_ref=TemplateRef(name="echo-template", template="echo-task"),
        variables=[VariableAsEnv(name="value", value=f'"{v}"')]
    )
    for v in echo_values
]
w.add_tasks(*tasks)
w.create()

Here are the result.

STEP                  TEMPLATE                 PODNAME                       DURATION  MESSAGE
 ✔ use-echo-template  use-echo-template
 ├─✔ task-fuga        echo-template/echo-task  use-echo-template-2648658388  12s
 ├─✔ task-hoge        echo-template/echo-task  use-echo-template-2431862676  10s
 └─✔ task-piyo        echo-template/echo-task  use-echo-template-715656806   11s

If I'm wrong, please point it out. (CC: @flaviuvadan

@flaviuvadan
Copy link
Collaborator

@dmerrick the issue of submitting large workflows is still an open problem for Argo. I recognize the pain point as we run into it at @dynotx as well. The typical approach is to batch the workflows into separate ones and submit them independently, for now. That is, until Argo has a way to use either a different persistent layer storage over etcd or a workaround with etcd itself via paged insertions or whatever the end solution may be. What @szdr suggested may also help! If the template is pretty big, particularly because of all the code that can go into it, it's most helpful to store the template on Argo and reuse a reference to it via template ref.

@dmerrick
Copy link
Contributor Author

I'm definitely getting closer, thanks so @szdr's suggestions. I'll follow up when I have more to say

@dmerrick
Copy link
Contributor Author

Okay @szdr's example worked for me, but it brought some other issues to my attention. I think this is the biggest one: TemplateRef Tasks don't include all parameters from the original Task.

For instance, using the above example, the Task attached to the WorkflowTemplate has image=image defined. That image correctly appears in the WorkflowTemplate, but it does not appear in the subsequent Workflow. Other Task attributes like resources, image_pull_policy, etc are also excluded from the final workflow.

Put differently, my final workflow has a number of container sections that look like this:

  - container:
      command:
      - python
      env:
      - name: cmd
        value: '{{inputs.parameters.cmd}}'
      env_from: []
      image: python:3.7
      image_pull_policy: Always
      resources:
        limits:
          cpu: '1'
          memory: 4Gi
        requests:
          cpu: '1'
          memory: 4Gi
      volume_mounts: []
    daemon: false
    inputs:
      artifacts: []
      parameters:
      - name: cmd
        value: '"echo 9"'
    metadata:
      annotations: {}
      labels: {}
    name: cmd9
    outputs:
      artifacts: []
    tolerations: []
  volume_claim_templates: []
  volumes: []

Not only are many of these values incorrect (i.e. image andresources), but many of these could be excluded to make the final workflow YAML much smaller

@dmerrick
Copy link
Contributor Author

(It does work okay if I add the params to the Task that includes template_ref=TemplateRef(...), but then I end up duplicating all of the params over and over in the final workflow spec)

@szdr
Copy link
Contributor

szdr commented May 10, 2022

Not only are many of these values incorrect (i.e. image andresources), but many of these could be excluded to make the final workflow YAML much smaller

This PR will avoid the addition of unnecessary templates.
#152

@dmerrick
Copy link
Contributor Author

Oh terrific! I'll try that out

@dmerrick
Copy link
Contributor Author

That helped me a lot, thanks. With @szdr's example above and #152, I'm a lot closer to 10k tasks

@flaviuvadan
Copy link
Collaborator

One critical thing brought up by @kromanow94 is how Hera adds duplicated templates (#154 (comment)). Perhaps fixing this will also help with more scaling!

@dmerrick
Copy link
Contributor Author

Thank you for keeping this issue in mind. I considered closing it, but I figured it might be a good place to record ongoing cruft-removal projects.

That being said, feel free to close if you're tired of looking at it

@stale
Copy link

stale bot commented Jun 10, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Jun 10, 2022
@stale stale bot closed this as completed Jun 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants