-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to create a workflow with 10k tasks #151
Comments
Is this issue relevant? |
It does seem relevant and I looked it over before submitting this issue, but unfortunately I'm still feeling stuck. I think maybe my issue is easier? If Hera already supports WorkflowTemplates than I imagine they support passing in params, I just haven't found an example yet |
How about using template_ref? First, create WorkflowTemplate. def echo(value: str):
import subprocess
subprocess.run(["echo", value])
wts = WorkflowTemplateService(
host=host,
token=token,
namespace=namespace,
)
wt = WorkflowTemplate(
"echo-template", wts, namespace=namespace, service_account_name=service_account
)
t = Task(
"echo-task",
func=echo,
func_params=[{"value": "DEFAULT_VALUE"}],
image=image
)
wt.add_task(t)
wt.create(namespace=namespace) Specify the created WorkflowTemplate in Task.template_ref ws = WorkflowService(
host=host,
token=token,
namespace=namespace,
)
w = Workflow(
"use-echo_template",
ws,
namespace=namespace,
service_account_name=service_account,
)
echo_values = ["hoge", "fuga", "piyo"]
tasks = [
Task(
f"task-{v}",
template_ref=TemplateRef(name="echo-template", template="echo-task"),
variables=[VariableAsEnv(name="value", value=f'"{v}"')]
)
for v in echo_values
]
w.add_tasks(*tasks)
w.create() Here are the result.
If I'm wrong, please point it out. (CC: @flaviuvadan |
@dmerrick the issue of submitting large workflows is still an open problem for Argo. I recognize the pain point as we run into it at @dynotx as well. The typical approach is to batch the workflows into separate ones and submit them independently, for now. That is, until Argo has a way to use either a different persistent layer storage over etcd or a workaround with etcd itself via paged insertions or whatever the end solution may be. What @szdr suggested may also help! If the template is pretty big, particularly because of all the code that can go into it, it's most helpful to store the template on Argo and reuse a reference to it via template ref. |
I'm definitely getting closer, thanks so @szdr's suggestions. I'll follow up when I have more to say |
Okay @szdr's example worked for me, but it brought some other issues to my attention. I think this is the biggest one: For instance, using the above example, the Task attached to the WorkflowTemplate has Put differently, my final workflow has a number of - container:
command:
- python
env:
- name: cmd
value: '{{inputs.parameters.cmd}}'
env_from: []
image: python:3.7
image_pull_policy: Always
resources:
limits:
cpu: '1'
memory: 4Gi
requests:
cpu: '1'
memory: 4Gi
volume_mounts: []
daemon: false
inputs:
artifacts: []
parameters:
- name: cmd
value: '"echo 9"'
metadata:
annotations: {}
labels: {}
name: cmd9
outputs:
artifacts: []
tolerations: []
volume_claim_templates: []
volumes: [] Not only are many of these values incorrect (i.e. |
(It does work okay if I add the params to the Task that includes |
This PR will avoid the addition of unnecessary templates. |
Oh terrific! I'll try that out |
One critical thing brought up by @kromanow94 is how Hera adds duplicated templates (#154 (comment)). Perhaps fixing this will also help with more scaling! |
Thank you for keeping this issue in mind. I considered closing it, but I figured it might be a good place to record ongoing cruft-removal projects. That being said, feel free to close if you're tired of looking at it |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
I'd like to have a workflow with many tasks, but I'm running into the 1.5MB k8s/etcd file limit. The workflow isn't complicated, it's basically 10k very-short bash commands that all run on the same image with the same resources etc.
I think the solution here is to use a WorkflowTemplate, but I haven't figured out how to use input parameters with hera.
I have something like this:
The text was updated successfully, but these errors were encountered: