Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic workflow duration prediction #2717

Closed
alexec opened this issue Apr 16, 2020 · 16 comments · Fixed by #4091
Closed

Automatic workflow duration prediction #2717

alexec opened this issue Apr 16, 2020 · 16 comments · Fixed by #4091
Assignees
Labels
area/controller Controller issues, panics type/feature Feature request
Milestone

Comments

@alexec
Copy link
Contributor

alexec commented Apr 16, 2020

Summary

Users would like to be able to understand the how long similar workflows took to see if they start to take longer.

Motivation

When I'm using workflow templates, is this template taking longer to run?

Proposal

Image from iOS

Add summary data of recent runs of the template status?

Related to #1658, #3557


Message from the maintainers:

If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

@alexec
Copy link
Contributor Author

alexec commented Apr 16, 2020

@dtaniwaki @sarabala1979 would it be possible do you think?

@alexec
Copy link
Contributor Author

alexec commented Apr 16, 2020

could we solve by just adding duration to list view?

@dtaniwaki
Copy link
Member

I think we can have stats of workflow templates in RDB or somewhere and calculate the average, median and etc. and show it in the list view.

But it may affect the performance of the controller. We may need to has an option to disable the feature.

@dtaniwaki
Copy link
Member

Actually, we want to use these stats in the scheduler and optimize the scheduling algorithm. For example, there is few resources available and a pod is waiting with high priority, now, a low priority pod came. If we can predict how long the pod runs, we can let it run before the high priority waiting pod.

@alexec
Copy link
Contributor Author

alexec commented Apr 20, 2020

This specific use case I want to solve - seeing historical durations is shown in the UI already.

@alexec
Copy link
Contributor Author

alexec commented Apr 20, 2020

There might be other things we want to aggregate, so I'll leave this open.

@alexec alexec changed the title Workflow template aggregate status Automatic workflow duration prediction Jul 20, 2020
@alexec
Copy link
Contributor Author

alexec commented Jul 23, 2020

#3558 produces a way to determine likely execution time and to dig into problems.

@alexec
Copy link
Contributor Author

alexec commented Sep 18, 2020

Similar to #1658.

alexcapras pushed a commit to alexcapras/argo that referenced this issue Nov 12, 2020
alexcapras pushed a commit to alexcapras/argo that referenced this issue Nov 12, 2020
Signed-off-by: Alex Capras <alexcapras@gmail.com>
@tszngai
Copy link

tszngai commented Mar 22, 2021

How do I enable this feature to see the estimation time on my UI?

@alexec
Copy link
Contributor Author

alexec commented Mar 22, 2021

It is automatically enabled. It estimates based on the most recently completed workflow that used the same workflow template or cron workflow.

@tszngai
Copy link

tszngai commented Mar 22, 2021

Thanks for the quick reply. I'm using v2.12.5 and I don't see the estimation time on it:
image

@tobisinghania
Copy link
Contributor

Is it also possible to retrieve those metrics via the prometheus metrics endpoint?

@alexec
Copy link
Contributor Author

alexec commented Apr 9, 2021

Short answer - no.
Long answer - there are metrics related to workflow execution that might help you.

@tobisinghania
Copy link
Contributor

Thx for your answer....although it's not the one I was hoping for ;)
Unfortunately I have not been able to create a similar diagram as you have in the UI with using the duration workflow metric together with a gauge in grafana (this might be only because a lack of skill on my end, though).

Maybe I will create the metric via a workaround which is using your api and pushes to prometheus...

Thank you

@Ybbbbbbbbbbb

This comment was marked as off-topic.

@agilgur5

This comment was marked as resolved.

@argoproj argoproj locked as resolved and limited conversation to collaborators Mar 11, 2024
@agilgur5 agilgur5 added the area/controller Controller issues, panics label Mar 11, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area/controller Controller issues, panics type/feature Feature request
Projects
None yet
6 participants