Performance Improvement Calculator #226

benthomasson · 2020-08-04T18:06:03Z

Description

A common request from Tower operators is to improve the performance of their playbooks when applied to an inventory. This feature attempts to help them do that by pointing out places where improvements would be the most successful to the overall playbook run.

The feature works a bit like the ROI calculator in that it shows them the current state of their system and then they can tweak it to see what performance improvements would do to the over all performance of their playbooks. This is a visualization of Amdahl's Law as applied to Ansible playbooks.

The visualization could be based on this chart.

Where A and B would be different tasks in a playbook.

We can present a bar chart showing the duration of the tasks in a playbook and provide fields with speed ups (1.0X by default) for each task. They can then tweak the speed ups for the tasks to see the overall speed up calculated by Amdahl's Law.

Additionally we can show tasks-per-host to graphically identify slow hosts. This could be in the same chart with expandable bars that expand to show bars for each host that ran that task. We can pre-expand some bars if the variance between durations is larger than some threshold which could be user defined as well.

This calculator can be used to compare the current state of a playbook run to hypothetical playbook runs based on user provided speed ups. It can also be used to compare the performance improvement between two runs of the same playbook calculating the per task speed ups and the overall playbook speed up.

Mock up

Add mock up here when ready

Related PRs

Add PRs here when ready

Verification

Screenshot

Add screenshot of implementation here when implemented

Steps

Add verification steps here when ready for QE

Ladas · 2020-08-04T20:03:12Z

@benthomasson we should be able to get the avg task time distribution for template from the event explorer API (after some tweaks and adding the real duration of tasks into rollups)

Then it's all UI magic to drag these, to compute possible speedups.

Btw. we should show avg task speed in the selected time period and maybe the distribution e.g. with quartile chart

https://github.com/RedHatInsights/tower-analytics-backend/issues/478

Ladas · 2020-08-05T13:32:24Z

@benthomasson currently we track these task states (similar to tower)

ok
failed
unreachable
skipped
retry
changed
ignored_failed
ignored_unreachable
rescued_failed
rescued_unreachable

I'll expose duration of each and we should show the distribution. And we should probably allow user to filter only some of these? E.g. unreachable and failed will be eliminated if we filter out only successful jobs.

Then this brings more useful insight, e.g. seeing some task taking a long time but always being skipped or never changing anything or having a lot of retries, etc... Each if these will provide a hint how we can optimize the task.

And we'd be probably showing e.g. average run of this task as changed for 1 host vs. average run of this task as skipped for 1 host

cswiii · 2020-09-01T15:39:19Z

Perhaps we could we call this something snazzy like "Performance Profiler"?

benthomasson · 2020-09-01T16:23:34Z

I have changed the name a few times myself. I was calling it "Performance Planner" in my head recently. Performance Profiler sounds good.

benthomasson · 2020-11-06T15:17:55Z

How do customers find the long running templates? Do we need a visualization or table of the longest running templates?

benthomasson · 2020-11-06T15:18:20Z

This would be useful for developers or architects.

jctanner · 2021-01-07T15:41:31Z

migrated to https://issues.redhat.com/projects/AA/issues/AA-163

benthomasson added the JIRA label Jan 6, 2021

jctanner closed this as completed Jan 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance Improvement Calculator #226

Performance Improvement Calculator #226

benthomasson commented Aug 4, 2020 •

edited

Ladas commented Aug 4, 2020 •

edited

Ladas commented Aug 5, 2020 •

edited

cswiii commented Sep 1, 2020

benthomasson commented Sep 1, 2020

benthomasson commented Nov 6, 2020

benthomasson commented Nov 6, 2020

jctanner commented Jan 7, 2021

Performance Improvement Calculator #226

Performance Improvement Calculator #226

Comments

benthomasson commented Aug 4, 2020 • edited

Description

Verification

Ladas commented Aug 4, 2020 • edited

Ladas commented Aug 5, 2020 • edited

cswiii commented Sep 1, 2020

benthomasson commented Sep 1, 2020

benthomasson commented Nov 6, 2020

benthomasson commented Nov 6, 2020

jctanner commented Jan 7, 2021

benthomasson commented Aug 4, 2020 •

edited

Ladas commented Aug 4, 2020 •

edited

Ladas commented Aug 5, 2020 •

edited