Conversation
|
Test coverage for aaff004 Static code analysis report |
|
@mthaddon @jdkandersson you may want to review this PR. I have decided not to detect active runners between reconciliation events, as this could lead to confusing visualisation (even if they are no longer active, they would still be shown as active in the panel until the next reconciliation event updates the status). |
Isn't that also the case for active runners, i.e., the active runners could finish their job within seconds of us reading the metric and then they would be counted as active until the next reconciliation event even though that information is outdated quite quickly. That metric should be interpreted as the number of runners that were active within the reconciliation window |
jdkandersson
left a comment
There was a problem hiding this comment.
Looks good, it would be good to get this into the edge deploy on Monday. Whether or not the active stays I'll leave up to you, there isn't a clear use case for it at this point and it looks like it would be easy to add back in
As there is no real use case (other than to see if active + idle adds up to the expected total or not), I fear that the metric will add more confusion than value to the dashboard viewer. The panel is not about real-time data, and the user needs to understand that the data points refer to the sum of the data from the last reconciliation events on each of the units. I have added an explanation to the panel, but I think most dashboard users will simply ignore it. |
Applicable spec: n/a
Overview
Clean up the dashboard :
Active Runnergauge and rename the graph toIdle Runners after Reconciliation.Idle Runners after Reconciliationgraph.Idle Runners after Reconciliationto60minstead of10mfor better reconciliation event consideration.Idle Runnerscount.Lifecycle Statusgraph counters as monotonically increasing within the selected range, disregarding values outside it.Rationale
Clearer Representation:
Enhanced Visualization: Improved visual representation for better user comprehension.
Optimized Time Range:
10mrange leads to missing values due to our 30-minute reconciliation period in our deployment. Upgrading to60mbetter aligns with the latest reconciliation events for units in normal scenarios.Improved Idle Runner Detection: Corrects issues where idle runners weren't accurately detected post-spawning due to API detection delays.
Relevant Data Presentation: Focus on presenting pertinent data within the selected range, catering to user interests.
Juju Events Changes
Module Changes
Modify the
runner_managermodule to calculate the metrics accordingly.Library Changes
Checklist
src-docsurgent,trivial,complex)