Skip to content

COS: Clean up dashboard#157

Merged
cbartz merged 14 commits intomainfrom
ISD-1389-update-dashboard
Dec 1, 2023
Merged

COS: Clean up dashboard#157
cbartz merged 14 commits intomainfrom
ISD-1389-update-dashboard

Conversation

@cbartz
Copy link
Copy Markdown
Collaborator

@cbartz cbartz commented Nov 29, 2023

Applicable spec: n/a

Overview

Clean up the dashboard :

  1. Remove the Active Runner gauge and rename the graph to Idle Runners after Reconciliation.
  2. Display only whole numbers (integers) for the Idle Runners after Reconciliation graph.
  3. Adjust range aggregation in Idle Runners after Reconciliation to 60m instead of 10m for better reconciliation event consideration.
  4. Include offline but healthy runners in the Idle Runners count.
  5. Showcase Lifecycle Status graph counters as monotonically increasing within the selected range, disregarding values outside it.

Screenshot from 2023-12-01 08-05-25

Rationale

  1. Clearer Representation:

    • Focused on observing overall trends of idle runners for user provisioning.
    • Active Runner display during reconciliation can mislead user interpretation.
  2. Enhanced Visualization: Improved visual representation for better user comprehension.

  3. Optimized Time Range: 10m range leads to missing values due to our 30-minute reconciliation period in our deployment. Upgrading to 60m better aligns with the latest reconciliation events for units in normal scenarios.

  4. Improved Idle Runner Detection: Corrects issues where idle runners weren't accurately detected post-spawning due to API detection delays.

  5. Relevant Data Presentation: Focus on presenting pertinent data within the selected range, catering to user interests.

Juju Events Changes

Module Changes

Modify the runner_manager module to calculate the metrics accordingly.

Library Changes

Checklist

@cbartz cbartz changed the title [WIP] COS: Clean up dashboard COS: Clean up dashboard Dec 1, 2023
@cbartz cbartz marked this pull request as ready for review December 1, 2023 07:53
@cbartz cbartz requested a review from a team as a code owner December 1, 2023 07:53
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Dec 1, 2023

Test coverage for aaff004

Name                    Stmts   Miss Branch BrPart  Cover   Missing
-------------------------------------------------------------------
src/charm.py              429     98    107     25    75%   79-80, 86-88, 112-113, 117-119, 155-157, 167->179, 177, 220-228, 240-244, 275, 352-357, 378, 395-396, 405-428, 446-448, 456-459, 466, 469-470, 486, 491-496, 541->544, 557-558, 560-561, 595-602, 626-627, 639-641, 656-657, 682-683, 685-686, 688-689, 763->765, 826-827, 843, 861-863, 867
src/charm_state.py         50      6      6      0    86%   76-83
src/errors.py              34      0      0      0   100%
src/event_timer.py         42      8      4      0    74%   94-97, 117-120
src/firewall.py            43     25     10      0    38%   38-42, 64-67, 75-149
src/github_type.py         36      0      0      0   100%
src/lxd_type.py            37      0      2      0   100%
src/metrics.py             67      2     10      1    96%   60->63, 150-151
src/runner.py             334     70    108     26    74%   49->51, 105, 210-217, 223-229, 256-257, 301-310, 334-339, 344, 364, 368-378, 427->430, 433-435, 442, 456, 466, 470, 472, 487, 535-540, 556-569, 580-612, 617, 655, 701-703, 707, 725, 760, 786, 791-803, 817, 837, 853
src/runner_manager.py     273     26    102      9    90%   166, 223-225, 238-239, 251-253, 259-264, 268-269, 279-280, 299, 422, 441-445, 470, 553, 702, 722
src/runner_metrics.py     112      5     22      2    95%   133-134, 146, 215-216
src/runner_type.py         55      0     12      0   100%
src/shared_fs.py           80     10     12      0    89%   73-74, 160-161, 166-167, 173-174, 196-197
src/utilities.py           68      6     20      8    84%   74->76, 78->84, 89-91, 121, 135, 173, 226
-------------------------------------------------------------------
TOTAL                    1660    256    415     71    82%

Static code analysis report

Run started:2023-12-01 07:54:04.660047

Test results:
  No issues identified.

Code scanned:
  Total lines of code: 3720
  Total lines skipped (#nosec): 0
  Total potential issues skipped due to specifically being disabled (e.g., #nosec BXXX): 10

Run metrics:
  Total issues (by severity):
  	Undefined: 0
  	Low: 0
  	Medium: 0
  	High: 0
  Total issues (by confidence):
  	Undefined: 0
  	Low: 0
  	Medium: 0
  	High: 0
Files skipped (0):

@cbartz
Copy link
Copy Markdown
Collaborator Author

cbartz commented Dec 1, 2023

@mthaddon @jdkandersson you may want to review this PR. I have decided not to detect active runners between reconciliation events, as this could lead to confusing visualisation (even if they are no longer active, they would still be shown as active in the panel until the next reconciliation event updates the status).

@jdkandersson
Copy link
Copy Markdown
Contributor

@mthaddon @jdkandersson you may want to review this PR. I have decided not to detect active runners between reconciliation events, as this could lead to confusing visualisation (even if they are no longer active, they would still be shown as active in the panel until the next reconciliation event updates the status).

Isn't that also the case for active runners, i.e., the active runners could finish their job within seconds of us reading the metric and then they would be counted as active until the next reconciliation event even though that information is outdated quite quickly. That metric should be interpreted as the number of runners that were active within the reconciliation window

Copy link
Copy Markdown
Contributor

@jdkandersson jdkandersson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, it would be good to get this into the edge deploy on Monday. Whether or not the active stays I'll leave up to you, there isn't a clear use case for it at this point and it looks like it would be easy to add back in

@cbartz
Copy link
Copy Markdown
Collaborator Author

cbartz commented Dec 1, 2023

Looks good, it would be good to get this into the edge deploy on Monday. Whether or not the active stays I'll leave up to you, there isn't a clear use case for it at this point and it looks like it would be easy to add back in

As there is no real use case (other than to see if active + idle adds up to the expected total or not), I fear that the metric will add more confusion than value to the dashboard viewer.

The panel is not about real-time data, and the user needs to understand that the data points refer to the sum of the data from the last reconciliation events on each of the units. I have added an explanation to the panel, but I think most dashboard users will simply ignore it.

@cbartz cbartz merged commit 5873a66 into main Dec 1, 2023
@cbartz cbartz deleted the ISD-1389-update-dashboard branch December 1, 2023 10:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants