Skip to content

Runner Counts Incorrect in metrics in 0.11.0 #4013

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
4 tasks done
AngellusMortis opened this issue Apr 3, 2025 · 0 comments
Open
4 tasks done

Runner Counts Incorrect in metrics in 0.11.0 #4013

AngellusMortis opened this issue Apr 3, 2025 · 0 comments
Labels
bug Something isn't working gha-runner-scale-set Related to the gha-runner-scale-set mode needs triage Requires review from the maintainers

Comments

@AngellusMortis
Copy link

AngellusMortis commented Apr 3, 2025

Checks

Controller Version

0.11.0

Deployment Method

Helm

Checks

  • This isn't a question or user support case (For Q&A and community support, go to Discussions).
  • I've read the Changelog before submitting this issue and I'm sure it's not due to any recently-introduced backward-incompatible changes

To Reproduce

1. Enable all listenerMetrics as mentioned in the 0.11.0 [release notes](https://github.com/actions/actions-runner-controller/releases/tag/gha-runner-scale-set-0.11.0) and [related issue](https://github.com/actions/actions-runner-controller/issues/3993)
2. port-forward the metrics container port to local
3. Run some jobs and monitor metrics

Describe the bug

The gha_assigned_jobs, gha_busy_runners, and gha_idle_runners gauges never match the actual values or the values displayed inside of Github (which are correct). This issue started with upgrading to 0.11.0. The 0.10.1 values before the change to how metrics worked were correct.

With no jobs running, the gauges correctly report no assigned jobs, but 5 busy runners and 5 idle runners. The correct counts are actually 0/0/3 (3 idle runners, no busy).

# HELP gha_assigned_jobs Number of jobs assigned to this scale set.
# TYPE gha_assigned_jobs gauge
gha_assigned_jobs{name="cicd-pipelines-x2bqb"} 0
# HELP gha_busy_runners Number of registered runners running a job.
# TYPE gha_busy_runners gauge
gha_busy_runners{name="cicd-pipelines-x2bqb"} 5
# HELP gha_idle_runners Number of registered runners not running a job.
# TYPE gha_idle_runners gauge
gha_idle_runners{name="cicd-pipelines-x2bqb"} 5

Image

Describe the expected behavior

The gha_assigned_jobs, gha_busy_runners, and gha_idle_runners gauges should match the numbers visible in Github.

Additional Context

The listener metrics values in the runner scaling set. They are just the uncommented defaults.


listenerMetrics:
  counters:
    gha_started_jobs_total:
      labels:
        ["repository", "job_name", "event_name"]
    gha_completed_jobs_total:
      labels:
        [
          "repository",
          "job_name",
          "event_name",
          "job_result",
        ]
  gauges:
    gha_assigned_jobs:
      labels: ["name"]
    gha_running_jobs:
      labels: ["name"]
    gha_registered_runners:
      labels: ["name"]
    gha_busy_runners:
      labels: ["name"]
    gha_min_runners:
      labels: ["name"]
    gha_max_runners:
      labels: ["name"]
    gha_desired_runners:
      labels: ["name"]
    gha_idle_runners:
      labels: ["name"]
  histograms:
    gha_job_startup_duration_seconds:
      labels:
        ["repository", "organization", "enterprise", "job_name", "event_name"]
      buckets:
        [
          0.01,
          0.05,
          0.1,
          0.5,
          1.0,
          2.0,
          3.0,
          4.0,
          5.0,
          6.0,
          7.0,
          8.0,
          9.0,
          10.0,
          12.0,
          15.0,
          18.0,
          20.0,
          25.0,
          30.0,
          40.0,
          50.0,
          60.0,
          70.0,
          80.0,
          90.0,
          100.0,
          110.0,
          120.0,
          150.0,
          180.0,
          210.0,
          240.0,
          300.0,
          360.0,
          420.0,
          480.0,
          540.0,
          600.0,
          900.0,
          1200.0,
          1800.0,
          2400.0,
          3000.0,
          3600.0,
        ]
    gha_job_execution_duration_seconds:
      labels:
        [
          "repository",
          "organization",
          "enterprise",
          "job_name",
          "event_name",
          "job_result",
        ]
      buckets:
        [
          0.01,
          0.05,
          0.1,
          0.5,
          1.0,
          2.0,
          3.0,
          4.0,
          5.0,
          6.0,
          7.0,
          8.0,
          9.0,
          10.0,
          12.0,
          15.0,
          18.0,
          20.0,
          25.0,
          30.0,
          40.0,
          50.0,
          60.0,
          70.0,
          80.0,
          90.0,
          100.0,
          110.0,
          120.0,
          150.0,
          180.0,
          210.0,
          240.0,
          300.0,
          360.0,
          420.0,
          480.0,
          540.0,
          600.0,
          900.0,
          1200.0,
          1800.0,
          2400.0,
          3000.0,
          3600.0,
        ]

Controller Logs

Controller logs: https://gist.github.com/AngellusMortis/db74503c1c9717ad4d3dc9975b6b73ab

Runner Pod Logs

Listener logs: https://gist.github.com/AngellusMortis/66227f21bad83088e3edc9b307a8642e
@AngellusMortis AngellusMortis added bug Something isn't working gha-runner-scale-set Related to the gha-runner-scale-set mode needs triage Requires review from the maintainers labels Apr 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working gha-runner-scale-set Related to the gha-runner-scale-set mode needs triage Requires review from the maintainers
Projects
None yet
Development

No branches or pull requests

1 participant