Skip to content

Latest commit

 

History

History
99 lines (72 loc) · 11.3 KB

config_metrics.md

File metadata and controls

99 lines (72 loc) · 11.3 KB

The metrics section

This is one of the features in GARM that I really love having. For one thing, it's community contributed and for another, it really adds value to the project. It allows us to create some pretty nice visualizations of what is happening with GARM.

Common metrics

Metric name Type Labels Description
garm_health Gauge controller_id=<controller id>
name=<hostname>
This is a gauge that is set to 1 if GARM is healthy and 0 if it is not. This is useful for alerting.
garm_webhooks_received Counter controller_id=<controller id>
name=<hostname>
This is a counter that increments every time GARM receives a webhook from GitHub.

Enterprise metrics

Metric name Type Labels Description
garm_enterprise_info Gauge id=<enterprise id>
name=<enterprise name>
This is a gauge that is set to 1 and expose enterprise information
garm_enterprise_pool_manager_status Gauge id=<enterprise id>
name=<enterprise name>
running=<true|false>
This is a gauge that is set to 1 if the enterprise pool manager is running and set to 0 if not

Organization metrics

Metric name Type Labels Description
garm_organization_info Gauge id=<organization id>
name=<organization name>
This is a gauge that is set to 1 and expose organization information
garm_organization_pool_manager_status Gauge id=<organization id>
name=<organization name>
running=<true|false>
This is a gauge that is set to 1 if the organization pool manager is running and set to 0 if not

Repository metrics

Metric name Type Labels Description
garm_repository_info Gauge id=<repository id>
name=<repository name>
This is a gauge that is set to 1 and expose repository information
garm_repository_pool_manager_status Gauge id=<repository id>
name=<repository name>
running=<true|false>
This is a gauge that is set to 1 if the repository pool manager is running and set to 0 if not

Provider metrics

Metric name Type Labels Description
garm_provider_info Gauge description=<provider description>
name=<provider name>
type=<internal|external>
This is a gauge that is set to 1 and expose provider information

Pool metrics

Metric name Type Labels Description
garm_pool_info Gauge flavor=<flavor>
id=<pool id>
image=<image name>
os_arch=<defined OS arch>
os_type=<defined OS name>
pool_owner=<owner name>
pool_type=<repository|organization|enterprise>
prefix=<prefix>
provider=<provider name>
tags=<concatenated list of pool tags>
This is a gauge that is set to 1 and expose pool information
garm_pool_status Gauge enabled=<true|false>
id=<pool id>
This is a gauge that is set to 1 if the pool is enabled and set to 0 if not
garm_pool_bootstrap_timeout Gauge id=<pool id> This is a gauge that is set to the pool bootstrap timeout
garm_pool_max_runners Gauge id=<pool id> This is a gauge that is set to the pool max runners
garm_pool_min_idle_runners Gauge id=<pool id> This is a gauge that is set to the pool min idle runners

Runner metrics

Metric name Type Labels Description
garm_runner_status Gauge controller_id=<controller id>
hostname=<hostname>
name=<runner name>
pool_owner=<owner name>
pool_type=<repository|organization|enterprise>
provider=<provider name>
runner_status=<running|stopped|error|pending_delete|deleting|pending_create|creating|unknown>
status=<idle|pending|terminated|installing|failed|active>
This is a gauge value that gives us details about the runners garm spawns

More metrics will be added in the future.

Enabling metrics

Metrics are disabled by default. To enable them, add the following to your config file:

[metrics]
# Toggle metrics. If set to false, the API endpoint for metrics collection will
# be disabled.
enable = true
# Toggle to disable authentication (not recommended) on the metrics endpoint.
# If you do disable authentication, I encourage you to put a reverse proxy in front
# of garm and limit which systems can access that particular endpoint. Ideally, you
# would enable some kind of authentication using the reverse proxy, if the built-in auth
# is not sufficient for your needs.
disable_auth = false

You can choose to disable authentication if you wish, however it's not terribly difficult to set up, so I generally advise against disabling it.

Configuring prometheus

The following section assumes that your garm instance is running at garm.example.com and has TLS enabled.

First, generate a new JWT token valid only for the metrics endpoint:

garm-cli metrics-token create

Note: The token validity is equal to the TTL you set in the JWT config section.

Copy the resulting token, and add it to your prometheus config file. The following is an example of how to add garm as a target in your prometheus config file:

scrape_configs:
  - job_name: "garm"
    # Connect over https. If you don't have TLS enabled, change this to http.
    scheme: https
    static_configs:
      - targets: ["garm.example.com"]
    authorization:
      credentials: "superSecretTokenYouGeneratedEarlier"