README.md

Jenkins

This directory consolidates all the metadata associated with the jenkins plugin for collectd. The relevant code for the plugin can be found here

DESCRIPTION

This is the SignalFx Jenkins plugin. Follow these instructions to install the Jenkins plugin for collectd.

The collectd-jenkins plugin collects metrics from jenkins instances hitting these endpoints: ../api/json (job metrics) and metrics/<MetricsKey>/.. (default and optional Codahale/Dropwizard JVM metrics).

FEATURES

Built-in dashboards

  • Jenkins: Provides a high-level overview of metrics for a jenkins cluster.

  • Jenkins MASTER: Provides metrics from jenkins instance(s) on a particular host.

REQUIREMENTS AND DEPENDENCIES

Version information

Software Version
collectd 4.9 or later
python 2.6 or later
Jenkins 1.580.3 or later
Python plugin for collectd (included with SignalFx collectd agent)

INSTALLATION

  1. Download collectd-jenkins. Place the jenkins.py file in /usr/share/collectd/collectd-jenkins

  2. Copy the sample configuration file for this plugin in /etc/collectd/managed_config

  3. Modify the sample configuration file as described in Configuration, below

  4. Install the Metrics Plugin in Jenkins. Manage Jenkins > Manage Plugins > Available > Search "Metrics Plugin"

  5. Install the Python requirements with sudo pip install -r requirements.txt

  6. Restart collectd

CONFIGURATION

Using the example configuration file 10-jenkins.conf as a guide, provide values for the configuration options listed below that make sense for your environment and allow you to connect to the jenkins instances

Metrics from /metrics/<MetricsKey>/metrics endpoint can be activated through the configuration file. Note, that SignalFx does not support histograms, meter and timer metric types as they are too verbose in Jenkins and also values of type string and list(hence, metrics of these will be skipped if provided in the configuration)

configuration option definition example value
ModulePath Path on disk where collectd can find this module. "/usr/share/collectd/collectd-jenkins/"
Host Host name of the jenkins instance "localhost"
Port Port at which the instance can be reached "2379"
MetricsKey Access key required to fetch Codahale metrics "6ZHwGBkGR91dxbFenpfz_g2h0-ocmK-CvdHLdmg"
Username User with security access if configured "admin"
APIToken API Token of the user "f04fff7c860d884f2ef00a2b2d481c2f"
EnhancedMetrics Boolean to indicate whether advanced stats from /metrics/<MetricsKey>/metrics are needed "false"
IncludeMetric Metric name from the /metrics/<MetricsKey>/metrics endpoint to include(valid when EnhancedMetrics is "false") "vm.daemon.count"
ExcludeMetric Metric name from the /metrics/<MetricsKey>/metrics endpoint to exclude(valid when EnhancedMetrics is "true") "vm.terminated.count"
Dimension Space-separated key-value pair for a user-defined dimension dimension_name dimension_value
Interval Number of seconds between calls to Jenkins API. 10
ssl_keyfile Path to the keyfile "path/to/file"
ssl_certificate Path to the certificate "path/to/file"
ssl_ca_certs Path to the ca file "path/to/file"

Example configuration:

LoadPlugin python
<Plugin python>
    ModulePath "/usr/share/collectd/collectd-jenkins"
    Import jenkins
    <Module jenkins>
        Host "127.0.0.1"
        Port "8080"
        Username "admin"
        APIToken "f04fff7c860d884f2ef00a2b2d481c2f"
        MetricsKey "6ZHwGBkGR91dxbFenpfz_g2h0-ocmK-CvdHLdmg"
        Interval 60
        ssl_keyfile "/etc/cert/jenkins.key"
        ssl_certificate "/etc/cert/jenkins.crt"
        ssl_ca_certs "/etc/cert/ca.crt"
    </Module>
</Plugin>

The plugin can be configured to collect metrics from multiple instances in the following manner.

LoadPlugin python
<Plugin python>
    ModulePath "/usr/share/collectd/collectd-jenkins"
    Import jenkins
    <Module jenkins>
        Host "127.0.0.1"
        Port "8080"
        Username "admin"
        APIToken "f04fff7c860d884f2ef00a2b2d481c2f"
        MetricsKey "6ZHwGBkGR91dxbFenpfz_g2h0-ocmK-CvdHLdmg"
        Interval 10
    </Module>
    <Module jenkins>
        Host "127.0.0.1"
        Port "8010"
        Username "admin"
        APIToken "f04bbb7c860d8b4f1ef00a2b2d481c2f"
        MetricsKey "6Z76HwGBHOj4uBOlsxbFenpfz_g2UAh0-ocmK-CvdHLSRdmg"
        EnhancedMetrics False
        IncludeMetric "vm.daemon.count"
        IncludeMetric "vm.terminated.count"
    </Module>
    <Module jenkins>
        Host "127.0.0.1"
        Port "8000"
        MetricsKey "6Z95HwOj4uBOakGR91dxbFenpfz_g2wBlUAh0-ocmK-CvdSvE1LGRdmg"
        EnhancedMetrics True
        ExcludeMetric "vm.terminated.count"
        ExcludeMetric "vm.daemon.count"
        Dimension foo bar
    </Module>
</Plugin>

USAGE

Interpreting Built-in dashboards

  • Jenkins:

    • Alive Status: Shows the number of Jenkins Masters that are alive.

    • Health Score: Shows the mean health score of each Jenkins instance on all hosts.

    • Job Failure Rate: Shows the rate of jobs failed in the past day.

    • Executor Usage: Shows the usage pattern of the executors. Gives an overview of the load on the Jenkins instances.

    • Top 5 Failed Jobs: Shows the top 5 failed jobs over the past day based on the total failure count.

    • Busy Executors vs Pending Jobs: A line graph showing comparison between in-use executors and pending jobs in queue. On comparing this chart with two above, reason for job failures can be narrowed down further quickly.

    • Average Duration - Past Day: Shows average duration of top 5 jobs that are taking the most time.

    • Slave Status: Shows the number of slave agents that are alive.

    • VM Memory Utilization: Area graph of the memory used by each Jenkins JVM.

    • Heap Usage: Line graph of the utilization percentage of Heap memory by each Jenkins instance.

    • Non-Heap Used: Line graph of the non-heap memory used by each Jenkins instance.

  • Jenkins Master:

    • Top 5 Failed Jobs: Shows the top 5 failed jobs over the past day based on the total failure count in an instance(s).

    • Health Checks: The status of each health check as reported by DropWizard Metrics. This gives a quick overview of what's wrong with the instance.

    • Slave Status: Shows the number of slave agents of the instance(s) that are alive.

    • Busy Executors vs Pending Jobs: A line chart showing comparison between in-use executors and pending jobs in queue in an instance(s). On comparing this chart with two above, reason for job failures can be narrowed down further quickly.

    • VM Memory Utilization: Area chart of the memory used by the Jenkins JVM instance(s) on a host.

All DropWizard metrics reported by the jenkins collectd plugin will not contain any dimensions by default. Whereas, the job metrics sent will contain the following dimensions by default:

  • Job, name of the job
  • Result, the status of the job

A few other details:

  • plugin is always set to jenkins
  • plugin_instance will contain the IP address and the port of the member given in the configuration
  • To add metrics from the /metrics/<MetricsKey>/metrics endpoint, use the configuration options mentioned in configuration. If metrics are being included individually, make sure to give names that are valid. For example, vm.daemon.count or vm.terminated.count

METRICS

By default, metrics about a job and instance are provided. Click here for details. Metrics from /metrics/<MetricsKey>/metrics endpoint can be activated through the configuration file. Note, that SignalFx does not support histograms, meter and timer metric types as they are too verbose in Jenkins and also values of type string and list(hence, metrics of these will be skipped if provided in the configuration). See usage for details.

Metric naming

<metric type>.jenkins.node.<name of metric>. This is the format of default metric names reported by the plugin. Optional metrics are named as available from the /metrics/<MetricsKey>/metrics endpoint.

LICENSE

This integration is released under the Apache 2.0 license. See LICENSE for more details.