Skip to content
Permalink
Branch: master
Find file Copy path
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
181 lines (124 sloc) 11.2 KB

RabbitMQ Check

RabbitMQ Dashboard

Overview

This check monitors RabbitMQ through the Datadog Agent. It allows you to:

  • Track queue-based stats: queue size, consumer count, unacknowledged messages, redelivered messages, etc
  • Track node-based stats: waiting processes, used sockets, used file descriptors, etc
  • Monitor vhosts for aliveness and number of connections

And more.

Setup

Follow the instructions below to install and configure this check for an Agent running on a host. For containerized environments, see the Autodiscovery Integration Templates for guidance on applying these instructions.

Installation

The RabbitMQ check is included in the Datadog Agent package. No additional installation is needed on your server.

Configuration

Edit the rabbitmq.d/conf.yaml file, in the conf.d/ folder at the root of your Agent's configuration directory to start collecting your RabbitMQ metrics and logs. See the sample rabbitmq.d/conf.yaml for all available configuration options.

Prepare RabbitMQ

Enable the RabbitMQ management plugin. See RabbitMQ's documentation to enable it.

The Agent user then needs at least the monitoring tag and these required permissions:

Permission Command
conf ^aliveness-test$
write ^amq\.default$
read .*

Create an Agent user for your default vhost with the following commands:

rabbitmqctl add_user datadog <SECRET>
rabbitmqctl set_permissions  -p / datadog "^aliveness-test$" "^amq\.default$" ".*"
rabbitmqctl set_user_tags datadog monitoring

Here, / refers to the default host. Set this to your specified virtual host name. See the RabbitMQ documentation for more information.

Metric collection

  • Add this configuration block to your rabbitmq.d/conf.yaml file to start gathering your RabbitMQ metrics:
init_config:

instances:
  - rabbitmq_api_url: http://localhost:15672/api/
  #  username: <username> # if your RabbitMQ API requires auth; default is guest
  #  password: <password> # default is guest
  #  tag_families: true           # default is false
  #  vhosts:
  #    - <YOUR_VHOST>             # don't set if you want all vhosts

If you don't set vhosts, the Agent sends the following for EVERY vhost:

  1. rabbitmq.aliveness service check
  2. rabbitmq.connections metric

If you do set vhosts, the Agent sends this check and metric only for the vhosts you list.

There are options for queues and nodes that work similarly. The Agent checks all queues and nodes by default, but you can provide lists or regexes to limit this. See the rabbitmq.d/conf.yaml for examples.

Configuration Options:

Option Required Description
rabbitmq_api_url Yes Points to the API url of the RabbitMQ Managment Plugin.
tls_verify No Set to false to skip verification of tls cert chain when the rabbitmq_api_url uses https. The default is true.
username No User name, defaults to 'guest'
password No Password, defaults to 'guest'
tag_families No Tag queue "families" based off of regex matching, defaults to false
nodes or nodes_regexes No Use these parameters to specify the nodes you want to collect metrics on (up to 100). If you have less than 100 nodes, you don't have to set this parameter. The metrics are collected for all nodes by default.
queues or queues_regexes No Use these parameters to specify the queues you want to collect metrics on (up to 200). If you have less than 200 queues, you don't have to set this parameter. The metrics are collected for all queues by default. If you have set up vhosts, set the queue names as vhost_name/queue_name. If you have tag_families enabled, the first captured group in the regex is used as the queue_family tag.
exchanges or exchanges_regex No Use these parameters to specify the exchanges you want to collect metrics on (up to 50). If you have less than 50 exchanges, you don't have to set this parameter. The metrics are collected for all exchanges by default.
vhosts No By default a list of all vhosts is fetched and each one is checked using the aliveness API. If you prefer only certain vhosts to be monitored, list the vhosts you care about.

Restart the Agent to begin sending RabbitMQ metrics, events, and service checks to Datadog.

Log collection

Available for Agent >6.0

  1. To modify the default log file location either set the RABBITMQ_LOGS environment variable or add the following to your RabbitMQ configuration file (/etc/rabbitmq/rabbitmq.conf):

      log.dir = /var/log/rabbit
      log.file = rabbit.log
    
  2. Collecting logs is disabled by default in the Datadog Agent, enable it in your datadog.yaml file:

      logs_enabled: true
  3. Add this configuration block to your rabbitmq.d/conf.yaml file to start collecting your RabbitMQ logs:

      logs:
          - type: file
            path: /var/log/rabbit/*.log
            source: rabbitmq
            service: myservice
            log_processing_rules:
              - type: multi_line
                name: logs_starts_with_equal_sign
                pattern: "="
  4. Restart the Agent.

Validation

Run the Agent's status subcommand and look for rabbitmq under the Checks section.

Data Collected

Metrics

See metadata.csv for a list of metrics provided by this integration.

The Agent tags rabbitmq.queue.* metrics by queue name and rabbitmq.node.* metrics by node name.

Events

For performance reasons, the RabbitMQ check limits the number of exchanges, queues, and nodes it collects metrics for. If the check nears this limit, it emits a warning-level event to your event stream.

If you require an increase in the number of exchanges, queues, or nodes, contact Datadog support.

Service Checks

rabbitmq.aliveness:
The Agent submits this service check for all vhosts (if vhosts is not configured) OR a subset of vhosts (those configured in vhosts). Each service check is tagged with vhost:<vhost_name>. Returns CRITICAL if the aliveness check failed, otherwise returns OK.

rabbitmq.status:
Returns CRITICAL if the Agent cannot connect to RabbitMQ to collect metrics, otherwise returns OK.

Troubleshooting

Need help? Contact Datadog support.

Further Reading

Additional helpful documentation, links, and articles:

Datadog Blog

FAQ

You can’t perform that action at this time.