MQ Exporter for Prometheus monitoring
This directory contains the code for a monitoring solution that exports queue manager data to a Prometheus data collection system. It also contains configuration files to run the monitor program
The monitor collects metrics published by an MQ V9 queue manager or the MQ appliance. Prometheus than calls the monitor program at regular intervals to pull those metrics into its database, where they can then be queried directly or used by other packages such as Grafana.
You can see data such as disk or CPU usage, queue depths, and MQI call counts. Channel status is also reported.
Example Grafana dashboards are included, to show how queries might be constructed. To use the dashboard, create a data source in Grafana called "MQ Prometheus" that points at your database server, and then import the JSON file. This dashboard was built using Grafana v5.3.1
There is also a script to start the collector so that it processes the statistics generated by the MQ Bridge for Salesforce, included from MQ V9.0.2
- You need to have the MQ client libraries installed first.
- Set up an environment for compiling Go programs
export GOPATH=~/go (or wherever you want to put it) export GOROOT=/usr/lib/golang (or wherever you have installed it) mkdir -p $GOPATH/src cd $GOPATH/src
- Clone this GitHub repository for the monitoring programs into your GOPATH. The repository
contains the prereq packages at a suitable version in the
git clone https://github.com/ibm-messaging/mq-metric-samples ibm-messaging/mq-metric-samples
- From the root of your GOPATH you can then compile the code
cd $GOPATH export CGO_LDFLAGS_ALLOW='-Wl,-rpath.*' go build -o bin/mq_prometheus src/ibm-messaging/mq-metric-samples/cmd/mq_prometheus/*.go
It is convenient to run the monitor program as a queue manager service whenever possible.
This directory contains an MQSC script to define the service. In fact, the
service definition points at a simple script which sets up any
necessary environment and builds the command line parameters for the
real monitor program. As the last line of the script is "exec", the
process id of the script is inherited by the monitor program, and the
queue manager can then check on the status, and can drive a suitable
STOP SERVICE operation during queue manager shutdown.
Edit the MQSC script and the shell script to point at appropriate directories where the program exists, and where you want to put stdout/stderr. Ensure that the ID running the queue manager has permission to access the programs and output files.
If you cannot run the monitor as a service, for example when
trying to monitor the MQ Appliance which does not support
service definitions, or monitoring another platform without Go support,
then you can run as
an MQ client connecting remotely. Setting the
true forces client connections. Then all the usual MQ configuration
comes into play (MQSERVER environment variable, use of CCDT files etc).
The monitor listens for calls from Prometheus on a TCP port. The default
port, reserved for MQ's use in the Prometheus list, is 9157. If you
want to use a different number, then use the
The monitor always collects all of the available queue manager-wide metrics.
It can also be configured to collect statistics for specific sets of queues.
The sets of queues can be given either directly on the command line with the
-ibmmq.monitoredQueues flag, or put into a separate file which is also
named on the command line, with the
-ibmmq.monitoredQueuesFile flag. An
example is included in the startup shell script.
Note that the queue patterns are expanded only at startup
of the monitor program. If you want to change the patterns, or new
queues are defined that match an existing pattern, the monitor must be
restarted with a
STOP SERVICE and
START SERVICE pair of commands.
The monitor program can now process channel status, reporting that back into Prometheus.
The channels to be monitored are set on the command line, similarly to
the queue patterns, with
Unlike the queue monitoring, wildcards are handled automatically by the channel
status API. So you do not need to restart this monitor in order to pick up newly-defined
channels that match an existing pattern.
Another command line parameter is
pollInterval. This determines how frequently the
channel status is collected. You may want to have it collected at a different rate to
the queue data, as it may be more expensive to extract the channel status. The default
pollInterval is 0, which means that the channel status is collected every time Prometheus
asks for the queue and queue manager statistics. Setting it to
1m means that a minimum
time of one minute will elapse between asking for channel status even if the queue statistics
are gathered more frequently.
A short-lived channel that connects and then disconnects in between collection intervals will leave no trace in the status or metrics.
A few of the responses from the DISPLAY CHSTATUS command have been selected as metrics. The key values returned are the status and number of messages processed.
The message count for SVRCONN channels is the number of MQI calls made by the client program.
There are actually two versions of the channel status returned. The
has the value corresponding to one of the MQCHS_* values. There are about 15 of these possible
values. There is also a
channel_status_squash metric which returns one of only three values,
compressing the full set into a simpler value that is easier to put colours against in Grafana.
From this squashed set, you can readily see if a channel is stopped, running, or somewhere in between.
Channel Instances and Labels
Channel metrics are given labels to assist in distinguishing them. These can be displayed in Grafana or used as part of the filtering. When there is more than one instance of an active channel, the combination of channel name, connection name and job name will be unique (though see the z/OS section below for caveats on that platform).
The channel type (SENDER, SVRCONN etc) and the name of the remote queue manager are also given as labels on the metric.
Channel Dashboard Panels
The example Grafana dashboard shows how these labels and metrics can be combined to show some channel status. The Channel Status table panel demonstrates a couple of features. It uses the labels to select unique instances of channels. It also uses a simple number-to-text map to show the channel status as a word (and colour the cell) instead of a raw number.
The metrics for the table are selected and have '0' added to them. This may be a workround of a Grafana bug, or it may really be how Grafana is designed to work. But without that '+0' on the metric line, the table was showing multiple versions of the status for each channel. This table combines multiple metrics on the same line now.
This monitor can be configured to authenticate to the queue manager, sending a userid and password.
The userid is configured using the
-ibmmq.userid flag. The password can
be set either by using the
-ibmmq.password flag, or by passing it via stdin.
That allows it to be piped from an external stash file or some other
mechanism. Command line flags for controlling passwords are not recommended!
The Prometheus server has to know how to contact the MQ monitor. The simplest way is just to add a reference to the monitor in the server's configuration file. For example, by adding this block to /etc/prometheus/prometheus.yml.
# Adding a reference to an MQ monitor. All we have to do is # name the host and port on which the monitor is listening. # Port 9157 is the reserved default port for this monitor. - job_name: 'ibmmq' scrape_interval: 15s static_configs: - targets: ['hostname.example.com:9157']
The server documentation has information on more complex options, including the ability to pull information on which hosts should be monitored from a variety of discovery tools.
Once the monitor program has been started, and Prometheus refreshed to
connect to it, you will see metrics being available in the prometheus
console. All of the metrics are given the a prefix which by
default is ibmmq. The name can be configured with the
on the command line.
The queue and queue manager metrics shown in the Prometheus console are named after the descriptions that you can see when running the amqsrua sample program, but with some minor modifications to match the required style.
The channel metrics all begin with
Because the DIS QSTATUS and DIS CHSTATUS commands can be used on z/OS, the Prometheus monitor can now support showing some limited information from a z/OS queue manager. There is nothing special needed to configure it, beyond the client connectivity that allows an application to connect to the z/OS system.
-ibmmq.qStatus parameter must be set to
true to use the DIS QSTATUS command
Channel Metrics on z/OS
On z/OS, there is no guaranteed way to distinguish between multiple instances of the
same channel name. For example, multiple users of the same SVRCONN definition. On Distributed
platforms, the JOBNAME attribute does that job; for z/OS, the channel start date/time is
used in this package as a discriminator, and used as the
jobname label in the metrics.
That may cause the stats to be slightly wrong if two instances of the same channel
are started simultaneously from the same remote address. The sample dashboard showing z/OS
status includes counts of the unique channels seen over the monitoring period.