likwid agent

TomTheBear edited this page Jun 12, 2015 · 2 revisions

likwid-agent: System monitoring agent for LIKWID

Introduction

likwid-agent is a daemon application that uses likwid-perfctr to measure hardware performance counters and write them to various output back-ends. The basic configuration is in a global configuration file that must be given on command line. The configuration of the hardware event sets is done with extra files suitable for each architecture. Besides the hardware event configuration, the raw data can be transformed using formulas to interested metrics. In order to output not too much data, the data can be further filtered or aggregated. likwid-agent provides multiple store back-ends like logfiles, RRD (Round Robin Database) or gmetric (Ganglia Monitoring System).

Notice

Currently there exists no common init script for likwid-agent. You have to write your own suitable for your OS.

Configuration file

The configuration file has the following options:

  • GROUPPATH <path>: Path to the group files containing event set and output definitions. See section Group files for information.
  • EVENTSET <group1> <group2> ...: Space separated list of groups (without .txt) that should be monitored.
  • DURATION <time>: Measurement duration in seconds for each group.
  • LOGPATH <path>: Sets the output logfile path for the measured data. Each monitoring group logs to its own file likwid.<group>.log
  • LOGSTYLE <update/log>: Specifies whether new data should be appended to the files (log) or the file should be emptied first (update). Update is a common option if you read in the data afterwards by some monitoring tool like cacti, nagios, ... Default is log.
  • GMETRIC <True/False>: Activates the output to gmetric.
  • GMETRICPATH <path>: Set path to the gmetric executable.
  • GMETRICCONFIG <path>: Set path to custom gmetric config file.
  • RRD <True/False>: Activates the output to RRD files (Round Robin Database).
  • RRDPATH <path>: Output path for the RRD files. The files are named according to the group and each output metric is saved as DS with function GAUGE. The RRD is configured with RRA entries to store average, minimum and maximum of 10 minutes for one hour, of 60 minutes for one day and daily data for one month.
  • SYSLOG <True/False>: Activates the output to system log using logger.
  • SYSLOGPRIO <prio>: Set the priority for the system log. The default priority is 'local0.notice'.

Group files

The group files are adapted performance group files as used by likwid-perfctr. This makes it easy to use the predefined and often used performance groups as basis for the monitoring. The folder structure of for the groups is /<SHORT_ARCH_NAME>/ with <SHORT_ARCH_NAME> similar to the ones for the performance groups, like 'sandybridge' or 'haswellEP'.

SHORT <string>

EVENTSET
<counter1> <event1>
<counter2>:<option1>:<option2> <event2>

METRICS
<metricname> <formula>
<filter> <metricname> <formula>

LONG
<multi-line string>

The SHORT keyword prefixes a short descriptive information about the group. The definition of the EVENTSET is similar to the performance groups. Each line contains a counter name and the desired event name. If you want to set further options, you can append them separated by colons to the counter name. See likwid-perfctr for details. The METRICS keyword starts the definitions of the output metrics. The syntax follows the METRICS definition of the performance groups as used by likwid-perfctr . If no function is set at the beginning of the line, <formula> is evaluated for every CPU and send to the output back-ends. The <metricname> gets the prefix T<cpuid>. To avoid writing to much data to the back-ends, the data can be reduced by <filter>. The possible filter options are MIN, MAX, AVG, SUM, ONCE. The ONCE filter sends only the data from the first CPU to the output back-ends - commonly used for the measurement duration. The last part, started with the LONG keyword, allows to set further information about the monitoring group. There is currently no output function for this in likwid-agent.

Clone this wiki locally
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.