Structured Metrics

Jump to bottom

deejay1 edited this page Oct 6, 2014 · 15 revisions

What?

Graph-Explorer connects to an ElasticSearch instance, which functions as a multi-dimensional database of structured (tagged) metrics.

mandatory tags: unit (used to be "what") and "target_type" There are some guidelines for naming the tag keys and values, so that the tags are somewhat standardized and easily guessable, and so that you can leverage the unit conversion etc features.

Populating / maintaining the database

structured_metrics plugins: the update_metrics.py script. This allows you to keep using old-style metrics ("proto1"). It downloads metrics.json from graphite and uses plugins which use regular expressions to enrich them with tags. The plugins will catch all your metrics, but not in the most optimal way (the tag keys won't always be useful). Review the plugins and/or add your own for a better experience. Protip: put it in cron:

*/20 * * * * /path/to/update_metrics.py /path/to/config.cfg &>/dev/null

(note, if you have hundreds of thousands of metrics or more, this can take a few minutes. there's some low hanging optimisation fruit there though, see http://es_host:es_port/graphite_metrics/_count to see the current count)

use carbon-tagger: if you submit metrics with the new convention ("proto2") they will be automatically added to elasticsearch, in realtime, without any further work.

The 2nd option is preferred, but for quick testing the first one is adequate too. You can also combine approaches and point them to the same elasticsearch database.

Structured metrics plugins

plugins define rules which match metric names, parse them and yield a target with associated metadata:

tags from fields in the metric name (server, service, interface_name, etc) by using named groups in a regex.
target_type (count, rate, gauge, ...)
unit (MB, queries/s, ...)
plugin (i.e. 'cpu')

the configuration also provide settings:

configure function or list of functions to further enhance the target dynamically (given the match object and the target along with its config), in addition to the default defined function which can also be overridden.
sanitize callback: to properly set "what" and "type" tags from a "wt" tag and deleting a possibly "wt" tag again.

Stock plugins

Graph-Explorer comes with a bunch of plugins by default. Obviously the catchall plugins, and also plugins to work with metrics from:

diamond monitoring agent
collectd monitoring agent (needs more feedback/work)
openstack swift statsd integration

Writing your own plugins

Definitely read 'Enhanced Metrics' above, and preferrably the entire readme. A simple plugin such as diskspace.py is a good starting point. Notice:

targets is a list of rules (and 'subrules' under the 'targets' key which inherit from their parent)
the match regex must match a metric for it to yield an enhanced metric, all named groups will become tags
target_type must be one of the predefined values (see above)
one or more configurations can be applied, or override default_configure_target which always gets called in these functions you can return a dict which'll get merged into your target (or just alter the target directly). use this to change tags, the target, etc.

backend.update_data() loads your metrics and gets matching targets by calling list_metrics(metrics) on a structured_metrics object.

list_metrics goes over every metric, and for each goes over every plugin (ordered by priority), and
- calls plugin_object.upgrade_metric(metric), which goes over all target configs in the plugin and tries each out.
- each target config can have multiple match regexes. first one wins. (it gets created, sanitized, and the configure functions are run)
- each target config for which none of the no_match regexes matches, or the limit is reached, doesn't get yielded.
- first plugin that yields a proto2 metric wins for this metric (no other plugins are tried for that metric). that's why catchall plugins have lowest priority.
run ./check_update_metric.py my.example.metric to validate the behavior of your plugin. (even better: write unit tests for it!. see the other test_structured_metrics_plugin_*.py examples)