Simple alerting system for Graphite metrics
Python HTML Makefile Ruby
Latest commit 7757352 Feb 3, 2017 @ldanz ldanz committed with garrettheel Add Content-Type header to slack request for mattermost compatibility (
…#153)

* Add Content-Type header to slack request for mattermost compatibility

* break up line so that pylint is happy
Permalink
Failed to load latest commit information.
debian Fix debian Aug 5, 2015
docker Buid a docker file to use graphite-beacon Nov 3, 2014
examples Make YAML loading work by default and fail fast for no config file (#143 Dec 18, 2016
graphite_beacon Add Content-Type header to slack request for mattermost compatibility ( Feb 3, 2017
tests Add more tests (#149) Dec 23, 2016
.bumpversion.cfg Bump version: 0.26.0 → 0.27.0 Dec 24, 2016
.gitignore Ignore .ropeproject Jun 25, 2015
.pylintrc Refactor time units (#146) Dec 19, 2016
.travis.yml Upgrade tornado and drop py2.6 support (#144) Dec 18, 2016
CHANGELOG.md Bump version: 0.26.0 → 0.27.0 Dec 24, 2016
DESCRIPTION fix description Oct 28, 2014
Dockerfile Add default backward-compatible /config.json (#134) Dec 19, 2016
LICENSE Initial commit Oct 25, 2014
MANIFEST.in Decomp tests into different files (#145) Dec 18, 2016
Makefile Add simple integration test and misc fixes (#148) Dec 20, 2016
README.md Refactor time units (#146) Dec 19, 2016
Rakefile Bulk documentation updates and moves examples Jan 11, 2016
beacon.jpg Update README Oct 25, 2014
beacon.png Fix logo. Oct 25, 2014
pytest.ini Add simple integration test and misc fixes (#148) Dec 20, 2016
requirements.txt Upgrade tornado and drop py2.6 support (#144) Dec 18, 2016
setup.cfg Add pylint and pep8 to tests (#136) Nov 7, 2016
setup.py Fix readme Oct 25, 2014
test-requirements.txt Update coverage from 4.2 to 4.3 (#151) Dec 27, 2016
tox.ini Add simple integration test and misc fixes (#148) Dec 20, 2016

README.md

graphite-beacon

logo

Simple alerting system for Graphite metrics.

Features:

  • Simple installation
  • No software dependencies (Databases, AMQP and etc)
  • Light and fully asynchronous
  • SMTP, HipChat, Slack, PagerDuty, HTTP handlers (PRs for additional handlers are welcome!)
  • Easily configurable and supports historical values

Build status Coverage Version License Downloads

Example:

{
"graphite_url": "http://g.server.org",
"smtp": {
    "from": "beacon@server.org",
    "to": ["me@gmail.com"]
},
"alerts": [
    {   "name": "MEM",
        "format": "bytes",
        "query": "aliasByNode(sumSeriesWithWildcards(collectd.*.memory.{memory-free,memory-cached}, 3), 1)",
        "rules": ["critical: < 200MB", "warning: < 400MB", "warning: < historical / 2"] },
    {   "name": "CPU",
        "format": "percent",
        "query": "aliasByNode(sumSeriesWithWildcards(collectd.*.cpu-*.cpu-user, 2), 1)",
        "rules": ["critical: >= 80%", "warning: >= 70%"] }
]}

Requirements

  • python (2.7, 3.3, 3.4)
  • tornado
  • funcparserlib
  • pyyaml

Installation

Python package

graphite-beacon can be installed using pip:

pip install graphite-beacon

Debian package

Using the command line, add the following to your /etc/apt/sources.list system config file:

echo "deb http://dl.bintray.com/klen/deb /" | sudo tee -a /etc/apt/sources.list
echo "deb-src http://dl.bintray.com/klen/deb /" | sudo tee -a /etc/apt/sources.list

Install the package using apt-get:

apt-get update
apt-get install graphite-beacon

Ansible role

There is an ansible role to install the package: https://github.com/Stouts/Stouts.graphite-beacon

Docker

Build a config.json file and run :

docker run -v /path/to/config.json:/srv/alerting/etc/config.json deliverous/graphite-beacon

Usage

Just run graphite-beacon:

$ graphite-beacon
[I 141025 11:16:23 core:141] Read configuration
[I 141025 11:16:23 core:55] Memory (10minute): init
[I 141025 11:16:23 core:166] Loaded with options:
...

Configuration


Time units:

'2second', '3.5minute', '4hour', '5.2day', '6week', '7month', '8year'

short formats are: '2s', '3m', '4.1h' ...

Value units:

short: '2K', '3Mil', '4Bil', '5Tri'

bytes: '2KB', '3MB', '4GB'

bits: '2Kb', '3Mb', '4Gb'

bps: '2Kbps', '3Mbps', '4Gbps'

time: '2s', '3m', '4h', '5d'

The default options are:

Note: comments are not allowed in JSON, but graphite-beacon strips them

    {
        // Graphite server URL
        "graphite_url": "http://localhost",

        // Public graphite server URL
        // Used when notifying handlers, defaults to graphite_url
        "public_graphite_url": null,

        // HTTP AUTH username
        "auth_username": null,

        // HTTP AUTH password
        "auth_password": null,

        // Path to a pidfile
        "pidfile": null,

        // Default values format (none, bytes, s, ms, short)
        // Can be redefined for each alert.
        "format": "short",

        // Default query interval
        // Can be redefined for each alert.
        "interval": "10minute",

        // Default time window for Graphite queries
        // Defaults to query interval, can be redefined for each alert.
        "time_window": "10minute",

        // Notification repeat interval
        // If an alert is failed, its notification will be repeated with the interval below
        "repeat_interval": "2hour",

        // Default end time for Graphite queries
        // Defaults to the current time, can be redefined for each alert.
        "until": "0second",

        // Default loglevel
        "logging": "info",

        // Default method (average, last_value, sum, minimum, maximum).
        // Can be redefined for each alert.
        "method": "average",

        // Default alert to send when no data received (normal = no alert)
        // Can be redefined for each alert
        "no_data": "critical",

        // Default alert to send when loading failed (timeout, server error, etc)
        // (normal = no alert)
        // Can be redefined for each alert
        "loading_error": "critical"

        // Default prefix (used for notifications)
        "prefix": "[BEACON]",

        // Default handlers (log, smtp, hipchat, http, slack, pagerduty)
        "critical_handlers": ["log", "smtp"],
        "warning_handlers": ["log", "smtp"],
        "normal_handlers": ["log", "smtp"],

        // Send initial values (Send current values when reactor starts)
        "send_initial": true,

        // used together to ignore the missing value
        "default_nan_value": -1,
        "ignore_nan": false,

        // Default alerts (see configuration below)
        "alerts": [],

        // Path to other configuration files to include
        "include": []
    }

You can setup options with a configuration file. See examples for JSON and YAML.

A config.json file in the same directory that you run graphite-beacon from will be used automatically.

Setup alerts

Currently two types of alerts are supported:

  • Graphite alert (default) - check graphite metrics
  • URL alert - load http and check status

Note: comments are not allowed in JSON, but graphite-beacon strips them

  "alerts": [
    {
      // (required) Alert name
      "name": "Memory",

      // (required) Alert query
      "query": "*.memory.memory-free",

      // (optional) Alert type (graphite, url)
      "source": "graphite",

      // (optional) Default values format (none, bytes, s, ms, short)
      "format": "bytes",

      // (optional) Alert method (average, last_value, sum, minimum, maximum)
      "method": "average",

      // (optional) Alert interval [eg. 15second, 30minute, 2hour, 1day, 3month, 1year]
      "interval": "1minute",

      // (optional) What kind of alert to send when no data received (normal = no alert)
      "no_data": "warning",

      // (optional) Alert interval end time (see "Alert interval" for examples)
      "until": "5second",

      // (required) Alert rules
      // Rule format: "{level}: {operator} {value}"
      // Level one of [critical, warning, normal]
      // Operator one of [>, <, >=, <=, ==, !=]
      // Value (absolute value: 3000000 or short form like 3MB/12minute)
      // Multiple conditions can be separated by AND or OR conditions
      "rules": [ "critical: < 200MB", "warning: < 300MB" ]
    }
  ]
Historical values

graphite-beacon supports "historical" values for a rule. For example you may want to get warning when CPU usage is greater than 150% of normal usage:

"warning: > historical * 1.5"

Or memory is less than half the usual value:

"warning: < historical / 2"

Historical values for each query are kept. A historical value represents the average of all values in history. Rules using a historical value will only work after enough values have been collected (see history_size).

History values are kept for 1 day by default. You can change this with the history_size option.

See the below example for how to send a warning when today's new user count is less than 80% of the last 10 day average:

alerts: [
  {
    "name": "Registrations",
    // Run once per day
    "interval": "1day",
    "query": "Your graphite query here",
    // Get average for last 10 days
    "history_size": "10day",
    "rules": [
      // Warning if today's new user less than 80% of average for 10 days
      "warning: < historical * 0.8",
     // Critical if today's new user less than 50% of average for 10 days
      "critical: < historical * 0.5"
    ]
  }
],

Handlers

Handlers allow for notifying an external service or process of an alert firing.

Email Handler

Sends an email (enabled by default).

{
    // SMTP default options
    "smtp": {
        "from": "beacon@graphite",
        "to": [],                   // List of email addresses to send to
        "host": "localhost",        // SMTP host
        "port": 25,                 // SMTP port
        "username": null,           // SMTP user (optional)
        "password": null,           // SMTP password (optional)
        "use_tls": false,           // Use TLS?
        "html": true,               // Send HTML emails?

        // Graphite link for emails (By default is equal to main graphite_url)
        "graphite_url": null
    }
}

HipChat Handler

Sends a message to a HipChat room.

{
    "hipchat": {
        // (optional) Custom HipChat URL
        "url": 'https://api.custom.hipchat.my',

        "room": "myroom",
        "key": "mykey"
    }
}

Webhook Handler (HTTP)

Triggers a webhook.

{
    "http": {
        "url": "http://myhook.com",
        "params": {},                 // (optional) Additional query(data) params
        "method": "GET"               // (optional) HTTP method
    }
}

Slack Handler

Sends a message to a user or channel on Slack.

{
    "slack": {
        "webhook": "https://hooks.slack.com/services/...",
        "channel": "#general",          // #channel or @user (optional)
        "username": "graphite-beacon",
    }
}

Command Line Handler

Runs a command.

{
    "cli": {
        // Command to run (required)
        // Several variables that will be substituted by values are allowed:
        //  ${level} -- alert level
        //  ${name} -- alert name
        //  ${value} -- current metrics value
        //  ${limit_value} -- metrics limit value
        "command": "./myscript ${level} ${name} ${value} ...",

        // Whitelist of alerts that will trigger this handler (optional)
        // All alerts will trigger this handler if absent.
        "alerts_whitelist": ["..."]
    }
}

PagerDuty Handler

Triggers a PagerDuty incident.

{
    "pagerduty": {
        "subdomain": "yoursubdomain",
        "apitoken": "apitoken",
        "service_key": "servicekey",
    }
}

Telegram Handler

Sends a Telegram message.

{
    "telegram": {
        "token": "telegram bot token",
        "bot_ident": "token you choose to activate bot in a group"
        "chatfile": "path to file where chat ids are saved, optional field"
    }
}

Command Line Usage

  $ graphite-beacon --help
  Usage: graphite-beacon [OPTIONS]

  Options:

    --config                         Path to an configuration file (JSON/YAML)
                                     (default config.json)
    --graphite_url                   Graphite URL (default http://localhost)
    --help                           show this help information
    --pidfile                        Set pid file

    --log_file_max_size              max size of log files before rollover
                                     (default 100000000)
    --log_file_num_backups           number of log files to keep (default 10)
    --log_file_prefix=PATH           Path prefix for log files. Note that if you
                                     are running multiple tornado processes,
                                     log_file_prefix must be different for each
                                     of them (e.g. include the port number)
    --log_to_stderr                  Send log output to stderr (colorized if
                                     possible). By default use stderr if
                                     --log_file_prefix is not set and no other
                                     logging is configured.
    --logging=debug|info|warning|error|none
                                     Set the Python log level. If 'none', tornado
                                     won't touch the logging configuration.
                                     (default info)

Bug tracker

If you have any suggestions, bug reports or annoyances please report them to the issue tracker at https://github.com/klen/graphite-beacon/issues

Contributors

License

Licensed under a MIT license

If you wish to express your appreciation for the role, you are welcome to send a postcard to:

Kirill Klenov
pos. Severny 8-3
MO, Istra, 143500
Russia