Banshee is a real-time anomalies(outliers) detection system for periodic metrics.
For example, a website api's response time is reported to banshee from statsd every 10 seconds:
20, 21, 21, 22, 23, 19, 18, 21, 22, 20, ..., 300
The latest 300
will be catched.
- Designed for periodic metrics.
- Dynamic threshold analyzation via 3-sigma.
- Also supports fixed-threshold alert option.
- Provides an alert rule management panel.
- No extra storage services required.
- Go >= 1.5.
- Node and gulp.
- Statsd.
It is strongly recommended to use statsd as banshee client.
- Clone this repo and checkout to the latest release.
- Build binary via
make
. - Build static files via
make static
.
$ ./banshee -c <config-filename>
Example configuration file is config/exampleConfig.yaml.
-
Install statsd-banshee to forward metrics to banshee.
$ cd path/to/statsd $ npm install statsd-banshee
-
Add
statsd-banshee
to statsd backends in config.js:{ , backends: ['statsd-banshee'] , bansheeHost: 'localhost' , bansheePort: 2015 }
- timers:
timer.mean_90.*
,timer.upper_90.*
,timer.count_ps.*
. - counters:
counter.*
. - gauge:
gauge.*
.
Detection should work for any metric delimited by dots, but above types are better supported and are also recommended to use as banshee input.
Statsd-banshee would format banshee metric names before data sent out.
Welcome to checkout the web panel manuals: English, 简体中文.
Banshee is a single-host program, its detection is fast enough in our case, we don't have a plan to expand it now.
We are using a Python script (deploy.py via fabric) to deploy it to remote host:
python deploy.py -u hit9 -H remote-host:22 --remote-path "/service/banshee"
Just pull the latest tag release. Please don't use master branch directly, checkout to a tag instead.
Generally we won't release not-backward-compatiable versions, if any, related notes would be added to the changelog.
Banshee requires a command, normally a script to send alert messages.
It should be called from command line like this:
$ ./alert-command <JSON-String>
The JSON string example can be found at alerter/exampleCommand/echo.go.
But how do you really analyze the anomalous metrics? Via 3-sigma:
>>> import numpy as np
>>> x = np.array([40, 52, 63, 44, 54, 43, 67, 54, 49, 45, 48, 54, 57, 43, 58])
>>> mean = np.mean(x)
>>> std = np.std(x)
>>> (80 - mean) / (3 * std)
1.2608052883472445 # anomaly, too big
>>> (20 - mean) / (3 * std)
-1.3842407711224991 # anomaly, too small
For further implementation introduction, please checkout docs/algorithms.md.
If you are using statsd as banshee client, please checkout statsd-banshee.
The network protocol is line based:
<NAME> <STAMP> <VALUE> '\n'
Where the NAME
should be a string, STAMP
should be a timestamp integer in seconds, and
the VALUE
should be a float number.
Please checkout docs/web-api.md.
Please checkout docker/README.md.
Thanks to our contributors.
MIT Copyright (c) 2015 - 2016 Eleme, Inc.