A modern, distributed monitoring system written in Go.
Another monitoring system? Why?
While there are a bunch of solutions for monitoring and alerting using time series data, there aren't many (or any?) modern solutions for 'regular'/'old-skool' remote monitoring similar to Nagios and Icinga.
9volt offers the following things out of the box:
- Single binary deploy
- Fully distributed
- Incredibly easy to scale to hundreds of thousands of checks
etcdfor all configuration
- Real-time configuration pick-up (update etcd -
9voltimmediately picks up the change)
- Support for assigning checks to specific (groups of) nodes
- Helpful for getting around network restrictions (or requiring certain checks to run from a specific region)
- Interval based monitoring (ie. run check XYZ every 1s, 1y, 1d or even 1ms)
- Natively supported monitors:
- Natively supported alerters:
- RESTful API for querying current monitoring state and loaded configuration
- Comes with a built-in, react based UI that provides another way to view and manage the cluster
- Access the UI by going to http://hostname:8080/ui
- Comes with a built-in monitor and alerter config management util (that parses and syncs YAML-based configs to etcd)
./9volt cfg --help
- Download latest
- Start server:
./9volt server -e http://etcd-server-1.example.com:2379 -e http://etcd-server-2.example.com:2379 -e http://etcd-server-3.example.com:2379
- Optional: use
9volt cfgfor managing configs
- Optional: add
9voltto be managed by
upstartor some other process manager
- Optional: Several configuration params can be passed to
9voltvia env vars
... or, if you prefer to do things via Docker, check out these docs.
H/A and scaling
9volt is incredibly simple. Launch another
9volt service on a separate host and point it to the same
etcd hosts as the main
9volt node will produce output similar to this when it detects a node join:
Checks will be automatically divided between the all
If one of the nodes were to go down, a new leader will be elected (if the node that went down was the previous leader) and checks will be redistributed among the remaining nodes.
This will produce output similar to this (and will be also available in the event stream via the API and UI):
API documentation can be found here.
Minimum requirements (can handle ~1,000-3,000 <10s interval checks)
- 1 x 9volt instance (1 core, 256MB RAM)
- 1 x etcd node (1 core, 256MB RAM)
Note In the minimum configuration, you could run both
etcd on the same node.
Recommended (production) requirements (can handle 10,000+ <10s interval checks)
- 3 x 9volt instances (2+ cores, 512MB RAM)
- 3 x etcd nodes (2+ cores, 1GB RAM)
While you can manage 9volt alerter and monitor configs via the API, another approach to config management is to use the built-in config utility (
9volt cfg <flags>).
This utility allows you to scan a given directory for any YAML files that resemble 9volt configs (the file must contain either 'monitor' or 'alerter' sections) and it will automatically parse, validate and push them to your etcd server(s).
By default, the utility will keep your local configs in sync with your etcd server(s). In other words, if the utility comes across a config in etcd that does not exist locally (in config(s)), it will remove the config entry from etcd (and vice versa). This functionality can be turned off by flipping the
You can look at an example of a YAML based config here.
Read through the docs dir.
Got a suggestion/idea? Something that is preventing you from using
9volt over another monitoring system because of a missing feature? Submit an issue and we'll see what we can do!