Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Web-based server monitoring and reporting, Scout-compatible.

branch: master

Fetching latest commit…

Octocat-spinner-32-eaf2f5

Cannot retrieve the latest commit at this time

Octocat-spinner-32 app
Octocat-spinner-32 config
Octocat-spinner-32 db
Octocat-spinner-32 docs
Octocat-spinner-32 lib
Octocat-spinner-32 log
Octocat-spinner-32 public
Octocat-spinner-32 script
Octocat-spinner-32 sh
Octocat-spinner-32 spec
Octocat-spinner-32 tmp
Octocat-spinner-32 .gitignore
Octocat-spinner-32 .ruby-gemset
Octocat-spinner-32 .ruby-version
Octocat-spinner-32 Capfile
Octocat-spinner-32 Gemfile
Octocat-spinner-32 Gemfile.lock
Octocat-spinner-32 Rakefile
Octocat-spinner-32 Readme.md
Octocat-spinner-32 config.ru
Readme.md

Sheriff is a web-based tool for server monitoring and reporting.

  • keeps track of what was reported (historic values)
  • keeps track of who reported (hostname/ip)
  • distributes scout-compatible plugins to deputies (see deputy)
  • alerts via logging / email / sms when something goes wrong

Development server

git clone git@github.com:dawanda/sheriff.git
cd sheriff
bundle
cp config/config.yml.example config/config.yml
cp config/database.yml.example config/database.yml
rake db:create
rake db:migrate
rake #run tests
rails s

Generating test data

curl "http://localhost:3000/notify?group=Cron.count_users&value=123"
# open "http://localhost:3000/reports/1"
# add a value validation (value: 1, warn via: email)
curl "http://localhost:3000/notify?group=Cron.count_users&value=123"
# open "http://localhost:3000/reports"
# you should see an error => group (Cron) and subgroup (count_users) are marked as error

Reporting

Values get pushed to Sheriff via http get e.g. curl but preferably via deputy

curl "http://localhost:3000/notify?group=Cron.count_users&value=123"
deputy Cron.count_users 123

# report the success/failure of script execution
./database_backup ; deputy Cron.db_backup $?

Validations

Sheriff validates reported values against a set of validations to see if someone should be notified.

  • ValueValidation -- reported value matches 'x', 1, 1..5, /foo/
  • RunEveryValidation -- reported every 10 minutes / only once per day
  • RunBetweenValidation -- reported between 00:00 and 02:00

Plugins

Plugins can be stored and assigned to deputies/servers to run every x minutes/hours/days. These plugins are compatible to Scout, so you can use these 50+ existing plugins or build your own.

class Redis < Scout::Plugin
  def build_report
    report :memory => `/opt/redis/redis-cli info | grep used_memory: | sed s/used_memory://`.strip
  end
end

Plugins are executed via deputy --run-plugins. deputy queries sheriff for plugins, assigned to this host and runs them if it's time to. The host is defined e.g. in:

#/etc/deputy.yml
sheriff_url: localhost:3000

Resque

To keep Sheriff responsive, report processing should be queued in Resque.
Install redis on localhost and set resque: true in config.yml

# config.yml
resque: true

If activated, Resque workers are started on cap deploy and Resque status can be seen at your-sheriff-url.com/resque/overview

Hoptoad

Add hoptoad_api_key to config.yml to get errors reported to Hoptoad.

Newrelic

If you want performance analysis via Newrelic, add your config/newrelic.yml

Demo / Heroku

You can play around with the demo at sheriff.heroku.com, its public, so people will make crazy/dangerous plugins.
Do not run plugins via deputy.
Only ValueValidations work, since there are no cron jobs.

# configure deputy via /etc/deputy.yml or ~/.deputy.yml
sheriff_url: http://sheriff.heroku.com

# report a value
deputy Foo.bar 111

# run plugins written by annonymouse pranksters
deputy --run-plugins --no-wait

To run your own setup
Setup your heroku account

git clone https://github.com/dawanda/sheriff.git
cd sheriff
heroku create my-sheriff

Make a config in config/config.heroku.yml

sh/configure_heroku.rb
git ps heroku

Setup on normal server

Sheriff is Rails app deployed via capistrano. It needs:

  • Relational database (tested with MySql/Postgres)
  • Rack server (tested with passenger)
  • Mail setup in e.g. sheriff/shared/config/initializers/mail.rb
  • (Optional) Resque for higher responsiveness / no timeouts
  • (Optional) goyyamobile.com account for sms notifications
  • (Optional) Newrelic account for performance analysis
  • (Optional) Hoptoad account for error reporting

Commands

For user 'deploy' group 'users' in /srv/sheriff

# on server:
sudo su
cd /srv
mkdir sheriff
chown users:deploy -R sheriff

sudo su deploy
cd /srv/sheriff
mkdir -p shared/config
mkdir -p shared/log
mkdir -p shared/pids
--- add customized shared/config/config.yml + database.yml [+ newrelic.yml]

# from your box
bundle exec cap deploy

Server

Use anything rack-ish e.g. passenger start [OPTIONS] passenger start --port 3000 --address myhost.com --environment production --max-pool-size 1

or add via normal apache/nginx config.

Logrotate

Dont let those log-files grow!

sudo ln -s /srv/sheriff/current/config/logrotate /etc/cron.d/sheriff

Cron

To notice when a report is missing we need a cron to check for it.

* * * * * cd /srv/sheriff/current && RAILS_ENV=production ruby sh/cron_minute.rb && deputy Cron.sheriff

TODO

  • remove capistrano-ext dependency
  • make sms provider configurable (create a gem for that ?)
  • make 1.9 compatible
  • highlight and notify any new error/alert message <-> set them to default email -> user can adjust down
  • make plugin OPTIONS configurable
Something went wrong with that request. Please try again.