Analyzes cron jobs (managed by puppet) to identify clashes, hot spots, etc.. and maps out an infrastructure-wide schedule visualization.
Python
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
.gitignore
Makefile
README
cron-analyze.py
cron-parse.py
cronlib.py
puppet.py

README

/!\ This is a work in process:

    What works:
        Parsing/storage of puppet catalogs / crons.
        Regex search of all crons.
        Output to ics (iCal).

    TODO:
        Analysis of cron clashes, etc.
        Graphical output per day/week.

/!\

Puppet Cron Analyzer

@Author: Charlie Schluting <charlie@schluting.com>

Purpose:
Analyze crons across an infrastructure, and visualize scheduled run times to identify
clashes or hot spots. Also, identifies duplicate crons and allows searching across
all crons in puppet.

Basically, I've found that puppet-managed cron jobs tend to grow and grow...
especially when developers (in addition to ops) are creating them :)

Questions I needed answered:
Are all nightly or hourly map/reduce jobs running at the same time?
Are all backups running at the same time?
Are all multiple jobs that hit mysql or cassandra heavily, running at the same time?
Where can I better schedule some of these clashes?!
Where are we running crons with the string 'foobar'?
What crons are running at 3am on this box (like, we see a graph showing high disk IO
  and want to know if it's a cron).

Running:
./puppet.py (on master) to generate json blob of all puppet catalogs to catalogs/
./cron-parse.py to parse all catalogs and create host-specific files in parse-output/
./cron-analyze.py to run basic analysis

You can now search across all crons with existing data (super fast):
./cron-analyze.py -e -f '.*'

Display all crons on a specific host:
./cron-analyze.py -e -f '.*' --host $PUPPET_CERTNAME

Or generate an ics file for visualization in calendar apps:
./cron-analyze.py -e -i ical -n 7
 (view from beginning of current year. this gets insane.. use a small -n!)

You can also manually run cron-analyze.py on specific host's file:
./cron-analyze.py ./analyze-output/hostname.fqdn

See --help for the latest available options.

Bonus:
This required the implementation of cronlib, which normalizes cron entries into
strictly lists of integers. It can also return a list of all timestamps in the
next year a cron will run.

Requires:
Python 2.7
icalendar module