Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Fetching latest commit…
Cannot retrieve the latest commit at this time.
|Failed to load latest commit information.|
Automated Process Reaper for Unix Systems Satan does just one thing, and one thing only; triggers restart of failed SMF services. Satan was designed to work with Solaris’ SMF self-healing properties. Let Satan restart/kill while SMF revive. The Satan name is a play off of the God Monitor http://god.rubyforge.org/ The reason Satan was developed was because God overlaps too much in functionality with SMF so it is not practical to run on Solaris. Features - No dependencies aside from Ruby - Email notification on reaped processes - Easy to use DSL to define reaping rules - HTTP checks to reap based on non-200 response code INSTALLATION - Install satan on your run path: /opt/bin;/opt/sfw/bin;/usr/bin - Create a satan config file, look at satan.cfg as an example - Edit satan.smf to your liking and import: svccfg import satan.smf - be sure to set up the PATH in smf properly - run satan with credentials that allow it to call svcadm - make satan dependent on the application that it monitors, this is crucial! HOW TO USE - /opt/bin/satan ~/satan.cfg OR - via SMF, see installation block. The configuration is all done in Ruby, clean and simple. Satan.watch do |s| s.name = "jvm instances" # name of job s.fmri = "applications/myapp" # solaris FMRI of the SMF service to watch s.debug = true # if to write out debug information s.safe_mode = false # If in safe mode, satan will not kill ;-( s.interval = 10.seconds # interval to run at to collect statistics s.contact = "firstname.lastname@example.org" # admin contact, optional if you want email alerts s.restart_grace = 30.seconds # grace period for svcadm restart before kill -9 is used s.kill_if do |process| process.daemon = "java" # (optional) name of the process that rules should be applied to process.args = "myapp" # (optional) substring to match in the arguments string process.user = "webservd" # (optional) effective user (owner) of the process process.condition(:cpu) do |cpu| # on cpu condition cpu.above = 48.percent # if above certain percentage cpu.times = 5 # how many times we can hit this condition before killing end process.condition(:memory) do |memory| # on memory condition memory.above = 850.megabytes # limit for memory use memory.times = 5 # how many times we can hit this condition before killing end # ActiveMQ tends to die on us under heavy load so we need the power of satan! process.condition(:http) do |http| # on http condition http.uri = "http://localhost:8161/admin/queues.jsp" # the URI http.timeout = 5.seconds # timeout http.times = 5 # how many times before the kill end # Checks free space in the old generation of a JVM heap # requires $JDK_HOME/bin/jstat to be on the PATH process.condition(:jvm_free_heap) do |free_heap| free_heap.below = 200.megabytes # minimum free old generation space free_heap.times = 5 # how many times we can hit this condition before killing end end end HOW IT WORKS - satan is started by SMF after the monitored app starts (because of the smf dependency) - satan will periodically invoke all the configured rules/conditions - if a rule failure is detected, a counter for this rule is increased - if a rule subsequently succeeds, the counter is decreased (up until 0) - if the failure counter reaches the value defined via 'times' property, satan asks SMF to restart the monitored service - if SMF fails to restart the service within 'restart_grace' period, satan will use kill -9 to kill all processes of the service (to avoid SMF maintenance state) - once the monitored service shuts down, SMF will temporarily shut satan down as well (because of the dependency) - monitoring will be resumed once the monitored app is started again