Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Only for Spawnfest 2012. Canonical repository at:

branch: master

Fetching latest commit…

Cannot retrieve the latest commit at this time

README
This is The CHAOS MONKEY.  It will kill your processes.

"What kills you, makes you stronger" -- The Chaos Monkey

The purpose of The Chaos Monkey is to find out if your system is
stable or not.  What will your system do when things start to go wrong
and your processes die randomly?  The Chaos Monkey will show you.
With a stick.

Start by including the chaos_monkey amongst your applications.  The
Chaos Monkey sits in its cage and eats bananas.  Let it out by running
chaos_monkey:on() or have him kill a single process using
chaos_monkey:kill().  More installation instructions can be found in
INSTALL.


chaos_monkey:on() -> {ok, started} | {error, already_running}
chaos_monkey:on(Opts) ->
    {ok, started}
  | {error, already_running}
  | {error, badarg}
chaos_monkey:off() -> {ok, stopped} | {error, not_running}

  Types:

    Opts :: [Opt]
    Opt :: {ms, non_neg_integer()}
         | {apps, all | all_but_otp | [atom()]}

  Will let The Chaos Monkey wreck reasonable havoc over time on your
  system.  This is the command to use if you want The Chaos Monkey
  running all the time.

  It will stay away from system processes and supervisors like a good
  monkey.

  Opts default to [{ms, 5000}, {apps, all_but_otp}] which allows The
  Chaos Monkey to kill one process every five seconds on average.  It
  can deviate from this number by 30%.  If your restart frequency
  setting doesn't allow for this then you could be in for a surprise.


chaos_monkey:almost_kill() ->
    {ok, NumberOfKilledProcesses} | {error, Error}
chaos_monkey:almost_kill(Applications) ->
    {ok, NumberOfKilledProcesses} | {error, Error}

  Types:

    Applications :: all | all_but_otp | [application()]
    NumberOfKilledProcesses :: non_neg_integer()
    Error :: term()

  Synchronous.
  
  This function call will almost kill your system.  If it works as
  published, The Chaos Monkey should stay one process away from
  bringing your system down.  Can you recover from that?

  The Chaos Monkey will randomly walk through processes belonging to
  the list of applications and kill things.  Supervisors are too
  strong for The Chaos Monkey, so it will kill their children instead,
  aiming to kill them by going above the restart threshold.  The Chaos
  Monkey is not suicidal so it will respect restart thresholds of
  permanent top level supervisors.

  As well as not killing supervisors; system processes and processes
  in the kernel application are too strong.  As mentioned above The
  Chaos Monkey will avoid suicide and by extension its siblings and
  parent.

  The Applications argument, tells The Chaos Monkey to focus its
  killing spree on:

    all -- All applications are available for killing.

    all_but_otp -- The Chaos Monkey will stay away from applications
                   in OTP.  Everything else is fair bait.  Default.

    [ListOfApps] -- Sic The Chaos Monkey on the list of application.
                    Remember that the Monkey will always see lonesome
                    processes that don't have the protection of an
                    application as available for harassment.


chaos_monkey:kill() -> {ok, ProcData}.

  Kills a single random non-OTP process in your system.

  ProcData contains information about the process that was killed.


chaos_monkey:find_orphans() -> [Pid]

  The Chaos Monkey will smell your processes and find the ones which
  lack protection from an application.  It gladly hands them over to
  you to do with as you please.  A well-behaved system should return
  [] when calling this function.
Something went wrong with that request. Please try again.