Well, actually DRAX is a reverse acronym inspired by the Guardians of the Galaxy character Drax the Destroyer.
You might have heard of Netflix's Chaos Monkey or it's containerized variant. Maybe you've seen a gaming version of it or stumbled upon a lower-level species. In any case I assume you're somewhat familiar with chaos-based resilience testing.
DRAX is a DC/OS-specific resilience testing tool that works mainly on the task-level. Future work may include node-level up to cluster-level.
Installation and usage
Note that DRAX assumes a running DC/OS 1.9 cluster.
Launch DRAX using the DC/OS CLI via the Marathon app spec provided:
$ dcos marathon app add marathon-drax.json
Now you can (modulo the public node of your cluster) do the following:
If you launched DRAX via Marathon, you can also trigger a POST to the /rampage continuously by deploying a DC/OS job. The example job is triggering the destruction every business hour from Monday till Friday:
$ dcos job add metronome-drax.json
Testing and development
Get DRAX and build from source:
$ go get github.com/dcos-labs/drax $ go build $ MARATHON_URL=http://localhost:8080 ./drax INFO This is DRAX in version 0.4.0 main=init INFO Listening on port 7777 main=init INFO On destruction level 0 main=init INFO Using Marathon at http://localhost:8080 main=init INFO I will destroy 2 tasks on a rampage main=init
And in a different terminal session:
For Go development, be aware of the following dependencies (not using explicit vendoring ATM):
- github.com/gambol99/go-marathon, an API library for working with Marathon.
- github.com/Sirupsen/logrus, a logging library.
Note that the following environment variables are pre-set in the Marathon app spec and yours to overwrite.
Number of target tasks
To specify how many tasks DRAX is supposed to destroy in one rampage, use
NUM_TARGETS. For example,
NUM_TARGETS=5 drax means that (up to) 5 tasks will be destroyed, unless the overall number of tasks is less, of course.
To influence the log level, use the
LOG_LEVEL env variable, for example
LOG_LEVEL=DEBUG drax would give you fine-grained log messages (defaults to
Will return a HTTP 200 code and
I am Groot if DRAX is healthy.
Will return runtime statistics, such as killed containers or apps and will report from the beginning of time (well, beginning of time for DRAX anyways).
Will trigger a destruction. Invoke with: