Skip to content
Go to file

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

[ALPHA] etcd-mesos

This is an Apache Mesos framework that runs an etcd cluster. It performs periodic health checks to ensure that the cluster has a stable leader and that raft is making progress. It replaces nodes that die.



  • runs, monitors, and administers an etcd cluster of your desired size
  • recovers from n/2-1 failures by reconfiguring the etcd cluster and launching replacement nodes
  • recovers from up to n-1 simultaneous failures by picking a survivor to re-seed a new cluster (ranks survivors by raft index, prefering the replica with the highest commit)
  • etcd proxy configurer (etcd-mesos-proxy) and optional SRV record support via mesos-dns


Marathon spec:

  "id": "etcd",
  "container": {
    "docker": {
      "forcePullImage": true,
      "image": "mesosphere/etcd-mesos:0.1.0-alpha-target-23-24-25"
    "type": "DOCKER"
  "cpus": 0.2,
  "env": {
    "FRAMEWORK_NAME": "etcd",
    "WEBURI": "http://etcd.marathon.mesos:$PORT0/stats",
    "MESOS_MASTER": "zk://master.mesos:2181/mesos",
    "ZK_PERSIST": "zk://master.mesos:2181/etcd",
    "AUTO_RESEED": "true",
    "RESEED_TIMEOUT": "240",
    "CLUSTER_SIZE": "3",
    "CPU_LIMIT": "1",
    "DISK_LIMIT": "4096",
    "MEM_LIMIT": "2048",
    "VERBOSITY": "1"
  "healthChecks": [
      "gracePeriodSeconds": 60,
      "intervalSeconds": 30,
      "maxConsecutiveFailures": 0,
      "path": "/healthz",
      "portIndex": 0,
      "protocol": "HTTP"
  "instances": 1,
  "mem": 128.0,
  "ports": [


First, check out the tagged release for your version of Mesos.

For Mesos versions 23, 24, or 25, check out v0.1.0-alpha-target-23-24-25

For Mesos versions 22 and below (the farther below, the less the chances of compatibility), check out v0.1.0-alpha-target-22

Next, built it!


The important binaries (etcd-mesos-scheduler, etcd-mesos-proxy, etcd-mesos-executor) are now present in the bin subdirectory.

A typical production invocation will look something like this:

/path/to/etcd-mesos-scheduler \
    -log_dir=/var/log/etcd-mesos \
    -master=zk://zk1:2181,zk2:2181,zk3:2181/mesos \
    -framework-name=etcd \
    -cluster-size=5 \
    -executor-bin=/path/to/etcd-mesos-executor \
    -etcd-bin=/path/to/etcd \
    -etcdctl-bin=/path/to/etcdctl \

If you'd like to build a new docker container, change the DOCKER_ORG and VERSION in the Makefile, and then run:

make docker

service discovery

Options for finding your etcd nodes on mesos:

  • Run the included proxy binary locally on systems that use etcd. It retrieves the etcd configuration from mesos and starts an etcd proxy node. Note that this it not a good idea on clusters with lots of tasks running, as the master will iterate through each task and spit out a fairly large chunk of JSON, so this approach should be avoided in favor of mesos-dns on larger clusters.
etcd-mesos-proxy --master=zk://localhost:2181/mesos --framework-name=etcd
  • Use mesos-dns or another system that creates SRV records and have an etcd proxy use SRV discovery:
etcd --proxy=on --discovery-srv=etcd.mesos
  • Use Mesos DNS or another DNS SRV system and have clients resolve _etcd-server._client.<framework name>.mesos

  • Use another system that builds configuration from mesos's state.json endpoint. This is how #1 works, so check out the code for it in cmd/etcd-mesos-proxy/app.go if you want to go this route. Be sure to minimize calls to the master for state.json on larger clusters, as this becomes an expensive operation that can easily DDOS your master if you are not careful.

  • Current membership may be queried from the etcd-mesos-scheduler's /members http endpoint that listens on the --admin-port (default 23400)

You can’t perform that action at this time.