stolon - PostgreSQL cloud native HA replication manager
stolon is a cloud native PostgreSQL manager for PostgreSQL high availability. It's cloud native because it'll let you keep an high available PostgreSQL inside your containers (kubernetes integration) but also on every other kind of infrastructure (cloud IaaS, old style infrastructures etc...)
For an introduction to stolon you can also take a look at this post
- Leverages PostgreSQL streaming replication.
- Resilient to any kind of partitioning. While trying to keep the maximum availability, it prefers consistency over availability.
- kubernetes integration letting you achieve postgreSQL high availability.
- Uses a cluster store like etcd or consul as an high available data store and for leader election
- Asynchronous (default) and synchronous replication.
- Full cluster setup in minutes.
- Easy cluster admininistration
- Automatic service discovery and dynamic reconfiguration (handles postgres and stolon processes changing their addresses).
- Can use pg_rewind for fast instance resyncronization with current master.
Stolon is composed of 3 main components
- keeper: it manages a PostgreSQL instance converging to the clusterview provided by the sentinel(s).
- sentinel: it discovers and monitors keepers and calculates the optimal clusterview.
- proxy: the client's access point. It enforce connections to the right PostgreSQL master and forcibly closes connections to unelected masters.
Stolon is under active development and used in different environments. Probably its on disk format (store hierarchy and key contents) will change in future to support new features. If a breaking change is needed it'll be documented in the release notes and an upgrade path will be provided.
Anyway it's quite easy to reset a cluster from scratch keeping the current master instance working and without losing any data.
- golang 1.5
- postgresql >= 9.4
- etcd >= 2.0 or consul >=0.6
We need a compatible etcd version, easiest way to get it and setup is:
# install some tools for local development make install-dev-tools # restore deps godep restore ./... # build etcd cd $GOPATH/src/github.com/coreos/etcd ./build # create dev cluster with tls cd hack/tls-setup make goreman start
And then in project dir you can launch:
It will launch the setup similar to simple cluster example,
Basic go tests can be launched like:
Also you might want to launch integration tests locally:
Note: These tests use real backends, in this exapmle I'll use etcd and postgresql 9.4. If you are capable of running stolon locally, for example using simple cluster example, you will launch tests like:
$ PATH=/usr/lib/postgresql/9.4/bin/:$PATH INTEGRATION=1 STOLON_TEST_STORE_BACKEND=etcd ETCD_BIN=$GOPATH/src/github.com/coreos/etcd/bin/etcd ./test
Your postgres service should be stopped before this. Check it with:
$ sudo systemctl status postgresql # Stop if needed $ sudo systemctl stop postgresql
If your local system is loaded, tests can hang or leave daemonized postgresql processes, so to be on the safe side, check them with:
$ pidof postgres # Kill if needed $ sudo kill -TERM $(pidof postgres)
Also you can additinally check processes tree running
Quick start and examples
Stolon tries to be resilient to any partitioning problem. The cluster view is computed by the leader sentinel and is useful to avoid data loss (one example over all avoid that old dead masters coming back are elected as the new master).
There can be tons of different partitioning cases. The primary ones are covered (and in future more will be added) by various integration tests
Why clients should use the stolon proxy?
Since stolon by default leverages consistency over availability, there's the need for the clients to be connected to the current cluster elected master and be disconnected to unelected ones. For example, if you are connected to the current elected master and subsequently the cluster (for any valid reason, like network partitioning) elects a new master, to achieve consistency, the client needs to be disconnected from the old master (or it'll write data to it that will be lost when it resyncs). This is the purpose of the stolon proxy.
Why didn't you use an already existing proxy like haproxy?
For our need to forcibly close connections to unelected masters and handle keepers/sentinel that can come and go and change their addresses we implemented a dedicated proxy that's directly reading it's state from the store. Thanks to go goroutines it's very fast.
We are open to alternative solutions (PRs are welcome) like using haproxy if they can met the above requirements. For example, an hypothetical haproxy based proxy needs a way to work with changing ip addresses, get the current cluster information and being able to forcibly close a connection when an haproxy backend is marked as failed (as a note, to achieve the latter, a possible solution that needs testing will be to use the on-marked-down shutdown-sessions haproxy server option).
Contributing to stolon
stolon is an open source project under the Apache 2.0 license, and contributions are gladly welcomed!