Skip to content
PostgreSQL cloud native High Availability
Branch: master
Clone or download
Pull request Compare This branch is 112 commits ahead, 521 commits behind sorintlab:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
Godeps
cmd
common
doc
examples/kubernetes
pkg
scripts
tests/integration
vendor
.editorconfig
.gitignore
CHANGELOG.md
DCO
LICENSE
Makefile
Procfile
README.md
build
test

README.md

stolon - PostgreSQL cloud native HA replication manager

Build Status Join the chat at https://gitter.im/sorintlab/stolon

stolon is a cloud native PostgreSQL manager for PostgreSQL high availability. It's cloud native because it'll let you keep an high available PostgreSQL inside your containers (kubernetes integration) but also on every other kind of infrastructure (cloud IaaS, old style infrastructures etc...)

For an introduction to stolon you can also take a look at this post

Features

  • Leverages PostgreSQL streaming replication.
  • Resilient to any kind of partitioning. While trying to keep the maximum availability, it prefers consistency over availability.
  • kubernetes integration letting you achieve postgreSQL high availability.
  • Uses a cluster store like etcd or consul as an high available data store and for leader election
  • Asynchronous (default) and synchronous replication.
  • Full cluster setup in minutes.
  • Easy cluster admininistration
  • Automatic service discovery and dynamic reconfiguration (handles postgres and stolon processes changing their addresses).
  • Can use pg_rewind for fast instance resyncronization with current master.

Architecture

Stolon is composed of 3 main components

  • keeper: it manages a PostgreSQL instance converging to the clusterview provided by the sentinel(s).
  • sentinel: it discovers and monitors keepers and calculates the optimal clusterview.
  • proxy: the client's access point. It enforce connections to the right PostgreSQL master and forcibly closes connections to unelected masters.

Stolon architecture

Project Status

Stolon is under active development and used in different environments. Probably its on disk format (store hierarchy and key contents) will change in future to support new features. If a breaking change is needed it'll be documented in the release notes and an upgrade path will be provided.

Anyway it's quite easy to reset a cluster from scratch keeping the current master instance working and without losing any data.

Development

Requirements

  • golang 1.5
  • postgresql >= 9.4
  • etcd >= 2.0 or consul >=0.6

Build

make

Run

We need a compatible etcd version, easiest way to get it and setup is:

# install some tools for local development
make install-dev-tools

# restore deps
godep restore ./...

# build etcd
cd $GOPATH/src/github.com/coreos/etcd
./build

# create dev cluster with tls
cd hack/tls-setup
make
goreman start

And then in project dir you can launch:

make start

It will launch the setup similar to simple cluster example,

Tests

Basic go tests can be launched like:

make test

Also you might want to launch integration tests locally:

make test-integration

Note: These tests use real backends, in this exapmle I'll use etcd and postgresql 9.4. If you are capable of running stolon locally, for example using simple cluster example, you will launch tests like:

$ PATH=/usr/lib/postgresql/9.4/bin/:$PATH INTEGRATION=1 STOLON_TEST_STORE_BACKEND=etcd ETCD_BIN=$GOPATH/src/github.com/coreos/etcd/bin/etcd ./test

Your postgres service should be stopped before this. Check it with:

$ sudo systemctl status postgresql
# Stop if needed
$ sudo systemctl stop postgresql

If your local system is loaded, tests can hang or leave daemonized postgresql processes, so to be on the safe side, check them with:

$ pidof postgres
# Kill if needed
$ sudo kill -TERM $(pidof postgres)

Also you can additinally check processes tree running ps fauxww.

Quick start and examples

Documentation

High availability

Stolon tries to be resilient to any partitioning problem. The cluster view is computed by the leader sentinel and is useful to avoid data loss (one example over all avoid that old dead masters coming back are elected as the new master).

There can be tons of different partitioning cases. The primary ones are covered (and in future more will be added) by various integration tests

FAQ

Why clients should use the stolon proxy?

Since stolon by default leverages consistency over availability, there's the need for the clients to be connected to the current cluster elected master and be disconnected to unelected ones. For example, if you are connected to the current elected master and subsequently the cluster (for any valid reason, like network partitioning) elects a new master, to achieve consistency, the client needs to be disconnected from the old master (or it'll write data to it that will be lost when it resyncs). This is the purpose of the stolon proxy.

Why didn't you use an already existing proxy like haproxy?

For our need to forcibly close connections to unelected masters and handle keepers/sentinel that can come and go and change their addresses we implemented a dedicated proxy that's directly reading it's state from the store. Thanks to go goroutines it's very fast.

We are open to alternative solutions (PRs are welcome) like using haproxy if they can met the above requirements. For example, an hypothetical haproxy based proxy needs a way to work with changing ip addresses, get the current cluster information and being able to forcibly close a connection when an haproxy backend is marked as failed (as a note, to achieve the latter, a possible solution that needs testing will be to use the on-marked-down shutdown-sessions haproxy server option).

Contributing to stolon

stolon is an open source project under the Apache 2.0 license, and contributions are gladly welcomed!

You can’t perform that action at this time.