Use celluloid and celluloid-zmq.
No Dcell. DCell is nice but not a perfect fit for this case. We want to manage our own keepalives, authentication, and rpc
Servers broadcast checks and keepalives with pub sockets.
Clients use sub sockets to subscribe to topics matching their configured subscriptions.
Clients also subscribe to two topics:
Servers broadcast on the system topic to all clients. Servers broadcast on client-specific topics to address a single client.
Clients push check results to servers with push sockets, servers handle them with pull sockets.
Use public key handshake. Nodes authenticate with certificates, verify, then exchange secret keys for further communications. This allows the network to run over WANs and unprotected links.
Every node, client and server, needs a private key and a certificate. If peer verification is enabled each node also needs a CA certificate.
Authentication handshakes use REQ/REP sockets.
Handshake procedure (inspired by salt authentication):
When clients connect, they handshake and get a key. Sockets are not bound until the key is retrieved. When clients disconnect, they close all sockets and handshake again upon connecting.
When a client receives a message it cannot decrypt, it re-handshakes with the server.
When a server rotates the key, it broadcasts a ping encrypted with the new key. This causes all clients to re-handshake.
When a key is rotated, the old key is still valid for a short time and messages will be decrypted. However after the grace period any messages encrypted with the old key will be discarded.
Eventually all clients that are receiving broadcasts from the server will get the new key when they fail to decrypt pings.
Clients send keepalives to servers via the result push sockets. Servers handle keepalives with phi accrual failure detectors similar to cassandra. Failure detector arrays are backed by redis lists so servers can remain stateless.
Servers use leader election. The leader is the only server that broadcasts. All servers handle responses. Eventually servers could divide responsibilities for broadcasting.
To be truly redundant, we must support automatic redis failover.
Redis failover is coordinated using a separate pubsub channel between servers only.
The strategy is similar to this project: https://github.com/jbaudanza/redis-failover
Servers constantly ping redis. If a server notices that redis is down, it broadcasts a query to ask if other servers have seen the master. The redis server is said to be on probation.
If that server receives a reply, it forwards any responses it was handling to the server that responded and disconnects until it can reach redis again.
If the server does not receive a reply before the end of the probation period (or receives a sufficient number of nacks), it promotes a slave and broadcasts the slave that was promoted.
The servers continue to ping the former master, and when it comes back online the leader tells that master to become a slave of the current master.
If a server receives a query broadcast, it first pings the suspected redis server. If it receives a reply, it responds with ack. If it is unable to connect, it responds with nack. It avoids doing anything to redis while the probation timeout is occurring.
If a server receives a promotion broadcast, it disconnects from master redis and connects to the slave mentioned.
NOTE: This should be reconsidered to ensure that servers which are not connected shut themselves down or are shut down by the other nodes (node fencing or STONITH). Zeromq makes it impossible to detect whether a server is running, so servers must be very good about killing themselves.
It is possible that servers could be run without connecting to redis directly. Servers that are connected to redis could expose some redis rpc channel that allows remote servers to communicate with redis in a limited way. These remote servers could never be leaders or participate in redis failover.
Plugins are run by actors. Actors run permanently and only fork if they need to.
Handlers are also actors and run permanently. Handlers could potentially be stateful but since they are not sticky and might be restarted at any time, they are encouraged to use redis stashes for storing state.
The API is a reel server that implements the sensu api. Ideally it is api-compatible with sensu so that the sensu dashboard can be used.
Ideally Zensu is configured the same as or similarly to sensu so automation can cross over. There may be some differences but the overall idea should be the same.
The server needs the following actors:
The client needs the following actors: