Skip to content
grantr edited this page May 18, 2012 · 9 revisions

Replace amqp in sensu with zeromq

Advantages:

  • fewer moving parts
  • no SPOF
  • fun!

Use celluloid and celluloid-zmq.

No Dcell. DCell is nice but not a perfect fit for this case. We want to manage our own keepalives, authentication, and rpc

Broadcast

Servers broadcast checks and keepalives with pub sockets.

Clients use sub sockets to subscribe to topics matching their configured subscriptions.

Clients also subscribe to two topics:

  • the system topic
  • a topic matching the client name

Servers broadcast on the system topic to all clients. Servers broadcast on client-specific topics to address a single client.

Results gathering

Clients push check results to servers with push sockets, servers handle them with pull sockets.

Authentication

Use public key handshake. Nodes authenticate with certificates, verify, then exchange secret keys for further communications. This allows the network to run over WANs and unprotected links.

Every node, client and server, needs a private key and a certificate. If peer verification is enabled each node also needs a CA certificate.

Authentication handshakes use REQ/REP sockets.

Handshake procedure (inspired by salt authentication):

  • client sends its key/cert to the server
  • server optionally verifies client cert
  • server encrypts shared key using the client public key
  • server sends its key/cert and encrypted shared key to the client
  • client optionally verifies server cert and decrypts shared key

When clients connect, they handshake and get a key. Sockets are not bound until the key is retrieved. When clients disconnect, they close all sockets and handshake again upon connecting.

When a client receives a message it cannot decrypt, it re-handshakes with the server.

When a server rotates the key, it broadcasts a ping encrypted with the new key. This causes all clients to re-handshake.

When a key is rotated, the old key is still valid for a short time and messages will be decrypted. However after the grace period any messages encrypted with the old key will be discarded.

Eventually all clients that are receiving broadcasts from the server will get the new key when they fail to decrypt pings.

Keepalives

Clients send keepalives to servers via the result push sockets. Servers handle keepalives with phi accrual failure detectors similar to cassandra. Failure detector arrays are backed by redis lists so servers can remain stateless.

Multiple servers

Servers use leader election. The leader is the only server that broadcasts. All servers handle responses. Eventually servers could divide responsibilities for broadcasting.

Redis failover

To be truly redundant, we must support automatic redis failover.

Redis failover is coordinated using a separate pubsub channel between servers only.

The strategy is similar to this project: https://github.com/jbaudanza/redis-failover

Servers constantly ping redis. If a server notices that redis is down, it broadcasts a query to ask if other servers have seen the master. The redis server is said to be on probation.

If that server receives a reply, it forwards any responses it was handling to the server that responded and disconnects until it can reach redis again.

If the server does not receive a reply before the end of the probation period (or receives a sufficient number of nacks), it promotes a slave and broadcasts the slave that was promoted.

The servers continue to ping the former master, and when it comes back online the leader tells that master to become a slave of the current master.

If a server receives a query broadcast, it first pings the suspected redis server. If it receives a reply, it responds with ack. If it is unable to connect, it responds with nack. It avoids doing anything to redis while the probation timeout is occurring.

If a server receives a promotion broadcast, it disconnects from master redis and connects to the slave mentioned.

NOTE: This should be reconsidered to ensure that servers which are not connected shut themselves down or are shut down by the other nodes (node fencing or STONITH). Zeromq makes it impossible to detect whether a server is running, so servers must be very good about killing themselves.

Redis forwarding

It is possible that servers could be run without connecting to redis directly. Servers that are connected to redis could expose some redis rpc channel that allows remote servers to communicate with redis in a limited way. These remote servers could never be leaders or participate in redis failover.

Plugins

Plugins are run by actors. Actors run permanently and only fork if they need to.

Handlers

Handlers are also actors and run permanently. Handlers could potentially be stateful but since they are not sticky and might be restarted at any time, they are encouraged to use redis stashes for storing state.

API

The API is a reel server that implements the sensu api. Ideally it is api-compatible with sensu so that the sensu dashboard can be used.

Configuration

Ideally Zensu is configured the same as or similarly to sensu so automation can cross over. There may be some differences but the overall idea should be the same.

Components

The server needs the following actors:

  • A Publisher. This schedules broadcasts of subscriptions. Linked to Elector.
  • A Puller. This receives results and routes them to the appropriate handlers.
  • An Authenticator. This receives handshake requests and replies to them. (other possible names: Keymaster, Recognizer)
  • An Elector. This handles leader election. Starts Publisher when it becomes the leader, exits (killing Publisher) when it loses the lead.
  • A Fencer. This ensures that the server fences itself if it is in an exceptional state. linked to Elector.
  • A Persister. This handles talking to redis or a redis-like backing store (fakeredis, zookeeper?).
  • A RedisMonitor. This handles redis failover duties. linked to persister.
  • A KeepaliveHandler. A built-in handler for client keepalives.
  • A set of Client actors. These are linked to the KeepaliveHandler so if it dies they die. Each one maintains a failure detector and state.
  • An actor for each handler.

The client needs the following actors:

  • An Authenticator. This handshakes with the server to get the aes key. (other possible names: Keyslave, ?)
  • A Subscriber. This receives broadcasts from the publisher and routes them to plugins.
  • A Pusher. This receives results from plugins and pushes them to servers.
  • An actor for each plugin.