A simple leadership election service for distributed systems.
This solves the problem of distributed schedulers executing at the same time, when only one execution is desired.
This is an HTTP wrapper over redis's global locks, with HTTP GET requests for ease of use.
gone itself is safe to be scaled up and still return a single leader, thanks to the guarantees of redsync.
Spin up a redis instance, listening at localhost:6379
docker run -p 6379:6379 --name rds -d redisRun the code, the HTTP server runs on port 8080 by default.
go run main.goNo user set up required, keys are assumed to never collide.
- Choose a system identifier. This self-chosen string is reused across your distributed service: e.g. if there are 5 redundant copies of a "cleaning scheduler" service, then this would be something like
cleaning_scheduler. - Choose a round identifier. This self-chosen string is ephemeral, but must match across your distributed service: e.g. if a tmpfs cleaner runs hourly, this would be something like
2020-10-18-0800-tmpfscleaner. All copies of the redundant service must generate the same value, so coarse timestamps are recommended. - Whoever receives the
202 AcceptedHTTP response wins! Losers receive204 No Content. - Optional: upon completion, the winner sends a POST to the
completeendpoint to indicate successful execution.
HTTP GET to /api/v1/elect/{system_identifier}/{round_identifier}
- Returns a
202 Acceptedif the requester is the elected leader - Returns a
204 No Contentif the requester is not the elected leader. These requests will take longer (approx. 1.5s) due to mutex timeouts. - In case of
500 Internal Server Error, an error occurred when recording or verifying leadership. It is safe to retry.
HTTP GET to /api/v1/elected/{system_identifier}/{round_identifier}
- Returns a
200 OKalongside theRequest.RemoteAddrof the elected leader - If no leader was elected (e.g. the round hasn't occurred yet),
204 No Contentis returned. - In case of
500 Internal Server Error, arediscommunication error has occurred. It is safe to retry.
HTTP GET /api/v1/complete/{system_identifier}/{round_identifier}
- Returns a
200 OKupon successful recording - Returns a
400 Bad Requestif another leader(!!!) has already recorded their completion. If this is encountered, the calling system is in a bad state. - In case of
500 Internal Server Error, an error occurred when recording or verifying the completion. It is safe to retry.
HTTP GET /api/v1/completed/{system_identifier}/{round_identifier}
- Returns a
204 No Contentif no completion found for the given round - Returns a
200 OKalongside theRequest.RemoteAddrof the completer - In case of
500 Internal Server Error, arediscommunication error has occurred. It is safe to retry.