An application agnostic tool to represent node availability.
What is this nonsense?
In many of our deploment pipelines we had to gather credentials from any number of systems, such as load balancers and monitoring, to disable nodes. Each system might have different requirements for authentication, vary quality of APIs or automation libraries, etc. With stricter RBAC requirements we sometime ran into a system that required admin access to do a simple state change. Aside from the cumbersome credentials management, access levels can be a problem in more compliance-oriented environments.
Our original goals:
- Remove a node from being active in a load balancer pool
- Ensure monitoring knows the node is not active
- Determine if the node is done draining connections before continuing maintenance
Most of these external systems did have some means of internally managing the state, whether via health checks or some other function. Thus our initial approach was to work with our developers to enable some kind of health state in their applications that we could configure our systems to utilize. However, this path leaves "off the shelf" software without a path.
Once we tried to cover all of our use cases we decided the best route would be to separate this functionality into a separate standalone web service.
But what about that 3rd goal??
Early on we thought we could easily expose active connections on the host through this service. But really its a separate function, and only tangentially related. If this is something you are interested we’d recommend checking out where we did implement it, the Ansible wait_for module as state=drained. A standalone implementation is available, but not really encouraged as it was more a proof of concept.
Why is it called Plight?
a dangerous, difficult, or otherwise unfortunate situation
Originally it was called 'nodestatus', we took to the thesaurus. Based on where we were with the original path to solve this problem we landed on plight.
Fedora or EL-based:
We have a COPR that is kept current with releases. After enabling that repository install using:
yum install plight
- TODO: push our puppet module to be public and publish to forge
By default Plight comes with 3 explicit states, which as of the 0.1.0 series are configurable in
|Enabled||200||node is available|
|Disabled||404||node is unavailable|
|Offline||503||node is offline|
Long term we are going to change the Status codes to default to 200 for these three states. We maintained the states from the previous release for compatability purposes.
Enable the service
chkconfig plightd on service plightd start
systemctl enable plightd systemctl start plightd
The default port configured for plight is
10101. In our examples directory there is a service entry for firewalld.
plight --help from the cli will give you a list of all valid plight
commands, which includes start, stop, and a dynamically generated list based on
Put a mode into maintenance mode
Put a mode into offline mode
Return a node to active mode
List the configured states
Checking the current state of a node
curl http://localhost:10101 -D -
All files contained with this distribution are licenced either under the Apache License v2.0 or the GNU General Public License v2.0. You must agree to the terms of these licenses and abide by them before viewing, utilizing, modifying, or distributing the source code contained within this distribution.
- Install via makefile
sudo make install
- Generate the RPM
- Requires buildsys-macros installed
COPR (publishing RPMs)
- Generate SRPM
- Publish SRPM to a publicly available HTTP or FTP repo
- Load the build into COPR
copr-cli build Plight http://example.com/paht/to/plight.src.rpm
- Generate deb package
PPA (publishing DEBs)
- Generate source package bits
- Change to ./artifacts/debs/ and generate signed source with changes
cd artifacts/debs debbuild -S -sa
- Push to PPA
dput ppa:gregswift/plight plight_VERSION_source.changes