A collection of cluster reliability tools built for Kubernetes
Governor is a collection of tools for improving the stability of the large Kubernetes clusters as a single Docker image.
Two common problems observed in large Kubernetes clusters are:
- Node failure due to underlying cloud provider issues.
- Pods being stuck in "Terminating" state and unable to be cleaned up.
node-reaper provides the capability for worker nodes to be force terminated so that replacement ones come up. pod-reaper does a force termination of pods stuck in Terminating state for a certain amount of time.
Assuming an AWS-hosted running kubernetes cluster:
kubectl create namespace governor # Using a CronJob kubectl apply -n governor -f https://raw.githubusercontent.com/orkaproj/governor/master/examples/node-reaper.yaml kubectl apply -n governor -f https://raw.githubusercontent.com/orkaproj/governor/master/examples/pod-reaper.yaml
|node-reaper||terminates nodes in scaling groups||node-reaper|
|pod-reaper||force terminates stuck pods||pod-reaper|
- Release alpha version of governor
❤ Contributing ❤
Please see CONTRIBUTING.md.
Please see DEVELOPER.md.