New features and Enhancements
- Addition of Chaos Events (across all litmus components, i.e., operator/runner/experiment job) to indicate experiment lifecycle
- Enhanced ChaosResult with experiment failure reason (step) provided in CR status
- Includes Node Memory Hog experiment to generic/kubernetes suite
- Includes OpenEBS pool disk loss experiment for GKE/AWS
- Adds support for Amazon EKS platform for generic chaos experiments
- Introduces a new chart category based on chaostoolkit with initial pod chaos experiments
- Supports override of default runner properties such as imagePullPolicy & entrypoint/args
- Extends cleanupPolicy enforcement to chaos-runner pods (apart from just the experiment job) with improved reconciliation flow
- Improves experiment chaoslib which now makes use of jobs (replacing daemonsets) to reduce the number of chaos resources (pods) used in an experiment, with chaos injection commands burned into the job templates.
- Adds support for RAMP_UP / RAMP_DOWN periods during the course of a chaos experiment.
- Homogenizes the time units (sec over msec) used across experiments for chaos duration and other parameters.
- Improved e2e suite with Ginkgo based BDD tests for newly added experiments and operator functionality
- Refactors the test-tools repository structure based on tool type
- Introduces an NFS liveness tool to lay foundation for NFS storage chaos experiments
- Adds governance artefacts (Maintainers, Governance) along with the project roadmap and an initial set of public adopters of LitmusChaos
- Adds license dependencies and scan reports obtained via fossa
Major Bug Fixes
- Fixes the hardcoded total chaos/job wait duration in the node-cpu-hog experiment.
- Fixes to verify state of application pods (health check) before proceeding with subsequent iterations of pod-delete chaos
- Adds a unique instance_id/run_id (hash) to names & labels of chaos jobs started by the experiment to aid identification and prevent conflicts upon parallel or repeated runs in a given namespace.
- Fixes execution workflow of chaos experiments when run as a standalone job without orchestration by the chaos operator
Prerequisites to install
- Make sure you have a healthy Kubernetes Cluster.
- Kubernetes 1.11+ is installed
kubectl apply -f https://litmuschaos.github.io/pages/litmus-operator-v1.2.0.yaml
Verify your installation
Verify if the chaos operator is running
kubectl get pods -n litmus
Verify if chaos CRDs are installed
kubectl get crds | grep chaos
For more details refer to the documentation at Docs