Skip to content

Commit

Permalink
Added support to kill prometheus pods
Browse files Browse the repository at this point in the history
  • Loading branch information
yashashreesuresh committed Apr 22, 2020
1 parent b745a04 commit 3ceb399
Show file tree
Hide file tree
Showing 3 changed files with 29 additions and 2 deletions.
5 changes: 4 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ kraken:
- scenarios/etcd.yml
- scenarios/openshift-kube-apiserver.yml
- scenarios/openshift-apiserver.yml
- scenarios/prometheus.yml
tunings:
wait_duration: 60 # Duration to wait between each chaos scenario
Expand All @@ -37,7 +38,7 @@ $ python3 run_kraken.py --config <config_file_location>
The report is generated in the run directory and it contains the information about each chaos scenario injection along with timestamps.

#### Checking if the cluster is sane after failures injection
[Cerberus](https://github.com/openshift-scale/cerberus) can be used to monitor the cluster under test and the aggregated go/no-go signal generated by it can be consumed by Kraken to determine pass/fail i.e make sure the Kubernetes/OpenShift cluste recovered fine after the failure injetion.
[Cerberus](https://github.com/openshift-scale/cerberus) can be used to monitor the cluster under test and the aggregated go/no-go signal generated by it can be consumed by Kraken to determine pass/fail i.e make sure the Kubernetes/OpenShift cluste recovered fine after the failure injetion. It is recommended to use cerberus for checking cluster health. After cerberus is installed, set cerberus_enabled to True and cerberus_url to the url where cerberus publishes go/no-go signal in the config file, thereby enabling cerberus integration.

### Kubernetes/OpenShift chaos scenarios supported
Following are the components of Kubernetes/OpenShift for which a basic chaos scenario config exists today. It currently just supports pod based scenarios, we will be adding more soon. Adding a new pod based scenario is as simple as adding a new config under scenarios directory and defining it in the config.
Expand All @@ -46,3 +47,5 @@ Component | Description
------------------------ | ---------------------------------------------------------------------------------------------------| ------------------------- |
Etcd | Kills a single/multiple etcd replicas for the specified number of times in a loop | :heavy_check_mark: |
Kube ApiServer | Kills a single/multiple kube-apiserver replicas for the specified number of times in a loop | :heavy_check_mark: |
ApiServer | Kills a single/multiple apiserver replicas for the specified number of times in a loop | :heavy_check_mark: |
Prometheus | Kills a single/multiple prometheus replicas for the specified number of times in a loop | :heavy_check_mark: |
3 changes: 2 additions & 1 deletion config/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ kraken:
- scenarios/etcd.yml
- scenarios/openshift-kube-apiserver.yml
- scenarios/openshift-apiserver.yml
- scenarios/prometheus.yml

tunings:
wait_duration: 60 # Duration to wait between each chaos scenario
wait_duration: 60 # Duration to wait between each chaos scenario
23 changes: 23 additions & 0 deletions scenarios/prometheus.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
config:
loopsNumber: 1
minSecondsBetweenRuns: 1
maxSecondsBetweenRuns: 30

# the scenarios describing actions on kubernetes pods
podScenarios:
- name: "delete prometheus pods"

match:
- labels:
namespace: "openshift-monitoring"
selector: "app=prometheus"

filters:
- randomSample:
size: 1

# The actions will be executed in the order specified
actions:
- kill:
probability: 1
force: true

0 comments on commit 3ceb399

Please sign in to comment.