Add node level chaos scenarios for bastion node #68

pravin-dsilva · 2021-02-03T18:49:41Z

Scenario: node level test to shut down and start the bastion node. Upon node startup, the list of services provided in the scenario will be checked if they are running or not.

Signed-off-by: Pravin Dsilva pravin.d-silva@ibm.com

rht-perf-ci · 2021-02-03T18:51:02Z

Can one of the admins verify this patch?

paigerube14 · 2021-02-04T17:37:56Z

docs/node_scenarios.md

@@ -25,7 +26,7 @@ Following node chaos scenarios are supported:
 A google service account is required to give proper authentication to GCP for node actions. See [here](https://cloud.google.com/docs/authentication/getting-started) for how to create a service account.

 **NOTE**: A user with 'resourcemanager.projects.setIamPolicy' permission is required to grant project-level permissions to the service account.
-
+ r


need to remove this extra line

I've fixed it along with a few other minor changes.

kraken/node_actions/common_node_functions.py

paigerube14 · 2021-02-04T20:03:01Z

kraken/node_actions/common_node_functions.py

+            time.sleep(sleeper)
+            i += sleeper            
+            logging.info("Trying to ssh to instance: %s" % (node))
+            connection = ssh.connect(node, username='root', key_filename='/root/.ssh/id_rsa', timeout=800, banner_timeout=400)


would we be able to add this ssh key parameter as a parameter in the node scenario yaml file

paigerube14 · 2021-02-08T16:08:42Z

run_kraken.py

@@ -49,6 +49,8 @@ def inject_node_scenario(action, node_scenario, node_scenario_object):
    node_name = node_scenario.get("node_name", "")
    label_selector = node_scenario.get("label_selector", "")
    timeout = node_scenario.get("timeout", 120)
+    service = node_scenario.get("service", "")
+    ssh_private_key = node_scenario.get("ssh_private_key", "")


Maybe lets add "~/.ssh/id_rsa" as the default value for the ssh_private_key parameter

Sure, added it.

paigerube14 · 2021-02-08T17:15:08Z

I’m not super familiar with helper nodes, would the helper node name and IP address be returned from oc get nodes using a specific label_selector in the config? If so, I think we could reuse the stop start node scenario code but still add on the service check at the end for this specific action. AKA no need for helper_node_start_scenario and helper_node_stop_scenario in openstack_node_sceanrios. Thoughts?

Also, not sure if this is all from code you have added but can you please take a look at the spacing issues outlined using tox.

To run:
pip install tox (if not already installed; might want to add this to the requirements)

Run “tox” and fix all spacing/import issues

pravin-dsilva · 2021-02-09T18:00:52Z

I’m not super familiar with helper nodes, would the helper node name and IP address be returned from oc get nodes using a specific label_selector in the config? If so, I think we could reuse the stop start node scenario code but still add on the service check at the end for this specific action. AKA no need for helper_node_start_scenario and helper_node_stop_scenario in openstack_node_sceanrios. Thoughts?

Yes, we intended to re-use the code but since the helper node is an external entity not part of the openshift cluster, there is no way to get the ip or hostname from oc get nodes.

Also, not sure if this is all from code you have added but can you please take a look at the spacing issues outlined using tox.

To run:
pip install tox (if not already installed; might want to add this to the requirements)

Run “tox” and fix all spacing/import issues

Sure, I have created another commit in this PR fixing all the indentation issues which are not part of my code. Thanks

paigerube14 · 2021-02-09T20:46:39Z

scenarios/node_scenarios_example.yml

    node_name:                                                      # node on which scenario has to be injected
    label_selector: node-role.kubernetes.io/worker                  # when node_name is not specified, a node with matching label_selector is selected for node chaos scenario injection
    instance_kill_count: 1                                          # number of times to inject each scenario under actions
    timeout: 120                                                    # duration to wait for completion of node scenario injection
    cloud_type: aws                                                 # cloud type on which Kubernetes/OpenShift runs
+    helper_node_ip:                                                 # ip address of the helper node


Are we able to move these next couple of new parameters to the example part of docs/node_scenario.md? Might be less confusing since those items aren't going to be used in the actions that are run be default

Yeah, moved them to the docs.

paigerube14 · 2021-02-09T20:46:57Z

Is there a plan to add this type of scenario to all could types? If not, it would be nice to give a better error when trying to use aws with the stop_start_helper_node_scenario for example. Instead of getting this ugly error message:
AttributeError: 'aws_node_scenarios' object has no attribute 'helper_node_stop_scenario'

Other than that LGTM

paigerube14 · 2021-02-15T18:09:34Z

LGTM

mffiedler · 2021-02-15T18:53:42Z

@pravin-dsilva please squash the commits and we'll do a final review. Thanks for working with us on this.

Signed-off-by: Pravin Dsilva <pravin.d-silva@ibm.com>

pravin-dsilva · 2021-02-16T17:07:05Z

Thanks, please take a look.

mffiedler

LGTM

pravin-dsilva mentioned this pull request Feb 3, 2021

Add node level scenarios for Helper node #67

Closed

pravin-dsilva force-pushed the bastion_scenario branch 3 times, most recently from 3454d20 to 3852918 Compare February 4, 2021 17:20

paigerube14 reviewed Feb 4, 2021

View reviewed changes

pravin-dsilva force-pushed the bastion_scenario branch 2 times, most recently from ed73d0a to 81c0df7 Compare February 4, 2021 17:42

paigerube14 reviewed Feb 4, 2021

View reviewed changes

kraken/node_actions/common_node_functions.py Show resolved Hide resolved

paigerube14 reviewed Feb 4, 2021

View reviewed changes

pravin-dsilva force-pushed the bastion_scenario branch 3 times, most recently from 5037e88 to 5b971a0 Compare February 5, 2021 14:21

paigerube14 reviewed Feb 8, 2021

View reviewed changes

pravin-dsilva force-pushed the bastion_scenario branch from 5b971a0 to 0c730ca Compare February 8, 2021 16:54

pravin-dsilva force-pushed the bastion_scenario branch from 0c730ca to cf7703c Compare February 9, 2021 17:51

paigerube14 reviewed Feb 9, 2021

View reviewed changes

pravin-dsilva force-pushed the bastion_scenario branch 3 times, most recently from 5a9acf1 to 152a1c1 Compare February 15, 2021 18:02

pravin-dsilva force-pushed the bastion_scenario branch from 152a1c1 to 0149dd1 Compare February 15, 2021 18:10

Add node level chaos scenarios for bastion node

918b5fb

Signed-off-by: Pravin Dsilva <pravin.d-silva@ibm.com>

pravin-dsilva force-pushed the bastion_scenario branch from 0149dd1 to 918b5fb Compare February 16, 2021 17:05

mffiedler self-requested a review February 16, 2021 20:30

mffiedler approved these changes Feb 16, 2021

View reviewed changes

mffiedler merged commit bfe9448 into krkn-chaos:master Feb 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add node level chaos scenarios for bastion node #68

Add node level chaos scenarios for bastion node #68

pravin-dsilva commented Feb 3, 2021

rht-perf-ci commented Feb 3, 2021

paigerube14 Feb 4, 2021

pravin-dsilva Feb 4, 2021

paigerube14 Feb 4, 2021

paigerube14 Feb 8, 2021

pravin-dsilva Feb 8, 2021

paigerube14 commented Feb 8, 2021

pravin-dsilva commented Feb 9, 2021

paigerube14 Feb 9, 2021

pravin-dsilva Feb 10, 2021

paigerube14 commented Feb 9, 2021

paigerube14 commented Feb 15, 2021

mffiedler commented Feb 15, 2021

pravin-dsilva commented Feb 16, 2021

mffiedler left a comment

Add node level chaos scenarios for bastion node #68

Add node level chaos scenarios for bastion node #68

Conversation

pravin-dsilva commented Feb 3, 2021

rht-perf-ci commented Feb 3, 2021

paigerube14 Feb 4, 2021

Choose a reason for hiding this comment

pravin-dsilva Feb 4, 2021

Choose a reason for hiding this comment

paigerube14 Feb 4, 2021

Choose a reason for hiding this comment

paigerube14 Feb 8, 2021

Choose a reason for hiding this comment

pravin-dsilva Feb 8, 2021

Choose a reason for hiding this comment

paigerube14 commented Feb 8, 2021

pravin-dsilva commented Feb 9, 2021

paigerube14 Feb 9, 2021

Choose a reason for hiding this comment

pravin-dsilva Feb 10, 2021

Choose a reason for hiding this comment

paigerube14 commented Feb 9, 2021

paigerube14 commented Feb 15, 2021

mffiedler commented Feb 15, 2021

pravin-dsilva commented Feb 16, 2021

mffiedler left a comment

Choose a reason for hiding this comment