Added support to track pod crash/restart during sleep #52

yashashreesuresh · 2020-05-13T18:28:05Z

This commit enables the pod crash/restart to be tracked during the wait time between each iteration. This prevents the Cerberus from missing pod crash/restarts when the pod enters the running phase before the next iteration.

Fixes: #51

yashashreesuresh · 2020-05-13T18:28:47Z

Please have a look @chaitanyaenr @mffiedler

rht-perf-ci · 2020-05-13T19:08:01Z

Can one of the admins verify this patch?

mffiedler · 2020-05-14T13:56:24Z

start_cerberus.py

@@ -165,6 +167,10 @@ def main(cfg):
                                           watch_namespaces_status, failed_nodes,
                                           failed_pods_components)

+            if iteration != 1 and crashed_restarted_pods:
+                logging.info("Pods that were crashed/restarted during the sleep: %s"


Suggestion: "Pods that crashed/restarted during iteration %s : %s" % (iteration, crashed_restarted_pods)

mffiedler · 2020-05-14T15:23:13Z

cerberus/kubernetes/client.py

+                    pods_tracker[pod]["creation_timestamp"] = pod_creation_timestamp
+                    pods_tracker[pod]["restart_count"] = pod_restart_count
+            else:
+                crashed_restarted_pods.append(pod)


The list of crashed/restarted pods is currently the pod name. A possible improvement is to append <pod_namspace>:<pod_name> to the list to make it clear where the pod is. This is minor though, I am fine if we move forward like it is now.

mffiedler · 2020-05-14T15:24:16Z

start_cerberus.py

@@ -165,6 +167,10 @@ def main(cfg):
                                           watch_namespaces_status, failed_nodes,
                                           failed_pods_components)

+            if iteration != 1 and crashed_restarted_pods:
+                logging.info("Pods that were crashed/restarted during the sleep: %s"


Suggestion: "Pods that were crashed/restarted during iteration %s : %s" % (iteration, crashed_restarted_pods)"

mffiedler · 2020-05-14T15:29:21Z

Not sure why the duplicate comment. In any case my comments are nitpicks. I tested this and it looks good to me.

chaitanyaenr

Minor nits.

chaitanyaenr · 2020-05-14T15:54:03Z

cerberus/kubernetes/client.py


 # Load kubeconfig and initialize kubernetes python client
 def initialize_clients(kubeconfig_path):
-    global cli
+    global cli, pods_tracker


Maybe initialize pods_tracker outside this function as it's meant for initializing client?

chaitanyaenr · 2020-05-14T15:57:22Z

cerberus/kubernetes/client.py

+    crashed_restarted_pods = []
+    for pod in pods:
+        try:
+            pod_info = cli.read_namespaced_pod_status(pod, namespace,


Make this a separate function and call it to provide the given pod status? This way we can reuse it in other places as well. Thoughts?

chaitanyaenr · 2020-05-14T15:59:06Z

cerberus/kubernetes/client.py

@@ -80,6 +82,41 @@ def check_sdn_namespace():
        please specify the correct networking namespace in config file")


+def namespace_sleep_track(namespace):


Let's add a comment to describe the functionality of this function.

yashashreesuresh · 2020-05-15T07:39:27Z

I have made all the above changes. PTAL @mffiedler @chaitanyaenr

This commit enables the pod crash/restart to be tracked during the wait time between each iteration. This prevents the Cerberus from missing pod crash/restarts when the pod enters the running phase before the next iteration.

chaitanyaenr

LGTM

chaitanyaenr · 2020-05-15T16:16:43Z

Nice job @yashashreesuresh.

mffiedler

/LGTM

mffiedler reviewed May 14, 2020

View reviewed changes

chaitanyaenr reviewed May 14, 2020

View reviewed changes

mffiedler mentioned this pull request May 14, 2020

Ability to add/enable collections of optional monitors #53

Open

yashashreesuresh force-pushed the sleep_tracker branch 2 times, most recently from e664d2b to f22f53f Compare May 15, 2020 07:32

Added support to track pod crash/restart during sleep

456aff2

This commit enables the pod crash/restart to be tracked during the wait time between each iteration. This prevents the Cerberus from missing pod crash/restarts when the pod enters the running phase before the next iteration.

yashashreesuresh force-pushed the sleep_tracker branch from f22f53f to 456aff2 Compare May 15, 2020 15:27

chaitanyaenr approved these changes May 15, 2020

View reviewed changes

mffiedler approved these changes May 15, 2020

View reviewed changes

chaitanyaenr merged commit e749679 into krkn-chaos:master May 15, 2020

yashashreesuresh deleted the sleep_tracker branch May 17, 2020 13:48

chaitanyaenr mentioned this pull request May 20, 2020

Monitor apiserver availability #56

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added support to track pod crash/restart during sleep #52

Added support to track pod crash/restart during sleep #52

yashashreesuresh commented May 13, 2020 •

edited

Loading

yashashreesuresh commented May 13, 2020

rht-perf-ci commented May 13, 2020

mffiedler May 14, 2020

mffiedler May 14, 2020

mffiedler May 14, 2020

mffiedler commented May 14, 2020

chaitanyaenr left a comment

chaitanyaenr May 14, 2020

chaitanyaenr May 14, 2020

chaitanyaenr May 14, 2020

yashashreesuresh commented May 15, 2020 •

edited

Loading

chaitanyaenr left a comment

chaitanyaenr commented May 15, 2020

mffiedler left a comment

		@@ -80,6 +82,41 @@ def check_sdn_namespace():
		please specify the correct networking namespace in config file")


		def namespace_sleep_track(namespace):

Added support to track pod crash/restart during sleep #52

Added support to track pod crash/restart during sleep #52

Conversation

yashashreesuresh commented May 13, 2020 • edited Loading

yashashreesuresh commented May 13, 2020

rht-perf-ci commented May 13, 2020

mffiedler May 14, 2020

Choose a reason for hiding this comment

mffiedler May 14, 2020

Choose a reason for hiding this comment

mffiedler May 14, 2020

Choose a reason for hiding this comment

mffiedler commented May 14, 2020

chaitanyaenr left a comment

Choose a reason for hiding this comment

chaitanyaenr May 14, 2020

Choose a reason for hiding this comment

chaitanyaenr May 14, 2020

Choose a reason for hiding this comment

chaitanyaenr May 14, 2020

Choose a reason for hiding this comment

yashashreesuresh commented May 15, 2020 • edited Loading

chaitanyaenr left a comment

Choose a reason for hiding this comment

chaitanyaenr commented May 15, 2020

mffiedler left a comment

Choose a reason for hiding this comment

yashashreesuresh commented May 13, 2020 •

edited

Loading

yashashreesuresh commented May 15, 2020 •

edited

Loading