CT-384: Downgrade missing pod exception to a debug log #742

yanscalyr · 2021-04-06T19:19:55Z

This error occurs if the Kubernetes events API returns an event for a pod that is already gone, common if the agent is restarted, this does not affect us in any meaningful way and goes away on its own but makes a big red error for the user. This PR will make that error only get logged as a debug log to not spook anybody.

codecov · 2021-04-06T19:38:41Z

Codecov Report

Merging #742 (11c1756) into master (81bde11) will increase coverage by 0.02%.
The diff coverage is 0.00%.

@@            Coverage Diff             @@
##           master     #742      +/-   ##
==========================================
+ Coverage   78.32%   78.34%   +0.02%     
==========================================
  Files         154      154              
  Lines       36650    36654       +4     
  Branches     4318     4318              
==========================================
+ Hits        28704    28714      +10     
+ Misses       6878     6873       -5     
+ Partials     1068     1067       -1

Impacted Files	Coverage Δ
...gent/builtin_monitors/kubernetes_events_monitor.py	`50.17% <0.00%> (-0.71%)`	⬇️
...s/unit/builtin_monitors/kubernetes_monitor_test.py	`98.13% <0.00%> (-1.12%)`	⬇️
...calyr_agent/builtin_monitors/kubernetes_monitor.py	`63.79% <0.00%> (+0.07%)`	⬆️
scalyr_agent/builtin_monitors/docker_monitor.py	`75.60% <0.00%> (+0.24%)`	⬆️
scalyr_agent/monitor_utils/k8s.py	`79.55% <0.00%> (+1.23%)`	⬆️

Kami · 2021-04-07T09:24:02Z

scalyr_agent/builtin_monitors/kubernetes_events_monitor.py

+                                            current_time,
+                                            query_options=k8s_events_query_options,
+                                        )
+                                    except k8s_utils.K8sApiNotFoundException as e:


Change looks reasonable to me as long as we don't potentially mask some other errors which should actually be logged under info.

Also, do you perhaps have access to some error message which arises in such situation?

Just wondering if we can scope this except some more based on the error message..

K8sApiNotFoundException: The resource at location `/api/v1/namespaces/scalyr/pods/<pod name>` was not found

Not sure if there is any meaningful way we can scope this down further, since this exception is about a queried resource being missing, and the try is around the code where we attempt to query a pod and nothing else.

Kami · 2021-04-07T09:24:22Z

scalyr_agent/builtin_monitors/kubernetes_events_monitor.py

+                                    except k8s_utils.K8sApiNotFoundException as e:
+                                        global_log.log(
+                                            scalyr_logging.DEBUG_LEVEL_1,
+                                            "Failed to process single k8s event line due to following exception: %s, %s, %s"


Perhaps we should also log other info (namespace, name, query_options)?

The namespace and name end up in the message as it is, they are present in the URL we query which does get logged as part of the exception. The query options seem to not have much to do with the actual query sent to k8s as far as I can tell, retry amount, what we do with unknown errors, and what rate limiter we use.

I don't think there is much use in logging any of those, plus I don't know how we could log the rate limiter in a meaningful way.

Kami

LGTM 👍

Please just add a changelog entry.

yanscalyr added 2 commits April 6, 2021 12:17

CT-384: Downgrade missing pod exception to a debug log

6374c24

Linting

b8fbcaf

Kami reviewed Apr 7, 2021

View reviewed changes

yanscalyr requested a review from Kami April 12, 2021 19:36

Kami approved these changes Apr 12, 2021

View reviewed changes

Add changelog entry

11c1756

yanscalyr merged commit af40d4e into master Apr 12, 2021

yanscalyr deleted the downgrade_error_missing_pod branch April 12, 2021 21:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CT-384: Downgrade missing pod exception to a debug log #742

CT-384: Downgrade missing pod exception to a debug log #742

yanscalyr commented Apr 6, 2021

codecov bot commented Apr 6, 2021 •

edited

Kami Apr 7, 2021

Kami Apr 7, 2021

yanscalyr Apr 7, 2021

Kami Apr 7, 2021

yanscalyr Apr 7, 2021

Kami left a comment

Navigation Menu

CT-384: Downgrade missing pod exception to a debug log #742

CT-384: Downgrade missing pod exception to a debug log #742

Conversation

yanscalyr commented Apr 6, 2021

codecov bot commented Apr 6, 2021 • edited

Codecov Report

Kami Apr 7, 2021

Choose a reason for hiding this comment

Kami Apr 7, 2021

Choose a reason for hiding this comment

yanscalyr Apr 7, 2021

Choose a reason for hiding this comment

Kami Apr 7, 2021

Choose a reason for hiding this comment

yanscalyr Apr 7, 2021

Choose a reason for hiding this comment

Kami left a comment

Choose a reason for hiding this comment

codecov bot commented Apr 6, 2021 •

edited