New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CT-384: Downgrade missing pod exception to a debug log #742
Conversation
Codecov Report
@@ Coverage Diff @@
## master #742 +/- ##
==========================================
+ Coverage 78.32% 78.34% +0.02%
==========================================
Files 154 154
Lines 36650 36654 +4
Branches 4318 4318
==========================================
+ Hits 28704 28714 +10
+ Misses 6878 6873 -5
+ Partials 1068 1067 -1
|
current_time, | ||
query_options=k8s_events_query_options, | ||
) | ||
except k8s_utils.K8sApiNotFoundException as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change looks reasonable to me as long as we don't potentially mask some other errors which should actually be logged under info.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, do you perhaps have access to some error message which arises in such situation?
Just wondering if we can scope this except some more based on the error message..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
K8sApiNotFoundException: The resource at location `/api/v1/namespaces/scalyr/pods/<pod name>` was not found
Not sure if there is any meaningful way we can scope this down further, since this exception is about a queried resource being missing, and the try
is around the code where we attempt to query a pod and nothing else.
except k8s_utils.K8sApiNotFoundException as e: | ||
global_log.log( | ||
scalyr_logging.DEBUG_LEVEL_1, | ||
"Failed to process single k8s event line due to following exception: %s, %s, %s" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we should also log other info (namespace, name, query_options)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The namespace and name end up in the message as it is, they are present in the URL we query which does get logged as part of the exception. The query options seem to not have much to do with the actual query sent to k8s as far as I can tell, retry amount, what we do with unknown errors, and what rate limiter we use.
I don't think there is much use in logging any of those, plus I don't know how we could log the rate limiter in a meaningful way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
Please just add a changelog entry.
This error occurs if the Kubernetes events API returns an event for a pod that is already gone, common if the agent is restarted, this does not affect us in any meaningful way and goes away on its own but makes a big red error for the user. This PR will make that error only get logged as a debug log to not spook anybody.