Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CT-384: Downgrade missing pod exception to a debug log #742

Merged
merged 3 commits into from Apr 12, 2021

Conversation

yanscalyr
Copy link
Contributor

This error occurs if the Kubernetes events API returns an event for a pod that is already gone, common if the agent is restarted, this does not affect us in any meaningful way and goes away on its own but makes a big red error for the user. This PR will make that error only get logged as a debug log to not spook anybody.

@codecov
Copy link

codecov bot commented Apr 6, 2021

Codecov Report

Merging #742 (11c1756) into master (81bde11) will increase coverage by 0.02%.
The diff coverage is 0.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #742      +/-   ##
==========================================
+ Coverage   78.32%   78.34%   +0.02%     
==========================================
  Files         154      154              
  Lines       36650    36654       +4     
  Branches     4318     4318              
==========================================
+ Hits        28704    28714      +10     
+ Misses       6878     6873       -5     
+ Partials     1068     1067       -1     
Impacted Files Coverage Δ
...gent/builtin_monitors/kubernetes_events_monitor.py 50.17% <0.00%> (-0.71%) ⬇️
...s/unit/builtin_monitors/kubernetes_monitor_test.py 98.13% <0.00%> (-1.12%) ⬇️
...calyr_agent/builtin_monitors/kubernetes_monitor.py 63.79% <0.00%> (+0.07%) ⬆️
scalyr_agent/builtin_monitors/docker_monitor.py 75.60% <0.00%> (+0.24%) ⬆️
scalyr_agent/monitor_utils/k8s.py 79.55% <0.00%> (+1.23%) ⬆️

current_time,
query_options=k8s_events_query_options,
)
except k8s_utils.K8sApiNotFoundException as e:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change looks reasonable to me as long as we don't potentially mask some other errors which should actually be logged under info.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, do you perhaps have access to some error message which arises in such situation?

Just wondering if we can scope this except some more based on the error message..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

K8sApiNotFoundException: The resource at location `/api/v1/namespaces/scalyr/pods/<pod name>` was not found

Not sure if there is any meaningful way we can scope this down further, since this exception is about a queried resource being missing, and the try is around the code where we attempt to query a pod and nothing else.

except k8s_utils.K8sApiNotFoundException as e:
global_log.log(
scalyr_logging.DEBUG_LEVEL_1,
"Failed to process single k8s event line due to following exception: %s, %s, %s"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should also log other info (namespace, name, query_options)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The namespace and name end up in the message as it is, they are present in the URL we query which does get logged as part of the exception. The query options seem to not have much to do with the actual query sent to k8s as far as I can tell, retry amount, what we do with unknown errors, and what rate limiter we use.

I don't think there is much use in logging any of those, plus I don't know how we could log the rate limiter in a meaningful way.

@yanscalyr yanscalyr requested a review from Kami April 12, 2021 19:36
Copy link
Contributor

@Kami Kami left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

Please just add a changelog entry.

@yanscalyr yanscalyr merged commit af40d4e into master Apr 12, 2021
@yanscalyr yanscalyr deleted the downgrade_error_missing_pod branch April 12, 2021 21:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants