-
Notifications
You must be signed in to change notification settings - Fork 398
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enabling RuntimeMetrics often occurs in 'Socket send would block' error #7728
Comments
@Alex-Wauters is |
Doesn't appear to be included directly, could be a transitive dependency. It occurs on all 3 services that we tried. Requirements.txt's:" svc 1 - flask
shared module from all svc's
svc 2 - fastapi based
|
Hi @Alex-Wauters do you know when you started to see this issue? Were you using an earlier tracer version before and didn't see those logs? We updated the Dogstatsd code that we vendor to the latest (0.47) two weeks ago. Which correlates with the 2.2.0 release that you're on. PR here. I'll look into this further, but that info would definitely be helpful. |
We only just enabled Python runtime metrics with that version, we haven't used it before so I don't have any stats for earlier versions. |
Hi @Alex-Wauters could you tell me with what frequency you're seeing that message? If it's less than multiple times per second, it shouldn't be an issue. This happens when we try to send a payload, but the agent is too busy to accept it. Non-blocking socket is a least common denominator way of sending metrics that works in all scenarios without negatively impacting the user application in some way. That message should be a debug log, are you running in debug mode? If so, and you want to avoid seeing this log, you can grab the logger and change its level e.g.:
Let me know if you have any questions about this! |
Closing this, but please re-open if the above does not help. |
Summary of problem
When RuntimeMetrics is enabled, we often get the following error on our kubernetes pods:
Socket send would block: [Errno 11] Resource temporarily unavailable, dropping the packet
We can see the runtime metrics in our dashboard (dogstatsd is enabled on our agent and used for other metrics), but the frequent occurence of these errors made us roll back the changes just in case it is contributing to resource exhaustion. We're aware the feature is in public beta, hence our report.
https://docs.datadoghq.com/tracing/metrics/runtime_metrics/python/
Which version of dd-trace-py are you using?
2.2.0
Runtime metrics enabled via
from ddtrace.runtime import RuntimeMetrics RuntimeMetrics.enable()
How can we reproduce your problem?
It would require running several pods with runtime metrics enabled and monitor over time
What is the result that you get?
Socket send would block: [Errno 11] Resource temporarily unavailable, dropping the packet
What is the result that you expected?
Preferably no errors
The text was updated successfully, but these errors were encountered: