-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Too many errors for endpoint 'https://...': retrying later #19360
Comments
Hey @rossigee, First, thanks for the time you've spent investigating this to open a meaningful issue 👍 I'll explain a bit this piece of code but TL;DR: For a more detailed answer: In order to reach the Datadog intake, there is a part of the Agent called the When everything is working fine, the payload sent by the However, if there is an error reaching an endpoint, an error message is thrown, there are multiple possible errors, this one, that one, or that one (the actual call to print that one out is later on in the branch). These contain useful information to better understand what's not working with the connection and you should be able to see a few of them in your Agent logs. I hope this helps. |
Hi @remeh, Thanks for taking the time to elaborate publicly on the issue. I'm sure others that encounter this will appreciate the additional context. I certainly do. While it took me a moment, I did eventually figure out that the repeated "Too many errors" meant that there was an earlier repeated failure. It then clicked that I should look much much further back in the logs to identify the error that was the actual root cause. This led me quickly to identify and solve my misconfiguration and move on. Now, I should probably have realised this all sooner, but the wording used here threw me on a bit of a wild goose chase. I guess we there is a little room for improvement regarding the terminology and function, as per your suggestion. I would perhaps rename Cheers 👍 |
👋 |
I would like to understand better the cause of the following error message...
The agent is running on Windows and is configured to use an haproxy serving TLS endpoints with a self-signed certificate. The certificate has been added to the Windows system trust chain.
Looking into the source code, it appears that actually it (seems to) mean that the target has been blocked by a 'circuit breaker' intended to blacklist certain targets (?!).
Further investigation is still required, but I'm filing this as a 'support' request, even though I feel it is bordering on a 'bug report' as the error message is just far too ambiguous, and lacks any detail that would help normal users understand what needs to be investigated and fixed. Why has my target been blocked?!
Also, doing some Googling I'm not the only person that's scratched their head over this one. The usual advice seems to be to send Datadog Support a flare. Really, that's not the best advice. If everyone has to file private support requests and generate flares every time they come across a poorly written error message, that's going to create a huge amount of unnecessary resistance, work and lost time for both Datadog support and us customers. So, I'm filing this here as a 'support request' on GitHub instead to try to save more people's precious time.
Please feel free to move to 'bug report' if I've misjudged it.
The text was updated successfully, but these errors were encountered: