-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Erroneous request method drives /metrics unreadable by Prometheus #1821
Comments
@ldez is this possibly our compression bug again? |
Compression is only activate when [entryPoints]
[entryPoints.http]
compress = true @ajardan Do you have activate compression? |
@ajardan are you redirect HTTP to HTTPS? Could do a |
@ldez As far as I can see, we don't have compression enabled. Unless it is enabled by default, of course. We are running traffic as a docker swarm service with the params I specified, no config file. And unfortunately, I don't think I can provide a debug output, since this is in production. Also if I restart the service, I don't know when the issue reappears, could be tomorrow, could be in a month... If there's any other way I can gather useful info, I would be happy to. Here is the curl output:
|
Just want to update and say this still happens from time to time. Had to restart the service today to fix metrics collection... |
/cc @marco-jantke |
@ajardan we put "method" information only, I think you have a client who send wrong "method". |
That could be the case. So I guess there will be no fix from traefik, and I should open a bug with Prometheus ? Or maybe it makes sense to filter bogus methods, since those anyway have no effect and are blocked ? |
I think it's more a Prometheus issue: sending all data without additional behavior seems normal for Traefik, Prometheus must filter and display these specific cases. |
I'll close this issue, because I think the question is answered. |
I think this has to be re-opened, since according to Prometheus documentation (https://prometheus.io/docs/instrumenting/exposition_formats/), the text in /metrics page has to be UTF-8. So traefik is violating the format, which seems like a bug in traefik? |
Also, expanding on this prometheus/prometheus#2983 (comment), handling of partial scrapes has a lot of issues that it won't be supported. For example, if we have a histogram and we ingest the data only partially, then the queries will start giving weird results. |
@ajardan I made a quick fix, but I need to explore other ways. |
Whoa, that was fast, thanks a lot @ldez ! |
@ldez did you move forward with this so far? I think the bug is actually "located" in the Prometheus client library for go and already documented as issue here: prometheus/client_golang#274 Personally I have the feeling that Traefik should not do any additional validation of inputs on that matter. Thats in the end what the client library is for. @ajardan obviously you have some misbehaving client. Might it be possible for you to track them down and fix on there side? |
@marco-jantke Unfortunately I don't. And even if I could do that, this is not a proper fix, since nothing stops another random guy sending requests like this and breaking monitoring. I guess traefik was not designed as an intranet tool where we all have control over our clients, so this could be used by "bad guys". And yes, fixing this in the Prometheus client library makes sense if it is used by traefik, since this will fix the root cause, not the consequences. |
I asked how an implementation in Prometheus client library should look like for label value validation. The maintainer of the project suggested that the lib should immediately error out on retrieval of invalid utf-8 data. This would mean in our concrete usage of it in Traefik that we have to validate all data that could be invalid (user input) before passing it to the lib. This should only be the |
@marco-jantke I have already do that. Please don't make a PR for that. |
Ok great. Let me know when you need someone to review :) |
Closed by #2081. |
Do you want to request a feature or report a bug?
bug
What did you do?
Collected Prometheus stats from traefik /metrics page
What did you expect to see?
Metrics being added to Prometheus
What did you see instead?
None of the metrics were collected by Prometheus, because of erroneous request method, which contained unreadable characters
Output of
traefik version
: (What version of Traefik are you using?)What is your environment & configuration (arguments, toml, provider, platform, ...)?
Here is what /metrics bogus entries look like:
The text was updated successfully, but these errors were encountered: