Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an error counter for internal errors in the HTTP handler #594

Merged
merged 2 commits into from
Jun 6, 2019

Conversation

beorn7
Copy link
Member

@beorn7 beorn7 commented Jun 5, 2019

The doc comments explain the rationale in a quite detailed way.

Fixes #543 and #542

I'll add a test once we have decided this is the way to go.

@beorn7 beorn7 requested a review from brian-brazil June 5, 2019 17:33
prometheus/promhttp/http.go Outdated Show resolved Hide resolved
errCnt.WithLabelValues("encoding")
if err := opts.Registry.Register(errCnt); err != nil {
if are, ok := err.(prometheus.AlreadyRegisteredError); ok {
errCnt = are.ExistingCollector.(*prometheus.CounterVec)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a global would seem cleaner here, rather than poking around error types to the same effect

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A global CounterVec wouldn't help here. We would still need the same error type exercise. Even if you re-register the same collector, you'll get an error.

However, if we used a global here, you couldn't use separate counters when passing different registries.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that had just occurred to me.

prometheus.DefaultRegisterer, HandlerFor(prometheus.DefaultGatherer, HandlerOpts{}),
prometheus.DefaultRegisterer, HandlerFor(
prometheus.DefaultGatherer,
HandlerOpts{Registry: prometheus.DefaultRegisterer},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this makes sense for the default handler, as gathering will fail the whole scrape anyway and once we've OpenMetrics it has an explicit EOF that'll catch encoding issues. Given that a little wary of adding two more series for ~every target, for cases that don't/won't happen.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. 99% of the usage for the new metric will be with ContinueOnError. However, the incident that triggered #542 would happen even with the default handler.

But yeah, adding metrics for the default case might be confusing. I'll remove it here.

The doc comments explain the rationale in a quite detailed way.

Fixes #543 and #542

Signed-off-by: beorn7 <beorn@grafana.com>
@brian-brazil
Copy link
Contributor

👍

@beorn7
Copy link
Member Author

beorn7 commented Jun 6, 2019

Thanks. Working on a test now.

Signed-off-by: beorn7 <beorn@grafana.com>
@beorn7
Copy link
Member Author

beorn7 commented Jun 6, 2019

Test added, will merge on green.

@beorn7 beorn7 merged commit 7ed96b3 into master Jun 6, 2019
@beorn7 beorn7 deleted the beorn7/promhttp branch June 6, 2019 15:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

promhttp: Consider a metric for HTTP exposition errors
2 participants