Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error "gauge should be a Counter" when scraping metrics for varnish 4.0 #47

Closed
glennslaven opened this issue Jul 25, 2019 · 4 comments · Fixed by #48
Closed

Error "gauge should be a Counter" when scraping metrics for varnish 4.0 #47

glennslaven opened this issue Jul 25, 2019 · 4 comments · Fixed by #48

Comments

@glennslaven
Copy link
Contributor

glennslaven commented Jul 25, 2019

  • Varnish 4.0.5
  • prometheus_varnish_exporter 1.5

prometheus_varnish_exporter runs and reports a Successful scrape, however, if you curl the /metrics endpoint this is returned:

error gathering metrics: 4 error(s) occurred:
* collected metric varnish_main_sessions label:<name:"type" value:"herd" > gauge:<value:0 >  should be a Counter
* collected metric varnish_main_sessions label:<name:"type" value:"pipeline" > gauge:<value:0 >  should be a Counter
* collected metric varnish_main_sessions label:<name:"type" value:"closed" > gauge:<value:0 >  should be a Counter
* collected metric varnish_main_sessions label:<name:"type" value:"readahead" > gauge:<value:0 >  should be a Counter

Only happens on Varnish 4.0, Varnish 4.1 and higher are fine. This did not happen prior to the 1.5 release.

I have a gist reproing the behaviour here https://gist.github.com/glennslaven/cafb50fa21a45b7798ce9d05a7430d7f .

@jonnenauha
Copy link
Owner

Thanks for the great repro gist. That was very useful as its hard to keep multiple versions of varnish running for normal dev.

I group session metrics starting with sess_ into varnish_main_sessions where the metrics label type is set into sess_<type>.

4.0 varnishstat exports this

"MAIN.sess_herd": {
  "type": "MAIN",
  "value": 0, 
  "flag": "a", 
  "description": "Session herd"
},

6.0.1 varnishstat exports this

"MAIN.sess_herd": {
    "description": "Session herd",
    "flag": "c", "format": "i",
    "value": 0
  },

The idea is that those are then grouped into a single metric name with the type counter.

# HELP varnish_main_sessions Number of sessions
# TYPE varnish_main_sessions counter
varnish_main_sessions{type="herd"} 0
varnish_main_sessions{type="readahead"} 0
varnish_main_sessions{type="closed"} 0

I think the bug is that in 4.0 specifically some of those sess_ metrics are resolved as counters and some as gauges. This mixes multiple types in a single varnish_main_sessions prom metric which probably gives the error out. This is my best guess looking at this quickly. Probably =>4.0.1 they fixed them all to be counters.

In the full 4.0 prom metrics you can see the flag a. Here is another sess_ one "MAIN.sess_queued": {"type": "MAIN", "value": 0, "flag": "c", "description": "Sessions queued for thread"}, that has the flag c.

The semi hack fix is to check the first metrics "parent" created (in this case varnish_main_sessions), and not allow any other "child" type to deviate from that metric type. I will look into this a bit more when I have time.

In the meanwhile have you downgraded the exporter or upgraded varnish so you have a working setup? :)

@jonnenauha
Copy link
Owner

This is the new code in latest version that breaks this https://github.com/jonnenauha/prometheus_varnish_exporter/blob/master/varnish.go#L141-L149

It does not handle the a type and defaults that to a gauge. I did not know that a existed, there are no good varnish docs what these flag values mean and what are the possible values.

Looking at the full 6.x dump, it does not have any a values. So if we just handle that as a counter I think that should cover the old versions. Again I suspect in 4.0.1 they either removed a or just changed the flag to c like the other sess_ ones.

@glennslaven
Copy link
Contributor Author

Thanks, yes we can downgrade the exporter. Unfortunately upgrading Varnish isn't an option at this time, we have a number of legacy instances. Working on getting them updated, but not there yet.

Looks like handling the a type would fix the issue.

@glennslaven
Copy link
Contributor Author

According to the varnish 4.0 code https://github.com/varnishcache/varnish-cache/blob/4.0/include/tbl/vsc_f_main.h#L36 it's Accumulator (deprecated, use 'c')

I've pushed a PR that looks like it fixes this issue, I've tested the compiled binary against my test above. #48

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants