-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
introduce metrics for object discovery #24
Conversation
vsphere/endpoint.go
Outdated
m.datacenters = prometheus.NewGauge(prometheus.GaugeOpts{ | ||
Namespace: "vmx", | ||
Subsystem: "discovery", | ||
Name: "datacenter_count", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we rename this to datacenter_total
in case that we change to a counter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the above comment. datacenter_total
seems like a good name for a metric that increments each time we discover a datacenter (even if it's the same datacenters each time). In that case, I'd need to decide what to name the metrics for last discovery totals only.
Help: "Count of datastores discovered during last object discovery.", | ||
}) | ||
|
||
m.duration = prometheus.NewHistogram(prometheus.HistogramOpts{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also add histograms/timers to count the time than different getObjects
methods take?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is a good idea 👍
vsphere/endpoint.go
Outdated
m.duration = prometheus.NewHistogram(prometheus.HistogramOpts{ | ||
Namespace: "vmx", | ||
Subsystem: "discovery", | ||
Name: "duration", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a best practice here usually is add the unit (is it seconds? or millis?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be seconds, I'll update this.
vsphere/exporter.go
Outdated
// create http server | ||
topMux := http.NewServeMux() | ||
topMux.Handle(cfg.TelemetryPath, newHandler(log.With(logger, "component", "handler"), registry)) | ||
if cfg.EnableMetaMetrics { | ||
topMux.Handle("/-/metrics", newHandler(log.With(logger, "component", "meta_handler"), mReg)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a common practice? why not expose the meta metrics in the same endpoint/registry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is common practice, I was on the fence about this. The reason I decided to move this to a different endpoint is the case we have customers testing internally, they may not be comfortable forwarding their vsphere metrics back to Grafana cloud for us to analyze as labels will likely eventually contain things like hostnames. The alternative could be to use relabel_configs which drop vsphere_*
metrics and only send us over the meta metrics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think using another registry it's fine but we can use the same handler: https://github.com/prometheus/node_exporter/blob/master/node_exporter.go#L136
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they may not be comfortable forwarding their vsphere metrics back to Grafana cloud for us to analyze as labels will likely eventually contain things like hostnames
Integration code could have different behavior but I think exporter should export meta metrics in the same handler
This is a start on collecting some metrics for object discovery; currently only covers duration and total counts.