vsphere resource discovery #10

rlankfo · 2022-03-29T17:50:36Z

This PR includes ported code from the telegraf vsphere plugin to discover vsphere resources. There are a few minimal changes to the code, for example, resource filtering is currently excluded. As we begin collecting prometheus metrics for discovered resources, this is expected to change a bit as well for generating the appropriate labels (for resource hierarchy).

Closes #1

LICENSE

vsphere/client.go

vsphere/collector.go

vsphere/endpoint.go

vsphere/finder.go

rfratto · 2022-03-29T19:01:14Z

vsphere/vsphere.go

+	MetricLookback:          3,
+	ForceDiscoverOnInit:     true,
+	ObjectDiscoveryInterval: time.Second * 300,
+	Timeout:                 time.Second * 60,


I'm concerned about how the Timeout currently works (along with finding it a little hard to reason when there is/isn't a timeout set in the context). As far as I can tell, the timeout currently gets reset per-object/metric as the discovery process progresses or during collection of metrics. This will make the total timeout time fairly unpredictable, and make it difficult to give tight constraints on how fast things should be completing.

IMO instead of having a Timeout per-operation, we should have a timeout for the entire discovery/scrape. This would allow users tighter control over how long things are taking rather than letting it go on unbounded. You can even consider using the X-Prometheus-Scrape-Timeout-Seconds header that Prometheus sets here.

I'm thinking that getting the timeout easy to reason about for users pretty important. We should be careful to cancel scrapes and avoid piling up a number of pending /metrics calls from Prometheus that get ignored because Prometheus already timed out way before the exporter did.

Agree with you here, this is something I'll note to come back to once further along and testing in live environments.

rfratto · 2022-03-29T19:02:52Z

(FWIW I understand that most of the comments I had above is relevant to code we copied and not something you wrote directly :)

vsphere/client.go

vsphere/endpoint.go

mattdurham

I think there are a lot of places we are swallowing or not returning errors. Feels like we should have comments on why we are doing that.

rlankfo added 2 commits March 4, 2022 13:23

port discovery code from telegraf

bf67c02

start discovery

b20f689

rlankfo requested review from rfratto and mattdurham March 29, 2022 17:50

mattdurham reviewed Mar 29, 2022

View reviewed changes

LICENSE Outdated Show resolved Hide resolved

rfratto approved these changes Mar 29, 2022

View reviewed changes

mattdurham reviewed Mar 31, 2022

View reviewed changes

vsphere/client.go Outdated Show resolved Hide resolved

mattdurham reviewed Mar 31, 2022

View reviewed changes

vsphere/endpoint.go Outdated Show resolved Hide resolved

mattdurham reviewed Mar 31, 2022

View reviewed changes

vsphere/endpoint.go Show resolved Hide resolved

mattdurham requested changes Mar 31, 2022

View reviewed changes

rlankfo force-pushed the telegraf-discovery branch from 85fe04e to b20f689 Compare March 31, 2022 16:28

rlankfo added 5 commits March 31, 2022 11:32

tidy up exposed API

68594ce

adjust GetClient comment

2abc764

refactor initialDiscovery

a0a836a

no need to set nil client

c98e46a

assign vars outside init

19cb14f

rlankfo merged commit f1870fa into grafana:main Apr 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vsphere resource discovery #10

vsphere resource discovery #10

rlankfo commented Mar 29, 2022

rfratto Mar 29, 2022

rlankfo Mar 31, 2022

rfratto commented Mar 29, 2022

mattdurham left a comment

vsphere resource discovery #10

vsphere resource discovery #10

Conversation

rlankfo commented Mar 29, 2022

rfratto Mar 29, 2022

Choose a reason for hiding this comment

rlankfo Mar 31, 2022

Choose a reason for hiding this comment

rfratto commented Mar 29, 2022

mattdurham left a comment

Choose a reason for hiding this comment