[receiver/vcenter]: Optimize vCenter Receiver to use concurrency #31837

schmikei · 2024-03-19T13:55:50Z

Component(s)

receiver/vcenter

Is your feature request related to a problem? Please describe.

I'm lead to believe that larger environments have a harder time using the receiver due to the large number of requests that need to be made.

For larger environments the amount of requests can show how the current synchronous code has some room for improvement. I think once we scaled up it was pretty evident that we were bottlenecking on requests, particularly for VMs.

Describe the solution you'd like

From my recollection I don't think that the MetricsBuilder maintains some state that prevented easy parallelization, however we could parallelize/batch the requests to get better performance.

The tooling mitmproxy really helped us identify the issue as the massive number of requests made individually were stacking up to be quite a time sink.

Describe alternatives you've considered

No response

Additional context

For an environment of 2000 VMs it was taking sometimes minutes to complete a collection, based off of early experiments, hoping to cut it down to a collection interval of seconds.

github-actions · 2024-03-19T15:20:46Z

Pinging code owners for receiver/vcenter: @djaglowski @schmikei. See Adding Labels via Comments if you do not have permissions to add labels yourself.

schmikei · 2024-03-19T15:22:15Z

Me and @StefanKurek plan to be taking this effort on, feel free to assign us to it

atoulme · 2024-03-23T05:38:24Z

maybe this spike can help: #30624

**Description:** <Describe what has changed.>  Changes the method for collecting VMs used by the `vccenterreceiver` to the more time-efficient `CreateContainerView` method. This is the first step to addressing the issue linked below. **Link to tracking Issue:** #31837 **Testing:** These changes were tested on an environment with 200+ virtual machines. The original collection time was ~80 seconds. Collection times with these changes are ~40 seconds. **Documentation:** N/A --------- Co-authored-by: Stefan Kurek <stefan.kurek@observiq.com>

…lemetry#32201) **Description:** <Describe what has changed.>  Changes the method for collecting VMs used by the `vccenterreceiver` to the more time-efficient `CreateContainerView` method. This is the first step to addressing the issue linked below. **Link to tracking Issue:** open-telemetry#31837 **Testing:** These changes were tested on an environment with 200+ virtual machines. The original collection time was ~80 seconds. Collection times with these changes are ~40 seconds. **Documentation:** N/A --------- Co-authored-by: Stefan Kurek <stefan.kurek@observiq.com>

**Description:** <Describe what has changed.> There were already some improvements made as far as how networks calls were made centered around Virtual Machines. This allowed collection times to decrease from ~90s to ~27s in an environment with 1 Cluster, 2 Hosts, & 280 VMs. Making similar changes for all resource types helped to further decrease collection times. Now collection time has decreased from ~27s to <~3s for the same environment. Here's a general list of the changes made: - Now makes all network calls (per datacenter) first and stores returned data. - Processes this data afterwards to convert to OTEL resources/metrics (refactored to new file). - Moves all metric recording to metrics.go to keep consistent. - Moves all resource builder creation to resources.go to keep consistent. - Updates/fixes tests. **Link to tracking Issue:** <Issue number if applicable> #31837 Although this issue prescribes a solution to the problem (goroutines) which ended up not being necessary **Testing:** <Describe what testing was performed and which tests were added.> Unit Tests & Integration Tests Passing as well as Manual Testing in Local Environments **Documentation:** <Describe the documentation added.> N/A

…metry#32991) **Description:** <Describe what has changed.> There were already some improvements made as far as how networks calls were made centered around Virtual Machines. This allowed collection times to decrease from ~90s to ~27s in an environment with 1 Cluster, 2 Hosts, & 280 VMs. Making similar changes for all resource types helped to further decrease collection times. Now collection time has decreased from ~27s to <~3s for the same environment. Here's a general list of the changes made: - Now makes all network calls (per datacenter) first and stores returned data. - Processes this data afterwards to convert to OTEL resources/metrics (refactored to new file). - Moves all metric recording to metrics.go to keep consistent. - Moves all resource builder creation to resources.go to keep consistent. - Updates/fixes tests. **Link to tracking Issue:** <Issue number if applicable> open-telemetry#31837 Although this issue prescribes a solution to the problem (goroutines) which ended up not being necessary **Testing:** <Describe what testing was performed and which tests were added.> Unit Tests & Integration Tests Passing as well as Manual Testing in Local Environments **Documentation:** <Describe the documentation added.> N/A

StefanKurek · 2024-05-22T20:54:52Z

@schmikei I think this can be closed now with the recent performance enhancements. I'll let you decide.

schmikei added enhancement New feature or request needs triage New item requiring triage labels Mar 19, 2024

crobert-1 added the receiver/vcenter label Mar 19, 2024

crobert-1 removed the needs triage New item requiring triage label Mar 19, 2024

crobert-1 assigned schmikei Mar 19, 2024

github-actions bot mentioned this issue Mar 26, 2024

Weekly Report: 2024-03-19 - 2024-03-26 #31947

Closed

Caleb-Hurshman mentioned this issue Apr 5, 2024

[receiver/vcenterreceiver] Optimize vCenter VM network calls #32201

Merged

StefanKurek mentioned this issue May 10, 2024

[receiver/vcenter] Collection Time Performance Enhancement #32991

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[receiver/vcenter]: Optimize vCenter Receiver to use concurrency #31837

[receiver/vcenter]: Optimize vCenter Receiver to use concurrency #31837

schmikei commented Mar 19, 2024

github-actions bot commented Mar 19, 2024

schmikei commented Mar 19, 2024

atoulme commented Mar 23, 2024

StefanKurek commented May 22, 2024