Inventory Optimisation: Querying for services and interfaces in bulk rather than by device #143

DouglasHeriot · 2020-03-25T11:56:00Z

ISSUE TYPE

Feature Idea

SUMMARY

Right now when the inventory plugin fetches the services and interfaces associated with devices and VMs, it does a separate request for each device's services and interfaces. In some circumstances it could be more efficient to just query for all interfaces and all services (similar to how everything else is queried) and then match them up with each device the whole list has been downloaded.

This method would be more efficient in the case where the entire contents of Netbox is wanted in the Ansible inventory. The down side of this would be if there are query_filters in place the limit what devices are required - there would be no way to filter the services or interfaces by device.

Right now I've got 2000 devices in Netbox, less than 1000 services defined, and 10,000 interfaces. And this is less than 10% of our current infrastructure - we're only getting started putting stuff into Netbox. This currently requires 4000 HTTP requests to get the services and interfaces of each device, when it could be achieved with just a couple of batch requests. I haven't even looked at what sort of server CPU and database load this creates.

Alternatively - should I make a request with the Netbox project to include returning interfaces and services as part of the main /dcim/devices/ query?

EXPECTED RESULTS

Fetching: https://netbox/api/ipam/services/?limit=0

ACTUAL RESULTS

Fetching: https://netbox/api/ipam/services/?device=HIL-01
Fetching: https://netbox/api/ipam/services/?device=HIL-02
Fetching: https://netbox/api/ipam/services/?device=HIL-03
Fetching: https://netbox/api/ipam/services/?device=HIL-04
Fetching: https://netbox/api/ipam/services/?device=HIL-05
Fetching: https://netbox/api/ipam/services/?device=HIL-06
Fetching: https://netbox/api/ipam/services/?device=HIL-07
Fetching: https://netbox/api/ipam/services/?device=HIL-08
...

The text was updated successfully, but these errors were encountered:

FragmentedPacket · 2020-03-26T02:59:10Z

Could you provide any numbers to this? Possibly print time before and after the fetches or maybe the complete run of this between the two options?

Thinking about this quickly, maybe add if/else to either fetch all or per device depending on the query_filters? If no query filters, and we want to fetch those, then it would make sense to just fetch everything with the limit=0

I don't have a big data set to test the change against on whether or not it's worth it to implement it.

I'll look for more discussion on this and the PR when it's put in. Don't have a strong opinion on it, but would like to see if there is a performance improvement.

DouglasHeriot · 2020-03-26T06:50:49Z

Here's some more specific numbers in our setup. Ansible is being run from a VM on-prem, and Netbox is hosted in EC2. We have a ping of just 2ms from Ansible to Netbox.

2236 Devices
4 services (total - yeah, we haven't really filled that out yet)
10403 Interfaces
3283 IP Addresses

Summary

Interfaces	Services	Time	HTTP Requests
Yes	Yes	7m14s	6787
No	Yes	3m28s	2274
No	No	17s	12

Interfaces and Services enabled

$ time ansible-inventory -vvv --inventory inventory/netbox.yml --graph 2>&1 | tee inventory.txt
...
real    7m14.441s
user    0m28.269s
sys     0m4.395s

6787 HTTP requests (counted by the self.display.v("Fetching: " + url) from _fetch_information. I removed the one from get_resource_list to avoid duplicate logs)

That's a 7 minute overhead for running any Ansible playbook.

Interfaces disabled, Services enabled by default

real    3m28.690s
user    0m9.222s
sys     0m1.342s

2274 HTTP requests

Neither interfaces or services

I had to comment out the services group extractor "services": self.extract_services,

real    0m17.436s
user    0m2.181s
sys     0m0.166s

12 HTTP requests

Caching?

Caching does help significantly - but not in all use cases. We plan to set up webhooks from Netbox to trigger a CI server to run the Ansible playbook automatically. Caching won't be helpful here as you're always looking to get the latest data from Netbox. Caching would only help in the development situation where you're running playbooks multiple times against the same set of inventory.

Thoughts

Both interfaces and services have a large impact on performance reducing where this inventory plugin can be used. Right now there's not even an option to disable services - I'll be adding one in so we can continue to use this as we did before services were added.

I like the idea to provide users a choice which option to take - query some devices or just fetch all interfaces/services/ip-addresses. I'm pretty sure for most uses cases the trade-off of fetching all interfaces in a couple of large requests vs hundreds or thousands of smaller requests - it's going to be quicker to just get all of them.

I'll consider adding more data into the test deployment scripts, but not sure what the performance of Travis CI is like or if it's worth risking slowing that down too much.

FragmentedPacket · 2020-03-26T12:18:40Z

Great! Thanks for those test. This improvement will definitely be welcomed. I agree with the choice of toggling bulk GET or individual calls to allow some flexibility. Looking forward to the PR!

Querying every single device's services as a separate HTTP request can be very slow. Allow users to disable this (similar to interfaces) if it is not required.

#154)

…unity#143) In most cases this isn't too bad, but for the interfaces and services extractors it would have been resulting in twice as many HTTP requests?

FragmentedPacket · 2020-04-27T13:10:00Z

@DouglasHeriot Do you think it would be worth it to do a fetch of all interfaces, ip addresses, etc. as well when those options are specified? Just stuff in all these optimizations into the PR for this.

DouglasHeriot · 2020-04-27T13:35:50Z

@FragmentedPacket which options are you thinking of?

I'm planning for this to introduce an option fetch_all that would determine whether the interfaces and services options both fetch all, or per-device.

I would be curious on different people's use-cases for this - I guess the default should be to fetch all, but I wonder if anyone will ever query such a small portion of their database it makes sense to turn it off?

FragmentedPacket · 2020-04-27T14:06:22Z

Pretty much just those at this point; interfaces, services, ip addresses. I think those are the main ones that have to be fetched outside of the normal device/VM lookups.

I don't use the inventory plugin at this point so my input isn't use-case driven or anything.

… interfaces/services. A side effect is it resolves netbox-community#142 fetching services for VMs Includes starting to better support virtual chasis - should only take the master device and not the children. Some work on this started in ansible/ansible#60642

… interfaces/services. A side effect is it resolves netbox-community#142 fetching services for VMs Includes starting to better support virtual chasis - should only take the master device and not the children. Some work on this started in ansible/ansible#60642 See the TODO comments for work still to be done before being ready to merge.

… interfaces/services. A side effect is it resolves netbox-community#142 fetching services for VMs Includes better support virtual chasis - only take the master device and not the children. Some of this from ansible/ansible#60642 See the TODO comments for work still to be done before being ready to merge.

* Allow tuning for largest size permitted by your webserver, for optimum performance reducing number of required HTTP requests. * Handle exceptions from within threads and raise on the main thread after being joined. This ensures that any HTTP errors from refresh_ methods will stop execution of the rest of the plugin. For example, you'll notice if you receive a HTTP 414 URI Too Long. Added unit tests for these things.

I found in my install I was getting HTTP 400 errors with 8000 length. 4000 length works. May depend on web server, CDN, etc.

FragmentedPacket added Discussion Enhancement labels Mar 26, 2020

DouglasHeriot mentioned this issue Apr 3, 2020

Inventory: make it optional to query for services (#143) #154

Merged

DouglasHeriot added a commit to hillsong/ansible_modules that referenced this issue Apr 6, 2020

Inventory: bump "version_added" of services flag netbox-community#143

6283b6a

FragmentedPacket pushed a commit that referenced this issue Apr 6, 2020

Enhancement - Inventory: make it optional to query for services (#143) (

609a854

#154)

This was referenced Apr 21, 2020

Inventory: Remove unnecessary lists around singular host vars #141 #155

Merged

Inventory bug: services queried by device name doesn't work for VMs, or names with spaces #142

Closed

DouglasHeriot mentioned this issue May 6, 2020

Inventory: Fix fetching services from VMs, and devices with spaces in the name #196

Closed

DouglasHeriot mentioned this issue May 11, 2020

Inventory performance improvements and fixes fetching interfaces and services #202

Merged

DouglasHeriot closed this as completed May 20, 2020

bsmeding mentioned this issue May 20, 2020

netbox_device_interface - type or form_factor not accepted #208

Closed

DouglasHeriot mentioned this issue Jul 14, 2020

Update Netbox inventory plugin add region grouping, optional interfaces&vlans gathering, slug and prefix length #280

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inventory Optimisation: Querying for services and interfaces in bulk rather than by device #143

Inventory Optimisation: Querying for services and interfaces in bulk rather than by device #143

DouglasHeriot commented Mar 25, 2020

FragmentedPacket commented Mar 26, 2020

DouglasHeriot commented Mar 26, 2020 •

edited

FragmentedPacket commented Mar 26, 2020

FragmentedPacket commented Apr 27, 2020

DouglasHeriot commented Apr 27, 2020

FragmentedPacket commented Apr 27, 2020

Inventory Optimisation: Querying for services and interfaces in bulk rather than by device #143

Inventory Optimisation: Querying for services and interfaces in bulk rather than by device #143

Comments

DouglasHeriot commented Mar 25, 2020

ISSUE TYPE

SUMMARY

EXPECTED RESULTS

ACTUAL RESULTS

FragmentedPacket commented Mar 26, 2020

DouglasHeriot commented Mar 26, 2020 • edited

Summary

Interfaces and Services enabled

Interfaces disabled, Services enabled by default

Neither interfaces or services

Caching?

Thoughts

FragmentedPacket commented Mar 26, 2020

FragmentedPacket commented Apr 27, 2020

DouglasHeriot commented Apr 27, 2020

FragmentedPacket commented Apr 27, 2020

DouglasHeriot commented Mar 26, 2020 •

edited