Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

large inventory loading performance enhancement. avoid calling glob.glob() search for .py files in all() function for each host. #32609

wants to merge 1 commit into
base: devel
Choose a base branch


Copy link

@skamithi skamithi commented Nov 7, 2017


Working a customer who has a very large inventory. Recent fixes to devel help reduce inventory load time from 4 minutes down to 30 seconds. Ran vmprof to see how to get any additional performance savings and noticed that glob.glob() is been called several times in the lib/ansible/plugins/ all() function. did not see the need for this as the .py files been polled are the same for each host listed in the inventory. preventing glob.glob() from been called for each host listed in the inventory improves performance by 30%

  • Bugfix Pull Request


ansible 2.5.0
  config file = None
  configured module search path = [u'/home/ansible/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
  ansible python module location = /home/ansible/inv-test/local/lib/python2.7/site-packages/ansible
  executable location = /home/ansible/inv-test/bin/ansible
  python version = 2.7.12 (default, Nov 19 2016, 06:48:10) [GCC 5.4.0 20160609]

Because I cannot use the customer's inventory output, I created a simulation of their output. Here is the script to generate the inventory used to test this patch. The script generates 1001 groups with an average group size of 19 and a total host count of 19999. Each host has 99 host variables defined.


import json

host_count = 20000
json_output = {
    'all': {'hosts': []},
    '_meta': {'hostvars': {}}
base_name = 'server'

metadata = {}

for i in range(1, 100):
    _key = "blah" + str(i)
    metadata[_key] = "blahblahblah"

groups = 1000
host_count = 20000
group_size = host_count / groups
group_number = 0
_group_size = 0
_new_group_name = 'group_' + str(group_number)

for i in range(1, host_count):
    if _group_size >= group_size:
        group_number += 1
        _new_group_name = 'group_' + str(group_number)
        _group_size = 0
    servername = base_name + str(i)
    json_output['_meta']['hostvars'][servername] = metadata

    if not json_output.get(_new_group_name):
        json_output[_new_group_name] = {'hosts': []}
    _group_size += 1

The vmprofile before the patch looks as follows

vmprof output:
%:      name:                                       location:
100.0%  run_path                                    /usr/lib/python2.7/
100.0%  _run_module_code                            /usr/lib/python2.7/
100.0%  _run_module_as_main                         /usr/lib/python2.7/
100.0%  _run_code                                   /usr/lib/python2.7/
100.0%  main                                        /home/ansible/inv-test/lib/python2.7/site-packages/vmprof/
100.0%  <module>                          
100.0%  <module>                                    /home/ansible/inv-test/lib/python2.7/site-packages/vmprof/
98.7%   <module>                                    /home/ansible/inv-test/bin/ansible:21
98.2%   run                               
65.7%   json_inventory                    
65.1%   _get_host_variables               
65.0%   get_vars                                    /home/ansible/inv-test/local/lib/python2.7/site-packages/ansible/vars/
28.9%   all                                         /home/ansible/inv-test/local/lib/python2.7/site-packages/ansible/plugins/
28.0%   _plugins_play                               /home/ansible/inv-test/local/lib/python2.7/site-packages/ansible/vars/
27.6%   _plugins_inventory                          /home/ansible/inv-test/local/lib/python2.7/site-packages/ansible/vars/
22.6%   _get_plugin_vars                            /home/ansible/inv-test/local/lib/python2.7/site-packages/ansible/vars/
22.4%   get_vars                                    /home/ansible/inv-test/local/lib/python2.7/site-packages/ansible/plugins/vars/
20.1%   glob                                        /usr/lib/python2.7/
18.7%   iglob                                       /usr/lib/python2.7/
16.0%   parse_sources                               /home/ansible/inv-test/local/lib/python2.7/site-packages/ansible/inventory/
16.0%   __init__                                    /home/ansible/inv-test/local/lib/python2.7/site-packages/ansible/inventory/
16.0%   _play_prereqs                               /home/ansible/inv-test/local/lib/python2.7/site-packages/ansible/cli/
15.3%   parse_source                                /home/ansible/inv-test/local/lib/python2.7/site-packages/ansible/inventory/
15.2%   parse                                       /home/ansible/inv-test/local/lib/python2.7/site-packages/ansible/plugins/inventory/
14.7%   n:PyObject_Call:0:-                        
13.2%   dump                              
13.2%   dumps                                       /usr/lib/python2.7/json/
12.8%   realpath                                    /home/ansible/inv-test/lib/python2.7/
12.8%   encode                                      /usr/lib/python2.7/json/
12.1%   glob1                                       /usr/lib/python2.7/
11.9%   _iterencode                                 /usr/lib/python2.7/json/
11.0%   n:PyEval_EvalCodeEx:0:-                    
10.6%   _iterencode_dict                            /usr/lib/python2.7/json/
10.0%   _joinrealpath                               /home/ansible/inv-test/lib/python2.7/
9.8%    all_plugins_play                            /home/ansible/inv-test/local/lib/python2.7/site-packages/ansible/vars/
9.2%    all_plugins_inventory                       /home/ansible/inv-test/local/lib/python2.7/site-packages/ansible/vars/
9.2%    groups_plugins_play                         /home/ansible/inv-test/local/lib/python2.7/site-packages/ansible/vars/
9.0%    groups_plugins_inventory                    /home/ansible/inv-test/local/lib/python2.7/site-packages/ansible/vars/
8.8%    populate_host_vars                          /home/ansible/inv-test/local/lib/python2.7/site-packages/ansible/plugins/inventory/
7.2%    set_variable                                /home/ansible/inv-test/local/lib/python2.7/site-packages/ansible/inventory/
7.2%    n:<native symbol 0x4ddd31>:0:-             
6.4%    n:<native symbol 0x51e191>:0:-             

Before the patch

this inventory using takes 25.01s user 2.61s system 99% cpu 27.824 total on my laptop.

After the patch

Inventory loads in 19.25s user 1.56s system 99% cpu 20.948 total

@skamithi skamithi changed the title large inventory loading performance enhancement. avoid calling glob.glob() search for .py files in inventory var module search for each host. large inventory loading performance enhancement. avoid calling glob.glob() search for .py files in all() function for each host. Nov 7, 2017
@ansibot ansibot added affects_2.5 bugfix_pull_request needs_triage python3 support:core labels Nov 7, 2017
Copy link

@ansibot ansibot commented Nov 7, 2017

The test ansible-test sanity --test pep8 [?] failed with the following error:

lib/ansible/plugins/ E303 too many blank lines (2)

click here for bot help

@ansibot ansibot added the needs_revision label Nov 7, 2017
for each host listed in the inventory. improves performance by 30%.
@skamithi skamithi force-pushed the inventory_script_performance_enhancement branch from e6748e5 to bd70ab9 Compare Nov 7, 2017
@jborean93 jborean93 removed the needs_triage label Nov 8, 2017
@ansibot ansibot removed the needs_revision label Nov 16, 2017
@ansibot ansibot added the stale_ci label Nov 24, 2017
@ansibot ansibot added bug performance and removed bugfix_pull_request labels Mar 2, 2018
@abadger abadger removed the python3 label Mar 20, 2018
@ansibot ansibot added needs_rebase needs_revision labels Mar 28, 2018
@ansibot ansibot added the new_plugin label May 23, 2018
@ansibot ansibot added support:community and removed support:core labels Sep 20, 2018
@ansibot ansibot added support:core and removed support:community labels Nov 26, 2018
@bcoca bcoca requested a review from s-hertel Aug 23, 2019
Copy link

@bcoca bcoca commented Aug 23, 2019

this should be solved by our directory cache at this point, each task should not retrigger glob.glob, only when an include/import happens and adds new directory to pathing.

assigned to verify

@bcoca bcoca added the needs_verified label Aug 23, 2019
@ansibot ansibot removed the stale_ci label Dec 6, 2020
@ansibot ansibot added the pre_azp label Dec 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
affects_2.5 bug has_issue needs_rebase needs_revision needs_verified new_plugin performance pre_azp support:core
None yet

Successfully merging this pull request may close these issues.

None yet

5 participants