Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

10x performance increase; remove uneeded deepcopy field #13673

Closed
wants to merge 2 commits into from

Conversation

chrismeyersfsu
Copy link
Member

#./main.yml

---
- hosts: all
  gather_facts: false
  vars:
    hello: "Hello World"
  tasks:
    - debug: msg="Hello world {{ hello }}"
      with_sequence: count=1000
time ansible-playbook -i inventory main.yml
...

PLAY RECAP *********************************************************************
localhost                  : ok=1    changed=0    unreachable=0    failed=0


real    0m29.533s
user    0m29.919s
sys 0m1.436s

Hmm, that "feels" slow. Let's take a deeper look as to why.

python -m cProfile -o outme /Users/meyers/ansible/ansible/bin/ansible-playbook -i inventory main.yml >> stdout
pyprof2calltree -i outme
qcachegrind outme.log

Note: replace qcachegrind with kcachegrind if on Linux.

image

deepcopy is costly. What is calling it and what can we do about it?
image
lib/ansible/plugins/callback/__init__.py _process_items() is costly.

    def _process_items(self, result):
        for res in result._result['results']:
            newres = self._copy_result(result)
            res['item'] = self._get_item(res)
            newres._result = res
            if 'failed' in res and res['failed']:
                self.v2_playbook_item_on_failed(newres)
            elif 'skipped' in res and res['skipped']:
                self.v2_playbook_item_on_skipped(newres)
            else:
                self.v2_playbook_item_on_ok(newres)

_copy_result() is basically a wrapper for deepcopy(). The copy of result._result is being overwritten in newres._result = res. What happens if we don't deep copy _result?
image
Wowzers!

time ansible-playbook -i inventory main.yml
...
PLAY RECAP *********************************************************************
localhost                  : ok=1    changed=0    unreachable=0    failed=0


real    0m4.617s
user    0m4.518s
sys 0m0.458s

=~ 10x speedup

Implementation Thoughts

Instead of creating a deepcopy_exclude() method, instead modify _copy_result() to accept the exclude parameter. If exclude parameter included in the call then behave like deepcopy_exclude()

@chrismeyersfsu chrismeyersfsu changed the title 30x performance increase; remove uneeded deepcopy field 10x performance increase; remove uneeded deepcopy field Dec 26, 2015
@amenonsen
Copy link
Contributor

Excellent work.

@abadger
Copy link
Contributor

abadger commented Dec 26, 2015 via email

@abadger
Copy link
Contributor

abadger commented Dec 26, 2015

If we merge it into _copy_results, when exclude is not given we can avoid doing the shallow copy as well. That's probably quicker for that case.

@abadger
Copy link
Contributor

abadger commented Dec 26, 2015

Note: if we keep _copy_result() as just a wrapper around deepcopy, I think we can still shave a small amount of time off by doing it this way:

-     def _copy_result(self, result):
-         ''' helper for callbacks, so they don't all have to include deepcopy '''
-         return deepcopy(result)
+    # helper for callbacks so they don't all have to include deepcopy
+    _copy_result = deepcopy

function calls are expensive in python. Setting _copy_result = deepcopy means that we get rid of the extra intermediate function call. If this is code that gets called frequently (not sure how much is just due to this being a microbenchmark with a loop of 1000x and how much it applies to real-world playbooks) then eliminating the extra call could be valuable.

@chrismeyersfsu
Copy link
Member Author

@abadger

  • Assigned _copy_results = deepcopy
  • Created _copy_results_exclude()
    • used @jimi-c proposed method of poping properties and assigning afterwords (avoiding copy()). Seems like it would be more efficient since exclude list is smaller than total of copy() properties.
  • Unit tests for _copy_result_exclude()

* _copy_results = deepcopy for better performance
* _copy_results_exclude to deepcopy but exclude certain fields. Pop
fields that do not need to be deep copied. Re-assign popped fields
after deep copy so we don't modify the original, to be copied, object.
* _copy_results_exclude unit tests
@jimi-c
Copy link
Member

jimi-c commented Dec 29, 2015

Merged into devel as squashed commit 2d11cfa.

@jimi-c jimi-c closed this Dec 29, 2015
@ansible ansible locked and limited conversation to collaborators Apr 26, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants