Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

copy: Reduce calls to stat() by 50% in chown_recursive #75741

Open
wants to merge 1 commit into
base: devel
Choose a base branch
from

Conversation

moreati
Copy link
Contributor

@moreati moreati commented Sep 21, 2021

SUMMARY

This reduces the number of times lstat() or chown() are called by the copy module. Reductions come from

  • combining loops for owner and group, thus not looping twice
  • not looping over the list of directories returned by os.walk(), each directory is already handled when os.walk() descends into it
ISSUE TYPE
  • Feature Pull Request
COMPONENT NAME

copy

ADDITIONAL INFORMATION

Possible further reductions could be achieved by

  • exiting early from the check_mode loop
  • Replacing m.set_owner_if_different() & m.set_group_if_different() calls with a single method.

I propose to do these seperately, if there is interest.

@ansibot ansibot added affects_2.12 core_review In order to be merged, this PR must follow the core review workflow. feature This issue/PR relates to a feature request. module This issue/PR relates to a module. needs_triage Needs a first human triage before being processed. support:core This issue/PR relates to code supported by the Ansible Engineering Team. labels Sep 21, 2021
moreati added a commit to moreati/ansible that referenced this pull request Sep 21, 2021
…rent

This merges implementations of `AnsibleModule.set_owner_if_different()`
& `AnsibleModule.set_group_if_different()` into a single method. The
benefits are
- reduction in code duplication
- makes possible halving the number of calls to `os.lchown()` when the
  copy module sets both owner and group.

There should be no outward change in behavior. The two original methods
become thin wrappers around the new one.

This PR is related to ansible#75741. If both are merged, then the second
benefit above can be realised - by replacing calls in
`copy.chown_recursive()`.

Open questions:

- Should this new method be public (without a leading underscore)?
- Is there a better name?
moreati added a commit to moreati/ansible that referenced this pull request Sep 21, 2021
…rent

This merges implementations of `AnsibleModule.set_owner_if_different()`
& `AnsibleModule.set_group_if_different()` into a single method. The
benefits are
- reduction in code duplication
- makes possible halving the number of calls to `os.lchown()` when the
  copy module sets both owner and group.

There should be no outward change in behavior. The two original methods
become thin wrappers around the new one.

This PR is related to ansible#75741. If both are merged, then the second
benefit above can be realised - by replacing calls in
`copy.chown_recursive()`.

Open questions:

- Should this new method be public (without a leading underscore)?
- Is there a better name?
@s-hertel s-hertel removed the needs_triage Needs a first human triage before being processed. label Sep 21, 2021
@moreati moreati changed the title copy: Reduce calls to stat() in chown_recursive copy: Reduce calls to lstat() in chown_recursive Sep 22, 2021
@moreati
Copy link
Contributor Author

moreati commented Sep 22, 2021

Some profiling of an extreme case, calls to lstat are cut by 50%

Setup

Create a deep directory tree of approx 18 000 nodes

# rm -rf /tmp/{src,dst} && mkdir -p /tmp/src/{a..z}/{a..z}/{a..z}
# ls -l /tmp/src | head -n2
drwxr-xr-x 28 root root 4096 Sep 22 21:24 a
drwxr-xr-x 28 root root 4096 Sep 22 21:24 b
# find /tmp/src/ | wc -l
18279

Without

# rm -rf /tmp/{src,dst} && mkdir -p /tmp/src/{a..z}/{a..z}/{a..z}
# python3 -m cProfile -s calls -m ansible.modules.copy<<<'{"ANSIBLE_MODULE_ARGS": {"src":"/tmp/src", "dest":"/tmp/dst", "owner":"alex", "group":"alex", "remote_src":true}}' | egrep '(os|posix)\.(chown|lchown|lstat|stat)'
   219396    0.517    0.000    0.517    0.000 {built-in method posix.lstat}
    92156    0.189    0.000    0.189    0.000 {built-in method posix.stat}
    36560    0.143    0.000    0.143    0.000 {built-in method posix.lchown}

With

# rm -rf /tmp/{src,dst} && mkdir -p /tmp/src/{a..z}/{a..z}/{a..z}
# python3 -m cProfile -s calls -m ansible.modules.copy<<<'{"ANSIBLE_MODULE_ARGS": {"src":"/tmp/src", "dest":"/tmp/dst", "owner":"alex", "group":"alex", "remote_src":true}}' | egrep '(os|posix)\.(chown|lchown|lstat|stat)'
   109806    0.261    0.000    0.261    0.000 {built-in method posix.lstat}
    92158    0.191    0.000    0.191    0.000 {built-in method posix.stat}
    36560    0.137    0.000    0.137    0.000 {built-in method posix.lchown}

without.cprof.gz
with.cprof.gz

@ansibot ansibot added the stale_ci This PR has been tested by CI more than one week ago. Close and re-open this PR to get it retested. label Sep 30, 2021
@moreati moreati closed this Nov 12, 2021
@moreati moreati reopened this Nov 12, 2021
@ansibot ansibot removed the stale_ci This PR has been tested by CI more than one week ago. Close and re-open this PR to get it retested. label Nov 12, 2021
@ansibot ansibot added the stale_ci This PR has been tested by CI more than one week ago. Close and re-open this PR to get it retested. label Nov 24, 2021
@mattclay mattclay added the P3 Priority 3 - Approved, No Time Limitation label Jul 13, 2022
@moreati moreati changed the title copy: Reduce calls to lstat() in chown_recursive copy: copy: Reduce calls to stat() by 50% in chown_recursive Jul 29, 2022
@moreati
Copy link
Contributor Author

moreati commented Jul 29, 2022

Rebased on devel, and included the headline number in the summary

@ansibot ansibot removed the stale_ci This PR has been tested by CI more than one week ago. Close and re-open this PR to get it retested. label Jul 29, 2022
moreati added a commit to moreati/ansible that referenced this pull request Jul 29, 2022
…rent

This merges implementations of `AnsibleModule.set_owner_if_different()`
& `AnsibleModule.set_group_if_different()` into a single method. The
benefits are
- reduction in code duplication
- makes possible halving the number of calls to `os.lchown()` when the
  copy module sets both owner and group.

There should be no outward change in behavior. The two original methods
become thin wrappers around the new one.

This PR is related to ansible#75741. If both are merged, then the second
benefit above can be realised - by replacing calls in
`copy.chown_recursive()`.
@ansibot ansibot added the stale_ci This PR has been tested by CI more than one week ago. Close and re-open this PR to get it retested. label Aug 6, 2022
moreati added a commit to moreati/ansible that referenced this pull request Jul 8, 2023
…rent

This merges implementations of `AnsibleModule.set_owner_if_different()`
& `AnsibleModule.set_group_if_different()` into a single method. The
benefits are
- reduction in code duplication
- makes possible halving the number of calls to `os.lchown()` when the
  copy module sets both owner and group.

There should be no outward change in behavior. The two original methods
become thin wrappers around the new one.

This PR is related to ansible#75741. If both are merged, then the second
benefit above can be realised - by replacing calls in
`copy.chown_recursive()`.
@moreati moreati changed the title copy: copy: Reduce calls to stat() by 50% in chown_recursive copy: Reduce calls to stat() by 50% in chown_recursive Jul 8, 2023
@ansibot ansibot removed the stale_ci This PR has been tested by CI more than one week ago. Close and re-open this PR to get it retested. label Jul 8, 2023
@ansibot
Copy link
Contributor

ansibot commented Jul 8, 2023

The test ansible-test sanity --test pylint [explain] failed with 2 errors:

lib/ansible/modules/copy.py:370:21: disallowed-name: Disallowed name "_"
lib/ansible/modules/copy.py:383:17: disallowed-name: Disallowed name "_"

click here for bot help

@ansibot ansibot added ci_verified Changes made in this PR are causing tests to fail. needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. labels Jul 8, 2023
This reduces the number of times `stat()` or `chown()` are called by the
copy module. Reductions come from

- combining loops for owner and group, thus not looping twice
- not looping over the list of directories returned by `os.walk()`, each
  directory is already handled when `os.walk()` descends into it

Possible further reductions could be achieved by
- exiting early from the check_mode loop
- Replacing `m.set_owner_if_different()` & `m.set_group_if_different()`
  calls with a single method.

I propose to do these separately, if there is interest.
@ansibot ansibot removed ci_verified Changes made in this PR are causing tests to fail. needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. labels Jul 8, 2023
@ansibot ansibot added the stale_ci This PR has been tested by CI more than one week ago. Close and re-open this PR to get it retested. label Jul 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects_2.12 core_review In order to be merged, this PR must follow the core review workflow. feature This issue/PR relates to a feature request. module This issue/PR relates to a module. P3 Priority 3 - Approved, No Time Limitation stale_ci This PR has been tested by CI more than one week ago. Close and re-open this PR to get it retested. support:core This issue/PR relates to code supported by the Ansible Engineering Team.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants