Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] file.recurse is > 5x slower than file.managed #61998

Closed
ManofWax opened this issue Apr 28, 2022 · 12 comments · Fixed by #64977
Closed

[BUG] file.recurse is > 5x slower than file.managed #61998

ManofWax opened this issue Apr 28, 2022 · 12 comments · Fixed by #64977
Assignees
Labels
Bug broken, incorrect, or confusing behavior Performance

Comments

@ManofWax
Copy link

Description
file.recurse is > 5x times slower than file.managed even if the directory contains only one file.

Setup

Create this state file:

file_managed_test:
    file.managed:
        - source: salt://mypath/etc/systemd/resolved.conf
        - name: /tmp/test1.txt

file_recurse_test:
    file.recurse:
        - source: salt://sensor/etc/systemd
        - name: /tmp/testfolder

The directory used as source contains only one file:

#~/mypath/etc/systemd$ ls -l
-rw-rw-r-- 1 user1 user1 350 nov 12 11:30 resolved.conf

Steps to Reproduce the behavior

Run the state file:

$ sudo salt myserver.local state.apply sensor.file_recurse_test
myserver.local:
----------
          ID: file_managed_test
    Function: file.managed
        Name: /tmp/test1.txt
      Result: True
     Comment: File /tmp/test1.txt is in the correct state
     Started: 16:00:18.816785
    Duration: 256.539 ms
     Changes:   
----------
          ID: file_recurse_test
    Function: file.recurse
        Name: /tmp/testfolder
      Result: True
     Comment: The directory /tmp/testfolder is in the correct state
     Started: 16:00:19.073618
    Duration: 1564.548 ms
     Changes:   

Summary for myserver.local
------------
Succeeded: 2
Failed:    0
------------
Total states run:     2
Total run time:   1.821 s

file.recurse is 6 times slower than file.managed

Expected behavior

file.recurse speed should be comparable to file.managed if the number of files involved are the same

Versions Report
This is the minion salt version:

# salt-minion --versions
Salt Version:
          Salt: 3004.1
 
Dependency Versions:
          cffi: Not Installed
      cherrypy: Not Installed
      dateutil: 2.7.3
     docker-py: 4.1.0
         gitdb: Not Installed
     gitpython: Not Installed
        Jinja2: 2.10.1
       libgit2: Not Installed
      M2Crypto: Not Installed
          Mako: Not Installed
       msgpack: 0.6.2
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     pycparser: Not Installed
      pycrypto: Not Installed
  pycryptodome: 3.6.1
        pygit2: Not Installed
        Python: 3.8.10 (default, Mar 15 2022, 12:22:08)
  python-gnupg: 0.4.5
        PyYAML: 5.3.1
         PyZMQ: 18.1.1
         smmap: Not Installed
       timelib: Not Installed
       Tornado: 4.5.3
           ZMQ: 4.3.2
 
System Versions:
          dist: ubuntu 20.04 focal
        locale: utf-8
       machine: x86_64
       release: 5.4.0-107-generic
        system: Linux
       version: Ubuntu 20.04 focal

This is the master version:

$ salt-master --versions
Salt Version:
          Salt: 3004.1
 
Dependency Versions:
          cffi: Not Installed
      cherrypy: Not Installed
      dateutil: 2.7.3
     docker-py: Not Installed
         gitdb: 2.0.6
     gitpython: 3.0.7
        Jinja2: 2.11.2
       libgit2: Not Installed
      M2Crypto: Not Installed
          Mako: Not Installed
       msgpack: 0.6.2
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     pycparser: Not Installed
      pycrypto: Not Installed
  pycryptodome: 3.6.1
        pygit2: Not Installed
        Python: 3.8.10 (default, Mar 15 2022, 12:22:08)
  python-gnupg: 0.4.5
        PyYAML: 5.3.1
         PyZMQ: 18.1.1
         smmap: 2.0.5
       timelib: Not Installed
       Tornado: 4.5.3
           ZMQ: 4.3.2
 
System Versions:
          dist: ubuntu 20.04 focal
        locale: utf-8
       machine: x86_64
       release: 5.4.0-107-generic
        system: Linux
       version: Ubuntu 20.04 focal
@ManofWax ManofWax added Bug broken, incorrect, or confusing behavior needs-triage labels Apr 28, 2022
@welcome
Copy link

welcome bot commented Apr 28, 2022

Hi there! Welcome to the Salt Community! Thank you for making your first contribution. We have a lengthy process for issues and PRs. Someone from the Core Team will follow up as soon as possible. In the meantime, here’s some information that may help as you continue your Salt journey.
Please be sure to review our Code of Conduct. Also, check out some of our community resources including:

There are lots of ways to get involved in our community. Every month, there are around a dozen opportunities to meet with other contributors and the Salt Core team and collaborate in real time. The best way to keep track is by subscribing to the Salt Community Events Calendar.
If you have additional questions, email us at saltproject@vmware.com. We’re glad you’ve joined our community and look forward to doing awesome things with you!

@whytewolf
Copy link
Contributor

I am unable to recreate this.

root@salt00:/srv/salt/tests# salt salt00 state.apply tests.file_test
salt00:
----------
          ID: test_file_manage
    Function: file.managed
        Name: /tmp/test.txt
      Result: True
     Comment: File /tmp/test.txt is in the correct state
     Started: 22:25:16.901869
    Duration: 33.125 ms
     Changes:
----------
          ID: test_file_recurse
    Function: file.recurse
        Name: /tmp/testdir
      Result: True
     Comment: The directory /tmp/testdir is in the correct state
     Started: 22:25:16.935185
    Duration: 44.001 ms
     Changes:

Summary for salt00
------------
Succeeded: 2
Failed:    0
------------
Total states run:     2
Total run time:  77.126 ms

as you can see in my test they are similar times. and both much faster than yours. which makes me think something else has to be going on here. do you have a large number of files in your salt file server? or a large number of directories?

@ManofWax
Copy link
Author

Yes, /srv/ (the salt directory on salt-master) is very large:

/srv (master*) # rsync --stats --dry-run -ax /srv /tmp        

Number of files: 30,949 (reg: 20,900, dir: 10,042, link: 7)

It contains 31,000 files. How does it matter for file.recurse speed?

@whytewolf
Copy link
Contributor

the filesystem in salt is not exactly robust. it isn't meant to be large. if it isn't being served by salt it shouldn't be in the fileserver. a large fileserver can cause the master to slowdown as it gets requests. but also file.recurse scans the list of directories looking for the target directory. this normally is pretty fast. however with a large number of directories it can slow it down.

@sevdog
Copy link

sevdog commented Dec 23, 2022

Is this the cause of such issue?

salt/salt/states/file.py

Lines 4477 to 4485 in 31aee29

# Check source path relative to fileserver root, make sure it is a
# directory
srcpath, senv = salt.utils.url.parse(source)
if senv is None:
senv = __env__
master_dirs = __salt__["cp.list_master_dirs"](saltenv=senv)
if srcpath not in master_dirs and not any(
x for x in master_dirs if x.startswith(srcpath + "/")
):

@whytewolf
Copy link
Contributor

that is only half of it. the other half being the transfer of the data needed. and the slowdown on the master as it indexes the filesystem. with the extra calls needed for file.recurse it will still be slower for file.recurse than file.managed.

@ManofWax
Copy link
Author

Is there another way to implement it? Maybe store file content inside pillars?

@whytewolf
Copy link
Contributor

pillars would be worse. pillars are not meant for anything other than secrets.

the recommendation is to stop abusing it. if you need large swaths of files put somewhere use rsync state or compress it and use the archive state.

@sevdog
Copy link

sevdog commented Dec 23, 2022

How about avoid a doubple-loop and use a single any statement here:

-    if srcpath not in master_dirs and not any(
-        x for x in master_dirs if x.startswith(srcpath + "/")
+    if not any(
+        x for x in master_dirs if x.startswith(srcpath + "/") or x == srcpath
    ):

This could at least save some time when the list is big.

@whytewolf
Copy link
Contributor

can also save on the size of the loop by adding a prefix to the cp.list_master_dirs but in the end that will only help marginly. a slow master will still slowdown as it builds the file index to send to the state. which is where most of the slowdown happens. and since file.managed doesn't need to know about the files in the file server but file.recurse does file.recurse will ALWAYS be slower than file.managed.

@damon-atkins
Copy link
Contributor

Last time I look at this, getting meta data from the master about files is not every efficient. It does cache a little bit.

@whytewolf whytewolf added this to the Sulfur v3006.2 milestone Jun 30, 2023
@whytewolf whytewolf linked a pull request Aug 15, 2023 that will close this issue
3 tasks
@Ch3LL
Copy link
Contributor

Ch3LL commented Aug 16, 2023

Closed by #64977

@Ch3LL Ch3LL closed this as completed Aug 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug broken, incorrect, or confusing behavior Performance
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants