Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

file_ignore_regex / file_ignore_glob not working properly #49009

Closed
msciciel opened this issue Aug 8, 2018 · 4 comments

Comments

Projects
None yet
3 participants
@msciciel
Copy link
Contributor

commented Aug 8, 2018

Description of Issue

Filtering which is done here https://github.com/saltstack/salt/blob/develop/salt/fileserver/__init__.py#L195 modifies dirnames which is not used for anything later.

My workaround is to modify generate_mtime_map function:

def generate_mtime_map(opts, path_map):
    '''
    Generate a dict of filename -> mtime
    '''
    file_map = {}
    for saltenv, path_list in six.iteritems(path_map):
        for path in path_list:
            log.error('Scanning path {0}'.format(path))
            for directory, dirnames, filenames in os.walk(path):
                for item in filenames:
                    try:
                        file_path = os.path.join(directory, item)
                        # Don't walk any directories that match file_ignore_regex or glob
                        if is_file_ignored(opts, file_path):
                            continue
                        file_map[file_path] = os.path.getmtime(file_path)
                    except (OSError, IOError):
                        # skip dangling symlinks
                        log.info(
                            'Failed to get mtime on {0}, '
                            'dangling symlink ?'.format(file_path))
                        continue
    return file_map

I was debugging it because I thought that it would increase performance in our deployment because we have file roots containing cloned git repositories and a lot of files in .git subdirectories but I cannot see significant improvement.

Setup

Add to master configuration:

file_ignore_regex:
  - '/\.svn($|/)'
  - '/\.git($|/)'

Steps to Reproduce Issue

Add log_level: debug to master configuration and restart it. There is no information about skipped files but because of that line https://github.com/saltstack/salt/blob/develop/salt/fileserver/__init__.py#L276 there should be. After adding below fix they starts to appear.

                        if is_file_ignored(opts, file_path):
                            continue

Versions Report

Salt Version:
           Salt: 2017.7.5

Dependency Versions:
           cffi: Not Installed
       cherrypy: Not Installed
       dateutil: Not Installed
      docker-py: Not Installed
          gitdb: Not Installed
      gitpython: Not Installed
          ioflo: Not Installed
         Jinja2: 2.7.2
        libgit2: Not Installed
        libnacl: Not Installed
       M2Crypto: Not Installed
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.4.6
   mysql-python: Not Installed
      pycparser: Not Installed
       pycrypto: 2.6.1
   pycryptodome: Not Installed
         pygit2: Not Installed
         Python: 2.7.5 (default, Apr 11 2018, 07:36:10)
   python-gnupg: Not Installed
         PyYAML: 3.11
          PyZMQ: 15.3.0
           RAET: Not Installed
          smmap: Not Installed
        timelib: Not Installed
        Tornado: 4.2.1
            ZMQ: 4.1.4

System Versions:
           dist: centos 7.5.1804 Core
         locale: ANSI_X3.4-1968
        machine: x86_64
        release: 3.10.0-862.3.2.el7.x86_64
         system: Linux
        version: CentOS Linux 7.5.1804 Core
@Ch3LL

This comment has been minimized.

Copy link
Contributor

commented Aug 8, 2018

ping @terminalmage any insight to @msciciel 's suggestions here?

@Ch3LL Ch3LL added this to the Blocked milestone Aug 8, 2018

@msciciel

This comment has been minimized.

Copy link
Contributor Author

commented Aug 9, 2018

In my opinion significant performance could be achieved by replacing os.walk with something which will skip reading directory recursively if matched with file_ignore_regex.

@terminalmage

This comment has been minimized.

Copy link
Contributor

commented Aug 24, 2018

This is a good catch, I've opened #49308 for this.

If you'd like to attempt a more efficient run at this function, we'd appreciate the pull request.

@msciciel

This comment has been minimized.

Copy link
Contributor Author

commented Aug 27, 2018

I'll try to fix this function.

@rallytime rallytime closed this in 2badd7f Aug 31, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.