Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Masterless gitfs performance #30100

Closed
armooo opened this issue Dec 31, 2015 · 9 comments
Closed

Masterless gitfs performance #30100

armooo opened this issue Dec 31, 2015 · 9 comments
Assignees
Labels
Bug broken, incorrect, or confusing behavior Core relates to code central or existential to Salt File Servers P2 Priority 2 severity-low 4th level, cosemtic problems, work around exists
Milestone

Comments

@armooo
Copy link

armooo commented Dec 31, 2015

When using masterless salt with the gitfs fileserver I observed that performance was scaling with the number of sls files * number of git repos configured in my minion config. With 12 state files and 13 pillar files and 9 git repos a noop highstate run was taking ~3 minutes to complete.

I was able to track this down to the time updating the git repos multiple times in a single highstate run. It seems that a salt master periodically updates, it seems by default once a minute. On the other hand when running masterless the construction of a FSChan causes the git repos to be updated.

I was able to test this by changing get_file_client to cache the FSChan instance, in a hacked up way complete ignoring the value of opts. After this a noop highstate run was finishing in ~30 seconds, most of which is my fault with one slow virtualenv.managed state, and the debug logs show that the repos were only updated once.

I don't have the context to propose the correct solution but would be happy to help with a patch with some guidance.

Salt Version:
           Salt: 2015.8.3

Dependency Versions:
         Jinja2: 2.7.2
       M2Crypto: Not Installed
           Mako: 0.9.1
         PyYAML: 3.10
          PyZMQ: 14.0.1
         Python: 2.7.6 (default, Jun 22 2015, 17:58:13)
           RAET: Not Installed
        Tornado: 4.2.1
            ZMQ: 4.0.4
           cffi: Not Installed
       cherrypy: Not Installed
       dateutil: 1.5
          gitdb: 0.5.4
      gitpython: 0.3.2 RC1
          ioflo: Not Installed
        libnacl: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.3.0
   mysql-python: 1.2.3
      pycparser: Not Installed
       pycrypto: 2.6.1
         pygit2: Not Installed
   python-gnupg: Not Installed
          smmap: 0.8.2
        timelib: Not Installed

System Versions:
           dist: Ubuntu 14.04 trusty
        machine: x86_64
        release: 4.2.0-0.bpo.1-amd64
         system: Ubuntu 14.04 trusty
@cachedout
Copy link
Contributor

Pretty interesting stuff, @armooo . Thanks for the research here.

I'm going to ping @terminalmage here to see what feedback he might have.

@cachedout cachedout added Bug broken, incorrect, or confusing behavior severity-low 4th level, cosemtic problems, work around exists Core relates to code central or existential to Salt P2 Priority 2 File Servers labels Jan 4, 2016
@cachedout cachedout added this to the Approved milestone Jan 4, 2016
@terminalmage terminalmage self-assigned this Jan 4, 2016
@mshade
Copy link

mshade commented Jun 2, 2016

Is there a workaround for this? We use a lot of formulas in git for local development in vagrant and it gets worse each time we add a formula. For context, we currently have 10 formulas and a no-op highstate does 390 gitfs fetches.

@tomasfejfar
Copy link

I made (with my lame bash skills) a workaround script, that synces repos locally on demand. It sucks, but it's better than nothing.

https://gist.github.com/tomasfejfar/e3607569e6a38424e7cf2d9d56187815

@terminalmage
Copy link
Contributor

I've been looking into this. We store the fileclient object in a dictionary that should live for the entirety of a Salt run, to prevent re-initialization (and thus a fileserver update) from being run more than once during a masterless salt-call. However, it appears that (for some reason) this is no longer the case.

@terminalmage
Copy link
Contributor

I've opened #34048 as a potential fix for this issue. I know that it works to resolve the issue of multiple fileserver updates per masterless run, but there may be a more elegant way to resolve this.

In the course of my investigation I found that the method of using the __context__ dictionary to store fileclient instances does not work for all places in the salt codebase where we instantiate a fileclient instance, as not all of these places have access to the __context__ dict (unless I have misread, which is possible). I have been able to reproduce the multiple update behavior going back to the 2014.1 release branch, so this issue has been around for some time. The reason it has to this day gone largely unnoticed is that it only affects the LocalClient fileclient subclass, which is used for masterless minions. The RemoteClient (used in the traditional master/minion setup, where the minion needs to make fileserver requests to a remote master) is not affected.

@terminalmage terminalmage modified the milestones: C 8, Approved Jun 15, 2016
@terminalmage
Copy link
Contributor

Fix was merged into the 2015.8 release branch, closing this issue. It will take a few days for this change to be merged into the 2016.3 release branch and the develop branch, I'll update this comment thread once the fix has propagated to those branches for those who would like to test.

@terminalmage
Copy link
Contributor

The fix has propagated to the 2016.3 release branch, and should be in develop in the next few days.

@terminalmage
Copy link
Contributor

Actually, the fix is now in the develop branch as well, so anyone intrepid enough to test from git can do so from the 2015.8, 2016.3, or develop branches, and the fix will be there.

@tomasfejfar
Copy link

I updated to latest version and the bug is gone. Good job. Cheers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug broken, incorrect, or confusing behavior Core relates to code central or existential to Salt File Servers P2 Priority 2 severity-low 4th level, cosemtic problems, work around exists
Projects
None yet
Development

No branches or pull requests

5 participants