Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSX: Slow boot time when machine tied to Active Directory #27402

Closed
benkoller opened this issue Sep 25, 2015 · 18 comments
Closed

OSX: Slow boot time when machine tied to Active Directory #27402

benkoller opened this issue Sep 25, 2015 · 18 comments
Labels
Bug broken, incorrect, or confusing behavior MacOS pertains to the OS of fruit P4 Priority 4 Packaging Related to packaging of Salt, not Salt's support for package management. Platform Relates to OS, containers, platform-based utilities like FS, system based apps severity-medium 3rd level, incorrect or bad functionality, confusing and lacks a work around stale
Milestone

Comments

@benkoller
Copy link

Hi,

I ran into an issue thats been bugging me for a while. In a nutshell, if I tie a Mac to the Active Directory and try to launch the salt-minion via launchd at boot / login I have to accept a 3 minute delay.

So far I tested a multitude of plists, starting with the one from pkg/darwin and working my way onwards from there, all with the same effect. Even when switching to a LaunchAgent I see the delay, however not during boot but during login.

Using the same plist on the same Mac but without ties to the Active Directory will boot without any delay.

When looking at the system.log I see a 3m delay between the start of process all plists in /Library/LaunchDaemons and the next steps.

If you can provide me with a hint / suggestion / guidance where to look next for troubleshooting I'm happy to help.

Cheers!

@jfindlay jfindlay added Bug broken, incorrect, or confusing behavior severity-medium 3rd level, incorrect or bad functionality, confusing and lacks a work around P3 Priority 3 P4 Priority 4 Packaging Related to packaging of Salt, not Salt's support for package management. Platform Relates to OS, containers, platform-based utilities like FS, system based apps MacOS pertains to the OS of fruit and removed P3 Priority 3 labels Sep 25, 2015
@jfindlay jfindlay added this to the Approved milestone Sep 25, 2015
@mosen
Copy link

mosen commented May 31, 2016

I have the same issue and it seems to be related to the query of group memberships when the group module runs grp.getgrall or when the user/group module executes dscacheutil -q group. Both of these seem to stall the system while it collects information on Active Directory group memberships. To resolve this locally I've rolled a module that only queries local group membership.

I will be looking at specifically what causes the group query to take so long in the mean time.

A possible issue could be grains generated at start up that are not available until the active directory plugin can contact the domain controller.

m

@lesphinks
Copy link

I see that this ticket is "High Severity", but I don't see any advancement.
Anyone got a clue what is going on ?
For us this is a critical issue, it makes Salt unusable on OSX.

@damon-atkins
Copy link
Contributor

Do you have a large AD?
If so, is it possible to restrict the search to part of the OU tree?

@lesphinks
Copy link

We're on LDAP here and not that many users right now, around 30.
Also what's weird is that Salt can work for a while and then one day it hangs on launch.
So I know it can work and it's not because of a big domain or anything.
Something buggy is happening regarding theses services.

@mosen
Copy link

mosen commented Sep 17, 2016

If so, is it possible to restrict the search to part of the OU tree?

Not possible if you rely on python's default grp.getgrall() implementation I think.

I haven't done very detailed testing yet. There's no response from saltstack at the moment, so I might have to take another look.

I thought maybe you could delay the launch daemon until after the directory plugin has connected, but that doesn't necessarily fix the issue. You can still cause an OSX client to hang on minion startup if you just start the minion interactively.

I'm going to fork and see how much mileage i get out of swapping all the calls to grp for OS X.

@damon-atkins
Copy link
Contributor

grp.getgrall() will use api calls, those api calls will look at configuration files. For example ldap.conf is the common name for ldap configuration files, and within it you can configure search paths.
If your using AD it should cache some of the data so if your off the network its still visible and can still authenticate.

@mosen
Copy link

mosen commented Sep 21, 2016

Some more information which might give some more context to the issue:
From the man page GETGRENT(3), and this is specific to OS X's implementation of this api.

The getgrent() function searches all available directory services on it's first invocation. It caches the returned entries in a list and returns group entries one at a time.
NOTE that getgrent() may cause a very lengthy search for group records by opendirectoryd and may result in a large number of group records being cached by the calling process. Use of this function is not advised.

This is the function invoked by python's grp module on the Python 2.7.10 shipped with Mac OS X 10.11.6. I'm not aware of any configuration file that can change the behaviour of this API on OSX, but I'm happy to be proven wrong!

@mosen
Copy link

mosen commented Sep 22, 2016

The exact invocation of getgrall that is causing the issue:
https://github.com/saltstack/salt/blob/develop/salt/utils/verify.py#L208

I've made a hotfix branch on my fork for this issue. For the moment I've just used a different verify_env function when salt.utils.is_darwin() returns true. There might be a way to sidestep the issue entirely but i don't completely understand when the permissive argument is true.

In other news, you should be able to prevent a pinwheel at startup with the minion config
verify_env: False
Although you will have to verify the correctness of the permissions in the salt directories yourself.

@lesphinks
Copy link

Thank you so much guys for looking into it.

@mosen you're a golden god !
I just tested the "verify_env: False" workaround and it works 👍
Althought, any salt states that uses any kind permissions hangs, like file.directory etc...
So the workaround is good for the actual salt-minion boot process but we do hit the same issue later on.
At least we know where it's coming from now and hopefully we'll have a fix soon :-)

@mosen
Copy link

mosen commented Sep 23, 2016

@lesphinks oh dear :(
It looks like salt.utils.get_group_list has to be changed just for macOS. Windows has its own module so I guess a darwin module might be the solution.

@mosen
Copy link

mosen commented Sep 26, 2016

@lesphinks I tried to replicate the issue with file.directory but I did not see the same behaviour. It might be because grp.getgrgid and grp.getgrnam don't cause the issue, even though their respective manpages DO warn you about using those functions.

@lesphinks
Copy link

@mosen
Then It must of been another state besides "file.directory" causing the same issue in one of my state files.
It's a pain to trouble shoot with the system hanging for minutes :-)
I'll try to isolate one, logically it has to point towards the file management related ones

@mosen
Copy link

mosen commented Sep 26, 2016

@lesphinks no problem ill try file.managed and recursive and report back

@damon-atkins
Copy link
Contributor

Add this to your minion config

log_fmt_console: '[%(levelname)-8s] %(module)s.%(funcName)s %(message)s'
log_fmt_logfile: '%(asctime)s,%(msecs)03.0f [%(name)-17s.%(funcName)s][%(levelname)-8s] %(module)s. %(message)s'

@bbrendon
Copy link

bbrendon commented Apr 20, 2018

I tried salt on a few macs connected to AD and it is very disruptive to using the computer. I'm guessing this is my issue? Any workarounds?

@evonz-mx
Copy link

Based on advice from @mosen above, we use this snippet to avoid the AD pinwheel problem on our MacOS minions.

Disable salt verification which causes pinwheel at bootup if joined to AD:
  file.managed:
    - name: /etc/salt/minion.d/noverify.conf
    - contents: "verify_env: False"

@stale
Copy link

stale bot commented Aug 3, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

If this issue is closed prematurely, please leave a comment and we will gladly reopen the issue.

@stale stale bot added the stale label Aug 3, 2019
@stale stale bot closed this as completed Aug 10, 2019
@Bacon-Unlimited
Copy link

Still happening. Please reopen.
macOS 10.14-12
salt-minion version 3002.8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug broken, incorrect, or confusing behavior MacOS pertains to the OS of fruit P4 Priority 4 Packaging Related to packaging of Salt, not Salt's support for package management. Platform Relates to OS, containers, platform-based utilities like FS, system based apps severity-medium 3rd level, incorrect or bad functionality, confusing and lacks a work around stale
Projects
None yet
Development

No branches or pull requests

7 participants