New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High Build times on Big Sur #510
Comments
Do you have any other systemextensions loaded? What does |
This is the output:
|
Can you remove |
Thanks Russel! Looks like santa is triggering a bug with the carbon black system extension. A theory we have is that the google.santad.plist is periodically checking if the santa system extension is alive and if its not - it loads it. However with Big Sur this somehow kills com.carbonblack.es-loader.es-extension over and over again. Do you have suggestions on how to run the plist only when the santa system extension is down? |
I'm not sure about that theory - The reason I asked you to test that is that we have a theory that the bug is related to the caching mechanism provided by the EndpointSecurity framework - either that caching mechanism is broken, some system extensions refuse to allow anything to cache, or some extensions are clearing the cache very aggressively. The result is that Santa is having to perform work for every single execution even for unmodified binaries it has previously seen; during a build this is unusable as you've discovered. Tom and I are going to work on bypassing the ES caching system and use our own, as we did pre-sysx and that will avoid these issues. Until then, I'm afraid your options are: a) put up with the bad performance |
@russellhancox thanks for the suggestions. And I'd like to confirm a few more things to make sure that I fully understand your points:
Therefore I'd like to confirm if the periodical invocation of /Applications/Santa.app/Contents/Library/SystemExtensions/com.google.santa.daemon.systemextension/Contents/MacOS/com.google.santa.daemon by launchd is intentional and if not, could we change it to be conditional ? For example could we launch it only when Santa's system extension is not running? |
That plist is installed and loaded for the case where Santa is running as a kext. If Santa is configured to run as a sysx (as it is by default on 10.15+) when it is started from launchd via that plist, it deletes the plist and triggers a re-load via sysx, removing that plist. If santad is running as a system extension, as it is in this case, that plist file should not exist. |
@russellhancox : thanks, I double checked it, the plist file looks to be indeed deployed by our infra, rather than by Santa itself. Is there documentation about what are the launchd daemons that should be installed by Santa? It will help to verify if there are more misconfigurations. |
Hi @russellhancox , were you able to make any progress on this issue? |
Unfortunately not, we used to but the sysx migration has changed this a few times and we haven't yet documented what the expected end-state is. At the system level the only "manually managed" daemon is
We have a test client with a self-managed caching layer messily integrated, which seems to work well but we haven't yet had a chance to test this with another system extension loaded to see if it actually fixes the problem |
I have this same exact issue running AMP and Santa (1.15 and 1.17 tested) on Big Sur. High CPU usage by the Santa System Extension and very very very slow builds. @russellhancox Any way I can get my hands on a copy of that test client with the self-managed caching layer? 😉 |
Unfortunately due to the signing/entitlement requirements it's not possible for us to distribute development builds (they're signed with dev profiles that are linked to specific devices) and we can't produce production builds without reviewed & submitted code. I have just sent out a PR that includes this feature and makes it optional, so we should be able to get a build out that includes it early next week. |
That's wonderful news, thanks for the update! |
Sorry, closed this a little prematurely. The v2021.1 release includes this cache, you'll need to enable it in your config profile by setting |
v2021.1 with the new cache option enabled is confirmed to fix the issue when running alongside AMP. 👍 |
Not to necropost, but did you guys ( @russellhancox , @tburgin ) follow up with Apple on the caching issues with Endpoint Security? My team is providing feedback for a few items in Big Sur and we were wondering if this has been highlighted. |
We didn't file anything about this - we're unsure whether there is a bug or if there is whether it's with the ES framework itself or bugs in other ES clients. And as we don't run 2 ES clients side-by-side and haven't seen this ourselves it's hard to gather the logs that would be necessary to file something. I'm happy to provide any needed details I can if you decide to file one, fwiw. |
Sure, I'd appreciate any pointers you may have on how to collect any more debug data for Apple/vendors for this particular issue. We can reproduce it in Big Sur with Santa 1.13. We run at least 2 other sysexts which should be looking at executions as well. I don't believe we've done extensive testing on isolating this particular bug down to a combination of sysexts so we can try that too. |
Apple will want From our position all I can say is we had multiple reports of Santa reporting very high (several hundred percent) CPU usage when run alongside other EndpointSecurity agents; with some logs from Console.app we determined Santa was repeating work for every execution and guessed the issue must be related to the ES cache. At first we thought it was this other client misbehaving but we've heard of this with at least 3 different products (Carbonblack, CrowdStrike and Cisco AMP) so it's either a bug in ES, widespread confusion about ES's caching or some interesting undocumented behavior. We could probably confirm which by writing a test ES client of our own so we could run 2 things we control side-by-side but a lack of time has prevented us doing that, especially as we have a fix in place that we're happy with. |
Santa 1.15 on Big Sur is causing high local build times.
What logs would be useful to debug this?
The text was updated successfully, but these errors were encountered: