Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add temp CI job to test syspolicy impact #127

Merged
merged 1 commit into from Jul 9, 2020
Merged

add temp CI job to test syspolicy impact #127

merged 1 commit into from Jul 9, 2020

Conversation

abathur
Copy link
Contributor

@abathur abathur commented Jul 9, 2020

Starting in Catalina, macOS runs a syspolicyd "assessment" that hits the network for each binary/script executable. It does cache these results, but Nix tends to introduce many "new" executables per build. (You can read more about this at NixOS/nix#3789).

This PR adds a temporary, redundant macOS job with these assessments disabled. I'm hoping you can adopt it for a few weeks to help me collect more data on how this affects real projects.

(I'm also hoping this runs clean. I've been waiting for these to run through before I PR, but I guess the CI config is only set to run on specific branches and pull, so they aren't automatically running beforehand. 🤞)

Starting in Catalina, macOS runs a syspolicyd "assessment" that hits the network for each binary/script executable. It does cache these results, but Nix tends to introduce many "new" executables per build. (You can read more about this at NixOS/nix#3789).

This PR adds a temporary, redundant macOS job with these assessments disabled. I'm hoping you can adopt it for a few weeks to help me collect more data on how this affects real projects.
@NorfairKing
Copy link
Owner

@abathur (Cool name btw) I'm fine with leaving this in until it starts failing. What kind of data are you looking for now?

@NorfairKing NorfairKing merged commit 94082a2 into NorfairKing:master Jul 9, 2020
@abathur
Copy link
Contributor Author

abathur commented Jul 9, 2020

Sure--I wouldn't want you to keep it if it started mucking with signals/flows you depend on.

Data is fairly simple.

I've had this 2nd job on one of my own jobs for ~2 weeks. Initially I was hoping that using a single service across many projects would produce easy-to-compare numbers. Unfortunately (but not unexpectedly 😁) the CI durations are very noisy relative to runs on my local system; not sure how much of this is just shared hardware or if there's also heterogenous hardware involved.

I'm hoping to handle that with a sufficiently large sample, so I built a little script that uses the GH API for workflows, lets me mark the workflows+jobs+steps that are significant (generally I just do the nix install, and any obvious steps that'll trigger a Nix build) for a given repo, and then collect duration of these stages (for jobs with syspolicyd assessments on, and again with them off) and save them to a csv.

I hope I'll be able to get away with using the mean or median of these, (the script currently just reports averages), but part of the reason I also collect the nix-install step is so that I've got a set of data points that should be fairly consistent across all of the projects collected in case additional steps are needed.

If you're curious and want to collect the data yourself, I do have the script up in a gist. I certainly won't object if you spot any problems 😄

@NorfairKing
Copy link
Owner

NorfairKing commented Jul 9, 2020

@abathur So what is it you're timing? The installing of nix? the running of the executable? Because this ci run doesn't run any of the built executables (it does run tests, but those are separate executables) and the caching means that you can never trust any timings...

@abathur
Copy link
Contributor Author

abathur commented Jul 9, 2020

@abathur So what is it you're timing? The installing of nix? the running of the executable? Because this ci run doesn't run any of the built executables (it does run tests, but those are separate executables) and the caching means that you can never trust any timings...

I'll record the number of seconds taken by the step that installs nix, and the step that invokes nix-build. The latter is a proxy for trying to ballpark what (if any) fraction of observed step durations is assessment overhead. The former is to have data points that should compare cross-project in case they're helpful later for spotting outliers and such.

My main focus is on how this overhead impacts everything that happens during any invocation that triggers a nix-build. I don't have any per-project expectations; it's good if some uses of nix-build appear to have no or virtually no overhead, since they'll help sharpen my understanding of what's roughly intrinsic to running a nix build and what's varies project-to-project or stack-to-stack.

Syspolicyd appears to get a chance to assess any script or executable that runs (unsure of precise scope; I think this is triggered down in a syscall). If it is a "new" one, it'll make a networked request with TLS and round-trip overhead to an Apple server and subsequent requests will re-use this connection until it hits an activity timeout. It caches these, but it seems (to me) like it may be doing this by inode. (If I write the same script contents to a 2nd file and run it, it'll still hit the network. If I edit the contents of a previously-run file, it doesn't.) Some nix build behavior (like working out of tree and generating wrappers) means many of the things that run may be "new".

Edit: I am, of course, not at all sure how cachix will affect this. I figured it made sense to go ahead and collect data for a few weeks and see if it's unintelligible or not. I do have a few cachix-using projects included, so I'm optimistic that the impact of cachix itself will be obvious across the projects.

@NorfairKing
Copy link
Owner

@abathur I think what you're doing makes sense, but only if you compare the timing to the timing of the 'normal' nix-build (that also uses cachix).
Please let me know how it goes!

@abathur
Copy link
Contributor Author

abathur commented Jul 9, 2020

Yes, precisely the intent. Maybe an example of what my tracking script is outputting after a run communicates better (I was going to do this for smos, but the github API is kicking out an unexpected datetime format with timezone adjustments that's breaking the conversion to unix timestamp; I'll need to fix before I can collect yours.):

Reporting for abathur/resholved (avg duration per step, broken down by job):

Step: Run cachix/install-nix-action@v10
       29.1818s  -  macos_perf_test
       41.9091s  -  tests (macos-latest)

Step: Run nix-build ci.nix
      259.5455s  -  macos_perf_test
      307.2727s  -  tests (macos-latest)

It's just computing these from the CSV at the end of each run, so when I have a better sense of what the "right" metrics are, I'll rewrite that part, or write a new analysis script, etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants