add temp CI job to test syspolicy impact #127

abathur · 2020-07-09T02:06:52Z

Starting in Catalina, macOS runs a syspolicyd "assessment" that hits the network for each binary/script executable. It does cache these results, but Nix tends to introduce many "new" executables per build. (You can read more about this at NixOS/nix#3789).

This PR adds a temporary, redundant macOS job with these assessments disabled. I'm hoping you can adopt it for a few weeks to help me collect more data on how this affects real projects.

(I'm also hoping this runs clean. I've been waiting for these to run through before I PR, but I guess the CI config is only set to run on specific branches and pull, so they aren't automatically running beforehand. 🤞)

Starting in Catalina, macOS runs a syspolicyd "assessment" that hits the network for each binary/script executable. It does cache these results, but Nix tends to introduce many "new" executables per build. (You can read more about this at NixOS/nix#3789). This PR adds a temporary, redundant macOS job with these assessments disabled. I'm hoping you can adopt it for a few weeks to help me collect more data on how this affects real projects.

NorfairKing · 2020-07-09T06:13:35Z

@abathur (Cool name btw) I'm fine with leaving this in until it starts failing. What kind of data are you looking for now?

abathur · 2020-07-09T14:52:57Z

Sure--I wouldn't want you to keep it if it started mucking with signals/flows you depend on.

Data is fairly simple.

I've had this 2nd job on one of my own jobs for ~2 weeks. Initially I was hoping that using a single service across many projects would produce easy-to-compare numbers. Unfortunately (but not unexpectedly 😁) the CI durations are very noisy relative to runs on my local system; not sure how much of this is just shared hardware or if there's also heterogenous hardware involved.

I'm hoping to handle that with a sufficiently large sample, so I built a little script that uses the GH API for workflows, lets me mark the workflows+jobs+steps that are significant (generally I just do the nix install, and any obvious steps that'll trigger a Nix build) for a given repo, and then collect duration of these stages (for jobs with syspolicyd assessments on, and again with them off) and save them to a csv.

I hope I'll be able to get away with using the mean or median of these, (the script currently just reports averages), but part of the reason I also collect the nix-install step is so that I've got a set of data points that should be fairly consistent across all of the projects collected in case additional steps are needed.

If you're curious and want to collect the data yourself, I do have the script up in a gist. I certainly won't object if you spot any problems 😄

NorfairKing · 2020-07-09T15:19:39Z

@abathur So what is it you're timing? The installing of nix? the running of the executable? Because this ci run doesn't run any of the built executables (it does run tests, but those are separate executables) and the caching means that you can never trust any timings...

abathur · 2020-07-09T16:52:35Z

@abathur So what is it you're timing? The installing of nix? the running of the executable? Because this ci run doesn't run any of the built executables (it does run tests, but those are separate executables) and the caching means that you can never trust any timings...

I'll record the number of seconds taken by the step that installs nix, and the step that invokes nix-build. The latter is a proxy for trying to ballpark what (if any) fraction of observed step durations is assessment overhead. The former is to have data points that should compare cross-project in case they're helpful later for spotting outliers and such.

My main focus is on how this overhead impacts everything that happens during any invocation that triggers a nix-build. I don't have any per-project expectations; it's good if some uses of nix-build appear to have no or virtually no overhead, since they'll help sharpen my understanding of what's roughly intrinsic to running a nix build and what's varies project-to-project or stack-to-stack.

Syspolicyd appears to get a chance to assess any script or executable that runs (unsure of precise scope; I think this is triggered down in a syscall). If it is a "new" one, it'll make a networked request with TLS and round-trip overhead to an Apple server and subsequent requests will re-use this connection until it hits an activity timeout. It caches these, but it seems (to me) like it may be doing this by inode. (If I write the same script contents to a 2nd file and run it, it'll still hit the network. If I edit the contents of a previously-run file, it doesn't.) Some nix build behavior (like working out of tree and generating wrappers) means many of the things that run may be "new".

Edit: I am, of course, not at all sure how cachix will affect this. I figured it made sense to go ahead and collect data for a few weeks and see if it's unintelligible or not. I do have a few cachix-using projects included, so I'm optimistic that the impact of cachix itself will be obvious across the projects.

NorfairKing · 2020-07-09T17:20:15Z

@abathur I think what you're doing makes sense, but only if you compare the timing to the timing of the 'normal' nix-build (that also uses cachix).
Please let me know how it goes!

abathur · 2020-07-09T17:36:24Z

Yes, precisely the intent. Maybe an example of what my tracking script is outputting after a run communicates better (I was going to do this for smos, but the github API is kicking out an unexpected datetime format with timezone adjustments that's breaking the conversion to unix timestamp; I'll need to fix before I can collect yours.):

Reporting for abathur/resholved (avg duration per step, broken down by job):

Step: Run cachix/install-nix-action@v10
       29.1818s  -  macos_perf_test
       41.9091s  -  tests (macos-latest)

Step: Run nix-build ci.nix
      259.5455s  -  macos_perf_test
      307.2727s  -  tests (macos-latest)

It's just computing these from the CSV at the end of each run, so when I have a better sense of what the "right" metrics are, I'll rewrite that part, or write a new analysis script, etc.

This reverts commit 94082a2.

NorfairKing merged commit 94082a2 into NorfairKing:master Jul 9, 2020

abathur mentioned this pull request Aug 31, 2020

Revert "add temp CI job to test syspolicy impact" #154

Merged

NorfairKing pushed a commit that referenced this pull request Aug 31, 2020

Revert "add temp CI job to test syspolicy impact (#127)" (#154)

ccc14bf

This reverts commit 94082a2.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add temp CI job to test syspolicy impact #127

add temp CI job to test syspolicy impact #127

abathur commented Jul 9, 2020

NorfairKing commented Jul 9, 2020

abathur commented Jul 9, 2020

NorfairKing commented Jul 9, 2020 •

edited

Loading

abathur commented Jul 9, 2020 •

edited

Loading

NorfairKing commented Jul 9, 2020

abathur commented Jul 9, 2020

add temp CI job to test syspolicy impact #127

add temp CI job to test syspolicy impact #127

Conversation

abathur commented Jul 9, 2020

NorfairKing commented Jul 9, 2020

abathur commented Jul 9, 2020

NorfairKing commented Jul 9, 2020 • edited Loading

abathur commented Jul 9, 2020 • edited Loading

NorfairKing commented Jul 9, 2020

abathur commented Jul 9, 2020

NorfairKing commented Jul 9, 2020 •

edited

Loading

abathur commented Jul 9, 2020 •

edited

Loading