Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update CI for upcoming macOS changes #7821

Closed
ehuss opened this issue Jan 22, 2020 · 21 comments · Fixed by #7906
Closed

Update CI for upcoming macOS changes #7821

ehuss opened this issue Jan 22, 2020 · 21 comments · Fixed by #7906
Labels
A-testing-cargo-itself Area: cargo's tests P-high Priority: High

Comments

@ehuss
Copy link
Contributor

ehuss commented Jan 22, 2020

Azure will be removing support for macOS 10.13 in March (https://devblogs.microsoft.com/devops/azure-pipelines-hosted-pools-updates/). We will need to update our CI configuration to use the new image. This will also remove support for cross-compiling tests to i686-apple-darwin. I would like to retain some kind of cross-compile testing on macOS, and Alex suggested using wasm as the alternate target (possibly minus the cross-compile cargo run tests).

@ehuss ehuss added the A-testing-cargo-itself Area: cargo's tests label Jan 22, 2020
@ehuss ehuss added the P-high Priority: High label Feb 2, 2020
@ehuss
Copy link
Contributor Author

ehuss commented Feb 9, 2020

I have a branch which implements this, but I am running into problems. See rust-lang/rust#68863 (comment) for details.

EDIT: Added logs here:

Logs
taskgated   11  libsystem_pthread.dylib             0x00007fff6522fe65 _pthread_start + 148
taskgated   12  libsystem_pthread.dylib             0x00007fff6522b83b thread_start + 15
taskgated   no system signature for unsigned /Users/eric/Proj/rust/cargo/target/debug/cargo[30241]
taskgated   close(3) err: 0
taskgated   end request
taskgated   begin request: 3331, 27001
taskgated   UNIX error exception: 2
opendirectoryd  PID: 30240, Client: 'clang', exited with 0 session(s), 0 node(s) and 0 active request(s)
opendirectoryd  Trigger - cancelled
opendirectoryd  Finalizing pidinfo (30240) object - 0x7fccecb44d00
taskgated   0   Security                            0x00007fff3a1fea43 Security::CommonError::LogBacktrace() + 107
taskgated   1   Security                            0x00007fff3a1fed5d Security::UnixError::UnixError(int, bool) + 263
taskgated   2   Security                            0x00007fff3a1fedb8 Security::UnixError::throwMe(int) + 36
taskgated   3   Security                            0x00007fff3a101c56 Security::CodeSigning::KernelCode::identifyGuest(Security::CodeSigning::SecCode*, __CFData const**) + 1006
taskgated   4   Security                            0x00007fff3a0df0fa Security::CodeSigning::SecCode::identify() + 58
taskgated   5   Security                            0x00007fff3a0df514 Security::CodeSigning::SecCode::autoLocateGuest(__CFDictionary const*, unsigned int) + 122
taskgated   6   Security                            0x00007fff3a0e55a6 SecCodeCopyGuestWithAttributes + 78
taskgated   7   taskgated                           0x000000010119f3b7 taskgated + 13239
taskgated   8   taskgated                           0x000000010119facb taskgated + 15051
taskgated   9   taskgated                           0x000000010119fe67 taskgated + 15975
taskgated   10  taskgated                           0x000000010119fed3 taskgated + 16083
taskgated   11  taskgated                           0x00000001011a20ea taskgated + 24810
taskgated   12  taskgated                           0x00000001011a1617 taskgated + 22039
opendirectoryd  PID: 30005, Client: 'cargo', exited with 0 session(s), 0 node(s) and 0 active request(s)
taskgated   13  taskgated                           0x00000001011a0ce5 taskgated + 19685
syspolicyd  GK process assessment:  <-- (, )
opendirectoryd  Trigger - cancelled
syspolicyd  Unable (errno: 2) to read file at  for process path:  library path: (null)
taskgated   14  libsystem_pthread.dylib             0x00007fff6522fe65 _pthread_start + 148
opendirectoryd  Finalizing pidinfo (30005) object - 0x7fccead43440
syspolicyd  Dropping com.apple.syspolicy.Gatekeeper.Errors as it isn't used in any transform (not in the config or budgeted?)
kernel  build_userspace_exit_reason: illegal flags passed from userspace (some masked off) 0x141, ns: 9, code 0x8
syspolicyd  Terminating process due to Gatekeeper rejection: 30005, 
taskgated   15  libsystem_pthread.dylib             0x00007fff6522b83b thread_start + 15
kernel  Waking up reference: 66615
taskgated   no signature for pid=30005 (cannot make code: UNIX[No such file or directory])
kernel  Thread waiting on reference 66615 woke up
taskgated   end request
kernel  Sleep interrupted, signal 0x100
kernel  Security policy would not allow process: 30005, /Users/eric/Proj/rust/cargo/target/cit/t1430/foo/target/debug/deps/foo

Most of these messages are unique to when it decides to kill a process.

@alexcrichton
Copy link
Member

@ehuss it may be worthwhile opening an issue for that on https://github.com/actions/virtual-environments since that's probably an image thing that may need fixing?

@ehuss
Copy link
Contributor Author

ehuss commented Feb 10, 2020

Nah, I can repro locally. I'm pretty sure it's a bug in Catalina (I wasn't able to repro in older macOS versions). Catalina had a bunch of new security changes, so it's not too surprising that a weird use case like Cargo's test suite causes a problem. I might file a radar, but in my experience it usually takes at least a few years to get a response (and since this is an obscure case, unlikely to get any response).

I'm currently leaning towards changing Cargo's testsuite to check for processes that exit with SIGKILL on macos and retrying them. I'm not sure how difficult that will be (it will probably need to scan the output for text like "exited with signal: 9").

My concern is that it would also happen with rustc's test suite, since it also rapidly creates new executables and runs them. But it seemed fine on my end, so maybe there's something unique or different in how Cargo is interacting. Or maybe Cargo's tests are more complex than rustc's, since they often have more complex linking requirements and do several rapid rebuilds of the same project (like overwriting an executable several times might confuse Gatekeeper).

@alexcrichton
Copy link
Member

Oh jeez that's a bummer :(

I figured there was some way to disable this but it doesn't look like that's possible at least from the UI... I agree that waiting for a fix from Apple probably isn't gonna happen, but I'm surprised others haven't run into this in the sense that testing C/C++ projects (maybe even LLVM) or other big-ish build systems in general seems like they'd have the same issues.

Googling around "disable gatekeeper" it looks like there are some hits though. I've got catalina myself and I can try to play with this soon and see if those can fix the issue. My thinking is that if we could disable w/e security things are there then we could make the request to github actions to do that on their image.

@ehuss
Copy link
Contributor Author

ehuss commented Feb 11, 2020

I tried disabling Gatekeeper with spctl --master-disable, but that didn't help. I also tried with the network turned off (thinking it might be related to the Notary service), but that didn't help either.

One thing did fix the problem, and that is disabling System Integrity Protection. That requires running csrutil disable from the Recovery OS. That seems to completely disable the syspolicyd process. I'm not sure if that's something we can ask Azure to do, since it is a manual thing that affects the entire system (and might lower their defenses). I'd also prefer not to do that on my development machines.

I could maybe run the tests in a loop and collect data about which tests seem to fail. So far it seems pretty random, but maybe there is a pattern to which ones are affected?

@ehuss
Copy link
Contributor Author

ehuss commented Feb 11, 2020

I ran the tests in a loop for a couple hours and counted which tests failed:

4 build_script::optional_build_script_dep
3 run::run_multiple_packages
3 run::release_works
3 run::fail_no_extra_verbose
2 run::specify_name
2 run::run_workspace
2 features::all_features_virtual_ws
2 build_script::rename_with_link_search_path
2 build_script::assume_build_script_when_build_rs_present
1 standard_lib::simple_bin_std
1 run::specify_default_run
1 run::simple
1 run::run_with_library_paths
1 run::run_dylib_dep
1 run::run_bin_different_name
1 run::library_paths_sorted_alphabetically
1 run::explicit_bin_with_args
1 run::exit_code
1 run::dashes_are_forwarded
1 path::path_dep_build_cmd
1 features::default_feature_pulled_in
1 build_script::optional_build_dep_and_required_normal_dep
1 build_script::code_generation
1 build::run_proper_binary_main_rs
1 build::run_proper_binary
1 build::crate_library_path_env_var

The only commonality I see is that these all run a process after it is built.

@Mark-Simulacrum
Copy link
Member

@ehuss Do you have a script for generating that? It might be useful to try and have others run tests locally as well, I guess, and see if we get the ~same set.

@ehuss
Copy link
Contributor Author

ehuss commented Feb 11, 2020

I just ran while true; do cargo test --test testsuite >> log; done, and then stopped it after a few hours. Grep for "FAILED$" in the log file, and made a unique count. I did that in my editor, but grep FAILED$ log | sort | uniq -c | sort -n should be the same.

@alexcrichton
Copy link
Member

Hm well without really knowing much about why these tests are getting killed makes it sort of hard to figure out how to work around this. We may be able to get by with a 10ms sleep before spawning processes or something like that?

The nondeterministic nature of this in particular is hard to wrap my head around. Given that I'm not sure how we can handle this because it's hard to pinpoint a cause. Clearly we can run just-generated binaries, just not in some cases...

@Mark-Simulacrum
Copy link
Member

In the reproductions I've managed to come across locally, "syspolicyd Unable (errno: 2) to read file at for process path: library path: (null)" is a commonality -- that suggests to me that some file is perhaps not being fully created yet? Maybe we need to (loosely) fsync before running a binary?

Unfortunately it looks like locally I get the following, and turning these privates into non-private seems hard (the random blogs online suggests running reverse-engineered code, which seems... error prone, and I don't want to break something :)

syspolicyd: Unable (errno: 2) to read file at <private> for process path: <private> library path: (null)
syspolicyd: Unable (errno: 2) to read file at <private> for process path: <private> library path: (null)

Some rough googling for the error here leads me to flutter/flutter#38325 via https://github.com/christopherfujino/catalina-crasher-demo/, which has the interesting sounding:

The actual cause is calling the Flutter tool re-entrantly, which deletes the current Dart SDK (which the flutter upgrade process is using), after which any forked sub-processes would be killed (code -9 from the VersionCheckError message) by the macOS anti-malware software

That sounds at least plausibly like something that might happen to us here? I don't think we're deleting anything that is forking a subprocess off, but maybe there's a similar error condition.

@ehuss
Copy link
Contributor Author

ehuss commented Feb 12, 2020

Yea, from what I've read (blog) it is impossible to reveal those <private> fields in 10.15.3, that's very frustrating. I'm almost tempted to figure out how to reinstall 10.15.2 to view them.

I went strolling through Apple's opensource deposits to see if any of this code is available, but they have only released the kernel for 10.15, and I can't find syspolicyd in 10.14 (I found copies from much older releases (10.7?), but that is not helpful). I found the code for build_userspace_exit_reason, but it's not too interesting (pretty much saying, it won't create a crash report).

I created an isolated reproduction that does not use cargo:

scargo.zip

It seems like calling hard_link is required to trigger the issue. If I switch that to a copy, it doesn't trigger. Maybe we'll just need to switch to copies on macos? Maybe either of you can fiddle with that repro and see if there's some other way to work around it?

@Mark-Simulacrum
Copy link
Member

Heh, I didn't come across that particular blog! By chance, I think I'm still on .2, so I'll try that out.

I will also try to fiddle with the reproduction you've noted. Losing hard linking is actually probably not too bad, especially on Macs, where most people have ssds I'd guess (or at least fast disks).

@aidanhs
Copy link
Member

aidanhs commented Feb 12, 2020

I was trying to help someone on the users forum who had an issue with something that seems suspiciously similar - https://users.rust-lang.org/t/github-actions-randomly-kill-a-test-program/37255/

@certik
Copy link

certik commented Feb 12, 2020

I am the one @aidanhs helped at the Forum with a similar (if not the same) bug. I posted a workaround there, but I do not know what the actual bug is.

@Mark-Simulacrum
Copy link
Member

With privacy disabled, here's the log. Initial guess is that we're trying to look at the old file to verify the new file? (This is with the scargo reproduction provided by @ehuss). I have not looked in detail at the scargo contents, and probably do not have time to dig in much more though.

2020-02-12 16:42:26.232263-0500 0x8e2a02   Default     0x0                  192    0    <Security`Security::MacOSError::MacOSError(int)> syspolicyd: (Security) [com.apple.securityd:security_exception] MacOS error: 3
2020-02-12 16:42:26.232346-0500 0x8e2a02   Default     0x0                  192    0    <Security`Security::CodeSigning::Requirement::Interpreter::eval(int)> syspolicyd: (Security) [com.apple.securityd:SecError] Error checking with notarization daemon: 3
2020-02-12 16:42:26.239340-0500 0x8ece1a   Default     0x0                  192    0    <Security`Security::MacOSError::MacOSError(int)> syspolicyd: (Security) [com.apple.securityd:security_exception] MacOS error: -67062
2020-02-12 16:42:26.239542-0500 0x8ece1a   Default     0x0                  192    0    <Security`Security::MacOSError::MacOSError(int)> syspolicyd: (Security) [com.apple.securityd:security_exception] MacOS error: -67062
2020-02-12 16:42:26.239665-0500 0x8ece1a   Default     0x0                  192    0    <Security`Security::CodeSigning::PolicyEngine::temporarySigning(__SecCode const*, unsigned int, __CFURL const*, unsigned long long)> syspolicyd: (Security) [com.apple.securityd:gk] temporarySigning type=1 matchFlags=0x0 path=/Users/mark/Edit/cargo/scargo/cit/t3/out/foo-10c7edd825e22438
2020-02-12 16:42:26.240279-0500 0x8ece1a   Default     0x0                  192    0    <Security`Security::MacOSError::MacOSError(int)> syspolicyd: (Security) [com.apple.securityd:security_exception] MacOS error: -67062
2020-02-12 16:42:26.241946-0500 0x8ece1a   Info        0x0                  192    0    <syspolicyd> syspolicyd: GK Xprotect results: /Users/mark/Edit/cargo/scargo/cit/t3/out/foo-10c7edd825e22438, {
    XProtectMalwareType = 0;
}
2020-02-12 16:42:26.241995-0500 0x8ece1a   Info        0x0                  192    0    <XprotectFramework`__32-[XProtectAnalysis initWithURL:]_block_invoke.38> syspolicyd: (XprotectFramework) [com.apple.xprotect:xprotect] Invalidated
2020-02-12 16:42:26.242019-0500 0x8e2a02   Info        0x0                  192    0    <syspolicyd> syspolicyd: GK scan complete: /Users/mark/Edit/cargo/scargo/cit/t3/out/foo-10c7edd825e22438, 7, 0
2020-02-12 16:42:26.242212-0500 0x8e2a02   Info        0x0                  192    0    <CoreAnalytics`AnalyticsSendEventLazy> syspolicyd: (CoreAnalytics) [com.apple.CoreAnalytics:client] Dropping com.apple.syspolicy.Gatekeeper.Scan as it isn't used in any transform (not in the config or budgeted?)
2020-02-12 16:42:26.242257-0500 0x8e2a02   Info        0x0                  192    0    <syspolicyd> syspolicyd: scan finished, waking up any waiters: file:///Users/mark/Edit/cargo/scargo/cit/t3/out/foo-10c7edd825e22438
2020-02-12 16:42:26.242314-0500 0x8e2a02   Info        0x0                  192    0    <syspolicyd> syspolicyd: GK evaluateScanResult: 2, /Users/mark/Edit/cargo/scargo/cit/t3/out/foo-10c7edd825e22438, 0, 0, 1, 0, 7, 0
2020-02-12 16:42:26.242318-0500 0x8e2a02   Info        0x0                  192    0    <syspolicyd> syspolicyd: Updating flags: /Users/mark/Edit/cargo/scargo/cit/t3/out/foo-10c7edd825e22438, 512
2020-02-12 16:42:26.249546-0500 0x8e2a00   Default     0x0                  500    0    <Security`Security::MacOSError::MacOSError(int)> taskgated: (Security) [com.apple.securityd:security_exception] MacOS error: -67062
2020-02-12 16:42:26.255189-0500 0x8c837e   Default     0x0                  500    0    <Security`Security::MacOSError::MacOSError(int)> taskgated: (Security) [com.apple.securityd:security_exception] MacOS error: -67062
2020-02-12 16:42:26.257743-0500 0xe93      Default     0x0                  500    0    <Security`Security::UnixError::UnixError(int, bool)> taskgated: (Security) [com.apple.securityd:security_exception] UNIX error exception: 2
2020-02-12 16:42:26.257913-0500 0xe93      Default     0x0                  500    0    <taskgated> taskgated: no signature for pid=38121 (cannot make code: UNIX[No such file or directory])
2020-02-12 16:42:26.258090-0500 0x8ece1a   Info        0x0                  192    0    <syspolicyd> syspolicyd: GK process assessment: /Users/mark/Edit/cargo/scargo/cit/t6/foo-10c7edd825e22438 <-- (/Users/mark/Edit/cargo/scargo/target/debug/scargo, /Users/mark/Edit/cargo/scargo/cit/t6/out/foo-10c7edd825e22438)
2020-02-12 16:42:26.258140-0500 0x8ece1a   Error       0x0                  192    0    <syspolicyd> syspolicyd: Unable (errno: 2) to read file at /Users/mark/Edit/cargo/scargo/cit/t6/foo-10c7edd825e22438 for process path: /Users/mark/Edit/cargo/scargo/cit/t6/foo-10c7edd825e22438 library path: (null)
2020-02-12 16:42:26.258154-0500 0x8ece1a   Info        0x0                  192    0    <CoreAnalytics`AnalyticsSendEventLazy> syspolicyd: (CoreAnalytics) [com.apple.CoreAnalytics:client] Dropping com.apple.syspolicy.Gatekeeper.Errors as it isn't used in any transform (not in the config or budgeted?)
2020-02-12 16:42:26.258165-0500 0x8ece1a   Error       0x0                  192    0    <syspolicyd> syspolicyd: Terminating process due to Gatekeeper rejection: 38121, /Users/mark/Edit/cargo/scargo/cit/t6/foo-10c7edd825e22438
2020-02-12 16:42:26.258182-0500 0x8ece1a   Default     0x0                  0      0    kernel: build_userspace_exit_reason: illegal flags passed from userspace (some masked off) 0x141, ns: 9, code 0x8
2020-02-12 16:42:26.258214-0500 0x8ee07f   Default     0x0                  0      0    <AppleSystemPolicy`ASPEvaluationManager::waitOnEvaluation(syspolicyd_evaluation*)> kernel: (AppleSystemPolicy) Sleep interrupted, signal 0x100
2020-02-12 16:42:26.258232-0500 0x8ee07f   Default     0x0                  0      0    <AppleSystemPolicy`AppleSystemPolicy::procNotifyExecComplete(proc*)> kernel: (AppleSystemPolicy) Security policy would not allow process: 38121, /Users/mark/Edit/cargo/scargo/cit/t6/foo-10c7edd825e22438
2020-02-12 16:42:26.266110-0500 0x8c837d   Default     0x0                  500    0    <Security`Security::MacOSError::MacOSError(int)> taskgated: (Security) [com.apple.securityd:security_exception] MacOS error: -67062
2020-02-12 16:42:26.266385-0500 0x8ece1a   Info        0x0                  192    0    <syspolicyd> syspolicyd: GK process assessment: /Users/mark/Edit/cargo/scargo/cit/t2/out/foo-10c7edd825e22438 <-- (/Users/mark/Edit/cargo/scargo/target/debug/scargo, /Users/mark/Edit/cargo/scargo/cit/t2/out/foo-10c7edd825e22438)
2020-02-12 16:42:26.266526-0500 0x8ece1a   Info        0x0                  192    0    <syspolicyd> syspolicyd: Gatekeeper assessment rooted at: /Users/mark/Edit/cargo/scargo/cit/t2/out/foo-10c7edd825e22438
2020-02-12 16:42:26.267369-0500 0x8ece1a   Activity    0x87afdd             192    0    syspolicyd: (TCC) TCCAccessRequest() IPC
2020-02-12 16:42:26.267636-0500 0x8ece1a   Info        0x87afdd             192    0    <TCC`tccd_send_message> syspolicyd: (TCC) [com.apple.TCC:access] Sending 0/7 synchronous to com.apple.tccd.system: request (0x7fea22e38320): <dictionary: 0x7fea22e38320> { count = 6, transaction: 0, voucher = 0x0, contents =
        "require_purpose" => <null: 0x7fff903b0010>: null-object
        "service" => <string: 0x7fea22e4fdc0> { length = 24, contents = "kTCCServiceDeveloperTool" }
        "function" => <string: 0x7fea22e484f0> { length = 16, contents = "TCCAccessRequest" }
        "preflight" => <bool: 0x7fff903af930>: true
        "target_token" => <data: 0x7fea22e48d60>: { length = 32 bytes, contents = 0xf5010000f501000014000000f501000014000000ea940000... }
        "background_session" => <bool: 0x7fff903af950>: false

@ehuss
Copy link
Contributor Author

ehuss commented Feb 14, 2020

Thanks for posting the unredacted logs. I don't see anything too surprising.

Just wanted to update, I've been experimenting with different things over the past 2 days, but haven't found any great workarounds. I'd like to avoid copies since they can use a lot of disk space. I also suspect for normal usage Gatekeeper is unlikely to affect anyone. Maybe we could only copy in debug mode for cargo's test suite? I was also trying to think of more extreme options (like APFS clones), but nothing practical has come up.

I created a repro just using shell scripts (just to rule out any particulars of Rust):

repro.sh

#!/bin/bash
# Gatekeeper crash reproduction.

set -e

echo "int main() {}" > foo.c
cc -o foo foo.c

N=8

for ((i=0;i<$N;i++))
do
    ./runner.sh $i &
    pids[$i]=$!
done

cleanup() {
    echo "Cleanup after exit..."
    kill -TERM "${pids[@]}"
    exit 1
}

# wait -n isn't available in this old version of bash.
trap "cleanup" CHLD

wait

runner.sh

#!/bin/sh

set -e

root=t$1
for i in {1..1000}
do
    echo $i
    rm -rf $root
    mkdir -p $root/out
    cp foo $root/out/foo2
    ln $root/out/foo2 $root/foo
    $root/foo
done

What's crazy is that it is very particular to the exact commands here. The following variations don't exhibit the crash:

  • Link same filename, different directories (out/foo and foo)
  • Link files in same directory (foo2 and foo)
  • Copy files instead of link

(Assuming I'm not being mislead by timing variations.)

Some things that I've tried that don't help:

  • fsync files and directories (I was thinking that maybe the directory metadata wasn't updated)
  • short sleeps before running the process

@alexcrichton
Copy link
Member

As a random shot in the dark, what if instead of hard-linking we instead did something like:

  • We previously hard link A (somewhere deep in deps) to B (somewhere in the "top level")
  • Instead we rename A to B
  • Then we hard link B to A

Since B would then be the "original copy" does it then fix the issue?

I'm also running those scripts locally to try to reproduce but nothing yet. How quickly does it reproduce for you?

@Mark-Simulacrum
Copy link
Member

I could reproduce with cargo on just run tests in around 15, maybe 20 minutes, though I only tried a couple times. I think I managed a reproduction in around 5-10 minutes with @ehuss's rust script. Both of these are on .2, though, and I've since updated to .3 and have not tried to reproduce since then.

I wonder if due to the migration to APFS which I presume is quite widely used, and I believe is CoW, we might be observing some side effect of that and it would be beneficial to write a few bytes (vs. messing with what gets hard linked). Obviously we could make these the same bytes, I guess, though maybe that doesn't defeat the CoW nature of the filesystem...

If it is CoW, we should also check that just making a copy isn't already sharing disk space and as such isn't fast enough that we don't need to bother hardlinking at all?

@ehuss
Copy link
Contributor Author

ehuss commented Feb 18, 2020

How quickly does it reproduce for you?

Usually takes anywhere from a few seconds to 5 minutes. I run my tests for 10-20 minutes just to make sure.

I suspect it is very sensitive to timing. The system I'm running tests on is kinda old (~6 years). You can maybe tweak the parallelism to match the number of cpus (I hard-coded it to 8). You can also maybe disable some CPUs to slow your machine down (Instruments > Settings > CPU).

Interestingly, I tried your rename trick and it doesn't seem to repro with that. How strange!

I tested with HFS, and I'm unable to repro on that, so it does seem to be related to APFS! I also verified that the 10.15 image on azure switched to apfs (which doesn't seem to be documented anywhere 😠).

@Mark-Simulacrum regarding the CoW stuff, I'm not too familiar with APFS. However, my understanding is that CoW only works if the program uses special options. That is, the copyfile(3) function with the COPYFILE_CLONE flag. Is that correct, or is there something else regarding CoW? I considered trying that, but I think it is only available in very recent versions of macos, and I don't know if it is worth monkeying around with that.

@alexcrichton
Copy link
Member

Oh I think I was actually having some kills locally, I just didn't see them because the script didn't stop at them (or I messed something up in how I ran the script)...

Interestingly, I tried your rename trick and it doesn't seem to repro with that. How strange!

Do you think this is a viable way forward maybe? We could try to optimize this to not move files around if the hard links are already set up (I think we already stat everything anyway).

@ehuss
Copy link
Contributor Author

ehuss commented Feb 19, 2020

Do you think this is a viable way forward maybe?

I ran some tests with Cargo's full testsuite using the rename trick, but I still got failures. ☹️

@bors bors closed this as completed in 369991b Feb 20, 2020
ehuss pushed a commit to ehuss/cargo that referenced this issue Mar 16, 2020
Switch azure to macOS 10.15.

Switches CI to the macOS 10.15 image.  Since 32-bit support is no longer available, this changes how cross-compile testing works.  I decided to use `x86_64-apple-ios` as a cross target, since it can easily build/link on macOS.  `cargo run` won't work without a simulator, so some of the tests are restructured to check if `cargo run` is allowed.  If you do have a simulator, it should Just Work.  CI doesn't seem to be configured with a simulator installed, and I didn't bother to look if that would be possible (the simulators tend to be several gigabytes in size).

An alternative approach would be to use wasm as a cross target, which is also fairly easy to support.  But wasm is a sufficiently different target that it can cause some issues in some tests, and is a bit harder to run as an executable.

This also adds some more help text on how to configure cross-compile tests.

Rustup is now installed on macOS by default, so no need to install it.  Unfortunately self-updates are not allowed, but hopefully that won't be an issue.

Closes rust-lang#7821
matklad added a commit to matklad/xshell that referenced this issue Mar 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-testing-cargo-itself Area: cargo's tests P-high Priority: High
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants