Fix nonconcurrent tests #6900

jethrogb · 2019-05-02T22:02:01Z

The cargo testsuite relies on a clean test “root” for every test (i.e. #[test]-annotated function). It relied on the test crate's behavior to spawn a new thread for each test, which isn't done when tests aren't run concurrently, breaking the test suite. In this PR, I'm using backtraces to figure out which test is being run, which is much more robust. I also cleaned up the root initialization logic so that it no longer recursive calls the init function.

Fixes #6746

rust-highfive · 2019-05-02T22:02:06Z

r? @ehuss

(rust_highfive has picked a reviewer for you, use r? to override)

jethrogb · 2019-05-08T01:29:51Z

Ping @ehuss

ehuss · 2019-05-09T00:25:22Z

Thanks for the PR! I don't think we can use backtraces to detect the current test, though. For example, on mac it runs about 5 times slower than normal.

Would you be willing to try a solution that involves creating a new proc-macro attribute? I think this should work: Create a new #[cargo_test] attribute which replaces the existing #[test] attribute. It should add a #[test] attribute, and also add a call to an initialization function at the top of the test. This initialization function can then use thread-local data to determine if it is a new test or not, and store the appropriate thread-local TASK_ID.

jethrogb · 2019-05-09T02:25:22Z

Done. I've put all the attribute changes in a separate commit for the reviewer's benefit.

ehuss · 2019-05-10T02:25:45Z

Would it be possible to keep the attribute short, like #[cargo_test]? What I was thinking is that the attribute would inject a call to an initialization function at the top of the test, and then remove the call to init from paths::root. This should ensure the test is only initialized once, and only for that test. The init function would need to check a thread-local value to determine if it is a new test, or if it is a test reusing a thread (in which case an new ID would be issued). Also, keeping the project named cargo_test would make its intent clearer if/when adding other macros.

jethrogb · 2019-05-10T02:58:42Z

Would it be possible to keep the attribute short, like #[cargo_test]?

I wanted to keep the proc_macro hygenic. If you don't think that's useful, that can be done.

What I was thinking is that the attribute would inject a call to an initialization function at the top of the test, and then remove the call to init from paths::root. This should ensure the test is only initialized once, and only for that test. The init function would need to check a thread-local value to determine if it is a new test, or if it is a test reusing a thread (in which case an new ID would be issued).

This is possible but it's a pretty significant rewrite of what I've done so far for IMO very little gain. Are you worried about the perf of the Mutex?

bors · 2019-05-10T15:04:24Z

☔ The latest upstream changes (presumably #6883) made this pull request unmergeable. Please resolve the merge conflicts.

ehuss · 2019-05-10T15:46:50Z

I wanted to keep the proc_macro hygenic. If you don't think that's useful, that can be done.

I don't know what this means here. I'm not sure how injecting a call to an init function should affect hygene.

Are you worried about the perf of the Mutex?

Well, this as-is doesn't work. I'm not sure, but it looks to have some race conditions. I would lean towards a simple solution, more analogous to traditional test suite "setup" functions, that gets run before the test starts.

jethrogb · 2019-05-10T15:59:51Z

I don't know what this means here. I'm not sure how injecting a call to an init function should affect hygene.

The init function is not defined by the proc macro, so if you don't pass in the path to the init function, the macro is just making up the path out of thin air, which may or may not be imported at the place the function is defined.

Well, this as-is doesn't work. I'm not sure, but it looks to have some race conditions.

I think it works fine, and there's no opportunity for a race condition because mostly everything is thread-local. The tests are currently failing because I think I changed some #[test] attributes that were inside the test cases.

jethrogb · 2019-06-03T20:30:59Z

@ehuss I answered your questions.

ehuss · 2019-06-05T17:00:11Z

I would still prefer a simpler attribute as explained above.

Also, I don't think we can't switch Cargo to a workspace because it is inside another workspace in rust-lang.

EDIT: We'll also need to figure out some way to publish this new package, we can't have a path-only dependency. My preference would be to relax this restriction, but it'll be some work.

jethrogb · 2019-06-05T19:13:14Z

Pushed changes. If the current code is acceptable I'd prefer a swift review since this bitrots pretty quickly.

sed -i 's/^#\[test\]/#[cargo_test]/' $(rg -l '^#\[test\]') Manual fixes: * proc_macro::proc_macro_doctest

ehuss · 2019-06-07T20:31:52Z

Thanks!
I rebased and included some updates to make it publishable.
@bors r+

bors · 2019-06-07T20:31:53Z

📌 Commit a8c22ca has been approved by ehuss

Fix nonconcurrent tests The cargo testsuite relies on a clean test “root” for every test (i.e. `#[test]`-annotated function). It relied on the `test` crate's behavior to spawn a new thread for each test, which isn't done when tests aren't run concurrently, breaking the test suite. In this PR, I'm using backtraces to figure out which test is being run, which is much more robust. I also cleaned up the root initialization logic so that it no longer recursive calls the `init` function. Fixes #6746

bors · 2019-06-07T20:32:01Z

⌛ Testing commit a8c22ca with merge d5723eb...

bors · 2019-06-07T21:03:42Z

☀️ Test successful - checks-travis, status-appveyor
Approved by: ehuss
Pushing d5723eb to master...

Eh2406 · 2019-06-11T21:24:57Z

I am seeing thread 'publish_lockfile::note_resolve_changes' panicked at 'path.metadata() failed with The system cannot find the file specified. (os error 2)', tests\testsuite\support\paths.rs:197:25 and then all other tests are failing with thread 'registry::resolve_and_backtracking' panicked at 'Once instance has previously been poisoned', src\libstd\sync\once.rs:362:21

jethrogb · 2019-06-11T21:47:36Z

Can you run it again with RUST_BACKTRACE=1?

Eh2406 · 2019-06-12T15:12:42Z

Looks like delling my cit folder fixed the problem. Sorry for the noize.

Update cargo Update cargo 19 commits in 545f354259be4e9745ea00a524c0e4c51df01aa6..807429e1b6da4e2ec52488ef2f59e77068c31e1f 2019-05-23 17:45:30 +0000 to 2019-06-11 14:06:10 +0000 - Stabilize publish-lockfile. (rust-lang/cargo#7026) - change package cache lock message (rust-lang/cargo#7029) - Fix documenting an example. (rust-lang/cargo#7023) - Fix nonconcurrent tests (rust-lang/cargo#6900) - Update git2 crates for libgit2 0.28 (rust-lang/cargo#7018) - fix bunch of clippy warnings (rust-lang/cargo#7019) - Ignore remap-path-prefix in metadata hash. (rust-lang/cargo#6966) - Don't synthesize feature diretives for non-optional deps (rust-lang/cargo#7010) - Handle pipelined tests of libraries (rust-lang/cargo#7008) - Import the cargo-vendor subcommand into Cargo (rust-lang/cargo#6869) - Remove unnecessary outlives bounds (rust-lang/cargo#7000) - Catch filename output collisions in rustdoc. (rust-lang/cargo#6998) - the testing SAT solver was messed up by a refactor (rust-lang/cargo#6995) - Add some hints to the docs for `cfg()` targets (rust-lang/cargo#6990) - Test the Resolver against the varisat Library (rust-lang/cargo#6980) - Update changelog. (rust-lang/cargo#6984) - Update cache-messages tracking issue. (rust-lang/cargo#6987) - zsh: Add --all-targets option to cargo-check and cargo-build (rust-lang/cargo#6985) - Fix typo (rust-lang/cargo#6982)

ehuss · 2019-06-17T21:20:14Z

@Eh2406 I'm having a hard time reproducing the issue you are seeing. I had a few questions:

Does it get stuck permanently? That is, every time you run cargo test it fails until you manually delete cit?
Have you figured out a way to reliably cause it to happen? If so, can you list the exact steps?
Are you just running cargo test, or are you giving other flags?
If you can recreate it, can you modify the do_op function to print which file it is trying to delete?
Is there anything unusual about your system? Is it running Defender or any other AV? Are you using an editor or IDE that might be indexing the target directory? Do you have an SSD?

ehuss · 2019-06-17T21:41:59Z

Oh, I reproduced it! I think lowering down to 2 cpu's was the trick.

I'll dig in a little more.

Update cargo Update cargo 19 commits in 545f354259be4e9745ea00a524c0e4c51df01aa6..807429e1b6da4e2ec52488ef2f59e77068c31e1f 2019-05-23 17:45:30 +0000 to 2019-06-11 14:06:10 +0000 - Stabilize publish-lockfile. (rust-lang/cargo#7026) - change package cache lock message (rust-lang/cargo#7029) - Fix documenting an example. (rust-lang/cargo#7023) - Fix nonconcurrent tests (rust-lang/cargo#6900) - Update git2 crates for libgit2 0.28 (rust-lang/cargo#7018) - fix bunch of clippy warnings (rust-lang/cargo#7019) - Ignore remap-path-prefix in metadata hash. (rust-lang/cargo#6966) - Don't synthesize feature diretives for non-optional deps (rust-lang/cargo#7010) - Handle pipelined tests of libraries (rust-lang/cargo#7008) - Import the cargo-vendor subcommand into Cargo (rust-lang/cargo#6869) - Remove unnecessary outlives bounds (rust-lang/cargo#7000) - Catch filename output collisions in rustdoc. (rust-lang/cargo#6998) - the testing SAT solver was messed up by a refactor (rust-lang/cargo#6995) - Add some hints to the docs for `cfg()` targets (rust-lang/cargo#6990) - Test the Resolver against the varisat Library (rust-lang/cargo#6980) - Update changelog. (rust-lang/cargo#6984) - Update cache-messages tracking issue. (rust-lang/cargo#6987) - zsh: Add --all-targets option to cargo-check and cargo-build (rust-lang/cargo#6985) - Fix typo (rust-lang/cargo#6982)

Revert test directory cleaning change. #6900 changed it so that the entire `cit` directory was cleaned once when tests started. Previously, each `t#` directory was deleted just before each test ran. This restores the old behavior due to problems on Windows. The problem is that the call to `rm_rf` would fail with various errors ("Not found", "directory not empty", etc.) if you run `cargo test` twice. The first panic would poison the lazy static initializer, causing all subsequent tests to fail. There are a variety of reasons deleting a file on Windows is difficult. My hypothesis in this case is that services like the indexing service and Defender swoop in and temporarily hold handles to files. This seems to be worse on slower systems, where presumably these services take longer to process all the files created by the test suite. It may also be related to how files are "marked for deletion" but are not immediately deleted. The solution here is to spread out the deletion over time, giving Windows more of an opportunity to release its handles. This is a poor solution, and should only help reduce the frequency, but not entirely fix it. I believe that this cannot be solved using `DeleteFileW`. There are more details at rust-lang/rust#29497, which is a long-standing problem that there are no good Rust implementations for recursively deleting a directory. An example of something that implements a "safe" delete is [Cygwin's unlink implementation](https://github.com/cygwin/cygwin/blob/ad101bcb0f55f0eb1a9f60187f949c3decd855e4/winsup/cygwin/syscalls.cc#L675-L1064). As you can see, it is quite complex. Of course our use case does not need to handle quite as many edge cases, but I think any implementation is going to be nontrivial, and require Windows-specific APIs not available in std. Note: Even before #6900 I still get a lot of errors on a slow VM (particularly "directory not empty"), with Defender and Indexing off. I'm not sure why. This PR should make it more bearable, though.

rust-highfive assigned ehuss May 2, 2019

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 2, 2019

jethrogb force-pushed the nonconcurrent-tests branch 2 times, most recently from 8ec2751 to c11c127 Compare May 2, 2019 22:05

jethrogb mentioned this pull request May 2, 2019

Test suite fails when run with a single core #6746

Closed

jethrogb force-pushed the nonconcurrent-tests branch from c11c127 to 2520bcf Compare May 8, 2019 01:15

jethrogb force-pushed the nonconcurrent-tests branch 2 times, most recently from dbc2d36 to 2a7e916 Compare May 9, 2019 02:24

jethrogb force-pushed the nonconcurrent-tests branch 2 times, most recently from 113ff07 to 6a2fecd Compare June 5, 2019 19:12

Jethro Beekman and others added 6 commits June 7, 2019 12:41

Fix tests when running with RUST_TEST_THREADS=1

04a9a7c

Fix intermittent test failure due to interleaved stdout

e157b6d

Don't rely on a thread local to uniquely create test roots

a598309

Update #[test] attribute on all tests in the testsuite

0e0d968

sed -i 's/^#\[test\]/#[cargo_test]/' $(rg -l '^#\[test\]') Manual fixes: * proc_macro::proc_macro_doctest

Update reamp_path_prefix_ignored to cargo_test.

1e74ea8

Update to allow publishing cargo-test-macro.

a8c22ca

ehuss force-pushed the nonconcurrent-tests branch from 6a2fecd to a8c22ca Compare June 7, 2019 20:22

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 7, 2019

bors merged commit a8c22ca into rust-lang:master Jun 7, 2019

bors mentioned this pull request Jun 7, 2019

Handle symlinks to directories #6817

Merged

ehuss mentioned this pull request Jun 12, 2019

Update cargo rust-lang/rust#61771

Merged

Eh2406 mentioned this pull request Jun 13, 2019

Crates.io meeting agenda 2019-06-13 20:00 UTC (Discord, 30 min) rust-lang/crates-io-cargo-teams#43

Closed

3 tasks

ehuss mentioned this pull request Jun 18, 2019

Revert test directory cleaning change. #7042

Merged

Eh2406 mentioned this pull request Jun 26, 2019

dev-dependencies are annoying and almost useless in publish rust-lang/crates-io-cargo-teams#46

Closed

ehuss added this to the 1.37.0 milestone Feb 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix nonconcurrent tests #6900

Fix nonconcurrent tests #6900

jethrogb commented May 2, 2019 •

edited

Loading

rust-highfive commented May 2, 2019

jethrogb commented May 8, 2019

ehuss commented May 9, 2019

jethrogb commented May 9, 2019

ehuss commented May 10, 2019

jethrogb commented May 10, 2019

bors commented May 10, 2019

ehuss commented May 10, 2019

jethrogb commented May 10, 2019

jethrogb commented Jun 3, 2019

ehuss commented Jun 5, 2019 •

edited

Loading

jethrogb commented Jun 5, 2019 •

edited

Loading

ehuss commented Jun 7, 2019

bors commented Jun 7, 2019

bors commented Jun 7, 2019

bors commented Jun 7, 2019

Eh2406 commented Jun 11, 2019

jethrogb commented Jun 11, 2019

Eh2406 commented Jun 12, 2019

ehuss commented Jun 17, 2019

ehuss commented Jun 17, 2019

Fix nonconcurrent tests #6900

Fix nonconcurrent tests #6900

Conversation

jethrogb commented May 2, 2019 • edited Loading

rust-highfive commented May 2, 2019

jethrogb commented May 8, 2019

ehuss commented May 9, 2019

jethrogb commented May 9, 2019

ehuss commented May 10, 2019

jethrogb commented May 10, 2019

bors commented May 10, 2019

ehuss commented May 10, 2019

jethrogb commented May 10, 2019

jethrogb commented Jun 3, 2019

ehuss commented Jun 5, 2019 • edited Loading

jethrogb commented Jun 5, 2019 • edited Loading

ehuss commented Jun 7, 2019

bors commented Jun 7, 2019

bors commented Jun 7, 2019

bors commented Jun 7, 2019

Eh2406 commented Jun 11, 2019

jethrogb commented Jun 11, 2019

Eh2406 commented Jun 12, 2019

ehuss commented Jun 17, 2019

ehuss commented Jun 17, 2019

jethrogb commented May 2, 2019 •

edited

Loading

ehuss commented Jun 5, 2019 •

edited

Loading

jethrogb commented Jun 5, 2019 •

edited

Loading