Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial implementation of tests with sanitizers #21

Draft
wants to merge 35 commits into
base: master
Choose a base branch
from

Conversation

tgross35
Copy link

@tgross35 tgross35 commented Oct 6, 2023

It was proposed to start running sanitizers as part of occasional CI in this same way we run Miri, this is a first pass at doing that.

Originally my plan was to fork the repo but since there is quite a bit of common behavior, I think it could make sense to keep it together.

@RalfJung
Copy link
Member

RalfJung commented Oct 6, 2023

My concern is that I have basically zero experience with sanitizers, so I'd be rather helpless when something goes wrong with them.

Would it be possible to isolate these tests more from each other? Like, have a miri subfolder and a sanitizer subfolder, and have the rust-src.diff and rust-version file inside those subfolders, so that I can bump and patch the Miri version without affecting the sanitizers?

esac


# run the tests (some also without validation, to exercise those code paths in Miri)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comments here still refer to Miri? Also running on other targets will not work so easily with sanitizers.

@RalfJung
Copy link
Member

RalfJung commented Oct 6, 2023

Or alternatively, would it make sense to look into having this as part of rustc CI?

@tgross35
Copy link
Author

tgross35 commented Oct 6, 2023

I wasn’t expecting a review quite so fast 😂 I just started a PR so I could see CI results. So a lot of the comments aren’t updated

Do you mean as a periodic task in rust-lang/rust? I suppose that would be an option too, but I don’t know how that would interact with everything else if these tests fail. I don’t really know how likely failure are, this will be interesting.

If you’d prefer it in a separate repo that’s fine too of course. It just seemed that with a lot of overlap, having one place to make any setup changes is easier than two.

Regarding failures, I honestly don’t know what to expect with this and I don’t know who maintains the implementation. This came out if a suggestion to dogfood the sanitizers feature since it may head to stabilization soon. So I’ll fix the directories, and just consider this borrowing your CI for a bit until someone else can chime in :)

@RalfJung
Copy link
Member

RalfJung commented Oct 6, 2023

Do you mean as a periodic task in rust-lang/rust? I suppose that would be an option too, but I don’t know how that would interact with everything else if these tests fail. I don’t really know how likely failure are, this will be interesting.

Periodic, or even with each PR, not sure how far we want to go.

If you’d prefer it in a separate repo that’s fine too of course. It just seemed that with a lot of overlap, having one place to make any setup changes is easier than two.

How big is the overlap in the end? All of the scripts that run the actual tests are separate, right? The overlap is in the crate setup that lets us invoke cargo test in the first place?

I'd say it depends on how much the parts that are different can be isolated, so that I don't have to worry about the sanitizer part when I need to patch the Miri part. If that can be done, then I'm fine with sharing the parts that can be shared.

Also someone should feel in charge of keeping this working so I can ping them in case of trouble. Would you be that someone?

@RalfJung
Copy link
Member

RalfJung commented Oct 6, 2023

We should probably rename the project if we land this.^^ miri-test-libstd was anyway not a great name since it also tests libcore and more...

@tgross35
Copy link
Author

tgross35 commented Oct 6, 2023

Periodic, or even with each PR, not sure how far we want to go.

You can only compile with one sanitizer at a time and we supposedly support like 7,

If you’d prefer it in a separate repo that’s fine too of course. It just seemed that with a lot of overlap, having one place to make any setup changes is easier than two.

How big is the overlap in the end? All of the scripts that run the actual tests are separate, right? The overlap is in the crate setup that lets us invoke cargo test in the first place?

I'd say it depends on how much the parts that are different can be isolated, so that I don't have to worry about the sanitizer part when I need to patch the Miri part. If that can be done, then I'm fine with sharing the parts that can be shared.

Yeah, that's about it. It really makes no difference whether it is here or elsewhere, it just seemed maybe nice to keep the similar structure together (but it's also not running yet so that could change).

Also someone should feel in charge of keeping this working so I can ping them in case of trouble. Would you be that someone?

I wouldn't mind being that someone, but I'm also not a team member. I'm sure someone there's at least one person on the compiler team that wouldn't mind being a fallback, I'll ask around once I (hopefully) get this working.

We should probably rename the project if we land this.^^ miri-test-libstd was anyway not a great name since it also tests libcore and more...

It it winds up that we have different repos, I'm voting to name this one SANity check :)

It will be pretty interesting to see how the results of this all compare to Miri. I suspect Miri catches a lot more, but it's interesting that some of the sanitizers can see through to the C side as well (have to figure that bit out yet...)

@tgross35
Copy link
Author

tgross35 commented Oct 6, 2023

Finally just told it to mark everything as a pass so we'd see actual results. Initial group of failures looks pretty repetitive:

  1. Some weird uninit thing A B
  2. test_str_truncate_split_codepoint test_try_reserve must have something unusual under the hood, it says we're requesting the max allocation size A B
  3. A data race in memcpy? A B
  4. An allocation error similar to 2 A
  5. SIMD looks like it has a different uninit error A but it's the first create to pass ASAN and leaksan!
  6. Some segfault in stdarch? A
  7. I can't even get cfi to compile because it seems to want conflicting flags

I'm sure a lot of those could be false positives, just need to do a bit of chasing.

Everything passed stacksafe

@saethlin
Copy link
Member

saethlin commented Oct 6, 2023

Hm. My biggest concern at the moment is that the backtraces aren't useful, I'm not confident in debugging from CI without good backtraces. Are we somehow compiling without debuginfo?

@tgross35
Copy link
Author

tgross35 commented Oct 6, 2023

I guess llvm-symbolizer needed to be installed for better output, the first result is now https://github.com/rust-lang/miri-test-libstd/actions/runs/6434877064/job/17474985110?pr=21#step:4:69. Still not that easy to pinpoint but better, if I'm reading that right it seems like maybe it's yelling at something inside the panic handler?

@tgross35
Copy link
Author

tgross35 commented Oct 6, 2023

@saethlin
Copy link
Member

saethlin commented Oct 6, 2023

Yep, in my experience it's pretty common that you need to permit allocator failure due to tests like this.

The MSan backtraces look like this:

  0.000006       #29 0x55e06d4e357c in std::rt::lang_start::hff29f05a9594b9e8 /rustc/e0d7ed1f453fb54578cc96dfea859b0e7be15016/library/std/src/rt.rs:165:17
  0.000007       #30 0x55e06d22d22f in main (/home/runner/work/miri-test-libstd/miri-test-libstd/target/x86_64-unknown-linux-gnu/debug/deps/alloc_run_test-fa2f54cc8400c238+0x119d22f) (BuildId: 80acc57dbc04badae46dbaa41120bfe55e537384)
  0.000006       #31 0x7f0071229d8f  (/lib/x86_64-linux-gnu/libc.so.6+0x29d8f) (BuildId: 229b7dc509053fe4df5e29e8629911f0c3bc66dd)
  0.000006       #32 0x7f0071229e3f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x29e3f) (BuildId: 229b7dc509053fe4df5e29e8629911f0c3bc66dd)
  0.000006       #33 0x55e06c0cf014 in _start (/home/runner/work/miri-test-libstd/miri-test-libstd/target/x86_64-unknown-linux-gnu/debug/deps/alloc_run_test-fa2f54cc8400c238+0x3f014) (BuildId: 80acc57dbc04badae46dbaa41120bfe55e537384)

The paths suggest we've somehow linked together artifacts from the local build and the precompiled standard library from rustup. MSan false positives are pretty common when you only instrument part of the program, which sure looks like is happening here.

It wouldn't surprise me if straightening out whatever is causing this fixes all the other errors.

@saethlin
Copy link
Member

saethlin commented Oct 6, 2023

I think we're better off setting ASAN_OPTIONS something like

export ASAN_OPTIONS="detect_leaks=0:detect_stack_use_after_return=true:allocator_may_return_null=1:detect_invalid_pointer_pairs=2"

At the very least letting the allocator return null. I honestly don't trust no_sanitize, partly-instrumented programs tend to throw false positives and we'll have to add another patch for every test that needs a very big allocation.

@tgross35
Copy link
Author

tgross35 commented Oct 6, 2023

I updated the flags but didn't add detect_leaks - are there a lot of false positives with that one?

Do you know of a way to fix cross-build linking? I am sort of wondering if maybe it is better to let x.py take a sanitizer option and build from source before running these tests. That would probably mean a separate repo from this one.

We could almost do it as part of rust-lang/rust CI like Ralf suggested, since there's about an hour free time between x86 finishing and a full run. But we would want to reuse the built artifacts but parallelize these sanitizer tests, I don't know of a good way to do that

@saethlin
Copy link
Member

saethlin commented Oct 6, 2023

That's the pile of flags I use to look for UB. Leaks are just annoying, and I've seen tests are supposed to leak. Happy to start with a stricter approach for the standard library tests. (I'm not aware of false positives)

@tgross35
Copy link
Author

tgross35 commented Oct 6, 2023

I guess that if we can't rely on no_sanitize for an intentionally leaky test then it sure seems pretty useless :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants