Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

explore using utest for testing the thumb targets #144

Closed
japaric opened this issue Feb 28, 2017 · 7 comments · Fixed by #155
Closed

explore using utest for testing the thumb targets #144

japaric opened this issue Feb 28, 2017 · 7 comments · Fixed by #155

Comments

@japaric
Copy link
Member

japaric commented Feb 28, 2017

utest is a custom test crate that supports no_std targets like the thumb targets. We could use it to test the thumb targets but there are a few complications:

  • utest-cortex-m-qemu, the test runner for thumb targets that works with QEMU emulation, depends on this crate. This means that this crate will appear twice in the dependency graph and that could cause problems.

  • even if we get #[test] working for the thumb targets, quickcheck still won't work / compile for those targets. So we'll have to create unit tests for the thumb targets that don't rely on quickcheck. We could port the compiler-builtins tests, those don't depend on quickcheck and have the advantage that we could move away from comparing compiler-builtins results to libgcc / libcompiler-rt results as well.

@japaric
Copy link
Member Author

japaric commented Mar 14, 2017

I have a new test suite in compiler-builtins-tester that works on both std and no_std targets and doesn't require compiling or linking to compiler-rt / gcc_s. I have checked that it mostly (*) passes on x86_64, the arm targets and the thumb targets.

(*) The powi{s,d}f2 tests are disabled because they don't compile on the thumb targets as we are missing some required intrinsics (e.g. aeabi_dmul). And testing on the thumbv6m target is also disabled because of #150.

The test suite works like this:

For each intrinsic, we randomly generate an integration test file (tests/divdi3.rs) that contains 10,000 (can be changed) different test cases (e.g. for addsf3, 1.0 + 1.0 must equal 2.0). Since the test cases are generated on a x86_64 host the expected results of the test cases are "ground truth" as they were obtained by computing the operation, that the intrinsic emulates, using hardware instructions.

Then we just run cargo test / cross test / xargo test (utest is supported out of the box) as appropriate.

While checking that this new test suite works, I found these issues: #145, #148 #150 and #151. Two of them affected std targets and should have been caught by the current test suite but weren't. I think the new test suite found them because it does a better job at picking extreme values for the test cases.

Also, with this new test suite, we can test the ARM targets which we are ignoring in the current test suite as some unit tests return results, which are correct, but that don't match compiler-rt / gcc_s results, which are wrong / not as accurate. (cc @mattico)

I propose we move to this new test system as it has wider (target) coverage and doesn't rely on compiler-rt / gcc_s C code bases.

cc @alexcrichton

@alexcrichton
Copy link
Member

Sounds like a great idea to me!

@mattico
Copy link
Contributor

mattico commented Mar 14, 2017

Big 👍 from me. Having to deal with test failures due to gcc_s returning the wrong answer was extremely frustrating.

Are you planning to add the tester to this repo and have it generate test cases on each travis run, or to just use it as a generator and check-in some tests?

@japaric
Copy link
Member Author

japaric commented Mar 19, 2017

@mattico The former to avoid bloating the repository with giant test files.

compiler-rt does the later for its test suite; it contains these MBs big test file that tests all the possible combinations. We could tweak the test generator to produce exhaustive test suites as well.

@mattico
Copy link
Contributor

mattico commented Mar 19, 2017

How long would exhaustive tests take to run? Presumably it wouldn't be too bad if compiler-rt does it. We could have bors run the exhaustive tests, but just do small/random tests on PR commits.

@japaric
Copy link
Member Author

japaric commented Mar 21, 2017

@mattico So actually compiler-rt doesn't have exhaustive tests. Their biggest test, for the i128 division intrinsic, has about 2^16 test cases (the file is 16MB big 😆). I think we'd actually run out of memory for a really exhaustive test suite.

How long would exhaustive tests take to run?

From what I have seen, compiling the test files takes longer than running them, even within QEMU. IIRC, compilation time was already in tens of seconds but this is for all the test files / all the intrinsics.

@mattico
Copy link
Contributor

mattico commented Mar 21, 2017

Yeah, I wouldn't expect exhaustive tests for i128 😆. I'm just a bit concerned about tests randomly uncovering an old error in an unrelated PR just because those specific numbers haven't been tested before. It's probably not a huge issue in practice, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants