Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unit testing #24

Open
psalz opened this issue Jan 22, 2019 · 2 comments
Open

Unit testing #24

psalz opened this issue Jan 22, 2019 · 2 comments

Comments

@psalz
Copy link
Member

psalz commented Jan 22, 2019

While trying to compile more complex applications I often find areas of hipSYCL where I’d like to contribute something, be it a missing operator on some class or error handling for some particularly obscure situation (into which I tend to run into quite frequently for some reason). Also I’m getting a lot of (mostly minor) compiler warnings with hipSYCL right now, and I’d like to cut down on those as well. Anyway, I’d like to not break the existing implementation while doing this – so I think it’s time to add unit tests :).

While at some point it would be great to have test coverage for all the different components of hipSYCL (i.e. the source transformations, syclcc, ...), I think for now the most important thing is to add tests for the C++ library itself.

With SYCL being a specification, it certainly could be a good approach to try and add tests by systematically going through the spec. Unfortunately I don’t have the resources to embark on such a journey right now, so I’d suggest to take a more ad-hoc approach for now (i.e., adding tests as new features are added or when bugs are fixed). Once the conformance test suite goes public (#6) things should also become a lot easier in this regard.

A couple of questions of how to approach testing come to mind (I’m sure there’s more to figure out further down the line...):

  • Some functionality will have to be tested on the host, some on a device and some on both. It would be good to have a concise way of specifying this within a test case.
  • Should functions in tests include “__device__” attributes, or should they also exercise the source transformation step?

Do you have any preferences regarding a unit testing framework? I can highly recommend using Catch2, as it's modern, very easy to set up and doesn't get in your way.

Let me know your thoughts on this; I can begin to set this up if you’d like.


PS: Sorry for the issue-avalanche, as I mentioned I had some stuff in my backlog and I figured to just do it all today and be done with it ;-).

@illuhad
Copy link
Collaborator

illuhad commented Jan 23, 2019

yes, we absolutely need this :) I agree that full unit testing of the spec is too much work, and likely unecessary anyway if the conformance test suite will be released. So, let's just add tests for new features/issues as you suggest.

There are several components to testing:

  1. Travis CI. This is probably the most urgent one: CI is broken for non-master branches. At the moment travis CI tests if hipSYCL builds both with AMD, NVIDIA and host backends. This is of course highly useful since it lifts the burden from us to install all kinds of different compute stacks on our development machines, and allows us to see immediately if our changes break something on some platform. Due to the complexity of these compute stacks, this testing is done using docker images from AMD/NVIDIA so that we don't have to mess with installing this stuff manually during the test run. This is done using the dockerfiles in the install directory. The problem is that these Dockerfiles just pull the source code from github and try to compile it. But since they just do a git clone, you always end up with the master branch. So, even if Travis CI tells us that some pull request builds, it has actually only tested the master branch and we won't know if the changes actually compile everywhere until they are merged.
    I think we can solve this by adding a cmake target that just configures these dockerfiles such that a git checkout is performed of the current branch. Then in, Travis CI, we first generate the dockerfiles with cmake and then try to build them.
  2. Testing with SYCL test programs. Here we could start with the existing test programs in the tests directory and just add automatic comparisons of the output to the expected output.
  3. Explicit testing of the pure runtime components (e.g. task graph), although I'm not sure how well they can be tested without SYCL test programs, i.e. step 2.
  4. Testing of the source-to-source transformation. This is a bit more complicated since it's not so easy to test if the output is correct (e.g. __host__ __device__ is the same as __device__ __host__) and there are some things it does that alters semantics and cannot be detected by just trying to compile (e.g. inserting __shared__ attributes). It is probably easiest to test this by trying to compile test programs and seeing if they produce expected results.

Ultimately we are interested in knowing if everything still works after a change, so I think that unless we also explicitly test the source-to-source transformation with a dedicated test suite (which will be more complicated, as mentioned), we should also run the source-to-source transformation to catch issues with the toolchain.

For tests which contain device code, the most straight-forward approach would be:

  1. Test if all test programs compile on all backends, which tests syclcc and the source-to-source transformation. This could also be integrated into Travis CI.
  2. Test if all test programs behave correctly/produce correct results. At least for the CPU backend, we could also test this automatically in Travis CI.

Tests which only affect some internals of the runtime (e.g. task graph) should be compiled with syclcc --hipsycl-bootstrap, which disables the source-to-source transformation and is also used when compiling libhipSYCL.

I'm not familiar with catch2. My original plan was using boost.test since we already have boost as dependency. But I'm very open for catch2 if it has some technical advantages over boost.test or can do things that boost.test cannot.

@psalz
Copy link
Member Author

psalz commented Mar 20, 2019

I'm back with some news on the unit testing front! I tried to evaluate the differences between Catch2 and Boost.Test to see which framework would be better suited for this project. As it turns out they are quite similar regarding their feature set and I didn't encounter anything in either one that I'd miss in the other. Of course the approach to certain things, for example fixtures, is quite different -- but they both support it.

Ultimately I think Catch2 is a tad more convenient and succinct to use (which makes sense, as it is newer and makes heavy use of modern C++ features). Unfortunately however, as it turns out, as of right now Clang 8 doesn't seem to like the combination of Catch2 and hipSYCL at all - as it somehow doesn't compile or call device code correctly. I won't go into any details, suffice to say I've spent a good two days trying to figure this out and the problem is just really weird (something I've gotten used to working in this space...). Since everything seems to work with nvcc my working hypothesis is that this must be a bug in Clang.

In any case, I've started writing some unit tests using Boost.Test and have been making good progress. I'll open a PR shortly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants