Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Property-based testing for Napari #2444

Open
Zac-HD opened this issue Mar 21, 2021 · 4 comments
Open

Property-based testing for Napari #2444

Zac-HD opened this issue Mar 21, 2021 · 4 comments
Labels
feature New feature or request
Milestone

Comments

@Zac-HD
Copy link
Contributor

Zac-HD commented Mar 21, 2021

Following some email discussions with @GenevieveBuckley, @sofroniewn, @tlambert03, @jni, and @FudgeMunkey, I thought I'd write up our ideas for property-based testing of Napari as a public issue so that anyone can contribute ideas or pull requests. As background, Hypothesis is basically super-powered random testing, and I wrote this short paper on applying it to scientific code.

The best way to start is generally with a PR that adds Hypothesis to your CI - I prefer to make this a standard test dependency, though some projects have specific jobs for property-based tests instead - updates the test guide, and adds a single simple test. Then future PRs can focus on adding more tests and fixing any bugs that this reveals.

For Napari's first property-based tests, I'd use the following as a checklist (and try the ghostwriter):

  • "fuzz tests": feed in weird but technically-possible inputs, and see if anything crashes. In Numpy (for example) this has found issues with unicode strings, zero-dimensional arrays, and comparing arrays-of-structs containing nan. The key is to think carefully about how to generate all possible values, as bugs often rely on the interaction of two or more edge cases. Once you have some library-specific generators, it's easy to reuse them for later steps.
  • Round-trip tests: save your data, load it, assert no changes! Simple, important, and highly effective. Test format conversions in-memory, as well as persistent storage formats.
  • Equivalence: fantastic when you have a function that should behave identically to a reference implementation, even if only over a subset of inputs. Traditionally great for refactoring, but perhaps you also have Dask- or Napari-specific implementations of functions in scipy or scikit-*, and could run both on small arrays? Presenting identical data as Numpy vs Dask arrays could also be interesting.
  • Look for parametrized tests, and consider changing them to use Hypothesis. This is best when the full product is too slow to run regularly, e.g. array dimensionality * size * dtype * bitwidth * endianness * random contents (for each operand...) and using Hypothesis would allow a less-restricted test. Note that you can also supply some arguments from @pytest.mark.parametrize and others from @hypothesis.given in the same test!
  • "Classic" properties like asserting bounds on outputs, idempotence, commutative operations, etc. These mostly apply to "algorithmic" code and are far from the only use of a PBT library, but still very useful when applicable.

[@sofroniewn suggests] focus on a small subset of the codebase first to make it tractable, rather than run on the whole thing. For example - the code inside napari/layers and testing our individual layer objects like napari.layers.Image, napari.layers.Points etc. These objects don't have any of our GUI/ Qt code in them and we definitely want them to be 100% rock solid across the range of their input types. (I'd ignore anything in a folder named venedored or experimental for now too).

@tlambert03
Copy link
Contributor

just wanted to say that I'm very excited about this :) Your paper was compelling and I'm wondering how I've made it this far without using property-based testing! (particular for widget inputs in magicgui... which would have revealed things like https://github.com/napari/magicgui/pull/178 far earlier). thanks again

@sofroniewn sofroniewn added this to the 0.4 milestone Mar 21, 2021
@sofroniewn sofroniewn added the feature New feature or request label Mar 21, 2021
@GenevieveBuckley
Copy link
Contributor

I'm also excited to see what happens with this project!

@sofroniewn
Copy link
Contributor

Thanks for the detailed write up @Zac-HD and the enthusiasm from the community. I know @neuromusic and @goanpeca were also interested in this too!

The best way to start is generally with a PR that adds Hypothesis to your CI - I prefer to make this a standard test dependency, though some projects have specific jobs for property-based tests instead - updates the test guide, and adds a single simple test.

I think this is a reasonable place to start, it will make it very clear what we're taking on by adding hypothesis support. If we decide to remove hypothesis in the future for whatever reason we can use that PR as a reference too, and it should be easy to revert.

My understanding from a quick look at the hypothesis docs is that it will integrate with our current pytest framework and so these tests can run alongside our current ones for each of the environment / OS configurations in our current grid, which would be nice.

If you'd like to start with such a PR, we'd be happy to review.

Then future PRs can focus on adding more tests and fixing any bugs that this reveals.

Yes sounds good! I like those categories you propose above. I'd be happy to weigh in with more advice on where to start as we get to this point.

I'm looking forward to the first PR! :-)

@brisvag
Copy link
Contributor

brisvag commented Mar 26, 2021

This sounds super cool! I'm going to play around with hypothesis elsewhere too :)
I think starting with layers would be great, it would help with at least 4 of the issues I raised (#1801, #1833, #2347, #2348) ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants