New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hypothesis testing #20914
Comments
I think that at the current time it would be most useful to apply hypothesis testing to something like the polys domains rather than Expr. |
Hypothesis can also be useful for finding performance issues. Take something like #20914. A hypothesis test that just tries to generate expressions but does nothing with them would have found the issue, because simply constructing The key problem is that when the slow code in the |
Somewhere I would start would be simply testing that SymPy expressions can be constructed. That would catch issues like #20914. The next simplest thing would be to create a test for test_args ( The hardest part for both of these is writing a hypothesis strategy to generate SymPy expressions. But the hard work for that was already mostly done in #17190, and we can reuse some of the stuff in test_args as well. Basically every SymPy class needs to have a hypothesis strategy that specifies what sorts of inputs it can accept. |
I think that would be good. The actual hardest part is fixing the code though. If you want to find examples breaking the func-args invariant then that's not hard. It's a bunch of work to do anything about it though! |
Here's a good real world example where a hypothesis test was very simple to write (took me about a minute), and it found a bug in a PR that probably would have gone unnoticed #21259 (comment). |
I've added this as a GSoC idea. I think playing around with hypothesis and seeing how far we can get with it would be a great GSoC project. At the very least we should find a decent number of bugs (which we will ideally also fix). I would warn anyone interested in this project however that it may be harder than it appears, and I would very highly recommend getting some experience using hypothesis first if you don't already have some. |
Introducing hypothesis to sympy via #25428 |
I'd like to use this issue to discuss the idea of using hypothesis in SymPy.
For those who don't know, hypothesis is a library that lets you do property based testing. You tell it what the input to your function should look like and assert what properties should always hold, and it tries to find inputs that falsify that. Here's an example to show what a hypothesis test looks like
(note that this test as written will fail because it takes too long for some inputs, but it's just to give the idea of what a property based test looks like)
Hypothesis is an extremely powerful tool. It is very good at finding examples that fail your tests, things which you would never think to test yourself. However, it's also very picky. As soon as it finds a failing test, it remembers it and always reports it. So it's only useful to add it to SymPy in a place where either the code currently works, or we are willing to fix any bugs that it finds.
There has been a lot of discussion on this in the past. See #17190 and #20906.
My idea with hypothesis is to start small. Much smaller than what was proposed in #17190 (although that approach can still have some merits as something to run independently to see if anything interesting pops up). For example, the example test I wrote in #20906 passes on that branch. I just ran over 20000 examples. Even it, though, is still too much to start with, IMO, because the "correct" strategy to get it to find interesting examples is nontrivial. Random polynomials do not factor or have interesting roots. So you need to generate things in a way that matches what you are looking for.
I'll have to think a bit on where a good place to start would be. Ideally it would be something that is easy to generate with the builtin hypothesis strategies, something that doesn't blow up in terms of performance on certain inputs, and something where interesting inputs aren't difficult to find from the naive way of generating them.
The hardest part of writing good hypothesis tests is writing good strategies. But fortunately, strategies are reusable, so, e.g., if we created a good strategy for generating interesting expressions, then contributors would not need to worry about that to use it. Writing the test itself is generally straightforward. You just think of as many things as you want to be true of your function and assert them. It can be complicated, but not too hard once you get the hang of it. Actually, simply writing the test forces you to think about what you actually want to be true about your function (it is in some loose sense, a "spec" for your function). So insomuch as writing a property-based test is hard, that's a good thing.
Here are some slides for an presentation I gave internally to some colleagues about hypothesis. You can also take a look at the test suite for ndindex, a library that I wrote, if you want to see what hypothesis tests look like in practice (see the docs for a high level description of how the tests work).
The text was updated successfully, but these errors were encountered: