Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FsCheck.Xunit.PropertyFailedException when "Arguments exhausted after 99 tests." #245

Closed
panesofglass opened this issue May 12, 2016 · 11 comments

Comments

Projects
None yet
4 participants
@panesofglass
Copy link

commented May 12, 2016

Is this expected? Based on the docs, I would expect that the test passed.

@ploeh

This comment has been minimized.

Copy link
Member

commented May 12, 2016

Unless you've tweaked the default settings, FsCheck attempts to execute 100 successful test runs. Also by default, it has 1,000 attempts to produce arguments that'll pass a condition (==>). If it runs out of attempts before reaching 100 successful test runs, it throws that exception.

That's what I'd expect, but perhaps the documentation is out of date. Which documentation are you referring to?

It may also be that I'm misunderstanding this report. If so, please accept my apologies, but in that case, I may need some more details in order to be able to help.

@panesofglass

This comment has been minimized.

Copy link
Author

commented May 14, 2016

It was running out of arguments and failing to hit 100 successful tests some of the time. I didn't read that it was supposed to throw an exception in that case, so I assume I just skimmed too much.

@kurtschelfthout

This comment has been minimized.

Copy link
Member

commented May 14, 2016

I'm not sure this is explicitly described in the docs. Not at a computer right now but re-opening so I remember to check and possibly amend later.

I do think it's the right behaviour; here it's on the edge but I think you'd want to know if a test could only generate a handful of values instead of the expected 100.

@fsoikin

This comment has been minimized.

Copy link

commented Feb 13, 2017

I don't quite understand what I'm supposed to do in this situation.
If my restrictions turn out to be too severe, I can of course reduce the number of test cases that I expect FsCheck to generate. But how exactly do I reduce it, to what value? Can I expect that number (99 in the issue title) to be consistent? In my experience it fluctuates a little, so if I reduce it to 99 now and my tests start passing, can I be sure that it won't be 98 tomorrow, and the tests start failing again?

In other words: what is the recommendation for dealing with such failures?

@kurtschelfthout

This comment has been minimized.

Copy link
Member

commented Feb 13, 2017

Best way to deal with it is to change your generator so it generates more of the expected values.

The number is expected to fluctuate because the generated inputs are random. You can tweak some configuration settings using e.g. the Config type or using the PropertyAttribute, including the number of test cases to try before FsCheck gives up, or e.g. the max size of the test run (often test cases of larger size have less chance of passing whatever precondition you have set up).

@fsoikin

This comment has been minimized.

Copy link

commented Feb 13, 2017

But... Doesn't this basically mean that "restrictions" are completely useless?

Because inputs are random, the actual restrictiveness of restrictions can only be defined statistically, which means that there is a tiny chance that any, even the least restrictive, restrictions will lead to an "arguments exhausted" failure. This means that, no matter how I tweak the configuration and restrictions, I will always see my tests fail once in a while.

So, is the only reliable solution to use a generator instead of restrictions?

@kurtschelfthout

This comment has been minimized.

Copy link
Member

commented Feb 13, 2017

In theory. In practice, this is not really a problem.

Yeah, the best solution is to change your generator. That also wastes less time on generating useless examples and shortens test execution time. Basically that's what this failure is trying to alert you to: your choice of generator + precondition is not great.

@ploeh

This comment has been minimized.

Copy link
Member

commented Feb 13, 2017

Conditional properties aren't useless, but I tend to use them in cases where they only filter away a small fraction of generated values. Here's an example: http://blog.ploeh.dk/2016/01/18/make-pre-conditions-explicit-in-property-based-tests

In your case, OTOH, it sounds like 9 out of 10 values are being thrown away. Not only is it inefficient, but it's also, as you've discovered, non-deterministic. I prefer to give my tests as wide a statistical margin as possible.

As @kurtschelfthout has already suggested, it's often better to define a generator that generates only those values in which you're interested, instead of using a default generator, and then throw away values.

If you share your requirements, we may be able to help you with that. Additionally, the generator documentation is fairly comprehensive, and even includes a nice collection of examples.

@fsoikin

This comment has been minimized.

Copy link

commented Feb 13, 2017

@ploeh:

  1. The statement that your conditions "only filter out a small fraction of values" can only be true statistically. This means that there is still a tiny chance that on a given run your condition will filter out all values. The chance is small, but non-zero (as I already said above, btw). Given a large number of tests, this may add up to a large chance of the overall test run to fail.
  2. This is not my case. I don't see where you're getting the idea that 9 out of 10 values are being thrown away. I never posted my specific numbers.
  3. I am perfectly capable of writing a generator, thank you. I am not making an argument that writing a generator is hard and do not seek help with it. I am trying to understand what was the rationale behind having this mechanism that guarantees once-in-a-while failures.
@ploeh

This comment has been minimized.

Copy link
Member

commented Feb 13, 2017

I'd guess that only @kurtschelfthout can speak authoritatively about the rationale, but it makes perfect sense to me.

After all, something like false ==> myProperty compiles, but will obviously never run to completion. Having a limit where FsCheck gives up makes perfect sense to me.

(BTW, the 9 out of 10 values was only based on the default configuration of 1000 tries for 100 test cases.)

@fsoikin

This comment has been minimized.

Copy link

commented Feb 13, 2017

I already got the rationale from @kurtschelfthout. The rationale was: "Yes, in theory this might happen, but in practice it is so improbable that it doesn't. And if it does, that points to a possibly incorrect combination of generator and restrictions." There was no need for further response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.