Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FailedHealthCheck with hypothesis.extra.django for django.contrib.auth.models.User model #1112

Closed
MrGreenTea opened this issue Feb 14, 2018 · 16 comments · Fixed by #1117
Closed
Labels
bug something is clearly wrong here docs documentation could *always* be better enhancement it's not broken, but we want it to be better

Comments

@MrGreenTea
Copy link

MrGreenTea commented Feb 14, 2018

I just use create the most simple strategy like this:
USER_STRATEGY = models.models(User)

Then I get a FailedHealthCheck:
hypothesis.errors.FailedHealthCheck: It looks like your strategy is filtering out a lot of data. Health check found 50 filtered examples but only 3 good ones. This will make your tests much slower ...

Any idea what I can do to get User objects for my tests?

@Zac-HD Zac-HD added question not sure it's a bug? questions welcome docs documentation could *always* be better labels Feb 14, 2018
@Zac-HD
Copy link
Member

Zac-HD commented Feb 14, 2018

I would guess you have some complex validation on one or more fields, and the default strategy is giving you data of the right type (eg DateField) that fails validation (eg 'only the first of the month').

You can override the strategy for each field of a model, with the example of customer age here - but I'll admit the docs don't make it obvious! (We'll therefore keep this issue open to improve them.) For example:

USER_STRATEGY = models.models(User, age=st.integers(0, 100))

Does that help?

@MrGreenTea
Copy link
Author

Yeah I suspected that at first too, because we have a lot of ForeignKeys to User. But even in a fresh project the same happens.
My whole test code is this:

from django.contrib.auth.models import User
from django.test import TestCase
from hypothesis import given
from hypothesis.extra.django import models as model_st

USER_STRATEGY = model_st.models(User)


class TestUser(TestCase):
    @given(USER_STRATEGY)
    def test_user(self, user):
        assert user.username

My django version is Django 2.0.2 with Hypothesis 3.45.0

@DRMacIver
Copy link
Member

DRMacIver commented Feb 14, 2018

You're using django.test.TestCase but you need to be using hypothesis.extra.django.TestCase. What's happening is that generated users are colliding with previous generated users because the database state isn't being reset at the start of each generation (though I'm a little surprised that this is how that manifests).

We should probably be better about signalling this problem.

@Zac-HD
Copy link
Member

Zac-HD commented Feb 14, 2018

Ooohh, yep, that would do it. Related to #1071 and the (hopefully someday soon) solution in #1072 - in both cases we need to check the type of the class that our test method is defined on.

@DRMacIver
Copy link
Member

I'm going to close this as it's not actually a bug in Hypothesis, but I've open #1113 for communicating this problem more clearly.

@MrGreenTea
Copy link
Author

That doesn't change it for me.

# Create your tests here.
import hypothesis.extra.django
from django.contrib.auth.models import User
from hypothesis import given
from hypothesis.extra.django import models as model_st

USER_STRATEGY = model_st.models(User)


class TestUser(hypothesis.extra.django.TestCase):
    @given(USER_STRATEGY)
    def test_user(self, user):
        assert user.username

still gives me

hypothesis.errors.FailedHealthCheck: It looks like your strategy is filtering out a lot of data. Health check found 50 filtered examples but only 5 good ones

@DRMacIver DRMacIver reopened this Feb 14, 2018
@DRMacIver
Copy link
Member

Interesting. You're absolutely right. I could have sworn we had tests that this worked with the standard auth User, but apparently we don't! Sorry for jumping to conclusions.

@MrGreenTea
Copy link
Author

Yeah, regressions happen. No worries ;)
Hope you manage to find a solution and thanks for all the help!

@DRMacIver
Copy link
Member

DRMacIver commented Feb 14, 2018

Based on some investigation, I'm pretty sure the problem is that the user model has a very restrictive idea of what characters a username can contain, and we're trying to generate arbitrary unicode text for it.

We should figure out a proper fix, but the following will let you work around it until we do:

import hypothesis.strategies as st
import string

USER_STRATEGY = model_st.models(
    User,
    username=st.text(
        alphabet=string.ascii_letters + string.digits + '@/./+/-/_'))

@Zac-HD Zac-HD added bug something is clearly wrong here and removed question not sure it's a bug? questions welcome labels Feb 14, 2018
@The-Compiler
Copy link
Contributor

nitpick: @DRMacIver, I don't think you meant to include the slashes in the character list there 😉

@Zac-HD
Copy link
Member

Zac-HD commented Feb 14, 2018

(I know for a fact there are people using Hypothesis for Django, but if many of them are using its model introspection then I'm frankly astonished that we've never seen this before)

I thought this might be new in Django 2.0, but no - supported since 1.10 😞

nitpick: @DRMacIver, I don't think you meant to include the slashes in the character list there 😉

That's straight outta the Django docs, but you're right that they shouldn't be in the string literal. Instead, why not use the regex that we're failing to validate against? username=st.from_regex(r'^[\w.@+-]+$')

Which, uh, probably is the proper fix: for all string fields that use a RegexValidator, we should use from_regex instead of text. Note that we will need to be careful about inverted matches, flags, and compiled expressions vs string patterns.

@DRMacIver
Copy link
Member

nitpick: @DRMacIver, I don't think you meant to include the slashes in the character list there 😉

Uh, yeah, I just copied those out of the validator error message without really thinking about it.

@DRMacIver
Copy link
Member

DRMacIver commented Feb 14, 2018

I thought this might be new in Django 2.0, but no - supported since 1.10 😞

I mean I do know that there are people using this stuff! I know there are people using Hypothesis + Django reasonably heavily, but I think the number of people using model introspection must be way lower than I thought if this is the first time we've seen this.

@Zac-HD
Copy link
Member

Zac-HD commented Feb 14, 2018

Notes dump:

  • Django has a bunch of default validators, which have been pretty stable for a long time. We can therefore make the field strategies substantially more powerful by introspecting validators as well as the field type.
  • For some (eg RegexValidator, EmailValidator), we'll choose a different base strategy
  • For most others, we can choose better arguments to the strategy based on the validator - eg MinValueValidator, MaxLengthValidator, etc.
  • For custom (and some builtin) validators, filtering will remain the only way. We should keep it in all cases as a final check for correctness.
  • We should resolve the strategy type and default kwargs for the given field type, then resolve kwargs from the validators (min_value, max_value, min_length, etc), possibly switch strategy type, and reconcile the arguments to give the actual strategy.
  • We should do some sanity-checks here too, eg for empty size ranges or multiple RegexValidators. It's technically possible to compute the intersection of n regular expressions by analysis as finite automata, but that doesn't admit backreferences (ie PCRE) and is IMO out of scope for this. Simpler lookahead-based combination of the patterns is equivalent to just filtering from a Hypothesis generation perspective, so that doesn't help.
  • Documentation should be clearer on how to override the strategy for a particular field, but also suggest careful use of validators once they're supported.

This is a substantial new feature, but it would be really really nice to be able to generate instances of the default user model (!!), and the principle unlocks a massive improvement in specificity of inference and I assume real-world usefulness. If anyone else is using it in the real world 😕

@Zac-HD Zac-HD added the enhancement it's not broken, but we want it to be better label Feb 14, 2018
@Zac-HD
Copy link
Member

Zac-HD commented Feb 14, 2018

I suggest we put together a minimal fix - just improving the docs and handling RegexValidator for CharField and TextField - and merge that, leaving a new issue (linked from said docs) with more detailed notes on the larger body of work for external or funded contributions.

@DRMacIver
Copy link
Member

I suggest we put together a minimal fix - just improving the docs and handling RegexValidator for CharField and TextField - and merge that, leaving a new issue (linked from said docs) with more detailed notes on the larger body of work for external or funded contributions.

👍 with the possible caveat that it would be really nice to support the default User object and I bet we'll find some other problems when we try to do that. But I agree that just supporting the minimal amount to make that work is 100% the right starting point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug something is clearly wrong here docs documentation could *always* be better enhancement it's not broken, but we want it to be better
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants