Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Constrain inferred Django field strategies based on validators before filtering #1116

Closed
Zac-HD opened this issue Feb 15, 2018 · 7 comments
Closed

Comments

@Zac-HD
Copy link
Member

@Zac-HD Zac-HD commented Feb 15, 2018

Currently, we infer strategies for Django fields almost solely based on the field type, then filter that strategy by calling any attached validators.

Unfortunately, this can be highly inefficient (eg #1112; could not generate default User model). Instead of inferring a strategy from the field type alone, we should also choose arguments to the strategy by inspecting any validators. For some (eg RegexValidator) we might even choose a different strategy.

@Zac-HD
Copy link
Member Author

@Zac-HD Zac-HD commented Feb 28, 2018

Notes dump:

  • Django has a bunch of default validators, which have been pretty stable for a long time. We can therefore make the field strategies substantially more powerful by introspecting validators as well as the field type.
  • For some (eg RegexValidator, EmailValidator), we'll choose a different base strategy
  • For most others, we can choose better arguments to the strategy based on the validator - eg MinValueValidator, MaxLengthValidator, etc.
  • For custom (and some builtin) validators, filtering will remain the only way. We should keep it in all cases as a final check for correctness.
  • We should resolve the strategy type and default kwargs for the given field type, then resolve kwargs from the validators (min_value, max_value, min_length, etc), possibly switch strategy type, and reconcile the arguments to give the actual strategy.
  • Multiple RegexValidators are tricky: The current solution is to take the union of the strategies for each regex, which does works if one generates output that usually passes the others. It's possible to compute the intersection of regular expressions (not PCRE) by treating them as finite automata using greenery, but this is well out of scope.
  • Documentation should suggest use of validators derived from Django's as an aid to Hypothesis as well as general correctness.
  • We'll want to overhaul strategy registration to match this, too - allow strategy objects, but prefer functions for numeric and string types where we can do validator-based constraints.
    • Check that functions can only be registered for string or numeric fields (because we can't usefully inspect others)
    • Check that functions have the argument names we expect (eg min_size, max_value) so we can actually use our inspection results

@Zac-HD
Copy link
Member Author

@Zac-HD Zac-HD commented Mar 6, 2018

Another note to self: after this is done, we could work with Simon to integrate hypothesis-drf and support the Django REST Framework.

Technically possible before, but it would be much easier to support this well if we get a solid field-level inference + validation + registration system going first. At least if we bear that in mind, and thus this note!

@Zac-HD
Copy link
Member Author

@Zac-HD Zac-HD commented Aug 29, 2018

Based on chatting with a few Django people at the PyCon Australia sprints, validators are rarely used for model fields but almost always for forms (see our oldest open issue, #35). We can add them in either order, but this is low-priority before a forms() function exists to use it.

It is almost certainly not worth the code complexity to support a public API for customisable validator inference; we should just support numeric and string validators and stop there.

@Zac-HD
Copy link
Member Author

@Zac-HD Zac-HD commented Jan 17, 2019

#1743 made this quite a bit easier, by cleaning up the from_field API.

  • In _for_text, get the Min and MaxLengthValidator (if any) and if relevant use the bounding values in place of the field-level min_size and max_size arguments. This only applies to fields without a RegexValidator.
  • For numeric types, write a registering function instead of writing static strategies into the lookup table. I would use a helper function _get_bounds_from_field(field, min_=None, max_=None) to factor out all the logic about getting the validator(s) and comparing bounds into a single place.

@TheBeans
Copy link

@TheBeans TheBeans commented Apr 24, 2019

@Zac-HD would love an update on this feature. Deciding on what testing strategy to use with a new django project and would like to try Hypothesis out.

@Zac-HD
Copy link
Member Author

@Zac-HD Zac-HD commented Apr 25, 2019

Definitely don't let this stop you!

This issue is a nice-to-have which would make automatically created strategies more efficient in some cases. We already infer strategies and they're guaranteed to be valid, plus you can supply an explicit strategy for any field you like if the automatic ones aren't quite what you want (e.g. only positive numbers in a field that allows negatives).

@auvipy
Copy link

@auvipy auvipy commented Dec 16, 2019

Hi, I would like to start working on it :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

3 participants