Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify the purpose of pydantic e.g. not strict validation #578

Closed
samuelcolvin opened this issue Jun 5, 2019 · 6 comments · Fixed by #856

Comments

@samuelcolvin
Copy link
Owner

commented Jun 5, 2019

ref #284 and #360

Pydantic is no a validation library, it's a parsing library.

It makes no guarantee whatsoever about checking the form of data you give to it; instead it makes a guarantee about the form of data you get out of it.

This sounds like an esoteric difference, but it has real practical consequences, eg.

  • if you pass "3" (which is not an int) to an int field, pydantic will convert it to an int
  • if you pass 3.14 (which again is not an int) to an int field, pydantic will convert it to an int (thereby "loosing information")

I think this is correct and I'm not interested in changing it, but we should be clear about what pydantic is/does - while I was annoyed by the manor of the question in #284 (sorry, wrong issue, I meant #360) I do understand the motivation for the question.

To fix this we should:

  • add the above statement prominently to the docs
  • rename validate_model to parse_model (this will require the old version to continue to work with a deprecation warning for 2 versions)
  • rename __get_validators__ to __get_parser_functions__ or similar (again will require the old function to continue to work with a deprecation warning)
  • remove use of "validate"/"validation" from the docs

This is a big, backwards incompatible change for no material benefit but I think it's worth it for clarity.

@samuelcolvin samuelcolvin referenced this issue Jun 5, 2019
3 of 5 tasks complete
@samuelcolvin samuelcolvin added this to the Version 1 milestone Jun 5, 2019
@samuelcolvin samuelcolvin referenced this issue Jun 5, 2019
@gangefors

This comment has been minimized.

Copy link
Contributor

commented Jun 5, 2019

I have no objection to this renaming. I also think it's a better fit for pydantic use-cases.

We use it as an advanced dataclass so that we can ensure that the data we want to save in our DB is consistent with the data model we have specified. We also use it to mangle messages sent on a queue into a data model that we can easily use in our code and store it as json documents that adhere to the specifications of the models we have written.

We were using cerberus but it was way harder to read the model, and we really didn't need to validate the incoming data as long as it was convertible into the data type we wanted in the model. We can even mangle the data in really advanced ways using for example pre-parsing.

tldr;
I support this.

@tiangolo

This comment has been minimized.

Copy link
Collaborator

commented Jun 5, 2019

Agreed. Makes sense.

@dmontagu

This comment has been minimized.

Copy link
Collaborator

commented Jul 16, 2019

This was probably implied, but since it wasn't specifically called out in the checkboxes, I wanted to add just in case:

  • the config values referencing validate should probably also be changed to say parse (namely validate_all and validate_always).
@samuelcolvin

This comment has been minimized.

Copy link
Owner Author

commented Sep 18, 2019

I've decided to chill out on this a lot, I'm going to add roughly the above warning to the docs. but avoid all the renaming.

"Validate" seems to be much more commonly used than "parse", also one step of parsing is validating, so like it or not; pydantic does do validation. Just that it's primary purpose isn't validation.

@boydgreenfield

This comment has been minimized.

Copy link

commented Sep 19, 2019

@samuelcolvin As a casual Pydantic repo followers and light user, this issue and some of these nuances caught me by surprise. I suspect some of the (not especially polite) tone in related issues stems from similar misunderstandings.

Just my 2¢, but I think changing some of the documentation language and package tagline would help tremendously here (much more so than breaking API changes). For example, the Google snippet for Pydantic starts "Data validation and settings management...", and the introductory section of the docs has validation/validate 4 times.

Adding a section that highlights "What type of validation does Pydantic perform?" or "How does Pydantic validate input data?" in the early introductory text would probably help a great deal too.

I hope that's a helpful suggestion, and thanks for the great library!

@samuelcolvin

This comment has been minimized.

Copy link
Owner Author

commented Sep 19, 2019

yes, I agree. That's basically what I decided after a long think.

@samuelcolvin samuelcolvin removed this from the Version 1 milestone Oct 1, 2019
@samuelcolvin samuelcolvin referenced this issue Oct 4, 2019
5 of 5 tasks complete
@samuelcolvin samuelcolvin changed the title remove use of "validation" replace with "parse" Clarify the purpose of pydantic e.g. not strict validation Oct 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.