Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Clarify the purpose of pydantic e.g. not strict validation #578
Pydantic is no a validation library, it's a parsing library.
It makes no guarantee whatsoever about checking the form of data you give to it; instead it makes a guarantee about the form of data you get out of it.
This sounds like an esoteric difference, but it has real practical consequences, eg.
I think this is correct and I'm not interested in changing it, but we should be clear about what pydantic is/does - while I was annoyed by the manor of the question in #284 (sorry, wrong issue, I meant #360) I do understand the motivation for the question.
To fix this we should:
This is a big, backwards incompatible change for no material benefit but I think it's worth it for clarity.
I have no objection to this renaming. I also think it's a better fit for pydantic use-cases.
We use it as an advanced dataclass so that we can ensure that the data we want to save in our DB is consistent with the data model we have specified. We also use it to mangle messages sent on a queue into a data model that we can easily use in our code and store it as json documents that adhere to the specifications of the models we have written.
We were using cerberus but it was way harder to read the model, and we really didn't need to validate the incoming data as long as it was convertible into the data type we wanted in the model. We can even mangle the data in really advanced ways using for example pre-parsing.
I've decided to chill out on this a lot, I'm going to add roughly the above warning to the docs. but avoid all the renaming.
"Validate" seems to be much more commonly used than "parse", also one step of parsing is validating, so like it or not; pydantic does do validation. Just that it's primary purpose isn't validation.
@samuelcolvin As a casual Pydantic repo followers and light user, this issue and some of these nuances caught me by surprise. I suspect some of the (not especially polite) tone in related issues stems from similar misunderstandings.
Just my 2¢, but I think changing some of the documentation language and package tagline would help tremendously here (much more so than breaking API changes). For example, the Google snippet for Pydantic starts "Data validation and settings management...", and the introductory section of the docs has validation/validate 4 times.
Adding a section that highlights "What type of validation does Pydantic perform?" or "How does Pydantic validate input data?" in the early introductory text would probably help a great deal too.
I hope that's a helpful suggestion, and thanks for the great library!