Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify the purpose of pydantic e.g. not strict validation #578

Closed
1 of 4 tasks
samuelcolvin opened this issue Jun 5, 2019 · 10 comments · Fixed by #856
Closed
1 of 4 tasks

Clarify the purpose of pydantic e.g. not strict validation #578

samuelcolvin opened this issue Jun 5, 2019 · 10 comments · Fixed by #856

Comments

@samuelcolvin
Copy link
Owner

@samuelcolvin samuelcolvin commented Jun 5, 2019

ref #284 and #360

Pydantic is no a validation library, it's a parsing library.

It makes no guarantee whatsoever about checking the form of data you give to it; instead it makes a guarantee about the form of data you get out of it.

This sounds like an esoteric difference, but it has real practical consequences, eg.

  • if you pass "3" (which is not an int) to an int field, pydantic will convert it to an int
  • if you pass 3.14 (which again is not an int) to an int field, pydantic will convert it to an int (thereby "loosing information")

I think this is correct and I'm not interested in changing it, but we should be clear about what pydantic is/does - while I was annoyed by the manor of the question in #284 (sorry, wrong issue, I meant #360) I do understand the motivation for the question.

To fix this we should:

  • add the above statement prominently to the docs
  • rename validate_model to parse_model (this will require the old version to continue to work with a deprecation warning for 2 versions)
  • rename __get_validators__ to __get_parser_functions__ or similar (again will require the old function to continue to work with a deprecation warning)
  • remove use of "validate"/"validation" from the docs

This is a big, backwards incompatible change for no material benefit but I think it's worth it for clarity.

@gangefors
Copy link
Contributor

@gangefors gangefors commented Jun 5, 2019

I have no objection to this renaming. I also think it's a better fit for pydantic use-cases.

We use it as an advanced dataclass so that we can ensure that the data we want to save in our DB is consistent with the data model we have specified. We also use it to mangle messages sent on a queue into a data model that we can easily use in our code and store it as json documents that adhere to the specifications of the models we have written.

We were using cerberus but it was way harder to read the model, and we really didn't need to validate the incoming data as long as it was convertible into the data type we wanted in the model. We can even mangle the data in really advanced ways using for example pre-parsing.

tldr;
I support this.

@tiangolo
Copy link
Sponsor Collaborator

@tiangolo tiangolo commented Jun 5, 2019

Agreed. Makes sense.

@dmontagu
Copy link
Collaborator

@dmontagu dmontagu commented Jul 16, 2019

This was probably implied, but since it wasn't specifically called out in the checkboxes, I wanted to add just in case:

  • the config values referencing validate should probably also be changed to say parse (namely validate_all and validate_always).

@samuelcolvin
Copy link
Owner Author

@samuelcolvin samuelcolvin commented Sep 18, 2019

I've decided to chill out on this a lot, I'm going to add roughly the above warning to the docs. but avoid all the renaming.

"Validate" seems to be much more commonly used than "parse", also one step of parsing is validating, so like it or not; pydantic does do validation. Just that it's primary purpose isn't validation.

@boydgreenfield
Copy link

@boydgreenfield boydgreenfield commented Sep 19, 2019

@samuelcolvin As a casual Pydantic repo followers and light user, this issue and some of these nuances caught me by surprise. I suspect some of the (not especially polite) tone in related issues stems from similar misunderstandings.

Just my 2¢, but I think changing some of the documentation language and package tagline would help tremendously here (much more so than breaking API changes). For example, the Google snippet for Pydantic starts "Data validation and settings management...", and the introductory section of the docs has validation/validate 4 times.

Adding a section that highlights "What type of validation does Pydantic perform?" or "How does Pydantic validate input data?" in the early introductory text would probably help a great deal too.

I hope that's a helpful suggestion, and thanks for the great library!

@samuelcolvin
Copy link
Owner Author

@samuelcolvin samuelcolvin commented Sep 19, 2019

yes, I agree. That's basically what I decided after a long think.

@samuelcolvin samuelcolvin removed this from the Version 1 milestone Oct 1, 2019
@samuelcolvin samuelcolvin mentioned this issue Oct 4, 2019
5 tasks
@samuelcolvin samuelcolvin changed the title remove use of "validation" replace with "parse" Clarify the purpose of pydantic e.g. not strict validation Oct 10, 2019
@brianmaissy
Copy link
Contributor

@brianmaissy brianmaissy commented May 11, 2020

Recently I read this blog post and it made me think of the the line from the pydantic docs:

pydantic is primarily a parsing library, not a validation library

In case anyone finds their way to this thread in an attempt to understand the implications of the difference between parsing and validation, it might be an interesting read.

@PyAntony
Copy link

@PyAntony PyAntony commented Oct 10, 2020

I have spent 1 hour trying to figure out if I could well use Pydantic for data validation or not. I guess the answer is YES. The following note (from the documentation): "pydantic is primarily a parsing library, not a validation library. Validation..." leads you to an UNNECESSARY rabbit hole that could well be avoided by writing something along these lines (in the same note section):
"Although validation is not the main purpose YOU CAN USE THIS LIBRARY for validation; go to the Validators section..."

It is written in the Validators section: "Custom validation ... can be achieved" !. So my point is fair; this should be explicitly mentioned in that Note box (in the Models section).

@jcerjak
Copy link

@jcerjak jcerjak commented Oct 14, 2020

For me the introduction in the docs was misleading:

Data validation and settings management using python type annotations.
pydantic enforces type hints at runtime, and provides user friendly errors when data is invalid.
Define how data should be in pure, canonical python; validate it with pydantic.

I had wrong expectations for the library, I think clarifying this would greatly help in avoiding potential issues. Apart from being surprised with implicit data parsing/conversion, I also made the mistake of not returning values in the validators (which are not only validators, but also parsers/deserializers), which led to some surprising behavior. The later issue was resolved after reading the docs for validators, but would help if I had mindset from the start, that parsing is the main goal of the library, while (strict) validation is also possible.

@cknoll
Copy link

@cknoll cknoll commented Oct 18, 2020

For those who – like me – stumble on this discussion and want to determine if and how pydantic can be used for strict validation: After reading through various issues examples/types_strict.py was what helped me a lot. IMHO it should be mentioned in the models/#data-conversion.

Suggestion:

This is a deliberate decision of *pydantic*, and in general it's the most useful approach, see 
[here](https://github.com/samuelcolvin/pydantic/issues/578) for a longer discussion of the subject.
Nevertheless strict type checking is supported. See [examples/types_strict.py](docs/examples/types_strict.py).

PhilippeGalvan pushed a commit to PhilippeGalvan/python-binance-profit that referenced this issue Feb 4, 2021
Refactor some expressions
Update validation and enforce strict int validation (see: samuelcolvin/pydantic#578 and https://pydantic-docs.helpmanual.io/usage/models/\#data-conversion)
PhilippeGalvan pushed a commit to PhilippeGalvan/python-binance-profit that referenced this issue Feb 4, 2021
Refactor some expressions
Update validation and enforce strict int validation (see: samuelcolvin/pydantic#578 and https://pydantic-docs.helpmanual.io/usage/models/\#data-conversion)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

9 participants