Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC-104: add note about preserving input types to migration guide #6135

Merged
merged 5 commits into from Jun 15, 2023

Conversation

adriangb
Copy link
Member

@adriangb adriangb commented Jun 14, 2023

Selected Reviewer: @samuelcolvin

@linear
Copy link

linear bot commented Jun 14, 2023

DOC-104 Migration Guide: Generic protocol annotations don't always preserve original type

(Maybe add to PR #6006)

@adriangb
Copy link
Member Author

please review

@cloudflare-pages
Copy link

cloudflare-pages bot commented Jun 14, 2023

Deploying with  Cloudflare Pages  Cloudflare Pages

Latest commit: d5f863e
Status: ✅  Deploy successful!
Preview URL: https://b2f3c1ad.pydantic-docs2.pages.dev
Branch Preview URL: https://doc-104.pydantic-docs2.pages.dev

View logs

Comment on lines 183 to 190
### Input types are not preserved

In Pydantic V1 we made an effort to preserve the input type.
For example, given the annotation `Mapping[str, int]` if you passed in a `collection.Counter()` you'd get a `collection.Counter()` as the value.
In Pydantic V2 we no longer promise we will preserve the input type (although we may in some cases).
Instead, we only promise that the output type will match the type annotations.
In this case, we promise it will be a valid `Mapping`, so it could be a plain Python dict.
If you want the output type to be a specific type, consider annotating it as such.
Copy link
Contributor

@dmontagu dmontagu Jun 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Input types are not preserved
In Pydantic V1 we made an effort to preserve the input type.
For example, given the annotation `Mapping[str, int]` if you passed in a `collection.Counter()` you'd get a `collection.Counter()` as the value.
In Pydantic V2 we no longer promise we will preserve the input type (although we may in some cases).
Instead, we only promise that the output type will match the type annotations.
In this case, we promise it will be a valid `Mapping`, so it could be a plain Python dict.
If you want the output type to be a specific type, consider annotating it as such.
### Generic container input types are not always preserved
In Pydantic V1 we made great efforts to preserve the types of all field inputs, even when they were proper subtypes of the field annotations.
For example, given the annotation `Mapping[str, int]` if you passed in a `collection.Counter()` you'd get a `collection.Counter()` as the value.
Due to the complexity and performance implications of supporting this while also applying validation to the contained values, in Pydantic V2 we no longer promise that we will always preserve the input type (although we still try to where possible). Instead, we now only promise that the post-validation value for a field will match its annotation.
In the case of `Mapping[str, int]` for example, if you pass in a `collection.Counter()`, the validated value might just be a plain Python dict. (Note that we still promise the post-validation value will be a valid `Mapping`.)
If you want the output type to be a specific type, consider annotating it as such, or add a `mode='wrap'` validator function that ensures the final result of validation will have the type you want.

I think without a bit more detail about the intent, this section may sound a bit disturbing to users, as it seems like something that intuitively you'd expect to be the case, and as someone new to the issue/consideration it's a bit hard to think through all the possible places in your app you might have subclasses passed to annotated fields.

I took what you wrote and just tried to soften it a bit, saying:

  • It only applies to generic containers (I think this is true, at least in intent..) — I think this is important or you might take it to mean that it could apply to model fields, which would be a much more significant breaking change.
  • We still view it as better to preserve the input type where possible, just, not doing that under all circumstances has benefits, and we want to take advantage of those
  • You can implement more carefully type-preserving behavior yourself using custom validators.

Feel free to take/leave any of the suggested changes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm hesitant to say "we'll try": there are going to be a lot of requests for preserving it for various types, and we end up doing a bunch of special casing, etc. We currently don't preserve anything at all. I get why a user with a Counter might request it but they probably doing realize (1) the performance implications for their specific case (adding a functional validator vs. doing everything in Rust) and (2) the performance implications for all other cases (e.g. every dict validation will be slower because it necessarily has to do an instance check or two)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I'm okay dropping that point, what you say makes sense. I think if we don't say "we'll try" then it's more important to make it clear that the scope of this point is generic containers (but is that actually right?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like that's not actually right:

from uuid import UUID, uuid4

from pydantic import BaseModel


class MyUUID(UUID):
    pass


class Model1(BaseModel):
    x: UUID

assert type(Model1(x=MyUUID(str(uuid4())))) is MyUUID


class MyString(str):
    pass


class Model2(BaseModel):
    x: str

assert type(Model2(x=MyString('foo'))) is MyString

Both of these assertions fail in both V1 and V2.

I'm going to change the wording of this to be even stronger and emphasize that we're not even going to try.

But I do like your point about explaining how you can do that with custom validators if you want.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest we at least say "we don't promise to in all cases" (and/or we make things fail for proper subtypes of standard types, the way we do for proper subtypes of generic types) instead of "we don't even try" because we do try for the very-important-case of BaseModel subclasses, that was one of the specific things we got working

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah fair enough we can add that in a followup PR I guess

Copy link
Member

@samuelcolvin samuelcolvin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

otherwise LGTM, please update.

docs/migration.md Show resolved Hide resolved
docs/migration.md Outdated Show resolved Hide resolved
@adriangb
Copy link
Member Author

please review

Copy link
Contributor

@tpdorsey tpdorsey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor suggested edits.

### Input types are not preserved

In Pydantic V1 we made great efforts to preserve the types of all field inputs for generic collections when they were proper subtypes of the field annotations.
For example, given the annotation `Mapping[str, int]` if you passed in a `collection.Counter()` you'd get a `collection.Counter()` as the value.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
For example, given the annotation `Mapping[str, int]` if you passed in a `collection.Counter()` you'd get a `collection.Counter()` as the value.
For example, given the annotation `Mapping[str, int]`, if you passed in a `collection.Counter()`, you'd get a `collection.Counter()` as the value.

In Pydantic V1 we made great efforts to preserve the types of all field inputs for generic collections when they were proper subtypes of the field annotations.
For example, given the annotation `Mapping[str, int]` if you passed in a `collection.Counter()` you'd get a `collection.Counter()` as the value.

Supporting this behavior in V2 would have negative performance implications for the general case (we'd have to check types every time) and would add a lot of complexity to validation. Further, even in V1 this behavior was inconsistent and partially broken: it did not work for most types (`str`, `UUID`, etc.) and even for generic collections it's impossible to re-build the original input correctly without a lot of special casing (consider `ChainMap`; rebuilding the input is necessary because we need to replace values after validation, e.g. if coercing strings to ints).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Supporting this behavior in V2 would have negative performance implications for the general case (we'd have to check types every time) and would add a lot of complexity to validation. Further, even in V1 this behavior was inconsistent and partially broken: it did not work for most types (`str`, `UUID`, etc.) and even for generic collections it's impossible to re-build the original input correctly without a lot of special casing (consider `ChainMap`; rebuilding the input is necessary because we need to replace values after validation, e.g. if coercing strings to ints).
Supporting this behavior in Pydantic V2 would have negative performance implications for the general case (we'd have to check types every time) and would add a lot of complexity to validation. Further, even in Pydantic V1, this behavior was inconsistent and partially broken: it did not work for most types (`str`, `UUID`, etc.) and even for generic collections it's impossible to re-build the original input correctly without a lot of special casing (consider `ChainMap`; rebuilding the input is necessary because we need to replace values after validation, e.g. if coercing strings to ints).


In Pydantic V2 we no longer attempt to preserve the input type.
Instead, we only promise that the output type will match the type annotations.
Going back to the `Mapping` example, we promise the output will be a valid `Mapping`, in practice it will be a plain `dict`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Going back to the `Mapping` example, we promise the output will be a valid `Mapping`, in practice it will be a plain `dict`.
Going back to the `Mapping` example, we promise the output will be a valid `Mapping`; in practice it will be a plain `dict`.

@adriangb adriangb merged commit 9d93f28 into main Jun 15, 2023
53 checks passed
@adriangb adriangb deleted the DOC-104 branch June 15, 2023 19:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants