Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parametrized generics validation is broken/inconsistent #4161

Closed
3 tasks done
brisvag opened this issue Jun 14, 2022 · 2 comments
Closed
3 tasks done

Parametrized generics validation is broken/inconsistent #4161

brisvag opened this issue Jun 14, 2022 · 2 comments
Assignees
Labels
bug V1 Bug related to Pydantic V1.X

Comments

@brisvag
Copy link

brisvag commented Jun 14, 2022

Checks

  • I added a descriptive title to this issue
  • I have searched (google, github) for similar issues and couldn't find anything
  • I have read and followed the docs and still think this is a bug

Bug

Field validation treats Generics differently depending on a series of issubclass checks.

For example, here is the List section:
https://github.com/samuelcolvin/pydantic/blob/8846ec4685e749b93907081450f592060eeb99b1/pydantic/fields.py#L659-L668

and here is the Sequence section:

https://github.com/samuelcolvin/pydantic/blob/8846ec4685e749b93907081450f592060eeb99b1/pydantic/fields.py#L692-L694

As you can see, the two are treated differently (for some reason that I coulnd't figure out from the docs). Also, in both cases cases, if the generic is parametrized, the actual type hint is thrown out in favour of keeping just the generic type, replacing the Field.type_ attribute (which normally holds the "full" type hint) with the type of the subfield, while the fact that it's a sequence-like (or list-like) object is actually stored by the Field.shape attribute. I don't understand why that is the case, when subfields are used effectively elsewhere. As far as I understand, this should simply use a subfield for the subfields, store the actual type hint in the type_.

Here's an example (with a simple List and Sequence subclass) to show the consequences of these problems:

from typing import List, Sequence
from pydantic import BaseModel


class S(Sequence):
    def __init__(self, v):
        self.v = v
    def __getitem__(self, i):
        return self.v[i]
    def __len__(self):
        return len(self.v)

    @classmethod
    def __get_validators__(cls):
        yield cls.v
    @classmethod
    def v(cls, v):
        print(f'validating Sequence {v}')
        return cls(v)


class L(List):
    def __init__(self, v):
        self.v = v
    def __getitem__(self, i):
        return self.v[i]
    def __len__(self):
        return len(self.v)

    @classmethod
    def __get_validators__(cls):
        yield cls.v
    @classmethod
    def v(cls, v):
        print(f'validating List {v}')
        return cls(v)


class M(BaseModel):
    class Config:
        arbitrary_types_allowed = True
        validate_all = True

    l1: L = [1]
    l2: L[float] = [2]
    s1: S = [3]
    s2: S[float] = [4]

m = M()

First of all, you will see that one validator (the parametrized Sequence one) is not firing, because the validators were not collected, and because the type was forgotten on Field creation:

validating List [1]
validating List [2]
validating Sequence [3]

You will also see that the types of both the parametrized objects are wrong, despite the fact that L.v returns the correct type (which, I guess, must be overridden by pydantic later), and that the field did indeed forget about the type:

for name, field in m.__fields__.items():
    print(field, getattr(m, name).__class__)

# name='l1' type=L required=False default=[1] <class '__main__.L'>
# name='l2' type=List[float] required=False default=[2] <class 'list'>
# name='s1' type=S required=False default=[3] <class '__main__.S'>
# name='s2' type=Sequence[float] required=False default=[4] <class 'list'>

Why is this the case? Is it a bug or intended behaviour?

And how would I go about fixing this in pydantic, or at least working around it so my Sequence object gets both validated and type-coerced properly?

PS: Moreover, passing the right object type (in this case S) to the constructor will fail with a rather cryptic error (which is ultimately caused, I think, by this function which (probably) should actually check for isinstance(v, Sequence)).

https://github.com/samuelcolvin/pydantic/blob/8846ec4685e749b93907081450f592060eeb99b1/pydantic/utils.py#L153-L154

m = M(l1=L([1]), l2=L([2]), s1=S([3]), s2=S([4]))
validating List []
validating List []
validating Sequence <__main__.S object at 0x7f3134c258a0>
---------------------------------------------------------------------------
ValidationError                           Traceback (most recent call last)
Input In [3], in <cell line: 1>()
----> 1 m = M(l1=L([1]), l2=L([2]), s1=S([3]), s2=S([4]))

File ~/git/pydantic/pydantic/main.py:341, in BaseModel.__init__(__pydantic_self__, **data)
    339 values, fields_set, validation_error = validate_model(__pydantic_self__.__class__, data)
    340 if validation_error:
--> 341     raise validation_error
    342 try:
    343     object_setattr(__pydantic_self__, '__dict__', values)

ValidationError: 1 validation error for M
s2
  value is not a valid sequence (type=type_error.sequence)

References

Here are several past issues that found this problem, but were not addressed or dropped:


Output of python -c "import pydantic.utils; print(pydantic.utils.version_info())":

             pydantic version: 1.9.1
            pydantic compiled: True
                 install path: /usr/lib/python3.10/site-packages/pydantic
               python version: 3.10.5 (main, Jun  6 2022, 18:49:26) [GCC 12.1.0]
                     platform: Linux-5.18.2-arch1-1-x86_64-with-glibc2.35
     optional deps. installed: ['typing-extensions']

@mjog
Copy link

mjog commented Feb 11, 2023

Possibly also related: #4121

@dmontagu
Copy link
Contributor

In v2, here's an example of how to make a custom subclass of Sequence that does approximately the right thing (should be similar for List):

from typing import Sequence, TypeVar, Any, Callable, get_args

from pydantic_core import core_schema, ValidationError

from pydantic import BaseModel

T = TypeVar('T')


class MySequence(Sequence[T]):
    def __init__(self, v: Sequence[T]):
        self.v = v

    def __getitem__(self, i):
        return self.v[i]

    def __len__(self):
        return len(self.v)

    @classmethod
    def __get_pydantic_core_schema__(
        cls, source: Any, handler: Callable[[Any], core_schema.CoreSchema]
    ) -> core_schema.CoreSchema:
        instance_schema = core_schema.is_instance_schema(cls)

        args = get_args(source)
        if args:
            sequence_t_schema = handler(Sequence[args[0]])
        else:
            sequence_t_schema = handler(Sequence)

        non_instance_schema = core_schema.general_after_validator_function(
            lambda v, i: MySequence(v), sequence_t_schema
        )
        return core_schema.union_schema([instance_schema, non_instance_schema])


class M(BaseModel):
    model_config = dict(validate_default=True)

    s1: MySequence = [3]


m = M()
print(m)
#> s1=<__main__.MySequence object at 0x103da07c0>
print(m.s1.v)
#> [3]


class M(BaseModel):
    s1: MySequence[int]

M(s1=[1])
try:
    M(s1=['a'])
except ValidationError as exc:
    print(exc)
    """
    2 validation errors for M
    s1.is-instance[MySequence]
      Input should be an instance of MySequence [type=is_instance_of, input_value=['a'], input_type=list]
    s1.function-after[<lambda>(), chain[is-instance[Sequence],function-wrap[sequence_validator(), list[int]]]].0
      Input should be a valid integer, unable to parse string as an integer [type=int_parsing, input_value='a', input_type=str]
    """

I'm sure with a bit of elbow grease the error messages can be cleaned up, if that's a problem, please create a new issue requesting help with that.

(Also, for what it's worth, we do plan to improve the docs for custom type creation including more examples, but it's not there yet. I've created a note in our documentation project tracker to remind us to add this.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug V1 Bug related to Pydantic V1.X
Projects
None yet
Development

No branches or pull requests

4 participants