Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

validators in custom subclasses of Mapping are not called #1457

Closed
seansfkelley opened this issue Apr 29, 2020 · 6 comments
Closed

validators in custom subclasses of Mapping are not called #1457

seansfkelley opened this issue Apr 29, 2020 · 6 comments
Assignees
Labels
change Suggested alteration to pydantic, not a new feature nor a bug help wanted Pull Request welcome

Comments

@seansfkelley
Copy link

seansfkelley commented Apr 29, 2020

Bug

Output of python -c "import pydantic.utils; print(pydantic.utils.version_info())":

             pydantic version: 1.5.1
            pydantic compiled: True
                 install path: /Users/seankelley/Library/Caches/pypoetry/virtualenvs/cl-pydantic-ygGyMK5A-py3.8/lib/python3.8/site-packages/pydantic
               python version: 3.8.2 (default, Mar 30 2020, 17:42:05)  [Clang 10.0.1 (clang-1001.0.46.4)]
                     platform: macOS-10.14.6-x86_64-i386-64bit
     optional deps. installed: ['typing-extensions']

I'm defining a FrozenOrderedDict (need to support hashable dicts for Reasons), and I want to give it Pydantic support. Here's the implementation at the moment:

from typing import TypeVar, Hashable, OrderedDict, Mapping

K = TypeVar("K", bound=Hashable)
V = TypeVar("V", bound=Hashable)


class FrozenOrderedDict(Mapping[K, V]):
    # snip: skipping __init__ and the rest of the Mapping interface

    @classmethod
    def __get_validators__(cls):
        yield cls.validate

    @classmethod
    def __modify_schema(cls):
        pass

    @classmethod
    def validate(cls, v):
        if isinstance(v, Mapping):
            return FrozenOrderedDict(v)
        else:
            raise TypeError(
                f"cannot construct a FrozenOrderedDict from non-Mapping value of type {type(v).__name__}"
            )

I wrote a couple tests for this.

def test_should_be_coercible_by_pydantic_from_normal_dict(self):
    class M(BaseModel):
        d: FrozenOrderedDict

    m = M(d=dict(foo="bar"))

    # This test passes -- we properly convert the type.
    self.assertIsInstance(m.d, FrozenOrderedDict)
    self.assertEqual(m.d, FrozenOrderedDict(foo="bar"))

def test_should_enforce_key_and_value_types_for_generic_frozenordereddict(self):
    class M(BaseModel):
        d: FrozenOrderedDict[int, int]

    with self.assertRaisesRegex(ValidationError, "not a valid integer"):
        M(d={"foo": 100})

    with self.assertRaisesRegex(ValidationError, "not a valid integer"):
        M(d={100: "foo"})

    m = M(d={100: 200})

    # This test fails here. We do validation, above, but don't coerce the type.
    self.assertIsInstance(m.d, FrozenOrderedDict)
    self.assertEqual(m.d, FrozenOrderedDict({100: 200}))

As far as I can tell, specifying type parameters hits a different code path: Pydantic no longer sees some duck-type that adheres to __get_validators__ and instead sees a generic type alias that descends from Mapping, and does its own validation. To go out on a limb, I'm guessing it's that

https://github.com/samuelcolvin/pydantic/blob/3cd8b1ee2d5aac76528dbe627f40fe1c27bf59f6/pydantic/fields.py#L464

comes before

https://github.com/samuelcolvin/pydantic/blob/3cd8b1ee2d5aac76528dbe627f40fe1c27bf59f6/pydantic/fields.py#L476

or something along those lines.

I tried some workarounds, most notably including not descending from Mapping but instead just Generic and implementing the recursive validation myself. This wasn't too bad on the Pydantic side (good support for inspecting and validating generic type parameters!), but it caused other problems around interop with dict and failures of duck-typing. And it was just weird to have a mapping that wasn't a Mapping.

How can I descend from generic Mapping and support validation that respects type parameters, but still have my own (additional) validators be used? I don't mind implementing the recursive step for key/value validation, if necessary (though of course, it's nice if I don't have to, like now).

I took a long look at #380 but could not see how it applies to my case despite apparent similarity.

@seansfkelley seansfkelley added the bug V1 Bug related to Pydantic V1.X label Apr 29, 2020
@samuelcolvin
Copy link
Member

I think your guess at the problem is probably (without digging in to it in detail now) spot on.

The short answer therefore is probably "you can't".

I'd be open to changing this (e.g. put the hasattr(origin, '__get_validators__') check before the issubclass(origin, Mapping) check) if:

  • the code change wasn't too massive
  • it didn't cause any performance regression
  • it didn't break any existing tests that suggested it would have some kind of backwards incompatibile effect - I imagine it wouldn't

If it does become clear it's backwards incompatible it would need to wait for v2.

@samuelcolvin samuelcolvin added change Suggested alteration to pydantic, not a new feature nor a bug help wanted Pull Request welcome and removed bug V1 Bug related to Pydantic V1.X labels Apr 30, 2020
@seansfkelley
Copy link
Author

Got it, thank you! A quick hack on my installed Pydantic to reorder those branches allows the tests to pass, and I have to implement the recursive checking myself. That means that it's technically a breaking change since mapping types will no longer be validated automatically if you define a __get_validators__ on them, but then again, those validators were never being called, so presumably nobody is relying on that.

Design question: assuming this can be done backwards-compatibly, I'm thinking the best approach would be to check for __get_validators__ before any of the other various generics there (list/set/mapping, etc.), and then exposing the various utility methods like _validate_mapping or _validate_tuple so they can be called something like:

@classmethod
def validate(cls, v):
    if isinstance(v, Mapping):
        v, errors = validate_mapping(v)
        if errors:
            raise ValidationError(errors, v)
        else:
            return FrozenOrderedDict(v)
    else:
        raise TypeError(
            f"cannot construct a FrozenOrderedDict from non-Mapping value of type {type(v).__name__}"
        )

Or do you think that that's too much API surface area, and would rather people follow the pattern for this outlined in the docs?

My quick and dirty validate now looks like:

@classmethod
def validate(cls, v, field: ModelField):
    if isinstance(v, Mapping):
        if field.sub_fields:
            validated_dict = {}
            errors = []

            key_type, value_type = field.sub_fields

            # Based on https://pydantic-docs.helpmanual.io/usage/types/#generic-classes-as-types.
            for k, v in v.items():
                key_result, key_error = key_type.validate(k, {}, loc="__key__")
                if key_error:
                    errors.append(key_error)

                value_result, value_error = value_type.validate(v, {}, loc=str(k))
                if value_error:
                    errors.append(value_error)

                if not errors:
                    validated_dict[key_result] = value_result

            if errors:
                raise ValidationError(errors, cls)
            else:
                return FrozenOrderedDict(validated_dict)
        else:
            return FrozenOrderedDict(v)
    else:
        raise TypeError(
            f"cannot construct a FrozenOrderedDict from non-Mapping value of type {type(v).__name__}"
        )

@samuelcolvin
Copy link
Member

Design question: assuming this can be done backwards-compatibly, I'm thinking the best approach would be to check for get_validators before any of the other various generics there (list/set/mapping, etc.), and then exposing the various utility methods like _validate_mapping or _validate_tuple so they can be called something like:

Humm, difficult.

One option would be to insert validators from the class as either pre or post "whole" validators. This would mean that the standard validation (including recursive checking) would still be called, just your validators would be called before or after that entire process. - This is probably be easier than what you're suggesting.

My main reservation is that I an idea for quite a big rewrite of validators in v2 - I don't want you to do lots of work on this (and perhaps break some edge case usage) then change it all again in v2 which I'm hoping to work on fairly soon.

I haven't yet sat down and written out a description of the new validators, I'll do that in the next few days and put a link here.

@seansfkelley
Copy link
Author

Sounds good, I'll wait for your thoughts here!

@hmvp
Copy link
Contributor

hmvp commented Sep 29, 2020

As a workaround I currently use this:

    @classmethod
    def __get_validators__(cls) -> Generator:  # noqa: B902
        yield from super().__get_validators__()
        ...  # your own extra validators

@adriangb
Copy link
Member

This now works well in V2:

from typing import Any, Dict, Hashable, TypeVar

from pydantic_core import CoreSchema, core_schema
from typing_extensions import get_args, get_origin

from pydantic.annotated import GetCoreSchemaHandler

K = TypeVar("K", bound=Hashable)
V = TypeVar("V")


class FrozenOrderedDict(Dict[K, V]):
    @classmethod
    def __get_pydantic_core_schema__(cls, source_type: Any, handler: GetCoreSchemaHandler) -> CoreSchema:
        origin = get_origin(source_type)
        if origin is None:
            key_type = value_type = Any
            origin = source_type
        else:
            key_type, value_type = get_args(source_type)
        return core_schema.no_info_after_validator_function(
            FrozenOrderedDict,
            handler.generate_schema(Dict[key_type, value_type])
        )

    def __repr__(self) -> str:
        return f'{self.__class__.__name__}({super().__repr__()})'


from pydantic import BaseModel


class Model(BaseModel):
    x: FrozenOrderedDict[str, int]


print(Model(x={'a': '1'}).x)
#> FrozenOrderedDict({'a': 1})  # note that value was correctly coerced!

I don't think we'll be able to fix this in V1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
change Suggested alteration to pydantic, not a new feature nor a bug help wanted Pull Request welcome
Projects
None yet
Development

No branches or pull requests

5 participants