Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

construct() for recursive models #1168

Closed
astariul opened this issue Jan 16, 2020 · 18 comments
Closed

construct() for recursive models #1168

astariul opened this issue Jan 16, 2020 · 18 comments
Labels

Comments

@astariul
Copy link

astariul commented Jan 16, 2020

Question

Output of python -c "import pydantic.utils; print(pydantic.utils.version_info())":

             pydantic version: 1.3
            pydantic compiled: True
                 install path: 
               python version: 3.6.8 (default, Dec 24 2018, 19:24:27)  [GCC 5.4.0 20160609]
                     platform: Linux-4.4.0-131-generic-x86_64-with-Ubuntu-16.04-xenial
     optional deps. installed: []

I want to use the construct() method to build a nested model faster, with trusted data. But it's not working :

from pydantic import BaseModel

class Id(BaseModel):
    id: int

class User(BaseModel):
    id: Id
    name = "test"

x = User(**{"id": {"id":2}, "name": "not_test"})
y = User.construct(**x.dict())

print(x.id.id)
# >>> 2

print(y.id.id)
# >>> Traceback (most recent call last):
# >>> File "<stdin>", line 1, in <module>
# >>> AttributeError: 'dict' object has no attribute 'id'

Is it possible to make it work ?

@samuelcolvin
Copy link
Member

no, I'm afraid not.

For performance reasons construct() doesn't do any analysis or mutation of the objects it receives.

If you want this behaviour you'll need to implement it yourself, either as a standalone function or as a model class method.

@transfluxus
Copy link

transfluxus commented Apr 9, 2020

I am trying to implement something but totally getting stuck with generic types

class Skill(BaseModel):
    name: str

class User(BaseModel):
    name: str
    age: int
    skills: List[Skill]

class SuperUser(User):
    origin: str

u1 = User(name="John", age=32, skills=[Skill(name="writing")])
u1
SuperUser.construct(**u1.dict())

how would I find out if a a field type is a generic (in this case List)?

@erlendvollset
Copy link

This feature would be very useful!

@Renthal
Copy link

Renthal commented Jan 22, 2021

I agree, could we get support for construct() on recursive models?

@PrettyWood
Copy link
Member

PrettyWood commented Jan 22, 2021

Hi guys
Here is a suggestion that should not have impact on product (hence the big if check before everything).
@samuelcolvin WDYT?
To use it now you can just create your own method for that

from pydantic import BaseModel as PydanticBaseModel


class BaseModel(PydanticBaseModel):
    @classmethod
    def construct(cls, _fields_set = None, *, __recursive__ = False, **values):
        if not __recursive__:
            return super().construct(_fields_set, **values)

        m = cls.__new__(cls)

        fields_values = {}
        for name, field in cls.__fields__.items():
            if name in values:
                if issubclass(field.type_, BaseModel):
                    fields_values[name] = field.outer_type_.construct(**values[name], __recursive__=True)
                else:
                    fields_values[name] = values[name]
            elif not field.required:
                fields_values[name] = field.get_default()

        object.__setattr__(m, '__dict__', fields_values)
        if _fields_set is None:
            _fields_set = set(values.keys())
        object.__setattr__(m, '__fields_set__', _fields_set)
        m._init_private_attributes()
        return m


class X(BaseModel):
    a: int

class Id(BaseModel):
    id: int
    x: X

class User(BaseModel):
    id: Id
    name = "test"

x = User(**{"id": {"id":2, "x": {"a": 1}}, "name": "not_test"})
y = User.construct(**x.dict())
z = User.construct(**x.dict(), __recursive__=True)

print(repr(x))
# User(id=Id(id=2, x=X(a=1)), name='not_test')
print(repr(y))
# User(name='not_test', id={'id': 2, 'x': {'a': 1}})
print(repr(z))
# User(id=Id(id=2, x=X(a=1)), name='not_test')

Hope it helps

@Renthal
Copy link

Renthal commented Jan 22, 2021

Hi @PrettyWood , thank you for the prompt answer! Your code is amazing and does exactly what I was looking for minus a small detail, which is now aliases are not working anymore. To test it I used the following code, given your BaseClass:

class Id(BaseModel):
    id: int = Field(
        default=None,
        alias='aliased_id',
    )

class User(BaseModel):
    id: Id
    name = "test"

d = {"id": {"aliased_id":2}, "name": "not_test"}
x = User(**d)
y = User.recursive_construct(**d)

print(x.id.id)
# 2

print(y.id.id)
# None

As the use-case would be for production, where you know for certain the test passed, this feature is very useful, however, without aliases it breaks the current workflow. Is there any chance to include such checks in your solution?

@Renthal
Copy link

Renthal commented Jan 22, 2021

Ok I came up with this modification which works on this small use-case but I am not sure if its general enough? WDYT?
It handles also list of BaseModels. Unfortunately in a preliminary benchmark this solution is twice as slow as going through the validators. Its is probably due to my sloppy code, right?

from pydantic import BaseModel as PydanticBaseModel, Field
from typing import List


class BaseModel(PydanticBaseModel):
    @classmethod
    def construct(cls, _fields_set = None, **values):  # or simply override `construct` or add the `__recursive__` kwarg

        m = cls.__new__(cls)
        fields_values = {}

        for name, field in cls.__fields__.items():
            key = ''
            if name in values:
                key = name
            elif field.alias in values:
                key = field.alias

            if key:
                if issubclass(field.type_, BaseModel):
                    if issubclass(field.outer_type_, list):
                        fields_values[name] = [
                            field.type_.construct(**e)
                            for e in values[key]
                        ]
                    else:
                        fields_values[name] = field.outer_type_.construct(**values[key])
                else:
                    if values[key] is None and not field.required:
                        fields_values[name] = field.get_default()
                    else:
                        fields_values[name] = values[key]
            elif not field.required:
                fields_values[name] = field.get_default()

        object.__setattr__(m, '__dict__', fields_values)
        if _fields_set is None:
            _fields_set = set(values.keys())
        object.__setattr__(m, '__fields_set__', _fields_set)
        m._init_private_attributes()
        return m

class Id(BaseModel):
    abc: int

class User(BaseModel):
    id: List[Id] = Field(
        default=None,
        alias='aliased_id',
    )
    name = "test"

d = {"aliased_id": [{"abc": 2}, {"abc": 3}], "name": "not_test"}
x = User(**d)
y = User.construct(**d)

print(x.id)
# Id(abc=2), Id(abc=3)]

print(y.id)
#[Id(abc=2), Id(abc=3)]

@PrettyWood
Copy link
Member

By default, __init__() will indeed use the alias whereas construct() won't.
That's not related to the recursive issue so I won't change my answer for clarity purpose.
But you are right, you just need to change the check of name (which is the field name) inside the input data values into field.alias
So here

            if field.alias in values:
                if issubclass(field.type_, BaseModel):
                    fields_values[name] = field.outer_type_.construct(**values[field.alias], __recursive__=True)
                else:
                    fields_values[name] = values[field.alias]

@Renthal
Copy link

Renthal commented Jan 25, 2021

but what about the speed concern? To my understanding the construct() method is mostly advantageous for not paying the prices of the validation operations. How come that the recursive construct is slower than the recursive validator?

@PrettyWood
Copy link
Member

I don't understand how it can take longer than plain __init__
I checked quickly with a basic example and it's still way faster

from pydantic import BaseModel as PydanticBaseModel


class BaseModel(PydanticBaseModel):
    @classmethod
    def construct(cls, _fields_set = None, *, __recursive__ = False, **values):
        if not __recursive__:
            return super().construct(_fields_set, **values)

        m = cls.__new__(cls)

        fields_values = {}
        for name, field in cls.__fields__.items():
            if name in values:
                if issubclass(field.type_, BaseModel):
                    fields_values[name] = field.outer_type_.construct(**values[name], __recursive__=True)
                else:
                    fields_values[name] = values[name]
            elif not field.required:
                fields_values[name] = field.get_default()

        object.__setattr__(m, '__dict__', fields_values)
        if _fields_set is None:
            _fields_set = set(values.keys())
        object.__setattr__(m, '__fields_set__', _fields_set)
        m._init_private_attributes()
        return m


class X(BaseModel):
    a: int

class Id(BaseModel):
    id: int
    x: X

class User(BaseModel):
    id: Id
    name = "test"

values = {"id": {"id":2, "x": {"a": 1}}, "name": "not_test"}

def f():
    # User(**values)
    User.construct(**values, __recursive__=False)

if __name__ == '__main__':
    import timeit
    print(timeit.timeit("f()", number=1_000_000, setup="from __main__ import f"))
    # __init__: 16.027352597
    # recursive construct: 7.86188907
    # non recursive construct: 2.8425855930000004

@Renthal can you please share your code and test results please?

@Renthal
Copy link

Renthal commented Jan 25, 2021

Unfortunately with shallow/small models the effect is minimal, I upgraded the example with a more complex object to showcase the issue at hand.

from pydantic import BaseModel as PydanticBaseModel, Field
from typing import List


class BaseModel(PydanticBaseModel):
    @classmethod
    def construct(cls, _fields_set=None, **values):

        m = cls.__new__(cls)
        fields_values = {}

        for name, field in cls.__fields__.items():
            key = ''
            if name in values:
                key = name
            elif field.alias in values:
                key = field.alias

            if key:
                if issubclass(field.type_, BaseModel):
                    if issubclass(field.outer_type_, list):
                        fields_values[name] = [
                            field.type_.construct(**e)
                            for e in values[key]
                        ]
                    else:
                        fields_values[name] = field.outer_type_.construct(**values[key])
                else:
                    if values[key] is None and not field.required:
                        fields_values[name] = field.get_default()
                    else:
                        fields_values[name] = values[key]
            elif not field.required:
                fields_values[name] = field.get_default()

        object.__setattr__(m, '__dict__', fields_values)
        if _fields_set is None:
            _fields_set = set(values.keys())
        object.__setattr__(m, '__fields_set__', _fields_set)
        m._init_private_attributes()
        return m


class F(BaseModel):
    f: List[str]


class E(BaseModel):
    e: List[str]


class D(BaseModel):
    d_one: List[E]
    d_two: List[F]


class C(BaseModel):
    c: List[D]


class B(BaseModel):
    b: List[int]


class A(BaseModel):
    one: List[B] = Field(
        default=None,
        alias='aliased_one',
    )
    two: List[C]


d = {
    "aliased_one": [{"b": [i]} for i in range(500)],
    "two": [{
        "c": [
            {
                "d_one": [{'e': ['' for _ in range(20)]} for _ in range(50)],
                "d_two": [{'f': ['' for _ in range(20)]} for _ in range(50)]
            } for _ in range(5)]
    } for _ in range(5)],
}


def f(values):
    A(**values)


def g(values):
    A.construct(**values)


if __name__ == '__main__':
    import timeit
    from functools import partial

    print(timeit.timeit(partial(f, values=d), number=100))
    #4.411519628018141
    print(timeit.timeit(partial(g, values=d), number=100))
    #22.629493223503232 <- as per my understanding this should be in the worst case as small as for f()

Note that the BaseModel is not the exact same as yours as mine also needs to handle lists of models. The idea is to have a construct version which can handle models as complex as those handled by the regular init but faster for production/trusted data.

@PrettyWood
Copy link
Member

PrettyWood commented Jan 25, 2021

Try to use the field.shape like this ;)

from pydantic import BaseModel as PydanticBaseModel, Field
from typing import List


class BaseModel(PydanticBaseModel):
    @classmethod
    def construct(cls, _fields_set=None, **values):

        m = cls.__new__(cls)
        fields_values = {}

        for name, field in cls.__fields__.items():
            key = field.alias  # this is the current behaviour of `__init__` by default
            if key:
                if issubclass(field.type_, BaseModel):
                    if field.shape == 2:  # the field is a `list`. You could check other shapes to handle `tuple`, ...
                        fields_values[name] = [
                            field.type_.construct(**e)
                            for e in values[key]
                        ]
                    else:
                        fields_values[name] = field.outer_type_.construct(**values[key])
                else:
                    if values[key] is None and not field.required:
                        fields_values[name] = field.get_default()
                    else:
                        fields_values[name] = values[key]
            elif not field.required:
                fields_values[name] = field.get_default()

        object.__setattr__(m, '__dict__', fields_values)
        if _fields_set is None:
            _fields_set = set(values.keys())
        object.__setattr__(m, '__fields_set__', _fields_set)
        m._init_private_attributes()
        return m


class F(BaseModel):
    f: List[str]


class E(BaseModel):
    e: List[str]


class D(BaseModel):
    d_one: List[E]
    d_two: List[F]


class C(BaseModel):
    c: List[D]


class B(BaseModel):
    b: List[int]


class A(BaseModel):
    one: List[B] = Field(
        default=None,
        alias='aliased_one',
    )
    two: List[C]


d = {
    "aliased_one": [{"b": [i]} for i in range(500)],
    "two": [{
        "c": [
            {
                "d_one": [{'e': ['' for _ in range(20)]} for _ in range(50)],
                "d_two": [{'f': ['' for _ in range(20)]} for _ in range(50)]
            } for _ in range(5)]
    } for _ in range(5)],
}


def f(values):
    return A(**values)


def g(values):
    return A.construct(**values)


if __name__ == '__main__':
    import timeit
    from functools import partial

    assert A(**d) == A.construct(**d)  # just to be sure!

    print(timeit.timeit(partial(f, values=d), number=100))
    # 11.651661356
    print(timeit.timeit(partial(g, values=d), number=100))
    # 0.6944440649999999

Hope it helps!

@Renthal
Copy link

Renthal commented Jan 26, 2021

It does thank you! Finally, since in my use-case most of the fields are in fact BaseModels I found that this solutions works better for me. Unfortunately the final version is only 20-30% faster than the validating one whereas with the small demo above it seems to be a lot faster. I think this is due to the nature of the model one is building

class BaseModel(PydanticBaseModel):
    @classmethod
    def construct(cls, _fields_set=None, **values):

        m = cls.__new__(cls)
        fields_values = {}

        for name, field in cls.__fields__.items():
            key = field.alias
            if key in values:  # this check is necessary or Optional fields will crash
                try:
                #if issubclass(field.type_, BaseModel):  # this is cleaner but slower
                    if field.shape == 2:
                        fields_values[name] = [
                            field.type_.construct(**e)
                            for e in values[key]
                        ]
                    else:
                        fields_values[name] = field.outer_type_.construct(**values[key])
                except AttributeError:
                    if values[key] is None and not field.required:
                        fields_values[name] = field.get_default()
                    else:
                        fields_values[name] = values[key]
            elif not field.required:
                fields_values[name] = field.get_default()

        object.__setattr__(m, '__dict__', fields_values)
        if _fields_set is None:
            _fields_set = set(values.keys())
        object.__setattr__(m, '__fields_set__', _fields_set)
        m._init_private_attributes()
        return m

@havardthom
Copy link

havardthom commented Apr 12, 2021

Hey guys, thank you for your suggested solutions, very helpful!

I ran into performance issues when using FastAPI with pydantic because my data was being validated multiple times before being returned to the client. Very similar to the problem stated here tiangolo/fastapi#1359 (comment)

Reading about the construct method in the pydantic docs I thought it was the obvious solution to my problem. The docs describe construct as Creating models without validation, but it fails to mention that it does not support nested models (which I believe is a very common use case) and that it behaves as if Config.extra = 'allow' was set since it adds all passed values. I had to dig through the source code to discover and understand this behaviour.

In my opinion Creating models without validation means creating the model like __init__ or parse_obj does, minus the validation part. So I suggest either fixing the functionality or making the docs much more explicit on what construct actually does and its use cases.

I ended up using the suggested solutions in this issue, but with a few modifications to make it work for me.

class BaseModel(PydanticBaseModel):
    @classmethod
    def construct(cls, _fields_set=None, **values):
        m = cls.__new__(cls)
        fields_values = {}

        config = cls.__config__

        for name, field in cls.__fields__.items():
            key = field.alias
            if key not in values and config.allow_population_by_field_name: # Added this to allow population by field name
                key = name

            if key in values: 
                if values[key] is None and not field.required: # Moved this check since None value can be passed for Optional nested field
                    fields_values[name] = field.get_default()
                else:
                    if issubclass(field.type_, BaseModel):
                        if field.shape == 2:
                            fields_values[name] = [
                                field.type_.construct(**e)
                                for e in values[key]
                            ]
                        else:
                            fields_values[name] = field.outer_type_.construct(**values[key])
                    else:
                        fields_values[name] = values[key]
            elif not field.required:
                fields_values[name] = field.get_default()

        object.__setattr__(m, '__dict__', fields_values)
        if _fields_set is None:
            _fields_set = set(values.keys())
        object.__setattr__(m, '__fields_set__', _fields_set)
        m._init_private_attributes()
        return m

@ynouri
Copy link

ynouri commented Mar 3, 2022

no, I'm afraid not.

For performance reasons construct() doesn't do any analysis or mutation of the objects it receives.

If you want this behaviour you'll need to implement it yourself, either as a standalone function or as a model class method.

@samuelcolvin this issue seems to be a common concern across Pydantic / FastAPI implementations operating at scale. Would it make sense to offer both construct options (high-performance non-recursive, slightly less performant recursive) directly in Pydantic?

@ughstudios
Copy link

When is this going to be added? It's been two years now.

@muety
Copy link

muety commented Dec 2, 2022

Would love to have recursive model construction!

@Mandera
Copy link

Mandera commented Jan 11, 2024

Been tracking this for so long, decided to try a different approach myself.
I don't want to have to write e.g. x: Optional[str] = None on every field, I like that they're kept clean and readable such as x: str.
I preferred construct because it allowed missing attributes, the lack of validation for non-missing fields was an unwanted side-effect.

The following solution tries to achieve

  • Recursive construction
  • Allow missing fields
  • Validate non-missing fields
class Model(pydantic.BaseModel):
    """ Allow any attr to be missing """
    
    def __init_subclass__(cls):
        """ Set all defaults to None """
        for key in cls.__annotations__:
            if not hasattr(cls, key):
                setattr(cls, key, None)
            else:
                attr = getattr(cls, key)
                if isinstance(attr, pydantic.fields.FieldInfo) and attr.is_required():
                    attr.default = None

    def __init__(self, **kwargs):
        """ Delete all None attributes (Mostly to clean up repr) """
        pydantic.BaseModel.__init__(self, **kwargs)
        for key in self.model_fields:
            attr = getattr(self, key, None)
            if attr is None:
                delattr(self, key)

Seems to work rather well, thought I'd share.
Use by normally instantiating a subclass of Model: SubModel(**data) or SubModel.model_validate(data), seems to be the same thing

RajatRajdeep pushed a commit to RajatRajdeep/pydantic that referenced this issue May 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet