Instances share objects after copy or created from unpacking a dict #249

gangefors · 2018-08-25T12:42:51Z

This is related to #154 and I'm using pydantic v0.12.1

Here is a simple example to show what I mean.

>>> class Model(pydantic.BaseModel):
...     my_dict: dict
...     
>>> model1 = Model(**{'my_dict': {'my_list': [1,2,3]}})
>>> model1
<Model my_dict={'my_list': [1, 2, 3]}>
>>> model2 = model1.copy()
>>> model2
<Model my_dict={'my_list': [1, 2, 3]}>
>>> model2.my_dict['my_list'] = []
>>> model2
<Model my_dict={'my_list': []}>
>>> model1
<Model my_dict={'my_list': []}>

Another example when created by unpacking a dictionary.

>>> my_dict = {'my_dict': {'my_list': [1,2,3]}}
>>> model1 = Model(**my_dict)
>>> model1
<Model my_dict={'my_list': [1, 2, 3]}>
>>> model2 = Model(**my_dict)
>>> model2
<Model my_dict={'my_list': [1, 2, 3]}>
>>> model2.my_dict['my_list'] = []
>>> model1
<Model my_dict={'my_list': []}>
>>> model2
<Model my_dict={'my_list': []}>

BaseModel.dict() seems to create copies correctly and can be used instead of .copy() to create a new object as seen with model3 in this example.

>>> model1 = Model(**{'my_dict': {'my_list': [1,2,3]}})
>>> model1
<Model my_dict={'my_list': [1, 2, 3]}>
>>> model2 = model1.copy()
>>> model2
<Model my_dict={'my_list': [1, 2, 3]}>
>>> model3 = Model(**model1.dict())
>>> model3
<Model my_dict={'my_list': [1, 2, 3]}>
>>> model2.my_dict['my_list'] = []
>>> model1
<Model my_dict={'my_list': []}>
>>> model2
<Model my_dict={'my_list': []}>
>>> model3
<Model my_dict={'my_list': [1, 2, 3]}>

I think this is quite serious since you can't know if you are working with a true copy of a model or not.

I see two things that need to be fixed.

.copy() should create a deepcopy that doesn't share any objects with the model it was copied from. User created classes have to support deepcopy operations to work correctly.

BaseModel(**dict) need to make sure that all items in the dict are deepcopied from the unpacked dictionary.

samuelcolvin · 2018-08-25T13:00:18Z

Hi, thanks for reporting.

I agree entirely that .copy() should take a deep copy.

BaseModel(**dict) is a more nuanced case. In 99% of cases people wouldn't care about coping the values and copying the values will be significantly slower. Example take the case (roughtly aiohttp but any other web framework and many other cases would be the case):

async def my_view(request):
    try:
        m = Model(**await request.json())
    except ValidationError as e:
        raise BadRequestError(**e.dict())
    ... do stuff with m

This is the primary use case for pydantic (at least for me) and I want it to remain as performant as possible (both in terms of CPU and memory). I really don't want to copy all the data from the result of .json() here.

We also can't add keyword arguments to Model.__init__() for fear of conflicting with input data. Instead I would suggest we add a class method Model.parse_copy().

For the same kinds of reason I don't really want .dict() to make a copy by default (for example .json() uses .dict() and definitely wouldn't require a copy) but here we could add an argument to deep copy the data

Does that make sense? Would you be willing to submit a PR for some or all of this?

samuelcolvin · 2018-08-25T13:01:47Z

BTW, you can reference other issues with #<issue number> you don't have to create a markdown link to the issue, this also has the advantage of showing the reference on the referenced issue.

gangefors · 2018-08-25T13:10:11Z

Totally agree that preformance is key.

If a true copy is needed at initialization the user can do that without the need for a seperate method. And a user can always do a deepcopy after converting dict(). So let's just focus on the initial problem I had, the copy() method (I noticed the other behaviours while testing).

Do you want to change the way copy() works or should we add a deepcopy() method to seperate the two?
Shallow copies might be enough for many use cases so changing that behaviour might not be the best call.

samuelcolvin · 2018-08-25T13:15:01Z

I think we add a deep=False argument to copy() and are explicit in the changelog and docs about what's happening.

I doubt many people actually use copy() directly.

gangefors · 2018-08-25T13:18:13Z

We use it during unit test where we want to have a base data model to start with and then just change what's needed for the test. That's how I ran into this issue, made a copy of the data, first test emptied a list in a dict and then the next test had nothing to check for despite copying the base data.

So I do think there is good use for it.

I'll look into doing a PR for this. Don't know then I can finish it though, but will try to have something for review in a week or so.

samuelcolvin · 2018-08-25T13:20:06Z

Great, thanks so much. I agree it's useful.

gangefors · 2018-09-10T13:26:28Z

Took a little longer than expected before I had time to do this, but now it's done.

gangefors mentioned this issue Sep 10, 2018

Add support for deep copy #261

Merged

5 tasks

samuelcolvin closed this as completed in c32ce34 Sep 10, 2018

Kilo59 mentioned this issue Dec 29, 2022

[FEATURE] ZEP - save config great-expectations/great_expectations#6665

Closed

4 tasks

alexdrydew pushed a commit to alexdrydew/pydantic that referenced this issue Dec 23, 2023

replace *_items -> *_length, fix pydantic#249 (pydantic#259)

895ddb0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Instances share objects after copy or created from unpacking a dict #249

Instances share objects after copy or created from unpacking a dict #249

gangefors commented Aug 25, 2018 •

edited

samuelcolvin commented Aug 25, 2018

samuelcolvin commented Aug 25, 2018

gangefors commented Aug 25, 2018

samuelcolvin commented Aug 25, 2018

gangefors commented Aug 25, 2018

samuelcolvin commented Aug 25, 2018

gangefors commented Sep 10, 2018

Instances share objects after copy or created from unpacking a dict #249

Instances share objects after copy or created from unpacking a dict #249

Comments

gangefors commented Aug 25, 2018 • edited

samuelcolvin commented Aug 25, 2018

samuelcolvin commented Aug 25, 2018

gangefors commented Aug 25, 2018

samuelcolvin commented Aug 25, 2018

gangefors commented Aug 25, 2018

samuelcolvin commented Aug 25, 2018

gangefors commented Sep 10, 2018

gangefors commented Aug 25, 2018 •

edited