Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak after updating to Pydantic v1.9.0 #3829

Closed
3 tasks done
MarkusSintonen opened this issue Feb 18, 2022 · 28 comments
Closed
3 tasks done

Memory leak after updating to Pydantic v1.9.0 #3829

MarkusSintonen opened this issue Feb 18, 2022 · 28 comments
Labels
bug V1 Bug related to Pydantic V1.X performance
Milestone

Comments

@MarkusSintonen
Copy link
Contributor

Checks

  • I added a descriptive title to this issue
  • I have searched (google, github) for similar issues and couldn't find anything
  • I have read and followed the docs and still think this is a bug

Bug

Output of python -c "import pydantic.utils; print(pydantic.utils.version_info())":

$ python -c "import pydantic.utils; print(pydantic.utils.version_info())"
             pydantic version: 1.9.0
            pydantic compiled: True
                 install path: /opt/pydist/.venv/lib/python3.10/site-packages/pydantic
               python version: 3.10.1 (main, Dec 21 2021, 09:01:08) [GCC 10.2.1 20210110]
                     platform: Linux-5.10.93-x86_64-with-glibc2.31
     optional deps. installed: ['dotenv', 'typing-extensions']

As a heads-up we are seeing a memory leak in production after upgrading to Pydantic version 1.9.0. Our application is a large FastAPI based API service (fastapi 0.73.0, uvicorn 0.17.4). The application uses Pydantic BaseModels, GenericModels and pydantic.dataclasses.dataclasses for FastAPI response/request data models. It also uses Pydantic models for some internal data validators. We deployed a single change to production, which changed the version of Pydantic from 1.8.2 to 1.9.0. After that we started to observe abnormal memory usage. The memory usage increased until the version upgrade got reverted back to the older 1.8.2 version (no other change again). After deploying the revert, the memory usage patterns went back to normal.

Unfortunately I haven't been able to reproduce the issue locally (it is an effort to simulate the prod-like load to get the leak to show up). So I can not yet add further information from isolating it or information from eg. memory analyzers. I tried to also look into the changelog and diff but its so big release that I can not spot anything obvious that may cause this. I have attached below a picture from our production metrics about the abnormal behaviour after using v1.9.0. We have never observed such a memory issue until now when upgrading to v1.9.0.

Thank you all for the awesome work and the great library! I wish I would have more information about the leak for the ticket.

Näyttökuva 2022-2-18 kello 8 48 24

Above shows the different deploys in colors. The orange one is where the Pydantic 1.9.0 update was running until it was reverted due to the memory issue. We couldn't have run it any longer as it was close at hitting the Kubernetes pod memory limits at that point.
@MarkusSintonen MarkusSintonen added the bug V1 Bug related to Pydantic V1.X label Feb 18, 2022
@bachsh
Copy link

bachsh commented Mar 28, 2022

Actually I'm witnessing a different issue when looking at tracemalloc output.
In v1.9 I see that the pydantic module is building up its memory consumption. In lower version (v1.8.2) I don't see it but the process itself is building up the memory consumption as it was before.

@samuelcolvin
Copy link
Member

Really need some kind of minimal example which exhibits this behaviour to help track down the bug.

Any help would be much appreciated.

@samuelcolvin samuelcolvin added this to the v1.9.1 milestone Apr 1, 2022
@samuelcolvin
Copy link
Member

Any progress on a reproducible example?

Very hard to start digging out without anything to go on.

@bachsh are you saying that for you, overall memory usage is unchanged?

@MarkusSintonen
Copy link
Contributor Author

Any progress on a reproducible example?

Sorry, no 😞 I tried to repro it locally, but was unable to do it in reasonable time. Also have been too busy with other stuff. We have now just locked the version into 1.8.2 😕

@samuelcolvin
Copy link
Member

@MarkusSintonen do you use generics with pydantic, and GenericModel in particular?

I'm trying to hunt for the memory leak by running unit tests in a loop and I think I might have found something in generics.

But so far I can't narrow it down to something which is unique to v1.9.

FWIW, here is a gist of the hacky script I'm currently using to hunt for memory leaks.

Only other question: is python 3.10.1 what you're using production? I assume you didn't change python version at the same time as upgrading pydantic?

@samuelcolvin
Copy link
Member

samuelcolvin commented May 13, 2022

One more question:

Do you create models dynamically? E.g. call create_model or create new dataclasses or parse_obj_as or similar regularly at runtime?

Update: in particular, are you creating dataclasses on the fly many times?

@samuelcolvin samuelcolvin mentioned this issue May 14, 2022
11 tasks
@samuelcolvin
Copy link
Member

I've found a potential culprit if you are using generics - #2549 and in particular _assigned_parameters can grow quite large, particularly if you're creating many generics.

@samuelcolvin
Copy link
Member

Well after hours of hunting, the good news is I think I've found a memory leak, the bad news is it's in python and I can't see how it's specific to v1.9.

python/cpython#92810

😞

@MarkusSintonen
Copy link
Contributor Author

Wow amazing investigation @samuelcolvin! Thank you 🙏

do you use generics with pydantic, and GenericModel in particular?

Yes we use GenericModels for example with some FastAPI request/response models.

Only other question: is python 3.10.1 what you're using production? I assume you didn't change python version at the same time as upgrading pydantic?

Yes we use 3.10.1 in production. We didnt change the Python version at the same time as version bump of Pydantic. The only change was then Pydantic 1.8.2 -> 1.9.0. The newer Python version was running longer without issues.

Do you create models dynamically? E.g. call create_model or create new dataclasses or parse_obj_as or similar regularly at runtime?
Update: in particular, are you creating dataclasses on the fly many times?

No we dont directly use dynamic model creation. But as we are heavily using features of FastAPI, it might be the FastAPI behind the scenes doing some dynamic models based on route handlers parameters. We are using BaseModels, GenericModels and also @pydantic.dataclasses.dataclass with FastAPI request/response parameters so FastAPI might be doing some dynamic models based on these (there are usages of create_model in FastAPI where it constructs Pydantic models based on route handler parameters etc.).

@samuelcolvin
Copy link
Member

One more question, I assume you're also generating schema from your models? That's where I found the use of issubclass that is causing problems in cpython.

@tiangolo any input on the fastAPI side of things?

@MarkusSintonen
Copy link
Contributor Author

MarkusSintonen commented May 16, 2022

One more question, I assume you're also generating schema from your models? That's where I found the use of issubclass that is causing problems in cpython.

Yes we are using OpenAPI specs generated by FastAPI (which in turn uses Pydantic schema). But it shouldn't really be happening in production that the schema generation would happen there, mainly happens in development side.

@samuelcolvin
Copy link
Member

Interesting, maybe I'm not as close as I thought to finding the problem. 😞

@MarkusSintonen
Copy link
Contributor Author

MarkusSintonen commented May 16, 2022

Do you create models dynamically? E.g. call create_model or create new dataclasses or parse_obj_as or similar regularly at runtime?

Can the issue be triggered by using GenericModel like this?:

T = TypeVar("T", bound=BaseModel)

class Model1(GenericModel, Generic[T]):
    foo: T

class Model2(BaseModel):
    bar: int

klass = Model1[Model2]  # This internally calls create_model in __class_getitem__
klass(foo={"bar": 1})

Because this happens in some parts of the codebase, active in production

@samuelcolvin
Copy link
Member

Yes I think so.

Also:

  • it's probably not used by that many people, which might explain why not everyone sees this
  • It definitely changed in v1.9
  • there's an obvious way for memory to build up via _assigned_parameters

I think I'll put a limit on the size of those two cache dicts just in case.

@tiangolo
Copy link
Member

On the FastAPI side, when there's a response_model parameter, FastAPI "clones" the Pydantic field and stores that one. Here's where it would use create_model(), but it would happen only once per response_model, and only at startup, I'm pretty sure it wouldn't build up creating more models as time progresses.


Just for completeness, FastAPI "clones" the field in response_model because FastAPI uses it to filter/serialize data, for example, it could have response_model=BaseUser, and that class wouldn't include a password field. But then there could be a class that inherits from it DBUser that contains the password field. A developer could return that same object directly, and FastAPI would filter out the password field by serializing the data with BaseUser.

Nevertheless, if it does the validation with Pydantic directly, because DBUser inherits from BaseUser, it would pass Pydantic's validation, including the password field. But by having a "clone" (duplicate) of it in response_model, it is no longer the same parent class, so it doesn't pass the validation directly and is then correctly parsed with BaseUser.

@zrothberg
Copy link

@tiangolo @samuelcolvin @MarkusSintonen So I was able to isolate the chunk that is causing it. The reason it is probably so sporadic is because of the way comparison happens it only causes issues if you are comparing misses after creating many objects. The search is fully recursive so any descendents of basemodel always get searched on new misses to isinstance and issubclass of a parent class. I would honestly suggest just dropping usage of ABCMeta from basemodel as it seems to be more trouble then its worth. If you dont want to do that you can replace both the subclasscheck method and the instancecheck methods and eliminate the entire issue. Frankly it will likely be dramatically faster as well.

@samuelcolvin
Copy link
Member

@zrothberg I don't seen anything new here, also I've already limited subclass checks in #4081.

Obviously we can't remove ABCMeta in a minor or patch release.

Lastly this issue occurred between v1.8 and v1.9, both of which use abc.


@MarkusSintonen have you tried v1.9.1? Could you let me know how it performs.

@zrothberg
Copy link

def chasesubclass(classname):
    name = classname.__name__
    parents = classname.__bases__

    bases = list()
    for subclass in classname.__subclasses__():
        bases.append(chasesubclass(subclass))

    return {name:bases, "parents":parents}

You can use the above to explore your class hierarchy from the pathing the isinstance and issubclass explores. With the new generics each instance is registering up the stack and so any miss is now going through 2x as many classes.

If you use them with the reponse_model field in fastapi. Some VERY interesting things start to happen. You end up creating a new model each time this is done. So that means every API endpoint is creating 1 new model for each time response_model is used. If you do this with a generic, you end up creating an even deeper nested hierarchy. That behavior appears to happen even in 1.8. That is going to keep a very alive instance of the class rolling around so it won't matter if you cache or not it won't ever go out of scope.

Using both generics and FastAPI response_models creates like I want to say 2*depth models per endpoint.
Screen Shot 2022-06-25 at 10 32 39 PM
IDK why fastapi needs to create a new model for every single endpoint that seems excessive so I presume its a bug. As it happens in both 1.8 and 1.9 pydantic. Generics just made the growth exponential instead of linear.

The reason that issubclass is causing all the issue is on every single miss on a single parent level issubclass would then push through every single one of those classes and create a new weakref pointer and callback.

Below is full code to reproduce. it created like 60+ models for two endpints and 5 classes half of which are nested so any issubclass call on the parental node will search the entire tree.

from pydantic import BaseModel
from fastapi import FastAPI
from typing import TypeVar, Generic
from pydantic.generics import GenericModel
app = FastAPI()

class flatmodel(BaseModel):
    name: str
    value: int


class testmodel(BaseModel):
    name:str
    value:int

TypeX = TypeVar('TypeX')
class Bigmodel(BaseModel):
    hello:str
    goodbye:testmodel
    wtf:flatmodel

class BaseClass(GenericModel, Generic[TypeX]):
    X: TypeX

class ChildClass(BaseClass[TypeX], Generic[TypeX]):
     # Inherit from Generic[TypeX]
     pass


class Child2Class(ChildClass[TypeX], Generic[TypeX]):
    # Inherit from Generic[TypeX]
    pass


ChildClass[int](X=1)


@app.post("/hello", response_model=Child2Class[Bigmodel])
async def hello(test:ChildClass):
    return test

@app.post("/hello", response_model=Child2Class[Bigmodel])
async def bye(test:ChildClass):
    return test

def chasesubclass(classname):
    name = classname.__name__
    parents = classname.__bases__

    bases = list()
    for subclass in classname.__subclasses__():
        bases.append(chasesubclass(subclass))

    return {name:bases, "parents":parents}

value = chasesubclass(BaseModel)

@zrothberg
Copy link

Here is a rough and dirty proof of concept implementation for define a class decorator that can be used alongside pydantic (or any other class) to provide abstract method tracking and does not require the use of abcMeta. This one does require setting one value in the class namespace to track if an object is abstract or not this can be moved out to a single WeakSet that tracks them so it knows if it needs to check for the methods. It uses init_subclass and preserves super and class definitions of those methods so it should be useable just about anywhere.

The one downside to this approach is that it requires that you pass in a keyword argument to the class definition if you want to subclass and stay abstract. Because decorators run after init_subclass you cannot tell if the class is abstract or not when it is called without one.

from abc import abstractmethod

old_subclass = "_hidden_init_subclass"
abstract_location = "__isabstractmethod__"

## Helper code until the method abstractClass
##
def _gen_init_sub(abstract_methods):
    def __init_subclass__(cls, **kwargs):
        if kwargs.pop("_supress_abstract", False):
            setattr(cls, abstract_location, True)
        elif vars(cls).get(abstract_location, False):
            pass
        else:
            for name in abstract_methods:
                value = getattr(cls, name, None)
                if value:
                    if not getattr(value, abstract_location, False):
                        continue
                raise ValueError(
                    f"Class {cls} did not fill in all abstract methods was missing {name}")

        return kwargs
    return __init_subclass__

def _make_init_sub(oldcls,abstract_methods,previous_method):
    partial = _gen_init_sub(abstract_methods)

    if previous_method:
        def __init_subclass__(cls, **kwargs):
            newkwargs = partial(cls,**kwargs)
            sidecall = getattr(cls, old_subclass)
            sidecall( **newkwargs)

        setattr(oldcls, old_subclass,previous_method)
        setattr(oldcls, "__init_subclass__", classmethod(__init_subclass__))
    else:
        def __init_subclass__(cls, **kwargs):
            newkwargs = partial(cls,**kwargs)
            super(oldcls, cls).__init_subclass__(**newkwargs)
        setattr(oldcls, "__init_subclass__", classmethod(__init_subclass__))

def _gather_abstract(tocheck):
    abstracts = {name
                 for name, value in tocheck.items()
                 if getattr(value, abstract_location, False)}

    return abstracts


def _generate_subclass_init(cls):
    tocheck = vars(cls)
    abstract_methods = _gather_abstract(tocheck)

    if abstract_methods:
        previoussub = tocheck.get("__init_subclass__", None)
        _make_init_sub(cls, abstract_methods, previoussub)
        return
    raise ValueError("No abstract methods where defined")



# Decorator to make class abstract
def abstractClass(class_to_wrap):
    _generate_subclass_init(class_to_wrap)

    return class_to_wrap


@abstractClass
class NewClass:
    bar:str

    def __init__(self):
        self.bar = "bar"

    @classmethod
    def __init_subclass__(cls, **kwargs):
        #this call is preserved
        super().__init_subclass__(**kwargs)

    @abstractmethod
    def foo(self):...

@abstractClass
class SecondClass(NewClass, _supress_abstract=True):
                    #_supress_abstract is required to subclass an
                    # abstract class without fulfilling the parent methods
    def foo(self):
        pass

    @abstractmethod
    def baz(self):...

class ThirdClass(SecondClass):
    #fill out last
    def baz(self):
        pass

@GillesVandewiele
Copy link

Confirming that we had huge memory issues with 1.9.1 and reverting to 1.8.2 solved the issue!

@samuelcolvin
Copy link
Member

is this solved for you by v1.10?

@GillesVandewiele
Copy link

While the memory usage does seem a bit higher than version 1.8.2 it has mostly been fixed. Note: I have not tested this in a very thorough and structured manner, but rather by just looking at my system monitor.

Thank you!

@samuelcolvin
Copy link
Member

Thanks, if you can track down where the leak is coming from, please create a new issue.

I could be ininstance on abc classes, which does build memory a lot, but other than that I though we had fixed all such leaks.

@MarkusSintonen
Copy link
Contributor Author

Hi! We upgraded to Pydantic 1.10.2 from 1.8.2. We didnt observe any memory anomaly anymore like we did with 1.9.0. So I think the main issue has been solved. Thank you alot! 💯

@samuelcolvin
Copy link
Member

great news, tanks so much for confirming.

@PrettyWood
Copy link
Member

Closing the issue then thanks

@sjoerdk
Copy link

sjoerdk commented Nov 3, 2022

Confirming the bug does not occur in 1.10. Thanks for all the work! Removing the 1.8.2 pin in my projects.

For my platform, this bug seems to affect pydantic 1.9.1 only. From my tests:

platform: linux, python 3.8.8

pydantic version Free from memory leak
1.8.2 ✔️
1.9.0 ✔️
1.9.1 ✖️
1.9.2 ✔️
1.10.0 ✔️
1.10.1 ✔️

@samuelcolvin
Copy link
Member

Anyone using large numbers of generics, would be great if you could review (and maybe install and test) #5052 - I'd like to release this soon, but I'm mindful of trying not to break anything in this area which is pretty complex.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug V1 Bug related to Pydantic V1.X performance
Projects
None yet
Development

No branches or pull requests

8 participants