-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leak after updating to Pydantic v1.9.0 #3829
Comments
Actually I'm witnessing a different issue when looking at |
Really need some kind of minimal example which exhibits this behaviour to help track down the bug. Any help would be much appreciated. |
Any progress on a reproducible example? Very hard to start digging out without anything to go on. @bachsh are you saying that for you, overall memory usage is unchanged? |
Sorry, no 😞 I tried to repro it locally, but was unable to do it in reasonable time. Also have been too busy with other stuff. We have now just locked the version into |
@MarkusSintonen do you use generics with pydantic, and I'm trying to hunt for the memory leak by running unit tests in a loop and I think I might have found something in generics. But so far I can't narrow it down to something which is unique to v1.9. FWIW, here is a gist of the hacky script I'm currently using to hunt for memory leaks. Only other question: is python 3.10.1 what you're using production? I assume you didn't change python version at the same time as upgrading pydantic? |
One more question: Do you create models dynamically? E.g. call Update: in particular, are you creating dataclasses on the fly many times? |
I've found a potential culprit if you are using generics - #2549 and in particular |
Well after hours of hunting, the good news is I think I've found a memory leak, the bad news is it's in python and I can't see how it's specific to v1.9. 😞 |
Wow amazing investigation @samuelcolvin! Thank you 🙏
Yes we use
Yes we use 3.10.1 in production. We didnt change the Python version at the same time as version bump of Pydantic. The only change was then Pydantic 1.8.2 -> 1.9.0. The newer Python version was running longer without issues.
No we dont directly use dynamic model creation. But as we are heavily using features of FastAPI, it might be the FastAPI behind the scenes doing some dynamic models based on route handlers parameters. We are using |
One more question, I assume you're also generating schema from your models? That's where I found the use of @tiangolo any input on the fastAPI side of things? |
Yes we are using OpenAPI specs generated by FastAPI (which in turn uses Pydantic schema). But it shouldn't really be happening in production that the schema generation would happen there, mainly happens in development side. |
Interesting, maybe I'm not as close as I thought to finding the problem. 😞 |
Can the issue be triggered by using
Because this happens in some parts of the codebase, active in production |
Yes I think so. Also:
I think I'll put a limit on the size of those two cache dicts just in case. |
On the FastAPI side, when there's a Just for completeness, FastAPI "clones" the field in Nevertheless, if it does the validation with Pydantic directly, because |
@tiangolo @samuelcolvin @MarkusSintonen So I was able to isolate the chunk that is causing it. The reason it is probably so sporadic is because of the way comparison happens it only causes issues if you are comparing misses after creating many objects. The search is fully recursive so any descendents of basemodel always get searched on new misses to isinstance and issubclass of a parent class. I would honestly suggest just dropping usage of ABCMeta from basemodel as it seems to be more trouble then its worth. If you dont want to do that you can replace both the subclasscheck method and the instancecheck methods and eliminate the entire issue. Frankly it will likely be dramatically faster as well. |
@zrothberg I don't seen anything new here, also I've already limited subclass checks in #4081. Obviously we can't remove Lastly this issue occurred between v1.8 and v1.9, both of which use abc. @MarkusSintonen have you tried v1.9.1? Could you let me know how it performs. |
Here is a rough and dirty proof of concept implementation for define a class decorator that can be used alongside pydantic (or any other class) to provide abstract method tracking and does not require the use of abcMeta. This one does require setting one value in the class namespace to track if an object is abstract or not this can be moved out to a single WeakSet that tracks them so it knows if it needs to check for the methods. It uses init_subclass and preserves super and class definitions of those methods so it should be useable just about anywhere. The one downside to this approach is that it requires that you pass in a keyword argument to the class definition if you want to subclass and stay abstract. Because decorators run after init_subclass you cannot tell if the class is abstract or not when it is called without one. from abc import abstractmethod
old_subclass = "_hidden_init_subclass"
abstract_location = "__isabstractmethod__"
## Helper code until the method abstractClass
##
def _gen_init_sub(abstract_methods):
def __init_subclass__(cls, **kwargs):
if kwargs.pop("_supress_abstract", False):
setattr(cls, abstract_location, True)
elif vars(cls).get(abstract_location, False):
pass
else:
for name in abstract_methods:
value = getattr(cls, name, None)
if value:
if not getattr(value, abstract_location, False):
continue
raise ValueError(
f"Class {cls} did not fill in all abstract methods was missing {name}")
return kwargs
return __init_subclass__
def _make_init_sub(oldcls,abstract_methods,previous_method):
partial = _gen_init_sub(abstract_methods)
if previous_method:
def __init_subclass__(cls, **kwargs):
newkwargs = partial(cls,**kwargs)
sidecall = getattr(cls, old_subclass)
sidecall( **newkwargs)
setattr(oldcls, old_subclass,previous_method)
setattr(oldcls, "__init_subclass__", classmethod(__init_subclass__))
else:
def __init_subclass__(cls, **kwargs):
newkwargs = partial(cls,**kwargs)
super(oldcls, cls).__init_subclass__(**newkwargs)
setattr(oldcls, "__init_subclass__", classmethod(__init_subclass__))
def _gather_abstract(tocheck):
abstracts = {name
for name, value in tocheck.items()
if getattr(value, abstract_location, False)}
return abstracts
def _generate_subclass_init(cls):
tocheck = vars(cls)
abstract_methods = _gather_abstract(tocheck)
if abstract_methods:
previoussub = tocheck.get("__init_subclass__", None)
_make_init_sub(cls, abstract_methods, previoussub)
return
raise ValueError("No abstract methods where defined")
# Decorator to make class abstract
def abstractClass(class_to_wrap):
_generate_subclass_init(class_to_wrap)
return class_to_wrap
@abstractClass
class NewClass:
bar:str
def __init__(self):
self.bar = "bar"
@classmethod
def __init_subclass__(cls, **kwargs):
#this call is preserved
super().__init_subclass__(**kwargs)
@abstractmethod
def foo(self):...
@abstractClass
class SecondClass(NewClass, _supress_abstract=True):
#_supress_abstract is required to subclass an
# abstract class without fulfilling the parent methods
def foo(self):
pass
@abstractmethod
def baz(self):...
class ThirdClass(SecondClass):
#fill out last
def baz(self):
pass |
Confirming that we had huge memory issues with |
is this solved for you by v1.10? |
While the memory usage does seem a bit higher than version Thank you! |
Thanks, if you can track down where the leak is coming from, please create a new issue. I could be |
Hi! We upgraded to Pydantic 1.10.2 from 1.8.2. We didnt observe any memory anomaly anymore like we did with 1.9.0. So I think the main issue has been solved. Thank you alot! 💯 |
great news, tanks so much for confirming. |
Closing the issue then thanks |
Confirming the bug does not occur in For my platform, this bug seems to affect pydantic platform: linux, python 3.8.8
|
Anyone using large numbers of generics, would be great if you could review (and maybe install and test) #5052 - I'd like to release this soon, but I'm mindful of trying not to break anything in this area which is pretty complex. |
Checks
Bug
Output of
python -c "import pydantic.utils; print(pydantic.utils.version_info())"
:As a heads-up we are seeing a memory leak in production after upgrading to Pydantic version 1.9.0. Our application is a large FastAPI based API service (
fastapi 0.73.0, uvicorn 0.17.4
). The application uses PydanticBaseModel
s,GenericModel
s andpydantic.dataclasses.dataclass
es for FastAPI response/request data models. It also uses Pydantic models for some internal data validators. We deployed a single change to production, which changed the version of Pydantic from 1.8.2 to 1.9.0. After that we started to observe abnormal memory usage. The memory usage increased until the version upgrade got reverted back to the older 1.8.2 version (no other change again). After deploying the revert, the memory usage patterns went back to normal.Unfortunately I haven't been able to reproduce the issue locally (it is an effort to simulate the prod-like load to get the leak to show up). So I can not yet add further information from isolating it or information from eg. memory analyzers. I tried to also look into the changelog and diff but its so big release that I can not spot anything obvious that may cause this. I have attached below a picture from our production metrics about the abnormal behaviour after using v1.9.0. We have never observed such a memory issue until now when upgrading to v1.9.0.
Thank you all for the awesome work and the great library! I wish I would have more information about the leak for the ticket.
Above shows the different deploys in colors. The orange one is where the Pydantic 1.9.0 update was running until it was reverted due to the memory issue. We couldn't have run it any longer as it was close at hitting the Kubernetes pod memory limits at that point.
The text was updated successfully, but these errors were encountered: