-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Performance boost: skip caching parent namespaces in most cases #10113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
My rough benchmark, testing building 7500 ish models: |
CodSpeed Performance ReportMerging #10113 will not alter performanceComparing Summary
|
|
Also open to ideas, maybe we don't do this in this function, but rather build out another function to do this specific check - looks like we're slowing down type adapter stuff a bit, which makes sense bc getting the frame info is slowing down |
|
I'll have time this evening to experiment with variations on this pattern - @dmontagu what do you think about this idea in general. |
|
@MarkusSintonen ping as well 🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you can get the docs tests passing with this I'm good with it in principle; if we make this change we should also change the name and docstring of that function to reflect the fact that it returns None for top-level module namespaces.
This sounds rather worrying. TypeAdapters are used at extreme levels eg in every FastAPI application 🤔 Is there any other way to do this? Or skipping the parent namespace stuff in TypeAdapters altogether? |
Yep, I'll find an alternative that doesn't slow TA builds. |
Deploying pydantic-docs with
|
| Latest commit: |
2ea648d
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://6eae71c4.pydantic-docs.pages.dev |
| Branch Preview URL: | https://namespace-change.pydantic-docs.pages.dev |
Coverage reportClick to see where and how coverage changed
This report was generated by python-coverage-comment-action |
||||||||||||||||||||||||
Co-authored-by: David Montague <35119617+dmontagu@users.noreply.github.com>
| # if f_back is None, it's the global module namespace and we don't need to include it here | ||
| if frame.f_back is None: | ||
| # if the class is defined at the top module level, we don't need to add namespace information | ||
| if frame.f_back is None or frame.f_code.co_name == '<module>': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, won't surprise me if we need to use a getattr here because some implementation of python has f_code is None or something, that feels like the kind of thing that might happen. But I don't know if those would have a pydantic-core release that worked with them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to be ok for the moment, but will keep an eye out for this as a potential source of bugs in the future!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inspect.getframeinfo (which was used before in this PR) also assumes frame.f_code.co_name is defined.
|
@MarkusSintonen woohoo, with 5742f82, we're no longer seeing the 10% slowdown :) |
|
On memory usage, for the case mentioned above with 7.5k models, we see 1/3 the memory usage 🚀 with this change
this branch: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Great find!
| # if f_back is None, it's the global module namespace and we don't need to include it here | ||
| if frame.f_back is None: | ||
| # if the class is defined at the top module level, we don't need to add namespace information | ||
| if frame.f_back is None or frame.f_code.co_name == '<module>': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
Going to merge. Excited to see impact in v2.9! |
|
hey folks! I am trying to understand what potential breaking change occurred here ( We have something like: if TYPE_CHECKING:
from prefect.flows import Flow
class EngineContext(RunContext):
...
flow: Optional["Flow"] = Nonewhich on import ( » python -c "from prefect import flow"
Traceback (most recent call last):
File "/Users/nate/github.com/prefecthq/prefect/.venv/lib/python3.12/site-packages/pydantic/_internal/_generate_schema.py", line 863, in _resolve_forward_ref
obj = _typing_extra.eval_type_backport(obj, globalns=self._types_namespace)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/nate/github.com/prefecthq/prefect/.venv/lib/python3.12/site-packages/pydantic/_internal/_typing_extra.py", line 287, in eval_type_backport
return _eval_type_backport(value, globalns, localns, type_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/nate/github.com/prefecthq/prefect/.venv/lib/python3.12/site-packages/pydantic/_internal/_typing_extra.py", line 311, in _eval_type_backport
return _eval_type(value, globalns, localns, type_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/nate/github.com/prefecthq/prefect/.venv/lib/python3.12/site-packages/pydantic/_internal/_typing_extra.py", line 340, in _eval_type
return typing._eval_type( # type: ignore
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Cellar/python@3.12/3.12.5/Frameworks/Python.framework/Versions/3.12/lib/python3.12/typing.py", line 415, in _eval_type
return t._evaluate(globalns, localns, type_params, recursive_guard=recursive_guard)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Cellar/python@3.12/3.12.5/Frameworks/Python.framework/Versions/3.12/lib/python3.12/typing.py", line 947, in _evaluate
eval(self.__forward_code__, globalns, localns),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "<string>", line 1, in <module>
NameError: name 'Flow' is not defined
...
File "<string>", line 1, in <module>
pydantic.errors.PydanticUndefinedAnnotation: name 'Flow' is not defined
For further information visit https://errors.pydantic.dev/2.9/u/undefined-annotationI'm of course suspicious that we were depending on something weird, but I haven't yet found our error and I wasn't immediately seeing how postponed annotations or the other suggestions I found from the usage error docs would help, but could totally be missing something. this is something we're currently doing to avoid circular imports im still very much looking into this, just figured i'd post here in case you could shortcut my investigation happy to open an issue if that seems appropriate EDIT (as I investigate more) unsurprisingly given the diff, if I remove this extension of the condition I also tried replacing |
|
This is what happens in your case, step by step:
if TYPE_CHECKING:
from prefect.flows import Flow
class EngineContext(BaseModel):
flow: Optional["Flow"]
from prefect.flows import Flow
from prefect.context import EngineContext
EngineContext.model_rebuild()
To work around this, I would provide an explicit @sydney-runkle, there are probably other projects relying on this behavior (probably without explicit knowledge). In the case of On the other hand, I feel like including the parent namespace during So I think we can either have a boolean flag in the definition of |
|
Hey @zzstoatzz, Thanks so much for reporting this. I'll look around for a fix on our end for this today - we definitely don't want to introduce this breaking change with an official minor release. The goal of this namespace modification stuff was to improve performance for schema builds + reduce unnecessary memory usage, but we certainly don't want users to have to manually modify types namespaces in cases where they didn't have to before. Will keep you updated regarding a fix! |
|
I've created #10253 to track updates here. |
|
thank you @Viicos and @sydney-runkle! the context is super helpful, and with your I'll keep up with that tracking issue 👍 |
|
@sydney-runkle, I was debugging an issue with the newer version of pydantic and after some investigation using We have a Python script that is dynamically generated using datamodel-code-generator and is executed within a function using # models.py
from __future__ import annotations
from pydantic import BaseModel
class Model1(BaseModel):
pass
class Model2(BaseModel):
model1: Model1
print(Model2.model_fields)# script.py
from pathlib import Path
def f():
source = Path("models.py").read_text()
exec(source)
f()I get a different result when I run $ python models.py
{'model1': FieldInfo(annotation=Model1, required=True)}
$ python script.py
{'model1': FieldInfo(annotation=ForwardRef('Model1'), required=True)}In the second case, I’m seeing a Adding I can open an issue if that would be helpful. Thank you. :) |
|
@skshetry does it work if you remove |
Thank you for your comment. If I remove $ python script.py
Traceback (most recent call last):
File "/Users/user/script.py", line 9, in <module>
f()
File "/Users/user/script.py", line 6, in f
exec(source)
File "<string>", line 8, in <module>
File "<string>", line 9, in Model2
NameError: name 'Model1' is not definedFor comparison, if I directly invoke, it works fine. $ python models.py
{'model1': FieldInfo(annotation=Model1, required=True)} |
|
In your case, using
def f():
exec(source, global())
print(Model2) # `<class '__main__.Model2'>`
f()You may encounter new issues when using More reading: |
We do pass def f():
local_vars = {}
source = Path("models.py").read_text()
exec(source, globals(), local_vars)
# access model from `local_vars`
spec = local_vars["spec"]
f()from __future__ import annotations
from pydantic import BaseModel
class Model1(BaseModel):
pass
class Model2(BaseModel):
model1: Model1
model_fields = Model2.model_fields
print(model_fields)
spec = Model2I guess it's a similar issue here (or, that the scope is not same?) |
|
By explicitly passing locals, you are hitting issues unrelated to Pydantic. Let's say A = 1
class B:
a = AAs per the documentation:
So doing the following: def f():
...
exec(source, globals(), local_vars)Is equivalent to: def f():
...
class SomeClass:
A = 1
class B:
a = AWhich will result in a What prevents you from importing |
We use Also, we don't want to pollute user's workspace, and |
@skshetry have you tried using |
We use pydantic and We probably could have generated a jsonschema or implemented other mechanisms, but that's how it's implemented at the moment. |
I agree with @Viicos' analysis here 👍 |
#10074 gives an in depth description of the troubles of our parent namespace approach.
In this PR, currently:
We skip any parent namespace fetching logic unless a model is defined in a function, where this becomes necessary - we don't need a namespace cache otherwise because we fetch the global ns when necessary during the schema build, ex, via this function.
Docs tests are no longer failing - the issue was addressed in Add new module to
sys.modulesso that namespace ops are clean pytest-examples#35.In future PRs, or maybe this one, I'd like to:
I wouldn't say this justifies closing #10074, but it gets us much closer!