-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Fix TypeAdapter to respect defer_build #8939
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix TypeAdapter to respect defer_build #8939
Conversation
a48b614 to
845780d
Compare
CodSpeed Performance ReportMerging #8939 will degrade performances by 12.43%Comparing Summary
Benchmarks breakdown
|
2a1a443 to
ce7909e
Compare
|
please review |
8a10e17 to
38dc679
Compare
|
With this our FastAPI initialization drops from ~40s to ~10s. Where core schema generation takes ~3-4s. This requires using |
|
Hmm, I'll take an in-depth look shortly, but my initial thought here was that we weren't planning on adding support for |
Thanks! That is interesting. I'm wondering why in the referenced issue it was suggested to use |
|
If there are some reservations about TypeAdapter supporting lazy core schema building like the BaseModels, then you could add a similar environment variable like suggested here #6768 (comment) (which unfortunately didn't work). So eg You can see here how our service startup time is going up as time goes by (additional data models and API models added). Now its already over a minute where the core schemas building (in TypeAdapters) takes about a minute: With Pydantic V1 the startup took about 10 seconds so the issue is getting worse with V2. |
@sydney-runkle any chance on allowing TypeAdapter to respect deferred building? Could we allow it via an environment flag as I suggested above? The slowness caused by core schemas is getting out of hand. |
|
Thanks for the ping, and sorry for the delay! I'll bring this up in our standup meeting on Monday and get back to you then! |
|
We'll discuss next week, but if it's helping you a lot and is opt in, I'm 👍. |
|
Thank you! This would indeed help. Atleast until there are optimizations/caching added to the CoreSchema generation. But probably implementing those are not happening in very near future |
I now made it an opt-in feature via the usual config object (instead of magical global flags as I originally suggested). See 7deb045 Also documented the current and the opt-in behaviour there. |
f8d413f to
3316871
Compare
@MarkusSintonen, we chatted this morning - let's move forward with this 🚀. I know you've implemented support for this as an opt in feature. I think we'll want to add some documentation explaining that this is experimental, and subject to change. Ultimately, a better solution will be to have TA build times improve significantly so that your changes aren't super necessary. I'll review thoroughly this afternoon :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to take a closer look at the logic changes in type_adapter.py, but here's some general feedback on the new API :).
Thanks for your great work on this thus far!
I still refactored it a bit into a smaller property functions with |
…test. Fix with main change
56f309d to
d6420f4
Compare
02ba59f to
b1a0518
Compare
…ing. Add more missing tests.
c90363d to
44d8b42
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your hard work on this.
Left some comments ranging from more broad to nitpicky change requests!
I think it makes sense for us to go ahead with the 2.7 beta release given that we still have more work to do on this PR (and there's lots of fixes in that release that we want to go ahead and get out), but I'm happy to continue to work closely with you to get this across the line!
pydantic/_internal/_mock_val_ser.py
Outdated
| def __contains__(self, key: Any) -> bool: | ||
| return self._get_built().__contains__(key) | ||
|
|
||
| def __getitem__(self, key: str) -> Any: | ||
| return self._get_built().__getitem__(key) | ||
|
|
||
| def __len__(self) -> int: | ||
| return self._get_built().__len__() | ||
|
|
||
| def __iter__(self) -> Iterator[str]: | ||
| return self._get_built().__iter__() | ||
|
|
||
| def _get_built(self) -> CoreSchema: | ||
| if self._built_memo is not None: | ||
| return self._built_memo | ||
|
|
||
| if self._attempt_rebuild: | ||
| schema = self._attempt_rebuild() | ||
| if schema is not None: | ||
| self._built_memo = schema | ||
| return schema | ||
| raise PydanticUserError(self._error_message, code=self._code) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by necessary? For abstract Mapping the __getitem__ / __len__ / __iter__ are required
Ill remove the __contains__ as it doesnt need overriding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, removed the unneeded __contains__ override
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I mean they aren't implemented for MockValSer, right? So why have them here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CoreSchema is a dict but eg SchemaValidator is an ordinary class
pydantic/type_adapter.py
Outdated
| def _frame_depth(depth: int) -> Callable[[Callable[..., R]], Callable[..., R]]: | ||
| def wrapper(func: Callable[..., R]) -> Callable[..., R]: | ||
| @wraps(func) | ||
| def wrapped(self: TypeAdapter, *args: Any, **kwargs: Any) -> R: | ||
| # depth + 1 for the wrapper function | ||
| with self._with_frame_depth(depth + 1): | ||
| return func(self, *args, **kwargs) | ||
|
|
||
| return wrapped | ||
|
|
||
| return wrapper | ||
|
|
||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think maybe it'd help to have a bit more documentation for this function and the frame depth function attached to the TypeAdapter class - it'd help me in my review as well!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, added some comment to this.
FYI Im wondering is there a better way than the _parent_depth handling. What about requiring user to define a callback function that would resolve the forward refs from locals/globals? So replacing _parent_depth with something like resolve_namespace: Callable[[], dict[str, Any]] where the str is the name of the type. Then it wouldnt be as fragile as the parent depth handling which could be still there but callable being preferred. Probably such resolver would need to come via Config instead than from the constructor arg directly.
This is ofcourse out of scope of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I definitely wish we had a better way for users to specify/pass the namespace, but the reason for the parent depth thing was so that it would generally behave correctly without any extra work. At this point, I think it's probably not possible to get rid of _parent_depth unless we find a way to keep all existing code that currently relies on that from breaking. In v3 I think we could make a change to this if we felt it simplified things significantly or otherwise had (even indirect) import-time performance benefits
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah no way to get rid of parent depth handling but some better (optional) way could be offered via configs to resolve the names. As working of the parent depth handling highly depends on the context on how model happens to be used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is quite impressive. I have the usual fear that there may be some surprises lurking but it generally seems quite solid. The "changes" to the behavior (such as not erroring when you access the core schema if it needs a rebuild, but only when you try to use it in some way) seem fine to me, so deep in the internals that I'm not very concerned. (Until someone reports a bug this causes... 🙂)
cd18f3f to
855a3f9
Compare
Thank you for the throughout review! Did the new round of changes |
Yes that was also my conclusion. It's so deeply internal that no one should rely on it (until they did). I also feel like it's now more consistent with rest of MockValSer behavior with lazy building. |
| self._error_message = error_message | ||
| self._code: PydanticErrorCodes = code | ||
| self._attempt_rebuild = attempt_rebuild | ||
| self._built_memo: CoreSchema | None = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used _built_memo memoizer here to avoid cases where user could capture the internal mocker class and then use it. That would again and again go through the model rebuilder which has deeply the core schema etc memoized after its built. Should the existing MockValSer for consistency also have its memoizer so it wont accidentally go again and again through the wrapped class deep memoizer?
|
Going to do a 2.7.1 patch release soon (Friday or Monday), then would love to get this merged. Will ping you if we have any additional questions before merging :). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really appreciate your work on this - the thorough testing and many rounds of iteration are certainly quite appreciated!!
Going ahead and merging this now - I've acknowledged the one schema building behchmark that experienced a bit of a regression. I'd love to see that improve again before we release 2.8, but I'm not particularly worried about the magnitude of the change.
Great work!
|
We'd more than welcome more PRs from you in the future 🚀! I'm guessing that you have an awesome grasp on lots of the mock validator and serializer logic at this point 😁. |
Thanks @sydney-runkle for the throughout review! :)
Noticed that also and the very small difference is coming from the |

Change Summary
Makes
TypeAdapterto respectdefer_buildso it constructs the core schema on first validation when_defer_build_modeis set to includetype_adapter.Related issue number
Partly related to #6768 but this does not fix the root performance issue. But allows
defer_buildto work with FastAPI (which heavily relies onTypeAdapter+Annotatedunder the hood).Checklist
Selected Reviewer: @samuelcolvin