10x speedup over V1 - use `TypedDict` instead of `BaseModel` #1

samuelcolvin · 2023-07-02T17:59:57Z

Hi, I saw your linkedin post and thought I'd have a look at why the speedup is only 5x.

It looks like the reasons are:

most of the time is spent in "serialization" namely converting the model to a dict
the model is quite simple in this case - just str and float types etc., no nested models etc

Since in this case you're not actually using the pydantic model, but converting it straight to a dict, I changed the code here to use a TypeDict, and do the excluding of None values during validation. This means there's no need for a formal "serialization" step - the output of validation to dicts is all you need.

Hope this helps, let me know if you have any questions.

prrao87 · 2023-07-03T03:22:29Z

Thanks a lot for the PR, this will help me and a lot of others learn some cool features of Pydantic 😇

samuelcolvin · 2023-07-03T09:46:52Z

FYI we also have profile guide optimisation (PGO) of pydantic-core compilation (see pydantic/pydantic-core#686) coming soon which should give another 10-30% speed of pydantic-core.

prrao87 · 2023-07-07T17:44:39Z

Hi @samuelcolvin, I'm filing a bug report with VS Code's Pylance language server (which depends on the Pyright type checker). The linter seems to think that specifying a custom method such as @field_validator under the TypedDict object is incorrect syntax, and Pyright says that it's invalid Python (Error: TypedDict classes can contain only type annotations) -- however, the code runs and there's clearly no syntax error. Just to clarify, the error is with Pylance/Pyright, and this issue is only present if you use VS code as your IDE (which I do), hence the bug report on their repo and not Pydantic's.

I've linked the issue above -- it looks like they're saying they're not sure if it's "correct" to inherit from TypedDict directly, and that inheriting from BaseModel is the right way to do it. What are your thoughts? I realize it's still early days and that the Pydantic v2 docs are still catching up, but is there anywhere you've documented how the inheritance from TypedDict works exactly? Any tips you could offer in understanding how Pydantic is able to handle the specification of custom methods under TypedDict objects even though the object was originally intended to just house struct-like fields?

My concern is that there has been an issue filed on the mypy repo, related to allowing custom methods within TypedDict that was closed as out of scope for this object type. I'm unsure if Pyright followed the same approach. So, if this is the case, how has Pydantic reconciled this and how does it pass type checkers if specifying @field_validator alongside TypedDicts?

Update

I just checked PEP 589 and it's clearly mentioned that methods are not allowed within TypedDict:

Methods are not allowed, since the runtime type of a TypedDict object will always be just dict (it is never a subclass of dict).

TypedDict isn’t extensible, and it addresses only a specific use case. TypedDict objects are regular dictionaries at runtime, and TypedDict cannot be used with other dictionary-like or mapping-like classes, including subclasses of dict. There is no way to add methods to TypedDict types. The motivation here is simplicity.

So I suppose the bigger questions are:

Why does the code here in this repo work? Is this because TypeDict is purely for type hinting and that the underlying dict is all that is seen during run time?
If the approach used in this PR, i.e., defining a TypedDict subclass, is non-standard and doesn't respect type hinting guidelines, should this approach even be used in a real Pydantic code base?
- If not, how would one go about minimizing serialization burden while also using field validators? I don't see any other way to do this other than subclassing BaseModel, which is what's currently shown in the Pydantic docs

prrao87 · 2023-07-07T18:53:49Z

I've started a discussion in the Pydantic repo regarding this, hope we can continue the discussion there!
pydantic/pydantic#6517

use TypedDict

628167c

prrao87 merged commit 6ef33c7 into prrao87:main Jul 3, 2023

prrao87 mentioned this pull request Jul 7, 2023

Pydantic v2 inheritance from TypedDict is flagged as error, even though it's valid Python microsoft/pylance-release#4584

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

10x speedup over V1 - use `TypedDict` instead of `BaseModel` #1

10x speedup over V1 - use `TypedDict` instead of `BaseModel` #1

samuelcolvin commented Jul 2, 2023

prrao87 commented Jul 3, 2023

samuelcolvin commented Jul 3, 2023

prrao87 commented Jul 7, 2023 •

edited

prrao87 commented Jul 7, 2023 •

edited

10x speedup over V1 - use TypedDict instead of BaseModel #1

10x speedup over V1 - use TypedDict instead of BaseModel #1

Conversation

samuelcolvin commented Jul 2, 2023

prrao87 commented Jul 3, 2023

samuelcolvin commented Jul 3, 2023

prrao87 commented Jul 7, 2023 • edited

Update

prrao87 commented Jul 7, 2023 • edited

10x speedup over V1 - use `TypedDict` instead of `BaseModel` #1

10x speedup over V1 - use `TypedDict` instead of `BaseModel` #1

prrao87 commented Jul 7, 2023 •

edited

prrao87 commented Jul 7, 2023 •

edited