Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plan to support pydantic 2.x versions #187

Closed
sjsinghai opened this issue Nov 22, 2023 · 7 comments · Fixed by #246
Closed

Plan to support pydantic 2.x versions #187

sjsinghai opened this issue Nov 22, 2023 · 7 comments · Fixed by #246
Assignees
Labels
feature New feature or request

Comments

@sjsinghai
Copy link

When can we expect support for pydantic 2.x versions?

@cmgrote
Copy link
Collaborator

cmgrote commented Nov 28, 2023

Hi @sjsinghai — this is something we'd investigated a couple months ago, but ran into severe performance issues with at that time (things that were sub-second suddenly took minutes). We'd also like to make the move, but not at the cost of slowing down the SDK by the amount we observed...

Curious: what particular features are you most looking forward to adopting or are reliant on with pydantic v2.x?

Perhaps even if we're still unable to move to pydantic 2.x we can look at some other way of unlocking those needs.

@sjsinghai
Copy link
Author

sjsinghai commented Nov 28, 2023

That is a bit surpirsing given that the main thrust of pydantic 2.x is speed up offered by the improved backend.

In fact it is because of the promised speed-up that we are investigation move of our stack to 2.x.

Have you considered using the pydantic.v1 that ships with 2.x e.g. as documented here?
https://docs.pydantic.dev/latest/migration/#continue-using-pydantic-v1-features.

This should

  • unblock others from using 2.x
  • should have minimal impact to your existing performance.

@cmgrote
Copy link
Collaborator

cmgrote commented Nov 30, 2023

Here's the thread over on pydantic's repo — seems at least one other adopter has faced similar challenges: pydantic/pydantic#7263

@tdebroc
Copy link

tdebroc commented Feb 9, 2024

Agree we need that as well. I sent a mail to Atlan support. The migration is pretty smooth to do, we did in our repo

@cmgrote
Copy link
Collaborator

cmgrote commented Feb 9, 2024

We're looking at this again — we definitely want to get there, too!

@cmgrote
Copy link
Collaborator

cmgrote commented Feb 13, 2024

We're making more progress on this now than we had previously, but still some work to do.

However, to give an update on our direction:

  • We will be using the pydantic.v1 V1 API (or "compatibility layer", as I call it) available in Pyandatic 2.x.
  • In this way we will remove the dependency on the Pydantic 1.x libraries.
  • This should allow folks to use "pure" Pydantic 2.x if-desired in their own code, without any conflicting libraries.

A bit more context:

We had hoped to move to "pure" Pydantic V2 ourselves, but the performance hit is simply unacceptable. We have proven this by:

  1. Starting with a subset of our (complex) data model (~30 out of 100+ classes).
  2. Modeling out the true inheritance and inter-relationships within that subset.
  3. Using the latest released version of Pydantic 2.

First we tested with importing straight from pydantic (so V2), and creating instances of all the various classes in that subset of the model. We updated our code to use the model_rebuild() method only where necessary for the inter-relationships to resolve and be able to successfully instantiate all the classes. (Rationale: running the model_rebuild() against every class made the imports alone take significantly longer.)

Result: 35 seconds on average to go through the imports, model rebuilds, and class instantiations. ❌

Next, we took exactly the same code but replaced the import straight from pydantic with an import from pydantic.v1, to use the V1 APIs (compatibility layer). We also replaced the model_rebuild() calls with update_forward_refs() calls, and then re-ran the same code to instantiate all the classes.

Result: 0.8 seconds on average to go through the imports, forward ref updates, and class instantiations. ✅

Given this is only a subset of our model, we expect the performance degradation of "pure" V2 to be even worse with the full complexity of our model (likely growing from 35x slower to close to 100x slower or more), and thus will stick to the V1 API for the foreseeable future, though we do intend to update to v2 of the library itself.

@rajivramanathan
Copy link

Yes. That sounds like a right approach without having to swallow v2 in one shot. And will help us a lot in the short term @cmgrote.

@cmgrote cmgrote added the feature New feature or request label Feb 21, 2024
@cmgrote cmgrote linked a pull request Feb 21, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants