Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider releasing Pydantic V2 under a different python package name. #6523

Closed
1 task done
jolynch opened this issue Jul 7, 2023 · 11 comments
Closed
1 task done

Consider releasing Pydantic V2 under a different python package name. #6523

jolynch opened this issue Jul 7, 2023 · 11 comments
Assignees
Labels
bug V2 Bug related to Pydantic V2 unconfirmed Bug not yet confirmed as valid/applicable

Comments

@jolynch
Copy link

jolynch commented Jul 7, 2023

Initial Checks

  • I confirm that I'm using Pydantic V2 installed directly from the main branch, or equivalent

Description

As projects across our org (Netflix) have been pulling the v2 release, they are breaking in pretty significant ways. Would the project maintainers consider renaming the python package to pydantic2 (probably core as well) so users can install both at once and start migrating safely and incrementally? It appears you are already doing branch based development so the patch to release to a new package name should be straightforward, although there are a few non relative imports that would have to be fixed. Understood that purging existing 2.x artifacts might be tough - doing this before the 2.x release would have been easier. A potential way to work around that would be to start releasing 1.x to pydantic1 (and emulate what ES does where the non versioned package name is always the latest major, but libraries can pin to their major version as well to cohabitate)

In particular, I'm a maintainer of a library which uses pydantic, and it's somewhat difficult to write and test code that works with both v1 and v2 consumers of the library. If they were different package names it would just work. I've seen this approach from FastAPI to try to work around this and stay compatible, but that just seems exceptionally unfortunate.

The technique of using a separate package name for major API changes is successfully used across the python ecosystem, examples are the boto -> boto3 transition, the Elasticsearch clients have done it since 5.0.

Example Code

No response

Python, Pydantic & OS Version

$ .tox/py/bin/python -c "import pydantic.version; print(pydantic.version.version_info())"

             pydantic version: 2.0.2
        pydantic-core version: 2.1.2 release build profile
                 install path: <venv path>
               python version: 3.8.10 (default, May 26 2023, 14:05:08)  [GCC 9.4.0]
                     platform: Linux-5.15.0-76-generic-x86_64-with-glibc2.29
     optional deps. installed: ['typing-extensions']

Selected Assignee: @samuelcolvin

@jolynch jolynch added bug V2 Bug related to Pydantic V2 unconfirmed Bug not yet confirmed as valid/applicable labels Jul 7, 2023
@jolynch
Copy link
Author

jolynch commented Jul 7, 2023

Also I'm aware of #5402 but I just wanted to surface just how much pain this is causing at least for our organization and provide a forum for others to surface the pain as well. Application owners have been scrambling to fix builds all week. For a library owner, there is no way forward other than building the compatibility shims or coordinating with consuming applications to perform the upgrade.

@samuelcolvin
Copy link
Member

I'm very sorry about the breaking changes that come with Pydantic V2; unfortunately most are unavoidable consequences of fixing quirks in V2 behaviour, the rest are results of the comprehensive rewrite we've tried very hard to limit these, but we only have so many resources - we're not a big company like Netflix 😸 .

We're not going to release a pydantic2 package, and the right place to discuss such things is on the existing issues #5402 and #4649 which already have a lot of detailed discussion.

If you want to use Pydantic V2 and V1 at the same time, we specifically have pydantic.v1 in the V2 codebase so you can use V1 with V2 installed.

I'm very surprised that a large, knowledgeable organisations like Netflix have had problems with V2 breaking things unexpectedly. Do you not pin dependencies and write unit tests?

Even if you are having those problems, releasing a pydantic2 package wouldn't help - running pip install pydantic would still install pydantic >2. It goes without saying that yanking or deleting Pydantic V2 releases in PyPI, conda-forge and the many linux distributions that ship pydantic is an absolute non-starter.


TL;DR:

  • if you're using pydantic as an application, pin to <2 until you have time to upgrade
  • if you're using pydantic as a package, either pin to <2 or implement separate logic to support both V1 and V2 as FastAPI has done
  • in either case you can use from pydantic.v1 import BaseModel, ... to continue to use pydantic V1 while you're upgrading to V2
  • use bump-pydantic to help automate the upgrade to V2, I've heard people have found it very useful

@jolynch
Copy link
Author

jolynch commented Jul 8, 2023

I'm very sorry about the breaking changes that come with Pydantic V2; unfortunately most are unavoidable consequences of fixing quirks in V2 behaviour, the rest are results of the comprehensive rewrite we've tried very hard to limit these

I and others here really love pydantic and I'm sure that the V2 changes are great, it has made its way into many projects here because developers really find value in this package (and fastapi). Let me start by saying I understand how hard open source maintenance is, and I appreciate your efforts. This issue was to surface the user pain and ask for a reasonably small ask that would relieve a large amount of that pain, and probably relieve a large amount of pain for the maintainers because you don't have to deal with all the "you broke me" issues.

We're not going to release a pydantic2 package, and the right place to discuss such things is on the existing issues #5402 and #4649 which already have a lot of detailed discussion.

Yeah I found those, and I just wanted to surface this is a real issue and how accurate @tlambert03 is in their arguments. As a library owner trying to support both applications that have v1 and v2 is difficult. The standard way in the python ecosystem to release a backwards incompatible library is to rename the package so the versions can cohabitate. I'd be surprised if other people are not running into this as well.

I'm very surprised that a large, knowledgeable organisations like Netflix have had problems with V2 breaking things unexpectedly. Do you not pin dependencies and write unit tests?

Some applications have unit tests and only their builds broke (which is still work) - some applications don't have thorough test coverage and broke in production (not assigning blame, just sharing experiences).

Most software at large companies doesn't do max version pinning since we want to stay on latest, and usually when large major breaking API changes are introduced the packages are renamed to avoid large amounts of user pain. Netflix is a relatively small Python shop (mostly does Java), but at my previous company Yelp which had 10+ million lines of python, library authors were forbidden from breaking their public API without a package name change due to the hundreds of hours of work that go into upgrading when libraries break compatibility.

Even if you are having those problems, releasing a pydantic2 package wouldn't help - running pip install pydantic would still install pydantic >2. It goes without saying that yanking or deleting Pydantic V2 releases in PyPI, conda-forge and the many linux distributions that ship pydantic is an absolute non-starter.

This is not accurate reasoning. With a separate package name, libraries can depend on the major API they depend on, and then applications can pick whatever version is best for them. With different names, different parts of the apps don't have to agree instead major versions can cohabitate. Even if you just released the old packages with a version number suffix (like Elasticsearch has done) that would resolve it because the library can dep pydantic1, until you release pydantic2 and then they can dep that. Releasing a pydantic2 package in addition would also help folks move forwards because we can forward roll libraries to pydantic2 while apps stay pinned to pydantic<2. I can

if you're using pydantic as an application, pin to <2 until you have time to upgrade

Yup we've already been going through and pinning applications to <2 (since most folks don't pre-emptively max pin) and are now scheduling a few engineering weeks of time to go upgrade all the broken applications since we would like to stay on latest - I fear though that a lot of folks will just say pinned down for a long time.

if you're using pydantic as a package, either pin to <2 or implement separate logic to support both V1 and V2 as FastAPI has done

Pinning to <2 breaks any including application that has upgraded to >=2. The only path forward for a library is to do the (frankly complex) hacks that FastAPI has done to cohabitate. The simpler thing from a maintenance perspective would actually be to vendor the package in entirety and rename the package to pydantic1 ourselves (like you did in v2 with the pydantic.v1 package).

in either case you can use from pydantic.v1 import BaseModel, ... to continue to use pydantic V1 while you're upgrading to V2

No this doesn't solve the problem for libraries since the don't control the version that is installed by the application. Most applications are just pinning fwiw since they don't want to introduce the changes that can break.

use bump-pydantic to help automate the upgrade to V2, I've heard people have found it very useful

Yes for applications this is useful, thank you to whoever built that. For libraries it is not useful because the library doesn't control the installed version.

@tlambert03
Copy link
Contributor

sorry to hear you're hurting here...

No this doesn't solve the problem for libraries since the don't control the version that is installed by the application. Most applications are just pinning fwiw since they don't want to introduce the changes that can break.

doesn't it help a little bit? For our libraries we've been doing this somewhere internally

try:
    from pydantic import v1 as pydantic
except ImportError:
    import pydantic

so, if you end up with v2, you're fine, and if you end up with v1, you're also fine

@adriangb
Copy link
Member

adriangb commented Jul 8, 2023

To make try to make some positive light of this conversation, I think it would be really helpful if you could share specifics on how things broke, either publicly or privately, if that’s easier. Something like X number of applications hit errors related to this change, Y applications were broken by this other change, etc. That data would be really useful in identifying ways we can make the upgrade less likely to break things in the first place. Like Samuel said some changes are unavoidable but they can be soften by adding shims with deprecations warnings or at least documenting clearer migration paths.

@jolynch
Copy link
Author

jolynch commented Jul 8, 2023

sorry to hear you're hurting here...

No this doesn't solve the problem for libraries since the don't control the version that is installed by the application. Most applications are just pinning fwiw since they don't want to introduce the changes that can break.

doesn't it help a little bit? For our libraries we've been doing this somewhere internally

try:
    from pydantic import v1 as pydantic
except ImportError:
    import pydantic

so, if you end up with v2, you're fine, and if you end up with v1, you're also fine

It does help by avoiding the <2 version pin in the library, which is obviously a non-starter since that then breaks including apps that have upgraded. It doesn't allow the libraries to upgrade and move before the applications, as the library has to wait for all consuming applications to bump to v2, then they can release a version of the library that deps v2.

@jolynch
Copy link
Author

jolynch commented Jul 8, 2023

To make try to make some positive light of this conversation, I think it would be really helpful if you could share specifics on how things broke, either publicly or privately, if that’s easier. Something like X number of applications hit errors related to this change, Y applications were broken by this other change, etc.

I don't think we've been centrally tracking this, but from looking at support requests this week I can estimate around ~10 applications were broken, and from dependency reports it seems there are around ~54 apps/libraries that depend on pydantic. Most folks are pulling it through central libraries which depend on pydantic (transitive deps). We've just been introducing <2 pins into consuming applications preemptively to try to stem the tide since the central libraries can't <2 pin.

I think most breaking issues are coming from more strict validations on input data, which are (maybe?) good.

For example a cli that automates SSHing across a fleet of machines broke in production with a validation error because a particular target application didn't have a field set:

__pydantic_self__.__pydantic_validator__.validate_python(data, self_instance=__pydantic_self__)
pydantic_core._pydantic_core.ValidationError: 1 validation error for AppAttributes
monitorBucketType
  Field required [type=missing, input_value={'name': '<redacted>'], 'tags': ['']}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.0.1/v/missing

I believe the issue there is the field was Optional but didn't declare a proper default due to a typo dafault instead of default (monitor_bucket_type: Optional[str] = Field(dafault=None, alias="monitorBucketType")). The test cases always provided the field so the unit tests didn't catch it a.k.a didn't have full test coverage. The code receiving that object handled None or non-None values, and the record wraps a remote JSON API so strict validation is probably not wise as it is.

The issue I personally ran into was while trying to make an unrelated change to the service-capacity-modeling library we use for capacity planning tests failed because previously 436.5 was being cast to 436 and everything was fine

service_capacity_modeling/hardware/__init__.py:23: in load_hardware
    return Hardware(**hardware)
E   pydantic_core._pydantic_core.ValidationError: 1 validation error for Hardware
E   instances.`i4i.large`.drive.size_gib
E     Input should be a valid integer, got a number with a fractional part [type=int_from_float, input_value=436.5, input_type=float]
E       For further information visit https://errors.pydantic.dev/2.1.2/v/int_from_float

This just broke builds, the upstream application had already pinned to <2 so no production issues there. While working on these fixes I started asking our central Python team "how am I supposed to support both v1 and v2" and that lead here because it was basically "we're pinning all apps to <2 until we can schedule the time to upgrade and test each of these apps".

@tlambert03
Copy link
Contributor

how am I supposed to support both v1 and v2

yeah, this is really the tough thing for libs. I've been creating per-package pydantic_compat modules like this one that allow me to support both v1 and v2. But, it's a lot of work, and a different part of the API for each package (so efforts to centralize my "pydantic-compat" strategy have been hard)

@reivilibre
Copy link

I don't mean to complain by any means, but reusing the same package name means it is hard to get distributions to package Pydantic V2 because they need to wait for all packaged software to be ready to jump ship to V2 before they can do that.
This in turn means that applications hoping to be packaged in distributions can't really switch to using Pydantic V2 proper as they can't expect it to get packaged in the near future.

The best we seem to be able to do at the moment is use this as transitional code, mentioned earlier in the thread:

try:
    from pydantic import v1 as pydantic
except ImportError:
    import pydantic

so we don't end up being one of the packages blocking the future :).

Wonder if there's any other, slightly nasty, solutions that would be possible here, e.g. releasing a version of Pydantic that is compatible with old-V1 applications but has the V2 stuff available in pydantic.v2, or a repackage of pydantic as pydantic2? I can't think of anything bulletproof though.

@samuelcolvin
Copy link
Member

I acknowledge this problem, we (the pydantic) team have had it too.

If there's a clever way around this, we're happy to consider it, but I fear this is just how it is.

@joshorr
Copy link

joshorr commented Sep 19, 2023

As a tip/something-helpful, I've found using poetry (https://python-poetry.org) can help for cases like this, as it uses semantic versioning to figure out which library versions to pull.

It can read the requirements of each library/dependency recursively and figure out exactly what libraries to install for the local project. If there is a conflict, where two versions of the same dependency need to be installed it will indicate it up-front by erring out while updating the project dependencies to new versions.

It's standard practice with poetry to pin to being under the next major version number. When you do a poetry add pydantic, it will add this to the projects dependency list:

pydantic = "^2"

This tells poetry that it can use any version of pydantic as long as it's at least 2.0.0 but <= to 3.0.0. Using semantic versioning practices like this with all your dependencies can really help prevent your projects from upgrading to a version of a dependency that might have a breaking change.

Every so often I ask poetry if there is any major version that is available that I have not upgraded to yet via:

poetry show --outdated

Which then shows me exactly what I have not updated to yet (including a major changes that could have breaking changes).

I would not necessarily know that pydantic had a new major version if it changed its package name each time. So I appreciate them using the same package name, so I can easily discover when they have new major versions.

If I do find something is outdated by a major-version number change, can then change the dependency requirement of my local project to use the major version I want to upgrade too. I then test the project as needed and then release it after everything is working. If there is a conflict with one of my other dependencies and this new major version poetry would error out up-front and tell me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug V2 Bug related to Pydantic V2 unconfirmed Bug not yet confirmed as valid/applicable
Projects
None yet
Development

No branches or pull requests

6 participants