Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate to Pydantic V2 #5304

Closed
anakin87 opened this issue Jul 10, 2023 · 9 comments
Closed

Migrate to Pydantic V2 #5304

anakin87 opened this issue Jul 10, 2023 · 9 comments
Labels
1.x P3 Low priority, leave it in the backlog wontfix This will not be worked on

Comments

@anakin87
Copy link
Member

Pydantic V2 was recently released.

The new version was not compatible with Haystack:
here you can see the errors that lead to pinning pydantic<2 and releasing Haystack 1.18.1.

We should migrate to V2 for several reasons, including:

  • Performance (Pydantic V2 is 5-50x faster than Pydantic V1)
  • Safety & maintainability

We see how painful it is 😃 and what changes migration requires...

@anakin87 anakin87 self-assigned this Jul 10, 2023
@anakin87 anakin87 added the 2.x Related to Haystack v2.0 label Jul 11, 2023
@anakin87
Copy link
Member Author

The migration turned out to be more difficult and longer than I thought.
Talking with @ZanSara, we decided that we can migrate to Pydantic V2 in Haystack v2.

Learnings

  • ❗ in V2, Pydantic dataclasses generate their own constructor and don't support a custom __init__ method.
    (Typing error using hierarchical custom __init__ pydantic/pydantic#5298)
    Currently, we use Pydantic dataclasses with custom __init__ in Document and Label.
  • however, migrating the data model (schema.py) was feasible
  • the JSON schema generation and validation proved hard to migrate, as we are using several Pydantic abstractions that have been removed/renamed, along with several python internals.
  • The bump-pydantic official tool did almost nothing (effective) on our codebase

malte-aws added a commit to malte-aws/semantic-search-aws-docs that referenced this issue Jul 25, 2023
@masci masci added the P2 Medium priority, add to the next sprint if no P1 available label Jul 31, 2023
@mjspeck
Copy link
Contributor

mjspeck commented Sep 28, 2023

When is Haystack V2 scheduled for?

@anakin87
Copy link
Member Author

Hey @mjspeck, thanks for your interest...

We do not yet have an official date for Haystack 2.0,
but we are sharing all the progress in this discussion and also on Discord.
Feel free to participate!

@mjspeck
Copy link
Contributor

mjspeck commented Sep 28, 2023

I appreciate that, but since 2.0 seems far away, I'm hoping that migration to pydantic 2.0 could happen before it. I know you said that bump-pydantic didn't help, but did you try updating pydantic and using the v1 submodule in your exploration? You should be able to stay with the v1 API by just adding v1 to all your imports (e.g. from pydantic.v1 import BaseModel). That would allow users to add haystack to environments that require v2 (which is the case for a project I'm working on), while requiring minimal code changes on your end.

@anakin87
Copy link
Member Author

anakin87 commented Sep 28, 2023

When I tried, the option from pydantic.v1 was not available (apparently).

It would be nice to test it but I don't know when it is feasible for us.

If you want to give it a try and open a PR if it works, you are welcome.
If it works, the best option is probably to have a conditional import, so as not to break things for those still using Pydantic 1.

Related discussion: fastapi/fastapi#9966

@anakin87
Copy link
Member Author

@silvanocerza please share your opinion on this...

@silvanocerza
Copy link
Contributor

I believe this is not a top priority as of now. We're focusing on releasing Haystack 2.x by the end of the year and we don't have enough resources to update 1.x to Pydantic v2.

It's not as easy task as that will require updating tons of libraries, and that can come with a new set of bugs and problems that we'd have to fix.

I would very much prefer if we stick with Pydantic v1 to avoid unecessary troubles now.

@masci masci removed the 2.x Related to Haystack v2.0 label Nov 27, 2023
@masci masci added 1.x P3 Low priority, leave it in the backlog and removed P2 Medium priority, add to the next sprint if no P1 available labels Dec 18, 2023
@masci masci added the wontfix This will not be worked on label Feb 23, 2024
@masci masci closed this as completed Feb 23, 2024
@dokterbob
Copy link

dokterbob commented Sep 3, 2024

@silvanocerza What's the ETA in addressing this? We (Chainlit) are having dependency resolving conflicts trying to patch security issues and might have to consider dropping or reducing haystack support because our other deps require v2.

Specifically, modern langchain requires pydantic 2. As there's security issues, we would like to resolve it but cannot due to haystack's <2 requirement.

@filippetrovic
Copy link

I'm very interested in this as well. For now we had to give up using another dependency

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1.x P3 Low priority, leave it in the backlog wontfix This will not be worked on
Projects
None yet
6 participants