New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: 3rd party non-JSON serialization/deserialization #79
Comments
Thanks for the question. I think this is just calling No changes should be required to pydantic-core to allow this. I want to add support for line numbers in errors, but that requires a new rust JSON parser, so won't be added until v2.1 or later. |
Closing, but feel free to ask if you have more questions. |
To be clear, I would love this to be possible, but I don't want to have to add the capability to parse more formats to pydantic-core, so the only way this would be possible would be to achieve runtime linking of pydantic-core and the third party libraries that perform the parsing. This is the only way I think of that it might work: (Note 1: there's probably a better way to do this, I'm not an expert at this stuff) With that out of the way, here's a very rough idea:
With this approach, while we go "via python", we never have to do the hard work to convert the |
i am coming here from here and i want to share my ideas how to runtime pluging of should work.... my idea is to do it kind of similar to dll you would have to have your pydantic-core plu-ins in one preagreed location (compiled) (or registered somehow) during deserialization you would just say what serializer to use... (format is not enought since there are more then one serializer/deserializer availible for one format (like simd json)) i am also not expert in this but i think there should be folowing requirements
why i am suggesting deoupeling...
if this is too hard i would just suggest to do in on compile time but that may be too complicated... |
This sounds very hard to do in a reliable cross-platform way. Given the problems we already experience (at the scale of pydantic or watchfiles) with very subtle OS differences and wheels. I'm very unwilling to enter into this mess. You're effectively proposing another way link libraries that side steps both crates and python imports, are there any other examples of python libraries that use DLL/ share libraries to access functionality in other packages without using the python runtime? (Perhaps worth remembering that I'll probably be maintaining pydantic for the next decade at least, one "clever" idea proposed now which relies on shaky cross-platform behaviour, could cost me 100s of hours of support over the years - hence my caution) I real question is how much faster would this be than my proposal above? To proceed with the conversation, we need to benchmark that. Really someone need to build both solutions (and any 3rd solutions proposed) and see how they perform. @PrettyWood @tiangolo do you have any feedback on any of this? |
well loading crates may be fine as well... but as said i am not expert.... |
crates would need to be a compile time dependency, so distributed wheels couldn't be used. |
ah.... yea... i forgot about that..... because i would just force you to compile when you install library... however if there are little to no perfomance impatcs for @samuelcolvin solution i would be also fine. However there are people "needing" SIMDjson... and in extreame cases perfomance may degrade |
If you care about "extreme performance", don't use python, build the whole thing in Rust, Go or C. |
Sorry for missing this discussion 2 weeks ago... I need to check out and play with the current (v0.3.1) version of pydantic-core before I can really give an informed opinion, but from a cursory glance it seems that Regarding Rust-side implementation, I think that it all sounds too messy for a Python-facing library. "Config parsing" use cases don't require cutting edge performance anyways - you generally parse a single YAML file at the beginning of a script (vs many JSON API requests/sec). And YAMLs aren't usually passed between (performance-critical) applications since parsing YAML is slower anyways. There's similar considerations with TOML. I guess the most JSON-like thing would be XML derivatives, but I don't have much experience there, and haven't encountered anyone using Pydantic for XML yet 😉 |
I agree, The only other thing you might need is line numbers, that's one of the main drivers (for me) of #10. We need to think about how to make this possible without adding complexity or damaging performance. |
Hi, author of pydantic-yaml here. I have no idea about anything Rust-related, unfortunately, but hopefully this feature request will make sense in Python land.
I'm going off this slide in this presentation by @samuelcolvin, specifically:
Here's a relevant discussion about "3rd party" deserialization from v1: pydantic/pydantic#3025
It would be great if
pydantic-core
were built in a way where non-JSON formats could be added "on top" rather than necessarily being built into the core. I understand performance is a big question in this rewrite, so ideally these would be high-level interfaces that can be hacked in Python (or implemented in Rust/etc. for better performance).From the examples available already, it's possible that such a feature could be quite simple on the
pydantic-core
side - the 3rd party would create their own function a-lavalidate_json
, possibly just callingvalidate_python
. However, care would be needed on how format-specific details are sent betweenpydantic
and the implementation. In V1 this is done with theConfig
class and specialjson_encoder/decoder
attributes, which have been a pain to re-implement for YAML properly (without way too much hackery).Ideally for V2, this would be something more easily addable and configurable. The alternative would be to just implement TOML, YAML etc. directly in the binary (and I wouldn't have to keep supporting my project, ha!)
Thanks again for Pydantic!
The text was updated successfully, but these errors were encountered: