Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing JSON directly #10

Closed
samuelcolvin opened this issue Apr 12, 2022 · 5 comments · Fixed by #15
Closed

Parsing JSON directly #10

samuelcolvin opened this issue Apr 12, 2022 · 5 comments · Fixed by #15

Comments

@samuelcolvin
Copy link
Member

Would be amazing if we could parse and validate JSON directly, without creating python objects, then validating them.

The basic idea would be to create traits to achieve all the conversions used here, then implement those traits for both serde types, and pyo3 types.

Then use those types instead of pyo3 types throughout validators.

If we did this, it also opens the door to using pydantic-core without python 👀 - e.g. in an entirely theoretical "Tydantic" typescript package.

@griels
Copy link

griels commented May 11, 2022

I am using Pydantic quite heavily in tandem with our own CBOR parsing code. It would be great if we could plug in an arbitrary serde codec (such as ciborium) rather than just serde-json. But I think using serde-json would be a great start.

@griels
Copy link

griels commented May 11, 2022

See also the work in orjson which does a partial decode/encode of pydantic structures (but misses many of the fancier features).

As a total Rust noob, I made a POC fork of orjson with CBOR support: https://github.com/griels/orjson/tree/cbor-support-mk2 so it's quite simple to work in different codecs.

@samuelcolvin
Copy link
Member Author

The options are:

  • parse to python, then pass to validate_python which is fine, but slow
  • adding more formats to pydantic-core, this is fairly easy but will increase the binary size and have a maintenance burdon, so a format would need to be fairly widely used to be added - only yaml and toml could be considered I think
  • We might be able to find some way to pass a JsonInput from rust to python, then back from python to rust in a safe way and thereby allow a library which pydantic-core doesn't know about create the JsonValue with pydantic-core consumes - thereby getting most of the performance gain

But I don't know if that's possible, it would probably be "unsafe" e.g. if you passed a pointer to something other than a JsonValue, you'd get a segfault. It's something to think about in future probably.

@samuelcolvin
Copy link
Member Author

If you have a model for "plugins" that's well developed, please explain - I can't see how that would work, but maybe I'm missing something.

@griels
Copy link

griels commented May 11, 2022

Thanks, I'll have to think about it. It seems there isn't an obvious ABI Rust code can offer for 'plugins', short of wrapping in a C FFI (maybe something like the JSONInput you mention), so I guess otherwise serde codecs might have to be injected at the source code level. Perhaps there is someone who has already tackled this. Apologies for necroposting by the way! 'Plugging' ciborium straight into Pydantic has been something I wanted to do for some time so I got excited.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants