Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sharing pyclasses between multiple Rust packages #1444

Open
davidhewitt opened this issue Feb 24, 2021 · 17 comments
Open

Sharing pyclasses between multiple Rust packages #1444

davidhewitt opened this issue Feb 24, 2021 · 17 comments

Comments

@davidhewitt
Copy link
Member

I've seen several questions about re-using #[pyclass] types between multiple Rust packages, for example on gitter, stack overflow and most recently in #1418.

This issue is indended to be a place for the discussion to centralize regarding understanding the difficulties in this and to brainstorm solutions / API designs which we can add to PyO3.

Below is my write-up of the issue. Note that I haven't tried to any of this myself; this is just where I think the difficulties are. I could be totally wrong - please use this thread to share evidence, ask questions, and test ideas.


Let's call package A the "original" package, which defines a #[pyclass] MyClass. Package B is a child crate which makes use of use a::MyClass in Rust code, to re-use the pyclass.

The key issue is that #[pyclass] stores the pyclass type object in static storage. This means that (if Rust's usual rlib linkage is used) packages A and B will have their own copies of the MyClass type object, and Python will think that they're actually different types coming from the two packages.

I think the core of the solution to this is to make B link against the package A C shared library - the same .so file which Python usess to load package A. Then B should be able to re-use the pyclass from A.

This comes with a few steps:

  • A will need to export additional symbols which B can use. This probably includes at a minimum the MyClass static type object, and probably functions for converting PyAny <-> MyClass.
  • B will need to use A as a "C" dependency which it links to using an extern "C" block, where it imports symbols from A.
  • I'm unsure if B will be able to use a::MyClass; at all - it may need to have a duplicate definition which uses the exported symbols internally.

Looking at that, it might mean that we'd eventually have pyclass_export! and pyclass_import! macros in PyO3 to help with this.

The above is all a poorly-worded brain dump of what might be a solution. I'm really not sure - anyone who is trying to do this, please comment, ask questions, provide error logs, and we can figure this out together.

@milesgranger
Copy link
Contributor

milesgranger commented Apr 2, 2021

After getting great help from you in #1535 I've hit this issue. The additional feature I want to add would still make use of multiple objects defined in the parent package. But alas, as you say, Python thinks they aren't the same object and fails to convert the object into an object it actually already is. 😅

@davidhewitt
Copy link
Member Author

Yeah; unfortunately unlike #1535 this one's hard and I haven't thought more than the words above as to how it might work. I'm also not planning to design a solution any time soon as there's a number of big pieces of work in the pyO3 core which I think are higher priority.

If this is something you need sooner rather than later, I suggest having a go at this yourself based on my sketch above and asking questions liberally on this issue for anything which you need help understanding. It should be possible to cobble something together for this without needing to change the current pyO3 crate itself. No guarantees it won't be painful though!

@milesgranger
Copy link
Contributor

Thanks, it is helpful all the same. I just might take you up on that offer to cobble something together. Albeit, I've started paternity leave so won't have anything up for some time most likely. We'll see how things go. Thanks again. 👍

@davidhewitt
Copy link
Member Author

Albeit, I've started paternity leave so won't have anything up for some time most likely. We'll see how things go.

No rush & congratulations!

@milesgranger
Copy link
Contributor

While dabbling with this earlier today, I hit #1193; thinking entering pyclasses into a capsule could be a good avenue for sharing structs/functions between crates. Is it not an absurd approach for when that issue is solved?

@davidhewitt
Copy link
Member Author

Possibly - why were you thinking of using capsules rather than using Python APIs to import the Python objects directly?

Or were you thinking of putting function pointers to supporting Rust functions in the capsules? I believe numpy might do something like that.

@milesgranger
Copy link
Contributor

I was indeed taking inspiration from pyo3 numpy and this description about using capsules seemed to reinforce my plan.

... using Python APIs to import the Python objects directly?

Is there any benefit/loss from doing it this way vs capsules? Naively, it seemed like there would be some overhead going through Python to get an object/pyclass vs the capsule workflow.

@davidhewitt
Copy link
Member Author

Yeah, especially if you're just exporting Rust functions which you don't want to wrap in Python at all, then capsules are very much a suitable tool.

@gabrik
Copy link

gabrik commented Nov 3, 2021

Hi everyone,

We have a similar issue, we have a crate A that defines some rust types (used from some rust API) and we would like to have such types also in Python, so we create a crate B that depends on A and that wraps the types.
But now we are in trouble because we have a crate C that depends on both A and B and that converts the types from something that is runtime rust to something that is runtime python.

So far we are able to build C against A and B but the types from B are not seen from the code in C like if the dependency is not there.

My take from the previous posts is that this is still not possible, I'm correct?


Nevermind, I solved this by introducing an indirection between crate A and B in which I just define the types but I do not generate any Python module. Then Cand B depends on this new crate and I'm able to share the definition of the types.

@gabrik
Copy link

gabrik commented Nov 8, 2021

Another update, I still find some issues when the data comes back from Python to Rust, in detail the extract(py) does not work.
It says that the type I'm receiving from Python is not the same that we have in Rust:

TypeError: 'Outputs' object cannot be converted to 'Outputs'

While the types are defined in crate C and both A (generating the python lib) and B calling the python that uses A.

@davidhewitt
Copy link
Member Author

Unfortunately that TypeError is expected if you re-use a #[pyclass] in mutliple extension modules naively. This is because Rust statically links all code into the final compiled libs. This means the #[pyclass] then creates a static type object in each of the two libs, and the two type objects don't compare equal, which leads to odd-looking errors like 'Outputs' object cannot be converted to 'Outputs'.

I think the solution is stll the one in my OP - don't statically link the shared dependency into the final extension-modules and instead import symbols from it as a dynamic library. This way there should only be on static type object ever created (in the dynamic library) which is then loaded everywhere. But as in the OP, this is hard, and neither Cargo nor PyO3 really think about this case yet.

@gabrik
Copy link

gabrik commented Nov 9, 2021

That was exactly what I was thinking, make crate C a dylib kind of crate and try to use dynamic linking in both the dependent libraries, but as you said this is the hard part.
In any case, thank you for the explanation

@jychen7
Copy link

jychen7 commented Apr 5, 2022

ah, I think I hit this issue in datafusion-contrib/datafusion-python#45 (comment) as well

@chetmurthy
Copy link

The upshot of this discussion seems to be that if one has a situation where one defines some Rust structs in module A, which are also exported to Python, and wants to write more Rust code that takes as arguments those structs from Python, then one had better put that Rust code into module A, and not into a module B.

Would that be accurate? I'm not complaining -- just want to verify what the state of play is. I'm new to all this, so .... still learning.

@davidhewitt
Copy link
Member Author

For now, building everything as one big module A is much much easier than having split modules A and B, yes.

@denehoffman
Copy link

Hi, just pinging this, I started using PyO3 recently with my Rust project, and so far it's fine, but I'm worried I'll run into this issue when I implement the whole library structure I'm going for. Basically I have a #[pyclass] struct "A" which is used extensively in my core module. I wanted to be able to support other modules which use both my core module and pyo3 to allow people to make their own libraries (let's call one "other") that integrate with mine in python as well as rust (rust integration is a given, of course). If all of the interactions with "A" happen in "core", and I create a method that returns an "A" in "other", then I'm assuming based on the discussion I've read that the "A" in "other" and the "A" in "core" will be different in Python, and thus all my core methods won't work on the "other" A? Am I correct here, and is there a plan for this issue or is it kind of on the backburner for now?

@BaxHugh
Copy link

BaxHugh commented May 15, 2024

I've also run into this issue.
For some of my workflow, we avoid this for some structs / classes by having a native python type which we convert to a rust type via serialisation / desrtialisation.
To make this 'easy' my team uses protobuf to define our shared python rust data structs. use prost / prost-build (or tonic-build) to compile the protobuf to rust, and protoc to compile to python. We then call the protobuf serde methods on both sides.

For reference, here's a minimal repor demo.
pyo3-shared-pyclasses.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants