Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Confusing APIs around PyObject #356

Closed
vadimcn opened this issue Feb 15, 2019 · 27 comments · Fixed by #419
Closed

Confusing APIs around PyObject #356

vadimcn opened this issue Feb 15, 2019 · 27 comments · Fixed by #419

Comments

@vadimcn
Copy link

vadimcn commented Feb 15, 2019

I started learning PyO3 a couple of days ago, and for the longest time I was confused by the fact that PyObject represents a "free-floating" reference to a Python object, whereas the test of Py<thing>'s represent concrete Python types that are only accessible with 'py lifetime.

It might be too late now to change all that, but... it would have been much easier to grok PyO3's API structure if PyObject and PyObjectRef meanings were swapped. Furthermore, I would suggest moving PyObject's convenience methods to Py and making PyObjectRef just an alias for Py<PyObject>.

At the very least, all this should have been explained somewhere close to the top of the user guide.

@kngwyu
Copy link
Member

kngwyu commented Feb 18, 2019

Thanks for reporting!
I think you misunderstand Py.
Please see the guide about it.

However, I completely agree with things around PyObject, PyObjectRef and so on are really confusing(in addition, now we have PyRef types!).
So anyway I think we should simplify our APIs.
And it sounds good to make PyObjectRef just an alias.

@kngwyu kngwyu changed the title Inconsistent API Confusing APIs around PyObject Feb 18, 2019
@kngwyu kngwyu changed the title Confusing APIs around PyObject Confusing APIs around PyObject Feb 18, 2019
@kngwyu
Copy link
Member

kngwyu commented Feb 18, 2019

I changed your issue title but please feel free to change it again if you dislike it.

@konstin
Copy link
Member

konstin commented Feb 18, 2019

I agree that PyObject is very confusing. Unfortunately it's difficult to at least rename PyObject to something more reasonable because there's ffi::PyObject (which make the situation even more confusing).

Afaik we could replace PyObject with Py<PyObjectRef> and maybe add alias like PyStored for convenience, but that would be a huge refactoring.

@konstin
Copy link
Member

konstin commented Feb 18, 2019

I think we can avoid much of the confusion without changing the api by explaining the types properly in the guide.

@vadimcn
Copy link
Author

vadimcn commented Feb 19, 2019

Well, the crate is still pre-1.0, but I agree, renaming PyObject now would break everybody's code.
Still, I would suggest to at least consider renaming PyObjectRef to something like PyBaseObject.

Regarding PyRef: the distinction between built-in and user-defined Python objects seems like another wart in crate API. Why not put native objects behind PyRef as well? Also, calling it PyBox would be more intuitive, IMO.

@jothan
Copy link

jothan commented Feb 20, 2019

With the latest 0.6.0-alpha.4, I'm having trouble with storing references to other Python instances from within a pyclass. What is the proper type to use inside structs now ? With the current situation, I'm having trouble upgrading &self or &mut self to Py<Self>. Does the new API account for this ?

Here is the code that broke.

@kngwyu
Copy link
Member

kngwyu commented Feb 21, 2019

@jothan
Thanks for your feedback.
It looks we need to provide some more APIs.
Please wait for a few days...

@jothan
Copy link

jothan commented Feb 22, 2019

Going forward, all types used to reference Python objects should probably be reduced or documented thoroughly.

From memory I can name Py, PyRef, PyRefMut, PyObjectRef and PyObject. This is too much for me. Working with PyO3, I found that I spent most of my time figuring out how to convert object references.

Once this is sorted out, I would like to update the docs with a matrix stating:

  • Each type and its precise purpose.
  • How each type behaves with regards to the GIL and lifetimes.
  • How #[pyclass] structs fit into all this.
  • How to convert a type into all the others when applicable.

@pganssle
Copy link
Member

I can't agree with this more strongly, I'd love to see a re-working of the API to be more clear about this and I'd be happy to chime in.

See for example this StackOverflow question, which is characteristic of my experience with PyO3. From the Rust side, I think a lot of this is clearer, but from the Python side there seem to be many, many ways to return values to Python, all of which seem to work from the Python side, but where I am not clear on the differences.

@pganssle
Copy link
Member

Also, as I noted in this comment, but I guess never created an issue for, the RAII approach for GILGuard is very misleading, because of the way the GIL works.

It's probably worth taking a look at that in a separate issue, but essentially the lifetime of GILGuard is not the lifetime of the GIL, which can and probably is released and re-acquired many times over the course of a given Python operation.

@konstin
Copy link
Member

konstin commented Feb 23, 2019

I agree that the api of pyo3 is overly complex and poorly documented. That being said I'd be very happy about pull request improving the api, the guide and/or the docstrings. E.g. having a matrix explaining all the different types and their usage would be great. The wasm-bindgen book should also give some inspiration.

Something that's related to the wrapper types are the conversions between them. Most of pyo3's current conversions traits ([In]ToPyObject, [In]ToPyPointer, FromPyObject) could be replaced by a generic FromPy<T>/IntoPy<T> pair that works like std's From<T>/Into<T>. I've added IntoPy in an earlier pull request and will add FromPy in #367, so if anyone interested it should be possible to remove one of those traits with medium effort.

@vadimcn
Copy link
Author

vadimcn commented Feb 23, 2019

That being said I'd be very happy about pull request improving the api, the guide and/or the docstrings.

I'm sorry, but conceptual API design is the job of crate owners. If you allow pull requests from random strangers to decide that for you, nothing good will come out of your project.


If you want my opinion, I would suggest deciding between:

  1. Representing "Python object with GIL held" as some wrapper type, say PyGIL<T>, and exposing all object methods of all types (including Python built-ins) on that wrapper.
  2. Somehow ensuring that user-defined types cannot be constructed outside your control, if which case you can make sure no one can obtain &UserDefinedClass without locking the GIL. Perhaps this could be done by injecting a non-user-constructible marker field as a part of #[pyclass] expansion.

Then stick to that approach. Using both at the same time is unpalatable.

@pganssle
Copy link
Member

I'm sorry, but conceptual API design is the job of crate owners. If you allow pull requests from random strangers to decide that for you, nothing good will come out of your project.

Obviously it's the crate owner's job to make decisions about the final outcome, but I think it's a good idea to encourage contributors to make their opinions known, and as specifically as possible, and to work to implement their ideas.

I very frequently will make pull requests that I have no intention of getting merged, just as a proof of concept of what my proposed API would look like in situ.

Personally, if I understood the current API well enough to change it, I'd be happy to make a PR to improve the API.

@vadimcn
Copy link
Author

vadimcn commented Feb 23, 2019

@pganssle: Even if implementation is offloaded to someone else, project owners need to make design decisions first. Just sending a sweeping change PR without prior buy-in from the owners is almost certain to be waste of work. And if they just accept it... well, I personally would have trust issues with such a project.

@kngwyu
Copy link
Member

kngwyu commented Feb 24, 2019

I agree with we need more documents, but let's enjoy refactoring so far 👍
I also agree that design decision is important, but I feel we're still in the try-and-error phase.

@konstin
Copy link
Member

konstin commented Feb 26, 2019

I think that the design of the wrapper types in general is good. They are the minimal set of types to ergonomically represent all cases and they have no soundness holes as far as I can tell. I think the big problem is that we need more documentation and more examples. The wasm-bindgen book is an example for top-notch documentation, especially with it's supported types chapter.

There's lots of functionality that people don't know about because it's not in the guide and it's not in the examples. E.g. the py: Python in #[pyfunction]s as first argument is optional and those function can return everything that's IntoPyObject, not only PyObjects, and in recent versions you don't even need to wrap it in a PyResult. There's already a lot of this in the tests, but of course we can't expect people to read pyo3's tests to learn the api. So I'd of course be very happy about pull requests for the guide or the documentation in general ✨

There's also many areas that needs to be refactored and renamed, especially the aforementioned conversion traits. But I think it would be most effective to document the existing functionality first. Trying to explain something also often helps to understand what's good or necessary and what needs to be changed.


@pganssle: Even if implementation is offloaded to someone else, project owners need to make design decisions first. Just sending a sweeping change PR without prior buy-in from the owners is almost certain to be waste of work. And if they just accept it... well, I personally would have trust issues with such a project.

If you have doubts whether the PR will be accepted, simply open an issue describing the changes you plan first.


If you want my opinion, I would suggest deciding between:

  1. Representing "Python object with GIL held" as some wrapper type, say PyGIL<T>, and exposing all object methods of all types (including Python built-ins) on that wrapper.

  2. Somehow ensuring that user-defined types cannot be constructed outside your control, if which case you can make sure no one can obtain &UserDefinedClass without locking the GIL. Perhaps this could be done by injecting a non-user-constructible marker field as a part of #[pyclass] expansion.

Then stick to that approach. Using both at the same time is unpalatable.

In earlier versions of pyo3, a PyToken existed which should guarantee that the struct was always allocated on the python heap. Unfortunately, this type was unsound (#94) and I'm not sure if it's possible to make a sound of it. The bigger problem with approach 2 is that it forces a divide between pyo3-types and non-pyo3-types, i.e. you could hardly write a library which optionally exposes structs as python types while allowing to use them a normal rust types if you're not integrating with python.

With the Deref's it should be possible to use pyo3 as in approach 1, where PyRef is the PyGIL you mentioned. I think that it's nice to have the simpler &'p Tapi for the native types.

@kngwyu
Copy link
Member

kngwyu commented Feb 27, 2019

I think to replace PyObject with Py<PyObjectRef> is a good idea, but I don't like the name PyObjectRef.
I can understand that it means 'this type should be used like &'gil PyObjectRef', but feel it still confusing.
Does anyone have a good idea of renaming?

@konstin
Copy link
Member

konstin commented Feb 28, 2019

What about PyAny?

@kngwyu
Copy link
Member

kngwyu commented Feb 28, 2019

@konstin
I'm a mypy user so I like it 👍
But I'm not sure other python users are also familiar with the name Any...

@konstin
Copy link
Member

konstin commented Feb 28, 2019

Even though it's not the perfect name (as PyObject is already taken), a python user should still be able to guess what the type does from the name. (Python object of any type -> some subclass of object)

@birkenfeld
Copy link
Member

I have to agree with the original sentiment -- except that now there is yet another unexplained type, PyAny 😄

I'd be happy with a nice list of types, and when to use them, on the main PyO3 doc page, and in the guide, but I can't offer to write it (yet).

@konstin
Copy link
Member

konstin commented Apr 17, 2019

PyObjectRef has been renamed PyAny; All the renaming are documented in the changelog. You might want to look at #374 and #376 which were created from this discussion. I'd be about more documentation PRs in future :)

@birkenfeld
Copy link
Member

And I'm happy to make them, if I reach the a state where I feel comfortable enough with the internals.

@birkenfeld
Copy link
Member

I don't think #419 has fixed this; documentation is still missing. (Notably, the change from #419 wasn't added to the guide, either.)

@kngwyu
Copy link
Member

kngwyu commented Apr 24, 2019

Looks Github automatically closed this 😕

@kngwyu kngwyu reopened this Apr 24, 2019
@vadimcn
Copy link
Author

vadimcn commented Sep 14, 2019

So PyAny is PyObject* in terms of Python C API? Not sure it improves clarity :-/ , one needs to be pretty familiar with mypy to make the connection...

Okay, to make this a more constructive criticism, here's the taxonomy and explanation that I would find non-confusing:

"Python-managed object" - an object allocated on Python heap, whose lifetime is controlled by the built-in reference count.
It is not possible to own a Python-managed object, but one can get a &'gil reference to it while the GIL is held. (There should probably be a trait that corresponds to this concept, say PythonManaged).

There are two types of Python-managed objects:

  • Native: PyObject, PyTuple, PySequence, etc..., which represent built-in Python objects and protocols. All these names should be the same as in C API (i.e. PyObject and not PyAny).
  • User-defined: PyUserObject<Impl> - is a user-defined object allocated on Python heap, where Impl is the Rust struct holding data and method impls. (Internally, PyUserObject<...> is basically a PyObject_HEAD followed by Impl).

PyOwnRef<T:PythonManaged> - an owning reference (in Python sense) to a Python-managed object. PyOwnRef can be owned and does not require GIL to be held, but, of course, does not expose any methods that might mutate Python interpreter state.
PyOwnRef<T> may be converted to &'gil T, using GIL guard object.

@davidhewitt
Copy link
Member

I'm going to close this issue; the types mentioned in the OP have more-or-less entirely been reworked at this point. Perhaps the last hurdle is that PyObject and Py<PyAny> will become equivalent in the soon-to-be-merged #1063.

I'm sure there are things that can still be improved, but let's track that in new issues that are more up-to-date.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
7 participants