Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename PyClassShell with PyCell and do mutability checking #770

Merged
merged 29 commits into from Mar 2, 2020

Conversation

kngwyu
Copy link
Member

@kngwyu kngwyu commented Feb 16, 2020

Resolves #342 #749 #738

TODO

  • Fix documents
  • Fix documentation in th guide
  • Add documentation about interior mutability.
  • Now protocol implementations uses try_borrow_unguarded so fix them to use try_borrow
  • Add documents to all new APIs
  • Rethink Py::new
    • I just marked it as soft-duplicated
  • Add tests for inheriting classes with weakref/dict slots

TL; DR

Now we can take multiple mutable references of PyClass.
This PR fixes it by runtime reference counting like RefCell.
We can use PyCell like RefCell. We can get PyRef and PyRefMut from &PyCell, with interior mutability checking.
The implementation is done via overhauling our PyClass and PyClassShell, resulting in renaming PyClassShell with PyCell.

Notation

BorrowFlag it the reference counter that PyCell has. It can also represent mutably borrowed.

Implementation

So, what's the problem of implementing mutability checking for PyClasses?
The answer is inheritation.
Our PyClass can inherit other class by #[pyclass(extends=OtherClass)] syntax.
In such cases, we should share BorrowFlag between child and parent classes.
So the following object layout is required:

pub struct PyCell {
    ob_base: ffi::PyObject,
    borrow_flag: BorrowFlag,
    value1: ParentClass,
    value2: ChildClass,
}

It is somewhat difficult since PyCell is defined recursively using the layout of base object, but I resolved this by using

#[repr(C)]
pub struct PyCellBase<T: PyTypeInfo> {
    ob_base: T::Layout,
    borrow_flag: Cell<BorrowFlag>,
}

as the farthest baseclasses for all PyClasses.
This needs some helper types like derive_utils::PyBaseTypeUtils.

Changed APIs

  • PyTryFrom: Now returns &PyCell<T> instead of &T and &mut T
  • FromPyPointer: Same as above.
  • AsPyRef: Now has an associated type Target. Returns &PyCell<T> from Py<T>.
    (TODO: make the complete list)

New APIs

  • PyRef
  • PyRefMut
  • (hidden) PyDowncastImpl
    (TODO: make the complete list)

@davidhewitt
Copy link
Member

Awesome! There's a lot here and I'll try to give it all a thorough review, though this will take me a few days...

Copy link
Member

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome stuff, really excited to see this! I've given this a really deep read and asked a lot of questions - hopefully useful comments.

In such cases, we should share BorrowFlag between child and parent classes.

Can you explain in a bit more detail why this has to be the case? I would think that as the Rust objects are separate, it should be safe to have &mut T::Base and &T simultaneously.

src/type_object.rs Outdated Show resolved Hide resolved
src/class/macros.rs Outdated Show resolved Hide resolved
guide/src/class.md Outdated Show resolved Hide resolved
guide/src/class.md Outdated Show resolved Hide resolved
guide/src/class.md Outdated Show resolved Hide resolved
src/derive_utils.rs Outdated Show resolved Hide resolved
src/instance.rs Show resolved Hide resolved
src/instance.rs Show resolved Hide resolved
src/objectprotocol.rs Outdated Show resolved Hide resolved
#[repr(C)]
pub struct PyCellBase<T: PyTypeInfo> {
ob_base: T::Layout,
borrow_flag: Cell<BorrowFlag>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth making an assertion that this doesn't need to be thread-safe (i.e. not AtomicXyz) because the of the GIL?

Copy link
Member Author

@kngwyu kngwyu Feb 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already not Send nor Sync.
What kind of assertion do you have in mind?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if it's not Send or Sync in Rust, the python interpreter will happily share this data between threads. Perhaps just a comment explaining why Cell is enough?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see.

Co-Authored-By: David Hewitt <1939362+davidhewitt@users.noreply.github.com>
@kngwyu
Copy link
Member Author

kngwyu commented Feb 18, 2020

Thank you for the lots of reviews!
Enjoying addressing them 😃

Can you explain in a bit more detail why this has to be the case? I would think that as the Rust objects are separate, it should be safe to have &mut T::Base and &T simultaneously.

If we allow having PyRefMut<T> and PyRefMut<T::BaseType> safely, get_super APIs should be (roughly) as follows:

impl<T> PyCell<T> {
    pub fn get_super(&self) -> &PyCell<T::BaseType> { ... }
}
impl<'p, T> PyRef<'p, T>
{
    pub fn get_super(&self) ->  &PyCell<T::BaseType> { ... }
}
...

I think this has some downsides:

  • Simply not convenient. I feel PyRefMut<T> -> PyRefMut<T::BaseType> style API is easier to use.
  • If T::BaseType have weakref or dict flag we can't return PyCell<T::BaseType>(just an implementation detail, but can be hard to fix)

@kngwyu kngwyu marked this pull request as ready for review February 18, 2020 08:24
@davidhewitt
Copy link
Member

Ah, that makes sense, thanks. Let's see what the gitter thread prefers. I'm warming to your design. It would be strange to try to .super() an object and then get a .PyCell.

src/types/mod.rs Outdated Show resolved Hide resolved
@davidhewitt
Copy link
Member

An idea: what if we had a type PyBorrowed<'a, T> ?

Then the API for get_super could be something like

impl<T> PyBorrowed<`_, T> {
    fn get_super(&self) -> PyBorrowed<T::BaseLayout>
}

Along with:

I wrote this in a rush, sorry if it doesn't make much sense, or it has other problems!

@kngwyu
Copy link
Member Author

kngwyu commented Feb 22, 2020

An idea: what if we had a type PyBorrowed<'a, T> ?

Sounds good for the future design, but I don't think we should add it now.

@davidhewitt
Copy link
Member

Excited to see all the progress on this. I'm hoping to find time to give this another review in the next day or two. 👍

@birkenfeld
Copy link
Member

I can't contribute at the moment, but just wanted to say that this is very cool. Making the API safe is very much appreciated!

@kngwyu kngwyu added this to the 0.9.0 milestone Feb 27, 2020
@kngwyu
Copy link
Member Author

kngwyu commented Feb 28, 2020

@Alexander-N
Could you please take a glance at class.md and documents for PyCell?
Thanks.

Copy link
Member

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've managed to give this another read through. Really pleased with how this is shaping up!

Given the similarity between PyCellBase and RefCell, I wonder if we could simplify the code and make PyCell a wrapper around RefCell rather than implementing the unsafe primitives ourselves. I appreciate the way this is laid out it might be hard, but in my opinion if we can do it and reduce the amount of unsafe non-ffi code in pyo3, it's worth it.

CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
guide/src/class.md Outdated Show resolved Hide resolved
guide/src/class.md Outdated Show resolved Hide resolved
pyo3-derive-backend/src/module.rs Show resolved Hide resolved
src/instance.rs Show resolved Hide resolved
src/pycell.rs Show resolved Hide resolved
src/pyclass.rs Outdated Show resolved Hide resolved
src/type_object.rs Outdated Show resolved Hide resolved
src/type_object.rs Outdated Show resolved Hide resolved
kngwyu and others added 2 commits March 1, 2020 12:43
Co-Authored-By: David Hewitt <1939362+davidhewitt@users.noreply.github.com>
@kngwyu
Copy link
Member Author

kngwyu commented Mar 1, 2020

@davidhewitt
Thank you for lots of reviews 😃

I wonder if we could simplify the code and make PyCell a wrapper around RefCell

RefCell<T> is a pair (T, BorrowFlag).
However, to enable extends= feature, PyCell can be a triplet (Base, Sub, BorrowFlag) or larger tuple. This nature conceptually makes it difficult to use RefCell here.

Copy link
Member

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few small tidy-ups still possible; in general this is looking great to me.

RefCell is a pair (T, BorrowFlag).
However, to enable extends= feature, PyCell can be a triplet (Base, Sub, BorrowFlag) or larger tuple. This nature conceptually makes it difficult to use RefCell here.

That makes total sense. Let's keep it as-is.

Copy link
Member

@Alexander-N Alexander-N left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a great step, very excited to see this! At the moment I can't look at it in detail, but I tried to at least give some feedback, hope it helps.

you can use `PyClassShell<T>`.
Or you can use `Py<T>` directly, for *not-GIL-bounded* references.
`PyCell<T: PyClass>` is always allocated in the Python heap, so we don't have the ownership of it.
We can get `&PyCell<T>`, not `PyCell<T>`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
We can get `&PyCell<T>`, not `PyCell<T>`.
We can only get `&PyCell<T>`, not `PyCell<T>`.

@@ -228,9 +242,12 @@ baseclass of `T`.
But for more deeply nested inheritance, you have to return `PyClassInitializer<T>`
explicitly.

To get a parent class from child, use `PyRef<T>` instead of `&self`,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
To get a parent class from child, use `PyRef<T>` instead of `&self`,
To get a parent class from a child, use `PyRef<T>` instead of `&self`,

fn method2(self_: &PyClassShell<Self>) -> PyResult<usize> {
self_.get_super().method().map(|x| x * self_.val2)
fn method2(self_: PyRef<Self>) -> PyResult<usize> {
let super_ = self_.as_ref(); // Get &BaseClass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me this is surprising, while get_super was very clear.

Copy link
Member Author

@kngwyu kngwyu Mar 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was initially named as_super but changed to use AsRef since wasm-bindgen does so, after some discussions in gitter.
I prefer to as_super but, in the long run, it would be beneficial to use the same API convention with a major library.

# pyo3::py_run!(py, subsub, "assert subsub.method3() == 3000")
```

To access the super class, you can use either of these two ways:
- Use `self_: &PyClassShell<Self>` instead of `self`, and call `get_super()`
- Use `self_: &PyCell<Self>` instead of `self`, and call `get_super()`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it a bit confusing that only into_super was used in the example.

Copy link
Member Author

@kngwyu kngwyu Mar 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, this was left unchanged... thanks

- `ObjectProtocol::get_base`
We recommend `PyClassShell` here, since it makes the context much clearer.
We recommend `PyCell` here, since it makes the context much clearer.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can simply remove get_base since I remember there was a problem with this method #381

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -761,16 +780,16 @@ struct GCTracked {} // Fails because it does not implement PyGCProtocol
Iterators can be defined using the
[`PyIterProtocol`](https://docs.rs/pyo3/latest/pyo3/class/iter/trait.PyIterProtocol.html) trait.
It includes two methods `__iter__` and `__next__`:
* `fn __iter__(slf: &mut PyClassShell<Self>) -> PyResult<impl IntoPy<PyObject>>`
* `fn __next__(slf: &mut PyClassShell<Self>) -> PyResult<Option<impl IntoPy<PyObject>>>`
* `fn __iter__(slf: PyRefMut<Self>) -> PyResult<impl IntoPy<PyObject>>`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I liked that PyRefMut was replaced with PyClassShell since I find PyRef/PyRefMut unintuitive for reasons mentioned in #356

PySelf(syn::TypeReference),
// self_: &PyCell<Self>,
PySelfRef(syn::TypeReference),
// self_: PyRef<Self> or PyRefMut<Self>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgotten comments?

Copy link
Member Author

@kngwyu kngwyu Mar 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are comments for developers.
I rewrote them as doc comments.

@kngwyu
Copy link
Member Author

kngwyu commented Mar 2, 2020

@Alexander-N
Thank you for reviews!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use RefCell to enforce aliasing rules
4 participants