Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Opt out of PyCell borrow tracking #1979

Merged
merged 21 commits into from
Apr 23, 2022
Merged

Opt out of PyCell borrow tracking #1979

merged 21 commits into from
Apr 23, 2022

Conversation

mejrs
Copy link
Member

@mejrs mejrs commented Nov 10, 2021

As suggested in #1068

This exposes an api like this:

#[pyclass]
pub struct Foo {
    #[pyo3(get)]
    field: u32,
}

#[pyclass(mutable)]
pub struct Bar {
    #[pyo3(get, set)]
    field: u32,
}
pub unsafe trait MutablePyClass: PyClass {}

pub struct PyCell<T>;

impl<T> PyCell<T>
where
    T: PyClass,
{
    pub fn borrow(&self) -> PyRef<T>;
}

impl<T> PyCell<T>
where
    T: MutablePyClass,
{
    pub fn borrow_mut(&self) -> PyRefMut<T>;
}

This is done by redirecting the borrow implementations through PyClass, which the PyCell implementation calls internally.

The default implementations for these don't check the borrowflags.

If the user declares #[pyclass(mutable)], then the macro:

  • replaces PyClasss default impls with impls that actually track the borrowing
  • implements MutablePyClass, so that the mutable borrow methods become available.

Short of specialization, I think this is the best way to approach this.

An open question is:

Perhaps this opt-out could be #[pyclass(immutable)] ?

Though TBH I think I prefer immutable-by-default, so perhaps we should have #[pyclass(mutable)] to opt-in to PyCell!

This is a breaking change for a lot of pyclasses, which opt-in to immutable would avoid. However I like Rust's immutable-by-default.

Comment on lines 523 to 524
// If the pyclass has extends/unsendable, we must opt back into PyCell checking
// so that the inner Rust object is not inappropriately shared between threads.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure whether this is sufficient.

I didn't see another obvious way around unsendable other than to re-enable borrow tracking.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this is unfortunate and would be nice to resolve rather than forcing mutability silently. I think if we start with #[pyclass(immutable)] for now then we can simply start by raising a compile error that extends/unsendable and immutable are incompatible. We could then create issues to track and fix this properly later with some follow-up refactoring before we complete the change to #[pyclass(mutable)].

@mejrs mejrs marked this pull request as draft November 10, 2021 15:16
Copy link
Member

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, thanks for kicking this one off! Very excited to see this work get started.

I fully agree that #[pyclass(mutable)] is a desirable end state (and that immutable-by-default is better), however I think we need to phase this in gradually so that users can have time to migrate.

If we keep mutable-by-default for now and introduce #[pyclass(immutable)] then we can add this as an additive feature while we're iterating and improving it. Then eventually we can emit deprecation warning and finally swap over to immutable-by-default.

The implementation is smart, however I'm a bit scared of the use of unsafe trait, and it would be nice to avoid extra additions to the public API. See comments below for some ideas I have on these points...

Also, I'd be really curious to find out about performance impact, if any!

@@ -4,6 +4,7 @@
//! Trait and support implementation for implementing number protocol
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh gosh all the changes in these files just remind me again why #[pyproto] hurts so much 😅 . Getting there!

Comment on lines 523 to 524
// If the pyclass has extends/unsendable, we must opt back into PyCell checking
// so that the inner Rust object is not inappropriately shared between threads.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this is unfortunate and would be nice to resolve rather than forcing mutability silently. I think if we start with #[pyclass(immutable)] for now then we can simply start by raising a compile error that extends/unsendable and immutable are incompatible. We could then create issues to track and fix this properly later with some follow-up refactoring before we complete the change to #[pyclass(mutable)].

src/pyclass.rs Outdated
///
/// If `T` does not implement [`MutablePyClass`], then implementations should not override the
/// default methods.
pub unsafe trait PyClass:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, it's a bit painful to see the PyClass trait extend in this way, for a couple of reasons:

  • The contract between the relationship of MutablePyClass & PyClass is a bit scary.
  • I'd rather keep these details out of the public API, because I expect we'll need to iterate on PyClass a few times yet and it would be nice if fewer users broke / we didn't have to restrict ourselves to iterating in semver-breaking releases.

I wonder if there's a possible design where we introduce multiple types of BorrowFlag, similar to how we have the ThreadChecker / ThreadCheckerStub?

I'm thinking it might be possible to write an implementation where an "immutable" borrowflag always fails to borrow mutably and is a no-op when borrowing immutably.

The exact BorrowFlag type to use could be added to the class in the PyClassImpl trait then to keep it out of the public API.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The contract between the relationship of MutablePyClass & PyClass is a bit scary.

I'd rather keep these details out of the public API, because I expect we'll need to iterate on PyClass a few times yet and it would be nice if fewer users broke / we didn't have to restrict ourselves to iterating in semver-breaking releases.

I mean, how public is PyClass actually? Are people really out there implementing PyClass for their structs? They also need to implement "hidden" traits to do so. I'd like to think PyClass is read only public, so adding things shouldn't be an issue.

That said, I agree this coupling between the two traits is not ideal. In particular PyClass' trait methods don't feel like a natural fit.

But I like the PyCell api that we end up with, in particular that it doesn't add a bunch of trait bounds to it. It could also be implemented cleanly with specialization or some sort of negative trait bounds, but we don't have those.

I wonder if there's a possible design where we introduce multiple types of BorrowFlag, similar to how we have the ThreadChecker / ThreadCheckerStub?

This would involve a runtime check, right?

I'm thinking it might be possible to write an implementation where an "immutable" borrowflag always fails to borrow mutably and is a no-op when borrowing immutably.

Would that be be a runtime error or a compilation error? I really prefer the latter, if possible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, how public is PyClass actually? Are people really out there implementing PyClass for their structs? They also need to implement "hidden" traits to do so. I'd like to think PyClass is read only public, so adding things shouldn't be an issue.

That's true, however everything the macros access is "public" in some form. It's not really about users implementing PyClass, but more the existence of these methods - I would rather they were in a place where its obvious that users are not expected to call the methods. I've been making PyClassImpl the bucket of "stuff which we can freely change so don't blame us if you use it and it breaks in a patch release".

This would involve a runtime check, right?

Yes, but if the immutable borrowflag is trivial no-op the call should be optimized away to nothing in release builds.

Would that be be a runtime error or a compilation error? I really prefer the latter, if possible.

Agreed compilation error would be best if we can make it so. I do like the design you've built for Py and PyCell here where it's impossible to even try to borrow mutably if the class is not mutable.

I'll try and prepare a PR for you to add to this branch the idea I'm thinking a bit better...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather they were in a place where its obvious that users are not expected to call the methods. I've been making PyClassImpl the bucket of "stuff which we can freely change so don't blame us if you use it and it breaks in a patch release".

Yes, that looks like a better place for them.

I'm thinking it might be possible to write an implementation where an "immutable" borrowflag always fails to borrow mutably and is a no-op when borrowing immutably.

Now that I've given it a second readover - do you mean that rather than the actual borrowing itself, we proxy the borrowflag checking through a trait? That is probably more appropriate I think.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes exactly - like we have the PyClassThreadChecker trait for #[unsendable], we could have a PyClassBorrowChecker trait for this use-case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I think it's going to be a couple of days yet before I'll have a chance to write the PR, if you think you've got enough to go on feel free to go with it?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I can work with that.

@mejrs
Copy link
Member Author

mejrs commented Nov 11, 2021

I fully agree that #[pyclass(mutable)] is a desirable end state (and that immutable-by-default is better), however I think we need to phase this in gradually so that users can have time to migrate.

If we keep mutable-by-default for now and introduce #[pyclass(immutable)] then we can add this as an additive feature while we're iterating and improving it. Then eventually we can emit deprecation warning and finally swap over to immutable-by-default.

That is a good idea.

Also, I'd be really curious to find out about performance impact, if any!

From my (naive) test using the above structs, field access from Python is about 2-3% faster.

This is quite hard to test. In a realistic scenario the performance impact of borrow checking is probably vanishingly small compared to everything else. I'm not good at testing - I'm hoping someone else gives it a try.

@mejrs
Copy link
Member Author

mejrs commented Nov 14, 2021

I just realized we probably also want an api like:

impl<T> PyCell<T>
where
    T: ImmutablePyClass,
{
    pub fn borrow_unguarded(&self) -> &T;
}

So that's yet another new trait...

@mejrs
Copy link
Member Author

mejrs commented Nov 14, 2021

We could also use Deref for this.

@davidhewitt
Copy link
Member

davidhewitt commented Nov 15, 2021

So that's yet another new trait...

We might be able to use a blanket impl of something like impl<T> ImmutablePyClass for T where T: PyClass<BorrowChecker = Immutable>. Maybe similar for MutablePyClass too.

(Where the BorrowChecker trait is the one the abstracts the behaviour above.)

We could also use Deref for this.

I was thinking about making PyCell<Subclass> deref to PyCell<Baseclass>, and a similar set of new deref implementations for native types (e.g. PyBool -> PyInt), so would rather not use deref here until that idea is decided on.

@davidhewitt
Copy link
Member

From my (naive) test using the above structs, field access from Python is about 2-3% faster.

This is quite hard to test. In a realistic scenario the performance impact of borrow checking is probably vanishingly small compared to everything else. I'm not good at testing - I'm hoping someone else gives it a try.

TBH that sounds about right, probably over time we might find things can improve even further. There's also the memory saving where in theory we don't need to make space for the borrow flag on immutable classes.

@mejrs
Copy link
Member Author

mejrs commented Nov 22, 2021

We might be able to use a blanket impl of something like impl ImmutablePyClass for T where T: PyClass<BorrowChecker =
Immutable>. Maybe similar for MutablePyClass too.

This requires making pyclass generic - not sure if that is desirable.

I've changed it around to be opt-in to immutable:

#[pyclass(immutable)]
pub struct Foo {
    #[pyo3(get)]
    field: u32,
}

Todos:

  • Make sure people can't implement immutable + extends/unsendable at the same time
  • Docs
  • More tests
  • A safe method to get &T

@davidhewitt
Copy link
Member

Sorry will try to review this tomorrow!

Copy link
Member

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, I'm happy with how the user-facing design is shaping up!

For the implementation, I'd like us to try to aim to avoid using the memory for the BorrowFlag in the immutable case, and also it would be nice to avoid having to generate a new impl BorrowImpl for each pyclass in the macro code. I've suggested an alternative implementation which may enable that, and will also then look more similar to the other pieces of PyCellImpl.

On the other hand, I have a sneaking suspicion that exploring solutions for problems like #1089 and #1358 may require us to rework chunks of the PyCell machinery again anyway, so if you'd rather not spend extra time refining this right now I'm okay with that 👍

@@ -0,0 +1,13 @@
use pyo3::prelude::*;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to have a few tests for #[pyfunction] and #[pymethods] to see how they cope with various combinations too.

@@ -39,7 +39,7 @@ impl<T> Copy for PyClassImplCollector<T> {}
///
/// Users are discouraged from implementing this trait manually; it is a PyO3 implementation detail
/// and may be changed at any time.
pub trait PyClassImpl: Sized {
pub trait PyClassImpl: Sized + BorrowImpl {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I was going to propose a slightly different design. Instead of BorrowImpl super type, add the following associated type to PyClassImpl:

type BorrowChecker: PyClassBorrowChecker;

Where PyClassBorrowChecker is the following trait:

trait PyClassBorrowChecker {
    /// Creates a new borrow checker
    fn new() -> Self;
    /// Increments immutable borrow count, if possible
    fn borrow(&self) -> Result<(), PyBorrowError>;
    /// Decrements immutable borrow count
    fn release_borrow(&self) -> ();
    /// Increments mutable borrow count, if possible
    fn borrow_mut(&self) -> Result<(), PyBorrowMutError>;
    /// Decremements mutable borrow count
    fn release_borrow_mut(&self) -> ();
}

Then we have struct PyCellBorrowChecker(Cell<BorrowFlag>) and zero-sized struct ImmutableBorrowChecker { _private: () } which can implement this trait.

PyBaseCell<T> would then change to contain borrow_checker: T::BorrowChecker instead of borrow_flag: Cell<BorrowFlag>.

That should allow us to not make space for the flag in the immutable case, as well as avoid needing to implement BorrowImpl as part of the macros.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That changes the layout of PyCellBase -is that alright? I notice it is #[repr(C)]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the #[repr(C)] is because this is the layout of the type in Python land.

src/pyclass.rs Outdated
Comment on lines 32 to 33
pub unsafe trait MutablePyClass: PyClass {}
pub unsafe trait ImmutablePyClass: PyClass {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the associated type, we can have blanket impls:

unsafe impl<T> MutablePyClass for T where T: PyClass<BorrowChecker = PyCellBorrowChecker> { }
unsafe impl<T> ImmutablePyClass for T where T: PyClass<BorrowChecker = ImmutableBorrowChecker> { }

... as long as these don't affect compiler errors too much.

src/pycell.rs Outdated
Comment on lines 937 to 939
pub trait PyCellLayout<T>: PyLayout<T> {
fn get_borrow_flag(&self) -> BorrowFlag;
fn set_borrow_flag(&self, flag: BorrowFlag);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This trait would probably have to change a bit:

pub trait PyCellLayout<T: PyClass>: PyLayout<T> {
    fn borrow_checker(&self) -> &T::BorrowChecker;
    
    // ... rest unchanged

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect that if you do that you end up with a circular trait bound somewhere.

fn type_object_raw(py: pyo3::Python) -> *mut pyo3::ffi::PyTypeObject {
use pyo3::type_object::LazyStaticType;
fn type_object_raw(py: ::pyo3::Python<'_>) -> *mut ::pyo3::ffi::PyTypeObject {
use ::pyo3::type_object::LazyStaticType;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that these global hygiene changes in the docs conflict with #2022.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, since it says "roughly", I'd not do any special hygience dance here, neither ::pyo3 nor the new _pyo3 and use.

src/pycell.rs Outdated
Comment on lines 398 to 404
let flag = crate::class::impl_::BorrowImpl::get_borrow_flag()(self);
if flag == BorrowFlag::HAS_MUTABLE_BORROW {
Err(PyBorrowError { _private: () })
} else {
self.set_borrow_flag(flag.increment());
crate::class::impl_::BorrowImpl::increment_borrow_flag()(self, flag);
Ok(PyRef { inner: self })
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this design, the implementation of this then becomes:

self.borrow_checker().borrow().map(|_| PyRef { inner: self })

and similar for the other methods.

@mejrs
Copy link
Member Author

mejrs commented Nov 26, 2021

I've tried to do those things, but I don't understand how to fit everything together.

the part 1 commit works, but is by default mutable. If you want to give it a try that's probably the best place to start.

@davidhewitt
Copy link
Member

Thanks, sorry if my idea has turned out to be ill-thought-out. I'll try to find some time to look into this in detail over weekend or end of Monday at the latest.

@mejrs
Copy link
Member Author

mejrs commented Nov 27, 2021

Thanks, sorry if my idea has turned out to be ill-thought-out. I

I think it can work - it does however infect a lot of other traits which makes the implementation quite tricky.

I'll try to find some time to look into this in detail over weekend or end of Monday at the latest.

Whenever you are ready :)

I just pushed some changes that I think are the correct way to go, but there are some trait bounds that I can't quite figure out at this point.

@davidhewitt davidhewitt mentioned this pull request Dec 27, 2021
10 tasks
@davidhewitt
Copy link
Member

I'm finally having a play with this branch this morning - not sure how quickly I'll be able to finish it off :)

@mejrs mejrs added the CI-no-fail-fast If one job fails, allow the rest to keep testing label Apr 12, 2022
@mejrs
Copy link
Member Author

mejrs commented Apr 12, 2022

Uh...it seems like this is incompatible with Rust 1.48? 😭

@davidhewitt
Copy link
Member

Ugh, yes it looks like the older compiler can't handle the complexity of the trait bounds.

I wonder whether some other cleanup and refactoring might unblock this, though it's not obvious what.

Also, given that the traits which seem to be problematic are related to the fact that only mutable classes can be base types in this model, I wonder if we flesh out the rules on how mutability interacts with inheritance if that'll unblock the older compiler.

@mejrs
Copy link
Member Author

mejrs commented Apr 14, 2022

It works on rust 1.49, which is really frustrating.

Any chance of a msrv bump (or a bumplet) with 0.17? 🥺

@birkenfeld
Copy link
Member

You know what the policy is...

@davidhewitt
Copy link
Member

Yeah as much as "one more version" feels approachable it would drop support for Debian, which I think would upset a lot of people.

@mejrs
Copy link
Member Author

mejrs commented Apr 17, 2022

Oh right, Debian will be stuck on 1.48 for at least a year :(

@davidhewitt
Copy link
Member

davidhewitt commented Apr 19, 2022

I'm hopeful that if we define the semantics of inheritance in combination with this feature then we might be able to refactor traits enough to avoid whatever currently confuses Rust 1.48.

I think the correct approach would be to go along the lines of #[pyclass(dict)], where the dict slot gets placed in the first child in the inheritance hierarchy which includes the annotation. So in this case, the borrow flag would be placed with the first mutable class, and then that borrow flag is shared between that and all children. (All parents would be immutable and not need to care about the flag.)

I'm happy to try to push a commit to implement that strategy and see if it fixes things?

@mejrs
Copy link
Member Author

mejrs commented Apr 19, 2022

I'm happy to try to push a commit to implement that strategy and see if it fixes things?

That would be greatly appreciated! I'm actually not at all familiar with how we implement inheritance 😅

@davidhewitt davidhewitt force-pushed the immutable branch 2 times, most recently from dd24e78 to 4d0af65 Compare April 21, 2022 07:22
@davidhewitt
Copy link
Member

davidhewitt commented Apr 21, 2022

Ok, so I think I've managed to push something which works... let's see if the 1.48 tests pass this time 🤞

The commit itself is a bit messy, I tacked a new set of structs and traits alongside the stuff already implemented here so that I could work in pieces rather than break everything and lose myself in a sea of compile errors.

So I don't forget, these are things which should be cleaned up:

  • I moved a few associated types from PyClass to PyClassImpl (i.e. make them not public). This was necessary to allow some other trait bounds to be just on PyClassImpl. I think this is fine to do, however we should mention in the CHANGELOG.
  • I added a second associated type in PyClassImpl - Mutability and PyClassMutability. It should be possible to just have one.
  • (Internal) documentation for how the new traits work.
  • Documentation for #[pyclass(immutable)] in the guide.
  • Might be nice to benchmark / optimise these new code paths.
  • Now that it's not actually doing borrow checking in the immutable case, I'd like to propose renaming PyCell<T> to PyClassObject<T>.

Given that this is now a big hefty branch which is getting harder to rebase, I suggest that if CI passes we merge this branch now and then you and I can more easily work on the items above in parallel as we work towards 0.17.


I'm actually not at all familiar with how we implement inheritance 😅

Essentially PyCell<T> where T extends U contains a valid "layout" of U as its first member. The PyClassImpl and PyClassBaseType traits working in tandem then handle the logic of how the PyCell behaves. It's hard, and poorly documented - as part of this work I can reorganise it to make it easier for others to follow.

@davidhewitt
Copy link
Member

Oh darn, it now fails later on, but still fails. I'll keep trying...

@davidhewitt
Copy link
Member

Hooray, looks like this works! Rust 1.48 definitely doesn't seem to be as comfortable with the trait bounds, but with some effort it compiles.

Are folks ok if I merge this as-is and open a tracking issue with the list of items above for us to complete by 0.17?

@davidhewitt
Copy link
Member

Actually looking at this again briefly this morning I'll try to push a cleanup commit first.

@mejrs
Copy link
Member Author

mejrs commented Apr 22, 2022

Given that this is now a big hefty branch which is getting harder to rebase, I suggest that if CI passes we merge this branch now and then you and I can more easily work on the items above in parallel as we work towards 0.17.

Yes please 🙏 No more resolving merge conflicts.

@davidhewitt davidhewitt marked this pull request as ready for review April 23, 2022 12:32
Copy link
Member

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'm satisfied with this for now, will proceed to merge! Thanks @mejrs for kicking this off and keeping the branch up-to-date, I'm very excited that we can get this into 0.17!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI-no-fail-fast If one job fails, allow the rest to keep testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants