Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor capabilities to simplify code #3409

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

jwnrt
Copy link
Contributor

@jwnrt jwnrt commented Mar 3, 2023

Pull Request Overview

This pull request replaces capability traits with zero-sized structs.

This allows them to be used as regular types and created with an unsafe fn new(), rather than creating a type and implementing the trait through the create_capability! macro.

I notice that this change was submitted years ago in #1485 but wasn't deemed enough of a simplification. I hope I've shown enough why this is a simplification in this PR.

This PR touches many files, so I've tried to keep commits reviewable by separating the changes.

Capability definition

With this change, capabilities are changed from:

pub unsafe trait ProcessManagementCapability {}

to:

pub struct ProcessManagement;

with a wrapping type Capability which prevents these capabilities being constructed from safe code.

Capability creation

Creating capabilities changes from:

let cap = create_capability!(ProcessManagementCapability);

to:

let cap = unsafe { Capability::<ProcessManagement>::new() };

This makes the use of unsafe explicit, and removes the indirection added by the macro. create_capability! can now be removed.

Capability use

Requiring a capability now changes from:

pub fn function(&self, _capability: &dyn ProcessManagementCapability) {}
pub fn function<C: ProcessManagementCapability>(&self, _c: &C) {}

to:

pub fn function(&self, _c: &Capability<ProcessManagement>) {}

This simplifies function and type definitions:

  • Removes generic parameters from both structs and functions.
  • Removes unnecessary trait object &dyns which have to be optimized away.

Testing Strategy

Semantics shouldn't have changed, only syntax.

Ran these parse checks:

  • cargo check -r
  • cargo check --tests
  • cargo test --doc

TODO or Help Wanted

Should be finished as far as I can tell.

Documentation Updated

  • Updated the relevant files in /docs, or no updates are required.
  • Updated rustdoc strings to relevant types and examples.

Formatting

  • Ran make prepush.

@github-actions github-actions bot added component kernel WG-OpenTitan In the purview of the OpenTitan working group. labels Mar 3, 2023
Copy link
Contributor

@brghena brghena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this change.

I think you have indeed shown that code is simplified:

  • Direct unsafe rather than macro
  • Less generics syntax everywhere

Question for maybe @hudson-ayers: did we ever decide publicly (i.e., document somewhere) that we need to have a //SAFETY doc next to every unsafe? (thinking about what #3382 does) One "downside" to this change is the unsafe keyword in more places, so we might want many comments on safety.

@jwnrt jwnrt force-pushed the refactor-capabilities branch 4 times, most recently from 5cdf61e to ad0463a Compare March 3, 2023 19:26
@jwnrt
Copy link
Contributor Author

jwnrt commented Mar 3, 2023

Sorry for all the force pushes, I should have checked cargo check --tests and cargo test --doc as well. CI passes now.

I like this change.

Thanks for having a look!

hudson-ayers
hudson-ayers previously approved these changes Mar 6, 2023
Copy link
Contributor

@hudson-ayers hudson-ayers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't looked through all of the changes yet but I like this -- anything that reduces the proliferation of generic parameters and &dyn is a win in my book, and this overall helps with verbosity

@jwnrt
Copy link
Contributor Author

jwnrt commented Mar 8, 2023

Squashed fixups

@bradjc
Copy link
Contributor

bradjc commented Mar 8, 2023

Question for maybe @hudson-ayers: did we ever decide publicly (i.e., document somewhere) that we need to have a //SAFETY doc next to every unsafe? (thinking about what #3382 does) One "downside" to this change is the unsafe keyword in more places, so we might want many comments on safety.

I personally don't think it's a concern in board main.rs files.

@bradjc
Copy link
Contributor

bradjc commented Mar 8, 2023

I thought we implemented capabilities as traits so we could implement multiple capability traits on the same object, and pass multiple capabilities at once. I don't think that is possible with this approach?

@hudson-ayers
Copy link
Contributor

Question for maybe @hudson-ayers: did we ever decide publicly (i.e., document somewhere) that we need to have a //SAFETY doc next to every unsafe? (thinking about what #3382 does) One "downside" to this change is the unsafe keyword in more places, so we might want many comments on safety.

I personally don't think it's a concern in board main.rs files.

I agree with Brad

@hudson-ayers
Copy link
Contributor

I thought we implemented capabilities as traits so we could implement multiple capability traits on the same object, and pass multiple capabilities at once. I don't think that is possible with this approach?

This is true, but I don't think we ever ended up using capabilities this way in practice

@hudson-ayers
Copy link
Contributor

General question: if we are not using trait objects anymore, why pass a reference to the capability, instead of just passing the capability struct itself? It is a ZST, and passing a ZST is free whereas passing a reference to a ZST is not necessarily free. Before it made sense that we often used references because trait objects are unsized, but it seems like the size/cycle win here could improve further by just passing structs directly

@jwnrt
Copy link
Contributor Author

jwnrt commented Mar 9, 2023

General question: if we are not using trait objects anymore, why pass a reference to the capability, instead of just passing the capability struct itself?

The only real reason they should stay as references is to convey that the caller can have the capability back and reuse it for something else afterwards. This lets you write signatures like fn(Capability<...>) that take a capability away from the caller by requiring a move. I can't think of a use case off the top of my head.

I've added Copy to all the capabilities and removed the references.

The binary didn't change for OpenTitan, but as you say I don't know if references to ZSTs are necessarily always free.

@hudson-ayers
Copy link
Contributor

General question: if we are not using trait objects anymore, why pass a reference to the capability, instead of just passing the capability struct itself?

The only real reason they should stay as references is to convey that the caller can have the capability back and reuse it for something else afterwards. This lets you write signatures like fn(Capability<...>) that take a capability away from the caller by requiring a move. I can't think of a use case off the top of my head.

I've added Copy to all the capabilities and removed the references.

The binary didn't change for OpenTitan, but as you say I don't know if references to ZSTs are necessarily always free.

I think I wasn't very clear in my original message, but what I had in mind was capabilities not implementing Copy, capsules that hold capabilities holding the owned type (accepting them via moves), and functions that require capabilities taking references to those capabilities. My thought was that this would both save some instances where rust might fail to optimize away the storage for a reference (before this PR lots of capsules would store &'static NetworkCapability or similar, and per rust-lang/rfcs#2040 being rejected I think that does take up space on the stack), and that this would make it somewhat more difficult for capsules to permanently share a capability with another capsule

@jwnrt
Copy link
Contributor Author

jwnrt commented Mar 20, 2023

I think I wasn't very clear in my original message, but what I had in mind was capabilities not implementing Copy, capsules that hold capabilities holding the owned type (accepting them via moves), and functions that require capabilities taking references to those capabilities. My thought was that this would both save some instances where rust might fail to optimize away the storage for a reference (before this PR lots of capsules would store &'static NetworkCapability or similar, and per rust-lang/rfcs#2040 being rejected I think that does take up space on the stack), and that this would make it somewhat more difficult for capsules to permanently share a capability with another capsule

Sorry, I'm still not sure I understand, do you have an example from the code of what should change? I've left NetworkCapability unchanged since it's created in userspace

I've removed my previous changes and pushed a commit with an example of what I think you might mean, let me know if it's wrong

@hudson-ayers
Copy link
Contributor

I think I wasn't very clear in my original message, but what I had in mind was capabilities not implementing Copy, capsules that hold capabilities holding the owned type (accepting them via moves), and functions that require capabilities taking references to those capabilities. My thought was that this would both save some instances where rust might fail to optimize away the storage for a reference (before this PR lots of capsules would store &'static NetworkCapability or similar, and per rust-lang/rfcs#2040 being rejected I think that does take up space on the stack), and that this would make it somewhat more difficult for capsules to permanently share a capability with another capsule

Sorry, I'm still not sure I understand, do you have an example from the code of what should change? I've left NetworkCapability unchanged since it's created in userspace

I've removed my previous changes and pushed a commit with an example of what I think you might mean, let me know if it's wrong

I started looking through the repository for more example of what I am talking about, and realized that basically every place that does the thing I described (actually storing a reference to a capability on a struct) was part of the network capabilities, which I agree are a different case.

I am happy with how this looks now, your latest commit matches what I was looking for. I think this is a very clean, efficient design for capabilities.

hudson-ayers
hudson-ayers previously approved these changes Mar 29, 2023
@jwnrt
Copy link
Contributor Author

jwnrt commented Mar 29, 2023

Thanks again! Rebased, although I notice make prepush fails on master now, so hopefully all is still good.

@phil-levis
Copy link
Contributor

I thought we implemented capabilities as traits so we could implement multiple capability traits on the same object, and pass multiple capabilities at once. I don't think that is possible with this approach?

Yes and yes, although I suppose you could create a type that contains multiple capabilities in it (to reduce the number of arguments).

I'm generally pretty wary of these kinds of changes, as there's a small simplicity benefit but it's changing a core mechanism in the kernel. These are the types of changes that make it very hard for out-of-tree ports and such to stay in sync. It's churn.

@ppannuto
Copy link
Member

We talked about this for a while on the core team call today. The consensus was that this is primarily a nice ergonomic change. The other biggest benefit is reduction of some brittleness in relying on compiler optimizations for efficiency. Neither of these are urgent.

The suggestion is to put this on a defer list. When we prepare a Tock 3.0, this is an opportunity where downstream users will already be expecting a larger array of breaking changes. This can go on a 2->3 migration guide, and is a good opportunity to integrate this change.

@ppannuto ppannuto added blocked Waiting on something, like a different PR or a dependency. Tock-3.0 Issues that should be resolved as part of a Tock 3.0 major release labels Apr 14, 2023
@jwnrt
Copy link
Contributor Author

jwnrt commented Apr 14, 2023

Thanks a lot for the update and for all the time spent going over this change!

@nbdd0121
Copy link
Contributor

Passing ZST references are not free, because the address is significant unless the optimizer is certain that it's not. While passing ZST by value is guaranteed to be no-op in ABI.

One way to prevent having to take capability by reference is to encode the lifetime into Capability struct itself:

struct Capability<'a, T>(PhantomData<&'a T>);

impl<'a, T> Capability<'a, T> {
    fn borrow(&self) -> Capability<'_, T> {
        Capability(PhantomData)
    }
}

(newly minted Capability will have 'static)

This does however make the ergonomics a little bit worse because one'll need to write .borrow() when passing capabilities.

@hudson-ayers
Copy link
Contributor

Passing ZST references are not free, because the address is significant unless the optimizer is certain that it's not. While passing ZST by value is guaranteed to be no-op in ABI.

One way to prevent having to take capability by reference is to encode the lifetime into Capability struct itself:

struct Capability<'a, T>(PhantomData<&'a T>);

impl<'a, T> Capability<'a, T> {
    fn borrow(&self) -> Capability<'_, T> {
        Capability(PhantomData)
    }
}

(newly minted Capability will have 'static)

This does however make the ergonomics a little bit worse because one'll need to write .borrow() when passing capabilities.

@jwnrt 's observation that the binary did not change at all when he tried passing by value instead of by reference makes me comfortable that LLVM is decent at optimizing this for Tock. I realize that this is not guaranteed and could change, but it does not necessarily seem to me like something that is likely to break, and I am wary of changes that have only hypothetical benefit (robustness to compiler changes that may or may not ever exist that break things) and concrete cost (cluttering code that uses capabilities with calls to borrow() everywhere).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked Waiting on something, like a different PR or a dependency. component kernel needs-rebase Tock-3.0 Issues that should be resolved as part of a Tock 3.0 major release WG-OpenTitan In the purview of the OpenTitan working group.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet