Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add explicit atomic (and aligned) operations #104

Closed
wants to merge 6 commits into from

Conversation

alexandruag
Copy link
Collaborator

Related to #102

Hi,

This PR adds four new methods (read_atomic, write_atomic, read_aligned, and write_aligned) to the GuestMemory and GuestMemoryRegion interfaces. Here are some details:

  • The proposal is mainly about adding read_atomic and write_atomic as part of the GuestMemory interface in order to provide a more straightforward (and always available) mechanism for performing atomic accesses with explicit semantics.

  • The Aligned<Addr, T> abstraction appears interesting and useful to represent additional alignment information and/or requirements about an address. It needs some more methods to improve the ergonomics of casting + offsetting, together with examples, but for the time being I am really curious if others also think this is worth pursuing.

  • For now, the new methods are added directly to GuestMemory and GuestMemoryRegion (as opposed to being part of the Bytes interface) because Bytes provides no guarantees about the alignment of the container (whereas both GuestMemory and GuestMemoryRegion are expected to be aligned to page boundaries -- maybe we should mention this explicitly btw). There are a couple of ways to reconcile the interfaces going forward, but I think what matters for now is to have a well-defined and straightforward way of performing atomic accesses based on a generic M: GuestMemory object.

  • I've also ran a couple of synthetic tests, and the improvement for atomic reads/writes seems quite promising (i.e. > 2x). I'll try to do some more after I manage to get a better setup going.

@alexandruag alexandruag self-assigned this Jul 19, 2020
@bonzini
Copy link
Member

bonzini commented Jul 21, 2020

What is the specific reason for removing the automatic implementation of Bytes? That is, which methods are meant to be overridden by different GuestMemory implementation?

Regarding read/write_atomic, I would prefer to have an ordering argument and use atomics instead of volatile accesses, but the overall concept is great. I think we should add an Error for misaligned containers so that the methods can be added directly to Bytes. Can you also document how read/write_atomic compare to VolatileAtomicRef? Is it just different ergonomics just like VolatileRef vs.read/write_obj?

@alexandruag
Copy link
Collaborator Author

Hi Paolo, thanks for the comment! I just want to try out a couple more ideas (including the things you mentioned) and then I'll update the PR and provide a complete answer.

@alexandruag
Copy link
Collaborator Author

Hi again, the PR is updated. I've added an atomic_ref method to Bytes, as well as {read|write}_{atomic|aligned}. To answer the previous questions:

  • I think the automatic Bytes implementation for GuestMemory removes too much flexibility from custom implementations, but it's not really in the scope of this PR to remove it, so I've dropped that commit (it will return in a different PR :D).

  • I've added an Ordering argument to read/write_atomic, and a default implementation that leverages atomic_ref. The presence of all these methods within the main Bytes interface is mainly a quality of life improvement (and the current implementation seems to make things faster). I think having the atomic operations clearly defined and easily accessible makes things safer and easier to understand when writing stuff like the virtio queue/device code, etc.

  • atomic_ref and VolatileMemory::get_atomic_ref are actually not totally generic, in the sense that they require the access to happen within a memory region backed by host memory, because they return an actual reference. If this ever becomes an actual limitation, we can replace the reference with a reference-like abstraction. I know it's kinda awkward to have them both right now, but I think we should wait a bit more to see what other changes happen before we're happy with the interface.

  • It's worth noting that read/write_obj (and similar) methods provide more than just better ergonomics when compared to VolatileRef, because the latter is only usable when the objects do not cross region boundaries (which makes it quite awkward to use; it also suffers from the "must be backed by host memory" restriction). Going forward, it would be nice if the Array and Ref abstractions hide away cross region accesses, and the Bytes methods can use them and only provide better ergonomics.

  • I wanted to update the PR as soon as possible to get more feedback so the Aligned<A, T> usability improvements are still on their way :( If ppl are ok with the PR, I'll add more documentation and the missing tests as well.

Copy link
Member

@bonzini bonzini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks great, only some stylistic choices to discuss (saving the flames for the blanket implementation PR :)).


impl<Addr: Address, T> Aligned<Addr, T> {
/// Attempt to create an `Aligned` value based on `addr`.
pub fn from_addr(addr: Addr) -> result::Result<Self, AlignmentError> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these two functions be implementations of std::convert::TryFrom instead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from_addr should be a TryFrom implementation, but it conflicts with a strange internal automatic implementation defined here (I think). Replacing cast's current form with something like impl<A, T> TryFrom<Aligned<A,T>> for Aligned<A,T> doesn't work because it also conflicts with some auto implementation defined somewhere :(

/// Wraps a `GuestAddress` that's known to be aligned with respect to `T`.
pub type AlignedGuestAddress<T> = Aligned<GuestAddress, T>;

impl<T> std::convert::TryFrom<GuestAddress> for AlignedGuestAddress<T> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementing impl<Addr, T> TryFrom<Addr> for Aligned<Addr, T> should make these two more specific implementations unnecessary.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but I didn't manage to get that to work, like I mentioned during a previous comment :(

@@ -117,11 +119,33 @@ impl Display for Error {
pub struct GuestAddress(pub u64);
impl_address_ops!(GuestAddress, u64);

/// Wraps a `GuestAddress` that's known to be aligned with respect to `T`.
pub type AlignedGuestAddress<T> = Aligned<GuestAddress, T>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these two types useful?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They seemed nicer to use than the fully qualified Aligned<X, Y>, but I don't have any strong feelings here.

pub struct Aligned<Addr, T> {
addr: Addr,
phantom: PhantomData<T>,
}

// Implementing `Clone` and `Copy` manually because deriving them did not work well
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it work if you use a PhantomData<*mut const T> instead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a neat suggestion, but unfortunately doesn't seem to work ... I imagine the background logic sees a T that's not necessary Copy as part of the type signature, and it simply gives up at that point. However, your comment reminded me of this interesting provision regarding PhantomData, so I switched to PhantomData<*const T> anyway.

pub struct GuestAddress(pub u64);

impl From<u64> for GuestAddress {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these should be implemented automatically by impl_address_ops!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is implemented to satisfy the newly added From<<Self as AddressValue>::V> trait bound for Address, which I found necessary precisely because I couldn't find a way to get the inner value using logic implemented as part of the impl_address_ops! macro :(

For example, the macro previously expected to only be called for newtypes (i.e. GuestAddress(u64)), so it could implement raw_value automatically as self.0. However, when I wanted to use the same macro for regular types (i.e. to implement Address for usize), I couldn't find a way to abstract obtaining the raw value that worked for both situations (and that didn't break various macro hygiene rules), so I've added the above trait bound. If there's a way to implement the conversion as part of the macro, then implementing From/Into would no longer be necessary.

}
}

impl Into<u64> for GuestAddress {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And Into should probably never be implemented, instead you should implement From<GuestAddress> for u64.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

pub struct MemoryRegionAddress(pub u64);

impl From<u64> for MemoryRegionAddress {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

@bonzini bonzini self-requested a review July 27, 2020 13:28
Copy link
Member

@bonzini bonzini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I clicked "approve" by mistake.

Copy link
Collaborator Author

@alexandruag alexandruag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay in answering. I'll post an extra commit with tentative Aligned conversion improvements soon.


impl<Addr: Address, T> Aligned<Addr, T> {
/// Attempt to create an `Aligned` value based on `addr`.
pub fn from_addr(addr: Addr) -> result::Result<Self, AlignmentError> {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from_addr should be a TryFrom implementation, but it conflicts with a strange internal automatic implementation defined here (I think). Replacing cast's current form with something like impl<A, T> TryFrom<Aligned<A,T>> for Aligned<A,T> doesn't work because it also conflicts with some auto implementation defined somewhere :(

@@ -117,11 +119,33 @@ impl Display for Error {
pub struct GuestAddress(pub u64);
impl_address_ops!(GuestAddress, u64);

/// Wraps a `GuestAddress` that's known to be aligned with respect to `T`.
pub type AlignedGuestAddress<T> = Aligned<GuestAddress, T>;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They seemed nicer to use than the fully qualified Aligned<X, Y>, but I don't have any strong feelings here.

/// Wraps a `GuestAddress` that's known to be aligned with respect to `T`.
pub type AlignedGuestAddress<T> = Aligned<GuestAddress, T>;

impl<T> std::convert::TryFrom<GuestAddress> for AlignedGuestAddress<T> {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but I didn't manage to get that to work, like I mentioned during a previous comment :(

pub struct Aligned<Addr, T> {
addr: Addr,
phantom: PhantomData<T>,
}

// Implementing `Clone` and `Copy` manually because deriving them did not work well
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a neat suggestion, but unfortunately doesn't seem to work ... I imagine the background logic sees a T that's not necessary Copy as part of the type signature, and it simply gives up at that point. However, your comment reminded me of this interesting provision regarding PhantomData, so I switched to PhantomData<*const T> anyway.

pub struct GuestAddress(pub u64);

impl From<u64> for GuestAddress {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is implemented to satisfy the newly added From<<Self as AddressValue>::V> trait bound for Address, which I found necessary precisely because I couldn't find a way to get the inner value using logic implemented as part of the impl_address_ops! macro :(

For example, the macro previously expected to only be called for newtypes (i.e. GuestAddress(u64)), so it could implement raw_value automatically as self.0. However, when I wanted to use the same macro for regular types (i.e. to implement Address for usize), I couldn't find a way to abstract obtaining the raw value that worked for both situations (and that didn't break various macro hygiene rules), so I've added the above trait bound. If there's a way to implement the conversion as part of the macro, then implementing From/Into would no longer be necessary.

}
}

impl Into<u64> for GuestAddress {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

@alexandruag
Copy link
Collaborator Author

Hi, I rebased & pushed two changes: using integer arithmetic + cast in GuestRegionMmap::check_ptr instead of ptr::offset (some more details in a comment there), and added an offset<U>(self, value: Addr::V) method to Aligned<Addr, T>, which is both an offset and a (potential) cast.

I also tried to use some trait + bounds constructs to have Rust "know" which alignments are ok to cast into which. Although it works, things get a bit too convoluted so I've kept this simple approach where both cast and offset return a Result, and one way to avoid having too many unwraps (which often can be optimized away because they can't trigger) is to use ? and a From<AlignmentError> implementation for the error type returned by the calling function. I'm going to try and use the atomic methods on top of some of the existing code in rust-vmm/PRs to get a better feel for things from a convenience perspective.

The "integer-atomics" feature can be replaced by using intrinsic `cfg`
annotations (such as detecting different platforms). Also removed the
implementation of `AtomicValued` for `AtomicBool` because we cannot
safely reinterpret locations in guest memory as `bool` (values other
than 0 and 1 may lead to undefined behaviour). Renamed `AtomicValued`
to `AtomicInteger`, which seems like a more accurate description.

Signed-off-by: Alexandru Agache <aagch@amazon.com>
Signed-off-by: Alexandru Agache <aagch@amazon.com>
@alexandruag
Copy link
Collaborator Author

Hi, I've added unit tests as well in a new commit, and removed [RFC] from the PR name. Looking forward to any (re) reviews :D

@alexandruag alexandruag changed the title [RFC] Add explicit atomic (and aligned) operations Add explicit atomic (and aligned) operations Sep 1, 2020
Signed-off-by: Alexandru Agache <aagch@amazon.com>
Signed-off-by: Alexandru Agache <aagch@amazon.com>
assert_eq!(
a.offset::<u64>(u32_offset).unwrap_err(),
AlignmentError::Misaligned
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding another case:

      c = a.offset::<u64>(u32_offset).unwrap();
      assert_eq!(c.addr, b.addr.unchecked_add(u32_offset));

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added!

@@ -193,6 +267,90 @@ pub trait Bytes<A> {
/// Part of the data may have been copied nevertheless.
fn read_slice(&self, buf: &mut [u8], addr: A) -> Result<(), Self::E>;

/// Returns a reference that allows atomic operations for `T` at the specified address.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm hesitating whether the atomic related interfaces should be hosted by a dedicated trait.
Per my understanding, we have three types of memory access modes,

  1. read/write in byte stream mode.
  2. read/write in naturally aligned mode (word/double word/quad word).
  3. read/write/swap/compare_and_exchange in naturally aligned atomic mode with memory ordering.

Bytes::{read/write/read_slice/write_slice/read_from/write_to/read_exact_from/write_all_to} should access memory in byte stream mode.
But Bytes::{read_obj/write_obj} should access memory in naturally aligned mode.
So should we move read_obj/write_obj/atomic_ref/write_atomic/read_atomic out of the Bytes trait?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think both the name and some of the doc comments (i.e. the one about Candidates which may implement this trait include:...) are a bit misleading for Bytes (in my mind I've always used something along the lines of RandomAccessMemory). Your overview of access types sounds reasonable, and what's forcing us right now to explicitly turn operations like read_obj into memcpys is cross region access (I'm working on a proposal to address that, and will post something later this week).

To answer your question, I'm not totally sure what the best place for these operations is in the long term; having them in Bytes seems like the best option for now. I think we still need a bit more time to figure out how the stable vm-memory interface looks like, and we should address some of the existing concerns first.

Signed-off-by: Alexandru Agache <aagch@amazon.com>
Signed-off-by: Alexandru Agache <aagch@amazon.com>
@alexandruag
Copy link
Collaborator Author

Closing for now in favor of #114. Thanks for the discussion so far everyone!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants