Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support generic pointer framework and generic pointer casts #1183

Open
11 tasks
joshlf opened this issue May 3, 2024 · 0 comments
Open
11 tasks

Support generic pointer framework and generic pointer casts #1183

joshlf opened this issue May 3, 2024 · 0 comments

Comments

@joshlf
Copy link
Member

joshlf commented May 3, 2024

Overview

This design is prototyped in #1169, #1209, #1345, and #1355.

Note that this design will likely require GAT, which isn't stable on our 0.8 MSRV (1.56).

A generic trait Pointer<'a, T> with an associated type, type Pointer<'b, U>, which represents the same pointer type applied to a different referent type. E.g., impl<'a, T> Pointer<'a, T> for &'a T { type Pointer<'b, U> = &'b U; }.

Permits conversion code to be generic over pointer type. For example, instead of needing FromBytes::ref_from_bytes, mut_from_bytes, and a hypothetical future box_from_bytes (and similar for other container types - see #114), we could write a single FromBytes::from_bytes that would be generic over pointer type, unifying the source and destination pointer types (e.g., &[u8] -> &T, &mut [u8] -> &mut T, etc).

Progress

Note that the first steps do not commit us to any public API, and so can be done without concern to semver.

  • Audit #114 for other requirements
  • Should there be Deref/DerefMut bounds in here somewhere?
  • Implement FromPtr/IntoPtr internally (these don't require GAT, and so work on our MSRV); use to deduplicate internal helper code such ref_from_prefix_suffix and mut_from_prefix_suffix
  • Figure out what requirements we should add regarding the use of Pointer in Ref's API
  • Figure out how to express the difference between Pointer types that can be split (required for FromBytes::from_prefix) and those that cannot (e.g. Box)
  • Figure out how to support FromBytes::read_from_bytes and FromBytes::read_from_prefix
  • Implement Pointer trait internally (either #[doc(hidden)] or pub-in-priv)
    • If this happens during 0.8, since Pointer requires GAT, this will need to be version-gated as prototyped in #1345
  • Confirm that our internal implementation conforms to all requirements listed below
  • Refactor some zerocopy internals to use this machinery
  • Spend time letting the design bake internally while we get used to it and learn about how best to use it, work out kinks, etc
  • Release as part of our public API

Requirements

  • FromZeros::new_pointer_zeroed<P: Pointer<Self>>() -> P
  • FromBytes::from_bytes<P: Pointer<[u8]>>(bytes: P) -> Result<P::Pointer<Self>, ...>
  • FromBytes::from_prefix<P: Pointer<[u8]>>(bytes: P) -> Result<(P::Pointer<Self>, P), ...>
    • Not valid for pointer types which can't be split such as P = Box<[u8]>
  • FromBytes::read_from_bytes<P: Pointer<[u8]>>(bytes: P) -> Result<Self, ...>
    • Not valid for pointer types which convey ownership such as P = Box<[u8]>; might be able to support P = Arc<[u8]>
  • IntoBytes::as_bytes<P: Pointer<Self>>(self: P) -> P::Pointer<[u8]>
  • transmute_ptr!/try_transmute_ptr!
  • [nice-to-have] Concise P: Pointer bound which is akin to a kind (e.g., representing Box rather than Box<T>)
    • Would allow us to write, e.g., UdpPacket<P: Pointer>
    • Might be doable just using default parameters, e.g. trait Pointer<T = ()> so that P: Pointer is sugar for P: Pointer<()>

Design

As prototyped most recently in #1345, the design has this shape. Note that this design has gaps - for example, it does not distinguish between Pointer types which can be split (e.g. to support FromBytes::from_prefix) and those which can't (such as Box).

// TODO: What's the relationship between `Pointer`, `FromPtr`, and `IntoPtr`? Should any be a sub/super-trait of another?
pub unsafe trait Pointer<'a, T: ?Sized> {
    type Pointer<U: 'a + ?Sized>: Pointer<'a, U, Aliasing = Self::Aliasing, Source = Self::Source>;
}

pub unsafe trait IntoPtr<'a, T> {
    type Aliasing: invariant::Aliasing;
    type Alignment: invariant::Alignment;
    type Validity: invariant::Validity;
    type Source: invariant::Source; // New invariant introduced to support this trait

    /// # Safety
    ///
    /// It is guaranteed to be sound to call `Ptr::new(Pointer::into_raw(...))`,
    /// producing a `Ptr<'a, T, (Self::Aliasing, invariant::Aligned,
    /// invariant::Valid, Self::Source)>`. In particular, all of the safety
    /// preconditions of `Ptr::new` are satisfied.
    fn into_raw(s: Self) -> NonNull<T>;

    fn into_ptr(
        ptr: Self,
    ) -> Ptr<'a, T, (Self::Aliasing, invariant::Aligned, invariant::Valid, Self::Source)>
    where
        Self: Sized,
    {
        let ptr = Self::into_raw(ptr);
        // SAFETY: `Self::into_raw` promises to uphold the safety preconditions
        // of `Ptr::new`.
        unsafe { Ptr::new(ptr) }
    }
}

pub unsafe trait FromPtr<'a, T, I: invariant::Invariants> {
    fn from_ptr(ptr: Ptr<'a, T, I>) -> Self;
}

This trait (or one with a similar design) would permit us to abstract over the type of pointer used to reference a particular type. It could support shared and exclusive references, but also smart pointer types such as Box, Rc, Arc, and possibly even Vec.

This has a number of advantages:

  • It allows us to remove much API duplication, where currently we have both shared and exclusive versions of many methods
  • It allows generic methods to naturally support non-reference pointer types (Box et al)
  • It allows us to possibly do the same via our transmute_ref! family of macros, or at least to introduce a new macro that permits "container transmutation" (#114)
  • It may nearly or entirely replace the job currently done by Ref

Ptr invariants

Pointer has associated types for an Aliasing and Source invariant. When converting from a p: P (where P: Pointer) to a Ptr, the resulting Ptr has the invariants (P::Aliasing, Aligned, Valid, P::Source).

The aliasing invariant must be encoded because different pointer types have different aliasing modes - e.g., shared references and Arcs conform to the Shared aliasing invariant, while mutable references and Boxes conform to the Exclusive aliasing invariant.

This design also introduces the new Source invariant on all Ptrs. The Source encodes the pointer type that was used to generate a Ptr, and is used as a bound to prove that it is sound to convert from a Ptr back into its original Pointer type at some point in the future. It would not be sound, for example, to perform the following conversion: &mut T -> Ptr<T> -> Box<T>.

The way this bound works in practice is that it bounds implementations of the FromPtr trait. For example, here is the implementation of FromPtr for Box from #1345 (edited for brevity):

unsafe impl<'a, T, I> FromPtr<'a, T, I> for Box<T>
where
    I: invariant::Invariants<
        ...
        Source = invariant::Box,
    >,
{ ... }

Note: Previous designs (#1169 and #1209) did not have a separate Source invariant, and instead added new Aliasing invariants for each Pointer type (Aliasing = Box, Aliasing = Arc, etc). This has the disadvantage that all code which operates on a Ptr must understand whether the given Pointer type has shared- or exclusive-aliased semantics. By encoding the Source as a separate invariant, we avoid this problem.

Vec

While supporting Vec directly may be hard, we can at least use the fact that the standard library supports conversion between Vec<T> and Box<[T]> to build on top of our Box support.

If we want to support Vec directly, the Vec -> Ptr direction (Pointer) will be harder than the Ptr -> Vec (FromPtr direction. For Ptr -> Vec, we can use Vec::from_raw_parts and pass the same capacity and size. However, going from Vec -> Ptr, we may be given a Vec whose size and capacity are not the same. Ptr does not currently have the ability to model such an allocation. The way that Vec -> Box (Vec::into_boxed_slice) handles this is by shrinking the underlying allocation to fit the size. This may be an expensive operation on some allocators.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant