New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Pointer metadata & VTable #2580

Open
wants to merge 20 commits into
base: master
from

Conversation

@SimonSapin
Copy link
Contributor

SimonSapin commented Oct 27, 2018

Add generic APIs that allow manipulating the metadata of fat pointers:

  • Naming the metadata’s type (as an associated type)
  • Extracting metadata from a pointer
  • Reconstructing a pointer from a data pointer and metadata
  • Representing vtables, the metadata for trait objects, as a type with some limited API

This RFC does not propose a mechanism for defining custom dynamically-sized types, but tries to stay compatible with future proposals that do.

HTML view

@SimonSapin

This comment has been minimized.

Copy link
Contributor

SimonSapin commented Oct 27, 2018

A previous iteration of this RFC is also visible at #2579. It was based on @Gankro’s proposal https://gist.github.com/Gankro/b053cb4d1cb3bcaec070de89734720f7

Show resolved Hide resolved text/0000-ptr-meta.md Outdated
@mikeyhew

This comment has been minimized.

Copy link

mikeyhew commented Oct 27, 2018

First of all, thanks for opening this rfc. It's the right way to fix the raw::TraitObject API, and is a big step toward custom DST.

My only criticism is I think the Vtable type should be generic over the trait object type, as mentioned in the alternatives section. Having different Metadata types for each trait object type would help catch errors at compile time. Also, it seems like &'static would preclude us from implementing trait object types like dyn Trait1 + Trait2 using multiple vtable pointers, and i think it would be nice to avoid making that decision in this RFC.

I like the name Pointee, I think it's an improvement over Referent because it's more clear what it means.

@gnzlbg

This comment has been minimized.

Copy link
Contributor

gnzlbg commented Oct 27, 2018

cc @ubsan

Show resolved Hide resolved text/0000-ptr-meta.md Outdated
Show resolved Hide resolved text/0000-ptr-meta.md Outdated
Show resolved Hide resolved text/0000-ptr-meta.md Outdated
Show resolved Hide resolved text/0000-ptr-meta.md Outdated
Show resolved Hide resolved text/0000-ptr-meta.md Outdated
Show resolved Hide resolved text/0000-ptr-meta.md Outdated
///
/// [dst]: https://doc.rust-lang.org/nomicon/exotic-sizes.html#dynamically-sized-types-dsts
#[lang = "pointee"]
pub trait Pointee {

This comment has been minimized.

@oli-obk

oli-obk Oct 27, 2018

Contributor

so... I'm assuming the compiler implements

default impl<T: ?Sized> Pointee for T {
    type Metadata = &'static Vtable;
}
impl<T: Sized> Pointee for T {
    type Metadata = ();
}
impl Pointee for str {
    type Metadata = usize;
}
impl<T: Sized> Pointee for [T] {
    type Metadata = usize;
}

Which means theoretically we could make Vtable generic over T allowing the drop_in_place method to take a raw pointer with the correct pointee type?

This comment has been minimized.

@SimonSapin

SimonSapin Oct 28, 2018

Contributor

These impls would be accurate in current Rust, but what I had in mind instead was that the compiler would automatically generate impls, similar to what it does for the std::marker::Unsize trait. As far as the standard library is concerned these impls would be "magic", not based on specialization.

Regardless, yes, making VTable generic with a type parameter for the trait object type is possible.

Show resolved Hide resolved text/0000-ptr-meta.md Outdated
(Answer: they can use a different metadata type like `[&'static VTable; N]`.)

`VTable` could be made generic with a type parameter for the trait object type that it describes.
This would avoid forcing that the size, alignment, and destruction pointers

This comment has been minimized.

@oli-obk

oli-obk Oct 27, 2018

Contributor

How would that avoid forcing this? Can you elaborate?

This comment has been minimized.

@SimonSapin

SimonSapin Oct 28, 2018

Contributor

Without a type parameter, x.size() with x: &'static VTable necessarily executes the same code for any vtable. With a type parameter, x: &'static VTable<dyn Foo> and x: &'static VTable<dyn Bar> are different types and could execute different code. (For example, do table lookup with different offsets.) However, keeping the offset of size the same within all vtables might be desirable regardless of this API.

`VTable` could be made generic with a type parameter for the trait object type that it describes.
This would avoid forcing that the size, alignment, and destruction pointers
be in the same location (offset) for every vtables.
But keeping them in the same location is probaly desirable anyway to keep code size

This comment has been minimized.

@scottmcm

scottmcm Oct 28, 2018

Member

Missing the end of a sentence?

type Metadata;
}
/// Pointers to types implementing this trait alias are

This comment has been minimized.

@scottmcm

scottmcm Oct 28, 2018

Member

Missing the end of a sentence?

@SimonSapin

This comment has been minimized.

Copy link
Contributor

SimonSapin commented Oct 28, 2018

@mikeyhew I’ve very open to adding a type parameter to VTable, I’ll just wait to get some more feedback on the RFC as-is before making significant changes.

As to supporting super-fat pointers with multiple vtable pointers, as mentioned in the alternatives section I believe this design doesn’t prevent it. Types that don’t exist yet and are added to the language in the future (possibly custom DSTs) can have a different metadata type. For dyn A + B that could be [&'static VTable; 2], for example. This proposal does however force dyn C with trait C: A + B {} to keep only having a single vtable pointer.

pub unsafe fn drop_in_place(&self, data: *mut ()) { ... }
}
```

This comment has been minimized.

@Centril

Centril Oct 28, 2018

Contributor

No drawbacks section...?

This comment has been minimized.

@SimonSapin

SimonSapin Oct 28, 2018

Contributor

I came up short trying to think of a reason not to do this at all (as opposed to doing it differently). Suggestions welcome.

and (hopefully) more compatible with future custom DSTs proposals,
this RFC resolves the question of what happens
if trait objects with super-fat pointers with multiple vtable pointers are ever added.
(Answer: they can use a different metadata type like `[&'static VTable; N]`.)

This comment has been minimized.

@Centril

Centril Oct 28, 2018

Contributor

Should then [&'static VTable; 1] for dyn SomeTrait be used to make that transition smoother and to fit better with const generics?

This comment has been minimized.

@SimonSapin

SimonSapin Oct 28, 2018

Contributor

This would make some sense if we were definitely gonna have super-fat pointers with multiple separate vtable pointers as fat pointer metadata. But if we don’t and end up with a different solution to upcasting, we’ll end up with a always-single-item arrays for no reason. This isn’t really the thread to get into that discussion, but my opinion is that super-fat pointer have a significant enough size cost that I’d much prefer a different solution.

Perhaps an alternative for this RFC, more neutral with respect super-fat pointers v.s. not, would be to have type Metadata = VTable<Self>; for trait objects. (See other comments about VTable’s possible type paramater.) With the pointer/reference indirection hidden away in private fields of the VTable type, this design would be compatible with having VTable<dyn A + B> contain two pointers in the future.

This comment has been minimized.

@SimonSapin

SimonSapin Oct 28, 2018

Contributor

Also, is there a use case for generic code that accepts any trait object with any number of vtable pointer but not other kinds of DSTs?

and (hopefully) more compatible with future custom DSTs proposals,
this RFC resolves the question of what happens
if trait objects with super-fat pointers with multiple vtable pointers are ever added.
(Answer: they can use a different metadata type like `[&'static VTable; N]`.)

This comment has been minimized.

@Centril

Centril Oct 28, 2018

Contributor

Are we doing the proposals in the right order? Shouldn't we focus on dealing with dyn A + B + C, upcasting, and such things first? Also, cc #2035.

This comment has been minimized.

@SimonSapin

SimonSapin Oct 28, 2018

Contributor

I believe the design proposed here is compatible enough with various options for multi-traits trait objects and upcasting such that there isn’t a strong dependency, and we don’t need to block this RFC on everything else being settled.


* The name of `Pointee`. [Internals thread #6663][6663] used `Referent`.

* The location of `VTable`. Is another module more appropriate than `std::ptr`?

This comment has been minimized.

@Centril

Centril Oct 28, 2018

Contributor

and should it be called Dictionary instead? ("type class dictionary")

This comment has been minimized.

@kennytm

kennytm Oct 28, 2018

Member

Big -1 to calling it Dictionary since this typically means a key-value map (example: C#, Swift, Python).

Furthermore, here in Rust the VTable is implemented as an array of function pointers, not a HashMap (unlike e.g. Python where it is really implemented as a dict), so calling it Dictionary obscures the alleged complexity.

This comment has been minimized.

@SimonSapin

SimonSapin Oct 28, 2018

Contributor

Even if the implementation happened to use HashMap, I’d prefer VTable since it’s more descriptive of the role of this type. (As opposed to: dictionary of what?) I believe that vtable is a well-enough established term of art.

kennytm and others added some commits Oct 28, 2018

Pointer metadata: add bounds on the associated type
Co-Authored-By: SimonSapin <simon.sapin@exyr.org>
Pointer metadata: typo fix
Co-Authored-By: SimonSapin <simon.sapin@exyr.org>
Pointer metadata: typo fix
Co-Authored-By: SimonSapin <simon.sapin@exyr.org>
Pointer metadata: typo fix
Co-Authored-By: SimonSapin <simon.sapin@exyr.org>
Pointer metadata: typo fix
Co-Authored-By: SimonSapin <simon.sapin@exyr.org>
Pointer metadata: grammar
Co-Authored-By: SimonSapin <simon.sapin@exyr.org>
Pointer metadata: grammar
Co-Authored-By: SimonSapin <simon.sapin@exyr.org>
Pointer metadata: grammar
Co-Authored-By: SimonSapin <simon.sapin@exyr.org>
@SimonSapin

This comment has been minimized.

Copy link
Contributor

SimonSapin commented Oct 28, 2018

Regarding making VTable generic, if it looks like this:

struct VTable<Dyn> { … }

… then it could be used with any type as a parameter. What does VTable<u32> mean? If we want to restrict VTable’s parameter to only types where it makes sense (dyn SomeTrait, dyn SomeTrait + SomeAutoTrait, etc.), we’ll need a dedicated public trait:

struct VTable<Dyn> where Dyn: ?Sized + std::marker::DynTrait { … }

Do we want such a trait?

@SimonSapin

This comment has been minimized.

Copy link
Contributor

SimonSapin commented Oct 28, 2018

Hmm, maybe we could get away with this?

struct VTable<Dyn> where Dyn: ?Sized + Pointee<Metadata=Self> { … }
@Gankro

This comment has been minimized.

Copy link
Contributor

Gankro commented Oct 28, 2018

Is there any way to ensure minimal compiler time is wasted performing monomorphization on useless vtable type params? If not, I would rather the API just be less safe.

@mikeyhew

This comment has been minimized.

Copy link

mikeyhew commented Oct 28, 2018

@Gankro the word "useless" sounds a little strong there... the purpose it to make sure you can't use a vtable from a *const dyn Trait to make a *const dyn SomeOtherTrait, at least not without doing something obviously hacky like transmute.

@Gankro

This comment has been minimized.

Copy link
Contributor

Gankro commented Oct 28, 2018

You're supposing a situation where I somehow am writing code with two vtable types floating around, in which case I have two data pointers, and nothing can stop me from swapping the data pointers, producing the exact same effect.

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Nov 14, 2018

A few questions:

  • I think that the metadata type of trait objects in this RFC is &'static VTable, not VTable. Should VTable be an extern type so that users can't assume anything about its representation?
  • I'm not certain the bounds on Metadata are useful. When would you want to write code generic over pointees with different metadatas that manipulates the metadata?
@kennytm

This comment has been minimized.

Copy link
Member

kennytm commented Nov 14, 2018

The point of the bounds is to provide forward compatibility with Custom DSTs. For sure you can't impl Pointee for X with this RFC alone, and with built-in types the list would be irrelevant. But once we get Custom DST the same list will be needed anyway, so I don't see any harm being more specific upfront.

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Nov 14, 2018

@kennytm why is the list needed? what relies on those guarantees?

@kennytm

This comment has been minimized.

Copy link
Member

kennytm commented Nov 14, 2018

@withoutboats This is needed for Custom DSTs because *T should be equivalent to (*(), T::Metadata) (also note that all from_raw_parts functions in this RFC are safe).

For example, if Metadata does not require Copy, a Custom DST could make Vec<usize> a Metadata and then we can "copy" the Vec via:

// all safe!
let ptr = <*const CustomDst>::from_raw_parts(&(), vec![1,2,3]);
let ptr2 = ptr;
let ptr3 = ptr;
let vec2 = metadata(ptr);
let vec3 = metadata(ptr);

The whole list Sized + Copy + Send + Sync + Ord + Hash + Unpin + 'static is derived from various such impls on pointers and references.

We could restrict the impl<T: ?Sized> Copy for *const T to where T::Metadata: Copy, but this will break existing generic code. We could prevent Custom DSTs from having non-Copy metadata but that's equivalent to adding this list.

Again, the list is irrelevant if no one can impl Pointee, it is mostly for future compatibility with Custom DST.

(A copy of the relevant section from kennytm#2)

When manipulating generic DSTs, we often need to use its metadata type. This can be exposed via associated type if the DST implements some trait.

type Pointee {
    type Metadata: Sized;
}

That is, every type &Dst will be represented as the tuple (&Something, Dst::Metadata) in memory.

We need to ensure all existing traits implemented for pointers and references (&T, *T, Box<T> etc) are not affected by custom DST. This imposes many restrictions on Metadata:

  • Copy&T and *T are both Copy

  • Send&T is Send as long as T is Sync, without considering T::Metadata.

  • Sync&T is Sync as long as T is Sync, without considering T::Metadata.

  • Ord*T is ordered by the pointer value + metadata together. Demonstration:

    let a: &[u8] = &[1, 2, 3];
    let b: &[u8] = &a[..2];
    let a_ptr: *const [u8] = a;
    let b_ptr: *const [u8] = b;
    // the thin-pointer part are equal...
    assert_eq!(a_ptr as *const u8, b_ptr as *const u8);
    // but together with the metadata, they are different.
    assert!(a_ptr > b_ptr);
  • Hash*T hashes the pointer value + metadata together.

  • Unpin&T is always Unpin (Copy does not imply Unpin).

  • 'static*T should outlive T, thus T::Metadata should outlive T. There is no 'self lifetime, nor it makes sense to complicate the matters by making Metadata a GAT, thus the 'static bound.

Thus, the final constraint would be:

trait Pointee {
    type Metadata: Sized + Copy + Send + Sync + Ord + Hash + Unpin + 'static;
}

In principle Metadata should be further bound by UnwindSafe and RefUnwindSafe, but these traits are defined in libstd instead of libcore, so they cannot be included in the bound. One may need to explicitly opt-out of RefUnwindSafe according to the metadata type.

impl<T: ?Sized> !RefUnwindSafe for T {}
impl<T: ?Sized> RefUnwindSafe for T where T::Metadata: UnwindSafe {}

Fortunately, RefUnwindSafe is only excluded for UnsafeCell<T>, and UnwindSafe is only excluded for &mut T, both of which are handled by the Copy bound already.

The Metadata type should also be bound by Freeze (i.e. cell-free), but Freeze is a private trait, thus cannot be exposed to public. Again, Copy already eliminated the possibility of having cells.

@SimonSapin

This comment has been minimized.

Copy link
Contributor

SimonSapin commented Nov 15, 2018

When would you want to write code generic over pointees with different metadatas that manipulates the metadata?

It’s rather niche, and likely we’ll end up with a small number of library that do that, but for example ThinBox and VecOfDst both could (almost #2580 (comment)) be generalized to work with any DST.

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Nov 15, 2018

@kennytm your comments make sense, thanks! and it is concerning that this represents a forward compat hazard with adding new autotraits like Unpin, something we'll have to keep in mind.

However, I don't think the metadata influences the implementations of Ord and Hash, do they? So why are those necessary? (I'm also surprised to learn that vtables have an ordering.)

EDIT: And I'm not so convinced that Unpin is necessary either (not that this should ever matter in practice). We haven't guaranteed that Pin pins the metadata of a wide pointer at all.

Biascally, I think Copy + Send + Sync + 'static captures the real fact that we need to guarantee about metadata: its POD with no more restrictions than an address has.

@kennytm

This comment has been minimized.

Copy link
Member

kennytm commented Nov 15, 2018

@withoutboats For now the both Hash and Ord impl for pointers compare their content as if opaque bytes (e.g. for Hash the implementation just transmutes to (usize, usize) and hash the tuple). I don't think transmute works when Custom DST is added.

Since we need to keep these impls without adding new where bounds:

impl<T: ?Sized> Hash for *const T { ... }
impl<T: ?Sized> Ord for *const T { ... }

and we need to keep comparing the metadata to maintain the runtime behavior, we'll need to add the Ord + Hash constraints.


For Unpin — interesting. I believe Pin<&mut CustomDst> not capturing the metadata in the pin is irrelevant. The interesting case is

Pin<&mut &CustomDst>  // this is a _thin_ pointer to a fat pointer

The inner reference is always Unpin currently, meaning we could safely Pin::get_mut to obtain the &mut &CustomDst, and we could mem::swap it with a different &CustomDst. If CustomDst::Metadata: !Unpin, this would have violated the pin guarantee.

Fortunately there's no safe way convert an arbitrary Pin<&mut T> to a Pin<&mut &CustomDst> (unlike Copy) so the damage is quite limited if Unpin is not on the list.

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Nov 15, 2018

@kennytm I was thinking of reference types, not pointer types. The behavior of pointer types is interesting, maybe even unfortunate..

@bjorn3

This comment has been minimized.

Copy link

bjorn3 commented Nov 16, 2018

Because of impl<T: ?Sized> Eq for *const T the metadata has to be Eq too.

@kennytm

This comment has been minimized.

Copy link
Member

kennytm commented Nov 16, 2018

Ord already implies PartialEq + PartialOrd + Eq.

@mikeyhew

This comment has been minimized.

Copy link

mikeyhew commented Nov 25, 2018

Would it be OK if we changed the Metadata type for dyn Trait to be this?

struct DynTraitMeta<T: ?Sized> {
    vtable: &'static Vtable,
    // not exactly sure what to put here, but this works for now. It must contain `T`.
    _marker: PhantomData<*const T>,
}

This type could be kept unstable, so that it could be changed in the future (potentially to &'static Vtable directly, if that's really what people want).

I'm uneasy about using &'static Vtable directly, not only because it means the type would be the same for all dyn Trait types (more on that later), but also because using &'static exposes an implementation detail that we don't need to expose — the fact that the metadata is a single, valid pointer to a type that we also need to expose. I think we can get away with keeping the metadata opaque, as long as we provide methods to get the size and alignment.

@mikeyhew

This comment has been minimized.

Copy link

mikeyhew commented Nov 25, 2018

Here is one more argument for having different metadata types for different dyn Trait types that I hope people will agree with.
If the type <dyn Trait as Pointee>::Metadata is different for each dyn Trait type, we can do both the following things:

  1. guarantee that the metadata for any *const dyn Trait is always valid, and therefore have safe dynamic dispatch even when *const dyn Trait is the type of self
  2. have <*const dyn Trait>::from_raw_parts be safe functions

If dyn Foo and dyn Bar have the same metadata type, then this falls apart, because it is possible to construct a *const dyn Foo with a vtable pointer to an implementation of Bar, which can result in undefined behaviour.

Now someone might say that "from_raw_parts" is an unsafe-sounding API that you would never expect to be safe. But I think that's just because of the name. It comes from [T]::from_raw_parts, which, by the way, could actually be a safe function if it returned *const [T] instead of &[T].

We have an opportunity to make something that sounds unsafe — messing around with pointer metadata and vtables — actually be a completely safe thing to do. I think that's pretty cool, and it's a great example of the kind of programming Rust lets you do.

And there are no downsides here. The only one that was suggested — potentially extra monomorphization because of an extra type parameter — doesn't hold any water. And even if it did, it seems like something that would come up so rarely that it wouldn't matter, or that could be fixed by future optimizations.

@rkruppe

This comment has been minimized.

Copy link
Member

rkruppe commented Nov 25, 2018

guarantee that the metadata for any *const dyn Trait is always valid

Extending data invariants like this has repercussions beyond just this RFC. For example, it can make unsafe code invalid even if it doesn't interact with the abstractions introduced here, and if we take this step we should do due digilence to check that doesn't affect any unsafe code we want to allow. Not necessarily a problem, but it's extra work that needs to be motivated.

and therefore have safe dynamic dispatch even when *const dyn Trait is the type of self

I don't understand how anything substantial can be made safe this way. The data pointer part of a raw pointer can be invalid, so even if one defines a method with raw pointer receiver type and can call it that way, that method couldn't really do anything with the trait object (if it's safe). While there are some bits and pieces of information one might want to put into a trait that are solely about the type it's implemented on rather than a specific object,

  • there are other more direct ways to support this (e.g., allowing access to associated constants through trait objects)
  • if an actual trait object is available, there's no problem even today, so this only helps if no trait object is available (in which case one could use e.g. a *const dyn Trait with the data pointer being null)
  • in those cases, people would have to write code passing around unusual types like Vtable<dyn Trait> or *const dyn Trait and it's not clear to me how that is any better than existing workarounds that pass around an unusual type like &'static OneOffTableOfImplSpecificData

have <*const dyn Trait>::from_raw_parts be safe functions

We have an opportunity to make something that sounds unsafe — messing around with pointer metadata and vtables — actually be a completely safe thing to do. I think that's pretty cool, and it's a great example of the kind of programming Rust lets you do.

This function is trivial and all the uses I know of are embedded into a context where its use is crucial to other, far more subtle, unsafe code. For example, making this function safe does not help at all when converting the *const dyn Trait constructed with from_raw_parts into a &'a dyn Trait. Can you give an example of useful code that could be written using only this function (and other safe functions)?

If not, I do not see any reason why this would make anything safer. As a general principle, usage of nominally-safe functions deeply embedded in unsafe code is often critical for the soundness of the whole code, so the advantage of making more functionality safe lies in enabling entirely safe code, not in being able to move one important line out of the unsafe block.

And there are no downsides here. The only one that was suggested — potentially extra monomorphization because of an extra type parameter — doesn't hold any water. And even if it did, it seems like something that would come up so rarely that it wouldn't matter, or that could be fixed by future optimizations.

Sorry, just asserting that is not convincing. That type parameters lead to monomorphization today is a fact. That monomorphization generally causes binary size issues is also well established. That only a neglegible amount of code would be redundantly monomorphized is not obvious to me -- there are entire families of data structures that could be built on this abstraction, not to mention all the generic utilities that may be used while working on those vtable pointers.

Appeals to future optimizations / "sufficiently smart compilers" also aren't great, as the extent and efficancy of these optimizations is still somewhat unknown (as they aren't implemented and we also don't yet know all the details of the code we'd want to optimize). Not to mention that even if the overhead would eventually become zero, it's still not great to have the overhead for the medium-term future.

@SimonSapin

This comment has been minimized.

Copy link
Contributor

SimonSapin commented Nov 25, 2018

The question of whether metadata value in a raw pointer should always be valid is completely independent of whether the metadata type for dyn Trait of different traits should be the same type.

from_raw_parts cannot be safe either way. Even if the type signature of that function includes the trait of dyn Trait, it does not include the concrete type that implements the trait. The vtable for *const u32 as *const dyn Display is not interchangeable with the vtable for *const String as *const dyn Display.

@mikeyhew

This comment has been minimized.

Copy link

mikeyhew commented Nov 26, 2018

@SimonSapin

The vtable for *const u32 as *const dyn Display is not interchangeable with the vtable for *const String as *const dyn Display

Actually, it is. (Sort of.)

If a *const dyn Trait has the vtable for <String as Trait> when it actually points to a u32, the vtable is still valid. It is the data pointer that is not valid, and that's the distinction I'm trying to make here.

@mikeyhew

This comment has been minimized.

Copy link

mikeyhew commented Nov 26, 2018

@rkruppe

Extending data invariants like this has repercussions beyond just this RFC. For example, it can make unsafe code invalid even if it doesn't interact with the abstractions introduced here, and if we take this step we should do due digilence to check that doesn't affect any unsafe code we want to allow. Not necessarily a problem, but it's extra work that needs to be motivated.

Fair point. I'm with you on that.

And there are no downsides here. The only one that was suggested — potentially extra monomorphization because of an extra type parameter — doesn't hold any water. And even if it did, it seems like something that would come up so rarely that it wouldn't matter, or that could be fixed by future optimizations.

Sorry, just asserting that is not convincing. That type parameters lead to monomorphization today is a fact. That monomorphization generally causes binary size issues is also well established. That only a neglegible amount of code would be redundantly monomorphized is not obvious to me -- there are entire families of data structures that could be built on this abstraction, not to mention all the generic utilities that may be used while working on those vtable pointers.

OK. Well I'm not sure what it will take to convince you, but I'll try.

Like I said earlier in this thread, all those data structures and generic utilities will already be generic on the pointee type. Or they might be specific to one trait object type. This one extra struct having the same type parameter that everything else does won't result in any extra functions being monomorphized in either of those cases, because either the code would be monomorphized anyway (in the case of code that is generic over the pointee type), or it would not be monomorphized even with the type parameter (in the case of code that is specific to one dyn Trait type).

If you're still not convinced, give me an example of code that you would want to write that you think would not be monomorphized if the metadata type didn't have the type parameter, and would be monomorphized if it did.

@rkruppe

This comment has been minimized.

Copy link
Member

rkruppe commented Nov 27, 2018

give me an example of code that you would want to write that you think would not be monomorphized if the metadata type didn't have the type parameter, and would be monomorphized if it did.

Even the exampled sketched in the RFC, ThinBox, has instances of this. While the ThinBox type itself is generic over the trait object type, it stores a NonNull<WithVtable<()>> and the trait object type is just stashed in a PhantomData. That means some of the code ThinBox uses can be monomorphic (other parts still need to know the trait object type to do unsizing/drop_in_place/etc. though some of that could be made obsolete with further RFCs).

Specifically, let's look at fn data_ptr, which I pick mostly because it's already in the RFC (I expect that a full implementation of ThinBox will have more instances -- to say nothing of other, more complex data structures like a heterogeneous list of trait objects). While data_ptr is polymorphic, that's only due to taking ThinBox<dyn Foo> and being nested in its impl, its body only needs to operate on a *const () based on the size and align from the vtable and so can be extracted into a monomorphic function. That leaves the polymorphic code as tiny as extracting self.ptr: NonNull<WithVtable<()>> (which will be a nop or just a move in the binary, and also a nop in LLVM IR once a certain refactoring finally finishes) and passing it to the monomorphic code.

If the type of the vtable is parametrized by the trait object, one can still extract the arithmetic into monomorphic code, but it's less natural (need to extract size+align and pass them separately) and the polymorphic code contains some more instructions which are then replicated for all trait object types.

@mikeyhew

This comment has been minimized.

Copy link

mikeyhew commented Nov 30, 2018

@rkruppe thanks for replying with an example. I'm super busy right now but I'll try to read through it and get back to you next week

@Ericson2314

This comment has been minimized.

Copy link
Contributor

Ericson2314 commented Dec 26, 2018

Great RFC @SimonSapin. I really like the idea of starting with top most super trait (Pointee) and then working "down" the trait hierarchy towards custom DSTs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment