- Start Date: (fill me in with today's date, YYYY-MM-DD)
- RFC PR: (leave this empty)
- Rust Issue: (leave this empty)
Summary
Solving the Servo DOM design requirements (and beyond), while integrating smoothly with the already existing trait mechanism.
This RFC provides:
- Struct Polymorphism
- Trait Polymorphism
- Runtime Type Information (to handle safe down-cast)
- Thin Polymorphic Pointer/References Support
Note: an experimental repository was setup at https://github.com/matthieu-m/rust-poly, it is incomplete because of missing support in the compiler but gives an idea of what this RFC aims for. The curious might find the implementation of functions/methods there.
Motivation
Rust currently supports polymorphism through its traits, however the experience in Servo has raised a number of requirements which are not fulfilled.
A summary of those requirements is given here:
- cheap field access from internal methods;
- cheap dynamic dispatch of methods;
- cheap down-casting;
- thin pointers;
- sharing of fields and methods between definitions;
- safe, i.e., doesn't require a bunch of transmutes or other unsafe code to be usable;
- syntactically lightweight or implicit upcasting;
- calling functions through smartpointers, e.g. fn foo(JSRef, ...);
- static dispatch of methods.
There have already been a number of proposals (see Summary of Efficient Inheritance RFCs).
This RFC is similar in nature to Fat Objects and tries not to focus on building independent bricks, but instead focus on maximizing integration with the existing code and avoid splitting the Rust landscape into two incompatible run-time polymorphism paradigms, which would hurt re-usability. It is also similar in nature to Trait based inheritance.
Indeed, this RFC designs two disjoint polymorphism paths (one for data, one for interfaces) and emphasizes a clean separation of concern between payload (struct), behaviour (trait) and usage (Box<Trait> or SomeThinPointer<Trait>).
It manages to do so with (relatively) minimal compiler support, thus opening the door to other library schemes.
Detailed Design
This RFC is rather long, as there is a lot to cover, it is presented in two major sections:
- the Language & Compiler section, introducing the necessary language changes and accompanying compiler changes
- the Library section, introducing a number of "new" structs, as well as the adaptations that should performed in the existing Rust
coreandstdmodules for better integration.
Note: The lack of support for Custom DST greatly affects the Library section and leads to a number of Unresolved Questions.
Note: As usual, all names are subject to discussion.
Detailed Design [Language & Compiler]
This section of the RFC presents the changes in the language and compiler that will provide the necessary building bricks for developers to implement polymorphism. Those bricks will be used in the Library section afterwards.
In keeping with Rust tradition, this RFC preserves the orthogonality of defining data structures (in struct) and defining interfaces (in trait). By doing so, it maximizes the opportunity to mix and match "object-like" polymorphism and trait polymorphism.
rtti module
A new core::rtti module is introduced to collect all run-time information; the various sections of this RFC will add their items in this module.
As part of the std facade, this module is re-exported as std::rtti.
Cast traits
The following traits rtti::{UpCast,UpCastRef,DownCast,DownCastRef,Cast} are introduced:
pub trait UpCast<Target> {
fn up_cast(self) -> Target;
}
pub trait UpCastRef<Target> {
fn up_cast_ref(&self) -> &Target;
fn up_cast_ref_mut(&mut self) -> &mut Target;
}
pub trait DownCast<Target> {
fn down_cast(self) -> Result<Target, Self>;
unsafe fn unchecked_down_cast(self) -> Target;
}
pub trait DownCastRef<Target> {
fn down_cast_ref(&self) -> Option<&Target>;
fn down_cast_ref_mut(&mut self) -> Option<&mut Target>;
unsafe fn unchecked_down_cast_ref(&self) -> &Target;
unsafe fn unchecked_down_cast_ref_mut(&mut self) -> &mut Target;
}
pub trait Cast<Target> {
fn cast(self) -> Result<Target, Self>;
unsafe fn unchecked_cast(self) -> Target;
}While the [Down|Up]Cast[|Ref] traits are intended to cast between related traits/structs, the Cast trait is intended for general purpose casts, such as cross-casts.
Alternative: It is unclear whether the
*CastRefvariants are that useful, this is discussed in the Alternatives section.
Unresolved Question: Could
DownCastRefandUpCastRefbe supplanted by implementing the regularDownCast/UpCaston references instead?
Struct Polymorphism
Struct Polymorphism aims at allowing a partial view of the fields inside a larger struct.
Syntax
Grammar:
struct ChildName: ParentName (+ ParentName)* {
// other attributes
}Example:
struct FirstParent;
struct SecondParent {
a: int
}
impl SecondParent() {
fn increment(&mut self) { self.a += 1; }
}
struct Child: FirstParent + SecondParent {
// other attributes
}
fn usage(child: &mut Child) {
SecondParent::increment(child as &mut SecondParent);
(child as &mut SecondParent).increment();
println!("{}", (child as &SecondParent).a);
}This RFC proposes to reuse the trait notation for extension, and use a similar behaviour for both derivations.
ParentName (+ ParentName)* is called the list of parents of the structure:
- it is an error for any
ParentNameto resolve toChildName - it is an error for the same
structto appear multiple times in the list of parents - any
structappearing in the list of parents ofChildNameis called a direct parent ofChildName
Alternative: the use of extension (instead of attributes) is discussed in the Alternatives section.
Alternative: the ability to extend from multiple bases, which introduced the parent ambiguity problem, is discussed in the Alternatives section.
Semantics
By extending another struct, the derived struct embeds its parents' fields. However, due to encapsulation, it can only access fields that it could access if the parent was an attribute.
In essence, the previous example could be rewritten:
struct Child {
pub _super_first: FirstParent, // 0 bytes, but mentioned anyway
pub _super_second: SecondParent,
// other attributes
}
fn usage(child: &mut Child) {
SecondParent::increment(&mut d._super_second);
SecondParent::increment(&mut d._super_second);
println!("{}", child._super_second.a);
}This RFC does not introduce any auto-dereferencing; the developer should feel free to implement Deref.
This RFC does not introduce any way to make the extended struct private. Indeed, as it relies on traits to expose the relationship and trait implementations cannot be private, this would not make sense.
Goal Reached: sharing of fields
and methodsbetween definitions
Goal Reached: syntactically lightweight
or implicit upcasting
ExtendStruct, FirstExtendStruct
In order to query, from the language, the relationship between two struct, we introduce some traits intrinsics::{ExtendStruct, FirstExtendStruct} and a short-hand notation. These traits are unsafe, and automatically implemented by the compiler.
unsafe trait ExtendStruct<P> { fn offsets() -> &'static [isize]; }
unsafe trait FirstExtendStruct<P> {}The compiler automatically implements the traits according to the following rules:
Childimplements all traits for itself,offsetsreturns&[0].ChildimplementsExtendStruct<()>.ChildimplementsExtendStruct<P>ifPappears in the list of extended structs ofChild.ChildimplementsExtendStruct<P>if there exists aTsuch thatChild: ExtendStruct<T>andT: ExtendStruct<P>; due to parent ambiguity multiple suchTmay exist.<C as ExtendStruct<P>>::offsets()returns the sorted list of offsets denotingPinstances in aCinstance. The offsets are expressed in bytes.ChildimplementsFirstExtendStruct<P>ifChild: ExtendStruct<P>and the result of<C as ExtendStruct<P>>::offsets()is&[0].
The short-hand notation C: P expands to C: ExtendStruct<P>.
Alternative: the short-hand notation is discussed in the Alternatives section.
Alternative: if parent ambiguity is forbidden,
fn offsets() -> &'static isizecan becomefn offset() -> isize, this is discussed in the Alternatives section.
Cast
The following default implementations of UpCastRef are provided:
impl<C, P> UpCastRef<P> for C
where C: FirstExtend<P>
{
fn up_cast_ref(&self) -> &P;
fn up_cast_ref_mut(&mut self) -> &mut P;
}Note: it is not possible to provide an implementation of
DownCastRefbecause doing so requires run-time type information, which astructdoes not contain by default.
Note: it is not possible to implement
UpCastReffor any(C, P)such thatC: Extend<P>because of the parent ambiguitye problem (severalPcould be reachable); should parent ambiguity be banned, then it would become possible.
Lightweight up-cast
The as keyword is used as a lightweight syntax for up-casting, it desugars so:
- with
c: &C,c as &Pdesugars into<C as UpCastRef<P>>::up_cast_ref(c) - with
c: &mut C,c as &mut Pdesugars into<C as UpCastRef<P>>::up_cast_ref_mut(c)
Goal Reached: syntactically lightweight or implicit upcasting
Alternative: the
asshort-hand is discussed in the Alternatives section.
StructId and struct_id
The raw::StructId struct and the intrinsics::struct_id function are introduced:
#[derive(Clone, Copy, Debug, Eq, PartialEq, Ord, PartialOrd)]
pub struct StructId { id: u64 }
pub extern "rust-intrinsic" fn struct_id<S>() -> raw::StructId
where S: marker::Reflect + 'static;To each 'static struct is associated a unique StructId, which can be obtained via the struct_id intrinsic.
Alternative: the
'staticrequirement is discussed in the Alternatives section.
StructInfo and struct_info
The raw::StructInfo struct and the intrinsics::struct_info function are introduced:
pub extern "rust-intrinsic" fn struct_info<S>() -> &'static raw::StructInfo
where S: marker::Reflect + 'static;
#[repr(C)]
pub struct StructInfo {
size_align: u64, // high 8 bits: log2(align), low 56 bits: size
struct_id: StructId,
v_table_getter: fn (TraitId) -> Option<&'static VTable>,
offsets_getter: fn (StructId) -> &'static [isize],
cloner: Option<fn (*const u8, *mut u8) -> ()>,
dropper: fn (&mut ()) -> (),
}
// Methods can be seen on the rust-poly repository.This structure provides the necessary information to perform some up-casts and down-casts, as well as manipulating structures without compile-time type information.
For each monomorphized struct, the compiler will emit a StructInfo in read-only memory in the crate the struct is instantiated and wire up the struct_info intrinsic to return a pointer to it.
- the size and alignment of the
structare stored to (later) allow DST implementation. - the ID of the
structis used to know the type of the object at run-time. - the
v_table_getterfunction provides theVTablefor all knowns traits thisstructimplements (at the pointStructInfois instantiated). - the
offsets_getterfunction is the type-erased version ofExtendStruct<P>::offsets, for allPfrom which the currentstructextends, and otherwise returns an empty array. - the
clonerfunction should provide a way to clone a type-erased type, it is optional, as it cannot be provided for types not implementing Clone. - the
dropperfunction is the type-erased version ofdrop, its argument is a pointer to the first byte of thestruct.
As mentioned, this type should represent a "view" of read-only memory laid out by the compiler.
Note: a rational on the sufficiency of
v_table_getteris provided in theTraitInfosection.
Common Ancestor ()
As mentioned in the Semantics section, the unit type (0-length tuple) is a common ancestor to all struct. As such, for any struct C, C: Extend<()>.
Alternative: using a common ancestor is handy, but forces parent ambiguity resolution in, it is discussed in the Alternatives section.
Trait Polymorphism
Trait Polymorphism aims at allowing a partial view of the methods inside a larger trait.
Syntax and Semantics
The syntax and semantics of traits are kept as is.
trait bound
The trait bound is introduced, making it possible to restrict a given generic type parameter to only traits. Doing so does not give access to any method of the trait, however it allows:
- if
T: trait, thenS: Tis a valid bound, and requires thatSimplementsT - if
T: trait, thenlet t: &T = mem::transmate(raw::TraitObject { ... })is a valid (and unsafe) expression
For backward compatibility reasons, the absence of trait bound cannot mean !trait, but instead means ?trait.
ExtendTrait, FirstExtendTrait
In order to query, from the language, the relationship between two trait, we introduce some traits intrinsics::{ExtendTrait, FirstExtendTrait}. Those traits are unsafe, and automatically implemented by the compiler.
unsafe trait ExtendTrait<B>: trait { fn offset() -> isize; }
unsafe trait FirstExtendTrait<B>: ExtendTrait<B> {}The compiler automatically implements the trait according to the following rules:
DimplementsExtendTrait<D>,offsetreturns0.DimplementsExtendTrait<B>ifD: B,offsetreturns the necessary in-memory adjustment to point to the right v-table.DimplementsFirstExtendTrait<B>if<D as ExtendTrait<B>>::offset()returns 0.
Note: the layout of v-tables is constrained by the presence of
offset, which requires that the compiler and theoffsetimplementation work hand in hand.
Cast
The following default implementations of UpCastRef and DownCastRef are provided:
impl<D, B> UpCastRef<B> for D
where D: B + trait + 'static,
B: trait + 'static,
{
fn up_cast_ref(&self) -> &B;
fn up_cast_ref_mut(&mut self) -> &mut B;
}
impl<D, B> DownCastRef<D> for B
where D: B + trait + 'static,
B: trait + 'static,
{
fn down_cast_ref(&self) -> Option<&D>;
fn down_cast_ref_mut(&mut self) -> Option<&mut D>;
}Unresolved question: Are
DownCastRefandUpCastRefnecessary? Could they be supplanted by implementing the regularDownCast/UpCaston references instead?
Lightweight up-cast
The as keyword is used as a lightweight syntax for up-casting, it desugars so:
- with
c: &C,c as &Pdesugars into<C as UpCastRef<P>>::up_cast_ref(c) - with
c: &mut C,c as &mut Pdesugars into<C as UpCastRef<P>>::up_cast_ref_mut(c)
Goal Reached: syntactically lightweight or implicit upcasting
Alternative: the
asshort-hand is discussed in the Alternatives section.
TraitId and trait_id
The raw::TraitId struct and the intrinsics::trait_id function are introduced:
#[derive(Clone, Copy, Debug, Eq, PartialEq, Ord, PartialOrd)]
pub struct TraitId { id: u64 }
pub extern "rust-intrinsic" fn trait_id<T>() -> raw::TraitId
where T: marker::Reflect + 'static;To each 'static trait is associated a unique TraitId, which can be obtained via the trait_id intrinsic.
Alternative: the
'staticrequirement is discussed in the Alternatives section.
TraitInfo and trait_info
The raw::TraitInfo struct and the intrinsics::trait_info function are introduced:
pub extern "rust-intrinsic" fn trait_info<T>() -> &'static raw::TraitInfo
where T: trait + marker::Reflect + 'static;
#[repr(C)]
pub struct TraitInfo {
trait_id: TraitId,
v_table_getter: fn (StructId) -> Option<&'static VTable>,
}
// Methods can be seen on the rust-poly repository.This structure provides the necessary information to perform some up-casts and down-casts, as well as manipulating structures without compile-time type information.
For each monomorphized trait, the compiler will emit a TraitInfo in read-only memory in the crate the trait is instantiated and wire up the trait_info intrinsic to return a pointer to it.
- the ID of the
traitis used to know the type of the object at run-time. - the
v_table_getterfunction provides theVTablefor all knowns structs implementing thistrait(at the pointTraitInfois instantiated).
Most languages feature closed-inheritance, such that when the v-table of a class is instantiated the definite list of all implemented interfaces is known. Rust, instead, features open-polymorphism: the set of trait a given struct implements is only known when creating a fully statically linked executable.
Rust does give some guarantees, however: the coherence rules guarantee that for any two monomorphized trait T and struct S, impl T for S may only appear in either the crate of T (or one of its dependencies) or the crate of S (or one of its dependencies). This RFC proposes to piggy-back on the coherence rules:
StructInfo::v_table_getter: all known traits at the point of instantiation of the monomorphizedstructTraitInfo::v_table_getter: all known structs at the point of instantiation of the monomorphizedtrait
Thus, for any given monomorphized trait T and struct S, if S: T, then:
- either
(trait_info::<T>().v_table_getter)(struct_id::<S>()).is_some() - or
(struct_info::<S>().v_table_getter)(trait_id::<T>()).is_some()
and should T and S be statically known, both computation should be resolvable at compile-time.
Note: it is unclear yet which should resolve faster than the other.
Goal Reached: cheap down-casting (using
v_table_getter), although maybe slightly less cheap than in C++.
VTable and v_table
The raw::VTable struct and the intrinsics::v_table function are introduced:
pub extern "rust-intrinsic" fn v_table<T, S>() -> &'static rtti::VTable
where T: trait + marker::Reflect + 'static,
S: T + marker::Reflect + 'static;
#[repr(C)]
pub struct VTable {
pub struct_info: &'static StructInfo,
pub trait_info: &'static TraitInfo,
// In memory, follows a list of pointers to methods and other amenities
}For each monomorphized impl, the compiler will emit a VTable in read-only memory in the crate the impl is instantiated and wire up the v_table intrinsic to return a pointer to it.
This structure's size depend on the trait implemented
Goal Reached: cheap field access from internal methods (monomorphized)
Goal Reached: cheap dynamic dispatch of methods (fixed offset from the start of
VTable)
Alternative: some functions (
dropper,cloner, ...) could be lifted inVTable, this is discussed in the Alternatives section.
Alternative: a typed, sized, version of
VTable(TypedVTable<T>) could be provided to the user, this is discussed in the Alternatives section.
TraitObject and Casts
The existing raw::TraitObject should be modified so that its vtable member may be safely transmuted to &'static VTable.
Then with &T relying on VTable, &T and &mut T can be cast at run-time:
impl<'a, T, B> UpCast<&'a B> for &'a T
where T: B + trait + marker::Reflect + 'static,
B: trait + marker::Reflect + 'static;
impl<'a, T, D> DownCast<&'a D> for &'a T
where T: trait + marker::Reflect + 'static,
D: T + trait + marker::Reflect + 'static;
impl<'a, T, X> Cast<&'a X> for &'a T
where T: trait + marker::Reflect + 'static,
X: trait + marker::Reflect + 'static;
impl<'a, T, B> UpCast<&'a mut B> for &'a mut T
where T: B + trait + marker::Reflect + 'static,
B: trait + marker::Reflect + 'static;
impl<'a, T, D> DownCast<&'a mut D> for &'a mut T
where T: trait + marker::Reflect + 'static,
D: T + trait + marker::Reflect + 'static;
impl<'a, T, X> Cast<&'a mut X> for &'a mut T
where T: trait + marker::Reflect + 'static,
X: trait + marker::Reflect + 'static;Detailed Design [Library]
This section of the RFC presents additional library elements that could be added to provide users with an out-of-the-box experience.
Unresolved Question: How to provide a safe
Unsizedtype in the absence of Custom DST?
Alternative: the necessity for those types is discussed in the Alternatives section.
Thin Pointers/References
Thin Pointers require bundling together the pointer to run-time information and the data of a particular instance of a struct. This RFC introduces some types to deal with those.
Class and DynClass (and Dyn)
The rtti::{Class,DynClass,Dyn} types are introduced, their particular implementations are only sketched.
#[repr(...)] // C ?
struct Class<T, S>
where T: trait,
S: T,
{
// Exposure only
dyn: DynClass<T, S>,
data: S,
}
// FIXME: this type should be !Sized, this requires negative bounds
// FIXME: this type would benefit from the availability of support for custom DST
#[repr(...)] // C ?
struct DynClass<T, S>
where T: trait,
{
// Exposure only
v_ptr: &'static VTable,
offset: isize,
}
//
type Dyn<T> = DynClass<T, ()>;Some details:
- It is intended for
DynClassto be an dynamically sized prefix ofClass. v_ptris the traditional pointer to the v-table, it is initialized tov_table::<T0, S0>()whenClass<T0, S0>is first created; it always points to v-tables ofS0throughout the lifetime of the instance, but jumps fromtraittotraitfollowing the casts (ofDynClass).offsetrepresents, at first, theoffsetofdatainClass<T0, S0>, it is necessary to supportS0types for whichmem::align_of::<S0>() > mem::align_of::<isize>(); it jumps fromstructtostruct(always within the original confines ofClass<T0, S0>) following the casts (ofDynClass)datais simply the piece of data.
Invariants:
- for any
DynClass<T, S>,self.v_ptrpoints to aVTablefortrait T(and the originalstruct) - for any
DynClass<T, S>,(&self as *const u8) + self.offsetpoints to an instance ofS
as a result of those invariants, building a &T or a &S is cheap.
Goal Reached: thin pointers (
Box<DynClass<T, S>>).
Goal Reached: safe, i.e., doesn't require a bunch of transmutes or other unsafe code to be usable.
Note: since
offsetis necessary for large alignment support, this RFC makes the best of it and handles extending multiplestructat once without additional storage costs.
Note:
DynClasscannot be directly constructed, instead an implementation ofconvert::Fromexists to convert fromBox<Class<T, S>>toBox<DynClass<T, S>>.
Casts
The various *Cast* traits are implemented for Box<DynClass<T, S>>:
impl<T, S, B, P> UpCast<Box<DynClass<B, P>>> for Box<DynClass<T, S>>
where T: B + trait + marker::Reflect + 'static,
S: P + marker::Reflect + 'static,
B: trait + marker::Reflect + 'static,
P: marker::Reflect + 'static;
impl<T, S, B, P> UpCastRef<DynClass<B, P>> for DynClass<T, S>
where T: FirstExtendTrait<B> + marker::Reflect + 'static,
S: FirstExtendStruct<P> + marker::Reflect + 'static,
B: marker::Reflect + 'static,
P: marker::Reflect + 'static;
impl<T, S, D, C> DownCast<Box<DynClass<D, C>>> for Box<DynClass<T, S>>
where T: trait + marker::Reflect + 'static,
S: marker::Reflect + 'static,
D: T + trait + marker::Reflect + 'static,
C: FirstExtendStruct<S> + marker::Reflect + 'static;
impl<T, S, D, C> DownCastRef<DynClass<D, C>> for DynClass<T, S>
where T: trait + marker::Reflect + 'static,
S: marker::Reflect + 'static,
D: FirstExtendTrait<T> + marker::Reflect + 'static,
C: FirstExtendStruct<S> + marker::Reflect + 'static;
impl<T, S, X, Y> Cast<Box<DynClass<X, Y>>> for Box<DynClass<T, S>>
where T: trait + marker::Reflect + 'static,
S: marker::Reflect + 'static,
X: trait + marker::Reflect + 'static,
Y: marker::Reflect + 'static;A prototype implementation can be seen on rust-poly; an excerpt is presented here:
// up-cast from trait T to trait B
let v_table: &'static VTable = unsafe {
let raw: *const u8 = mem::transmute(self.v_table);
mem::transmute(raw as isize + <T as TraitExtendTrait<B>>::offset())
};which illustrates the support of efficient up-casts, as the result of <T as TraitExtendTrait<B>>::offset() can be computed at compile-time (providing offset is inlined).
Note: the extra restrictions imposed on
UpCastRefandDownCastRefapply any time the content pointed to is immutable; for example, it would apply toRc<DynClass<T, S>>; this cannot be worked around by usingRc<RefCell<DynClass<T, S>>>(or any other run-time mutability enabler) as the*Casttraits consume their inputs.
RawClone
Cloning is a very much desirable feature (when available), however:
Cloneis not object-safe, thus requiringT: Clone(whereTis a trait) is not possibleDynClassis a type-erased container, so whether the originalS0could be cloned is unknown, yetClone::cloneis not allowed to fail
This RFC proposes the inclusion of a new trait in core::clone: RawClone.
pub trait RawClone {
// Will write mem::size_of::<Self>() bytes in dst:
// - dst will be overwritten, no destructor will run,
// - dst should be big enough,
// - dst should not overlap with self.
unsafe fn raw_clone(&self, dst: *mut u8);
}
impl<T> RawClone for T
where T: clone::Clone + Sized
{
unsafe fn raw_clone(&self, dst: *mut u8); // an example implementation is available in rust-poly
}This trait is a low-level object-safe version of Clone, and is implemented by default by any type implementing Clone, so that its inclusion is transparent to the users.
In turn, it allows static detection of whether cloning is possible (or not) in a type-erased scenario:
impl<T, S> DynClass<T, S>
where T: RawClone + trait,
S: T
{
pub fn clone_to_box(&self) -> Box<DynClass<T, S>>;
}
impl<T, S> clone::Clone for Box<DynClass<T, S>>
where T: RawClone + trait,
S: T,
{
fn clone(&self) -> Self { self.clone_to_box() }
} // impl CloneThe clone_to_box is intended to smooth integration of cloning for other smart pointers, and does all the heavy lifting.
Note: should the current trait
Tnot extendRawClone, the user may use theCastfacility described below to attempt aCasttoBox<DynClass<RawClone, S>>, clone (if the cast succeeds), and cast back toBox<DynClass<T, S>>.
Alternative: A more comprehensive attempt at solving DST challenges is necessary, as this new trait is insufficient to solve all challenges in itself, this is discussed in the Alternatives section.
Fat Pointers/References
Why?
&Tdoes not provide direct access to fields&DynClass<T, S>does not provide access to other fields
DynRef and DynRefMut
The rtti::{DynRef,DynRefMut} types are introduced, their particular implementations are only sketched.
pub struct DynRef<'a, T, S>
where T: marker::Reflect + 'static,
S: marker::Reflect + 'static,
{
// Exposure only
v_ptr: &'static VTable,
offset: isize,
v_data: &mut (),
}
pub struct DynRefMut<'a, T, S>
where T: marker::Reflect + 'static,
S: marker::Reflect + 'static,
{
// Exposure only
v_ptr: &'static VTable,
offset: isize,
v_data: &mut (),
}Those references can be pointed at any existing storage, borrowing it, and allow navigating the extension hierarchy of S safely through the implementation of the *Cast traits.
Conversion from existing references
The rtti::{DynRef,DynRefMut} can be created from existing &T or &mut T:
impl<'a, T> From<&'a T> for DynRef<'a, T, ()>
where T: trait
{
fn from(r: &'a T) -> DynRef<'a, T, ()> {
let trait_object = unsafe { mem::transmute(r) };
DynRef { v_ptr: trait_object.v_ptr, offset: 0, v_data: trait_object.data }
}
}
// Similarly
impl<'a, T> From<&'a mut T> for DynRefMut<'a, T, ()>;This then allow the user to obtain handles to the struct through the existing casting facilities.
Implementing the DOM according to requirements
Let us now how an example of a simple DOM would look like given those facilities, as it is the reference example used by the existing RFCs.
type ClassNode = DynClass<Node, NodeData>;
trait Node: clone::RawClone {}
#[derive(Debug)]
struct NodeData {
parent: Option<Box<ClassNode>>,
first_child: Option<Box<ClassNode>>,
}
impl Node for NodeData {}
type ClassText = DynClass<Node, TextNode>;
#[derive(Debug)]
struct TextNode: NodeData {}
impl Node for TextNode {}
type ClassElement = DynClass<Element, ElementData>;
trait Element: Node {
fn do_the_thing(&self);
fn before_set_attr(&mut self, _key: &str, _val: &str) {}
fn after_set_attr(&mut self, _key: &str, _val: &str) {}
}
#[derive(Debug)]
struct ElementData: NodeData {
attrs: HashMap<String, String>,
}
// Note: private access to ElementData::data, ensuring invariants;
// also, this method is always statically dispatched and thus inlinable.
impl ElementData {
fn set_attribute(&mut self, key: &str, value: &str) {
self.before_set_attr(key, value);
self.attrs.insert(key.to_string(), value.to_string());
self.after_set_attr(key, value);
}
}
impl Node for ElementData {}
impl Element for ElementData {
fn do_the_thing(&self) { println!("ElementData is in da place!"); }
}
#[derive(Debug)]
struct HTMLImageElement: ElementData {}
impl Node for HTMLImageElement {}
impl Element for HTMLImageElement {
fn do_the_thing(&self) { println!("HTMLImageElement is in da place!"); }
fn before_set_attr(&mut self, key: &str, val: &str) {
if key == "src" {
// remove cached image
}
<ElementData as Element>::before_set_attr(&mut self as &mut ElementData, key, val);
}
}
#[derive(Debug)]
struct HTMLVideoElement: ElementData {
cross_origin: bool,
}
impl Node for HTMLVideoElement {}
impl Element for HTMLVideoElement {
fn do_the_thing(&self) { println!("HTMLVideoElement is in da place!"); }
fn after_set_attr(&mut self, key: &str, value: &str) {
if key == "crossOrigin" {
self.cross_origin = value == "true";
}
<ElementData as Element>::after_set_attr(&mut self as &mut ElementData, key, value);
}
}
fn process_any_element<'a>(element: &'a Element) {
element.do_the_thing();
}
pub fn doit() {
let text_node: Box<ClassNode> = {
let nd = NodeData { parent: None, first_child: None };
Box::new(Class::new(TextNode { NodeData: nd })).into().up_cast()
};
let video_element: Box<ClassElement> = {
let nd = NodeData { parent: None, first_child: Some(text_node) };
let ed = ElementData { NodeData: nd, attrs: HashMap::new() };
let hve = HTMLVideoElement { ElementData: ed, cross_origin: false };
Box::new(Class::new(hve)).into().up_cast()
};
process_any_element((*video_element).as_trait());
let node = (video_element.as_struct() as &NodeData).first_child.as_ref().unwrap();
if let Some(text) = down_cast!((*node) => ref ClassText) {
println!("I got me some text node {:?}", &text);
} else if let Some(element) = down_cast!((*node) => ref ClassElement) {
println!("I got me some element {:?}", &element);
} else {
println!("Oh shoot, nothing I know!");
}
println!("I haz teh clone: {:?}", node.clone());
}Note: the
down_cast!macro is a straightforward macro implemented in rust-poly which eases in-flight casts by expanding to a block in which a type ascription occurs. This RFC does not discuss its inclusion, it can be added easily enough later on.
Goal Check
Starting by a freebie:
Goal Reached: calling functions through smartpointers, e.g. fn foo(JSRef, ...), if said smart pointers implement
Deref<Target = T>for the desired trait.
The goals check list:
- cheap field access from internal methods:
- the code for a given
structortraitis monomorphized.
- the code for a given
- cheap dynamic dispatch of methods:
- the method is at a compile-time known offset from in the v-table current being pointed, thus the cost is identical to calling a method through a
&TwhereT: trait.
- the method is at a compile-time known offset from in the v-table current being pointed, thus the cost is identical to calling a method through a
- cheap down-casting:
- down-casting at an arbitrary point in an open hierarchy is a complicated operation, achieved here with at most 2 function calls (one statically known, with a run-time argument, the other dynamically known, with a compile-time argument); cheap given the functionality.
- thin pointers:
Box<DynClass<..>>is a thin pointer, but is unavailable without custom DST support.
- sharing of fields and methods between definitions:
- fields are shared through
structextension. - methods... see "Unresolved Question"
- fields are shared through
- safe, i.e., doesn't require a bunch of transmutes or other unsafe code to be usable:
- the library types
ClassandDynClasscan be used safely.
- the library types
- syntactically lightweight or implicit upcasting:
- a single method call (when inference works) or the "as" keyword.
- calling functions through smartpointers, e.g. fn foo(JSRef, ...):
- as long as said pointers implement the appropriate
Deref
- as long as said pointers implement the appropriate
- static dispatch of methods:
- as today.
Unresolved Question: What does "sharing methods between definitions" mean, exactly?
Wrapping up
This RFC proposes a polymorphism which integrates smoothly with the existing language today: this is evidenced by the fact that &T gains polymorphism, for example.
By fusing the necessary facilities for up-casting, down-casting and cross-casting with the existing facilities, this RFC avoids duplicating the functionalities and augments the existing language without introducing a shift, thus guaranteeing the interoperability of code which makes use of its library facilities and code which does not.
Yet, despite being lightweight and rusty, it is quite possible to translate traditional objects hierarchy with a one-to-one mapping, use type aliases to mask the novelty and type inference to avoid ever mentioning it directly.
Drawbacks
- No custom DST yet: in the absence of support for custom DST types, thin-pointers can only exist in the compiler (today); this RFC does not seek to address the issue, and rather than introducing more special cases into the compiler (for a built-in
DynClass), the suggestion offered is to revise this proposal after some minimal custom DST support has landed. - heavier syntax (
&Tvs&Dyn<T>): it is expected that the need for such bundling be rare, and it is possible to convert to&Timmediately (so that onlystructcode and notfncode be affected), thus this is seen as a minor drawback. - performance untested: the performance can only be assessed with the proper compiler intrinsics in place; complexity analysis looks good, but it only goes so far.
- opinionated: some building blocks are created, though maybe not as generic as one would wish. It seems unlikely that building blocks catering to any case would emerge, though.
Alternatives
*CastRef
It is unclear whether those variants are really useful, or whether they could be supplanted by implementing the *Cast traits on reference types instead.
Extension syntax
The use of the extension syntax for struct immediately introduce a number of specific constructs:
- the extension syntax itself,
- the initialization syntax is changed to be able to initialized the extended structs,
- ...
Gereeter (co-author of the #11), instead proposes the use of an attribute such as #[parent] (for example) placed on a regular data member.
Despite the ease of introducing such an attribute, there seems to be a number of downsides. Support would seem lack-luster, compared to the first-class support of trait extension, for example, and it would seem bizarre that even though a sub-typing relationship is introduced as for trait, the syntax is completely different.
It is unclear whether first-class syntax support for this feature is really desired, though, as it could be an edge case.
Short-hand notation :
The short-hand notation for syntax extension (or even, trait extension) is not necessary; however it greatly simplifies writing bounds, in a way that current Rustaceans have already internalized.
Still, in the presence of the intrinsic traits, it is redundant.
Short-hand notation as
The short-hand notation for up-casting is not necessary, however as is greatly simplifies writing the up-casts, in a way that current Rustaceans have already internalized.
Still, in the presence of the UpCast* traits, it is redundant.
Multiple parents
The syntax and compiler intrinsics proposed support a struct having multiple parents and indeed having multiple times the same parents (through various branches).
It would be possible to restrict struct extension to a single parent. This would obviate the need for ExtendStruct::offsets, and other bits and nits, at the cost of restricting the user.
It would instead be possible to allow struct extension to multiple parents, but forbidding parent ambiguity, that is having P reachable multiple times via different branches of the parents hierarchy. This would simplify the interface of ExtendStruct::offset (which would always return a single offset), and the restriction might not be too onerous on the user.
And finally, full-support for parent ambiguity is also possible, in multiple ways:
ExtendStructcould be restricted to unambiguous cases; forcing the user to specify the path when casting. In this case,offsetcan be simplified as above.- a supplementary
UnambiguouslyExtendStructcould be introduced for this purpose. - or thing remain as-is, and a run-time decision need be taken, such as using the first instance (when up-casting), bailing out (Option) or panicking.
Note: the fate of the proposed common ancestor
()hangs in the balance here as the ability to cast to()requires support for parent ambiguity. Without a common ancestor,Dyn<T>requires a full-blown implementation rather than being a mere alias.
'static lifetime
Today, in Any, the 'static lifetime is required because Any erases the lifetime.
This seems arbitrarily restrictive, and it seems that:
- it is unnecessary for
trait_idandstruct_id(as it is fortype_idtoday) as long as one accepts that traits/structs varying only by their lifetime (a pure compile-time artefact) would obtain identical IDs. - the safety could instead be enforced at library level,
ClassandDynClasscould for example be extended to take a lifetime, and up-casts/down-casts would only be allowed to narrow the lifetime (yes, narrowing whatever the direction of the cast)
It could be worth future-proofing the API (introducing an extraneous lifetime in Class and Dyn* but with a 'static bound for now), to avoid API changes, but may not be worth introducing it from the get-go: there is already much to deal with.
Layout of VTable
The layout of VTable as proposed only takes into account fast up-casting and fast dynamic method dispatching. The down-casting path is slower, as is a call to dropper or (in the future) cloner.
Other layouts are possible:
- not duplicating the v-table headers in-between the methods list when duplicating the list: this prevents the existing of
ExtendTrait::offsetand requires run-time look-up of the v-table. - increasing the v-table header, by moving / duplicating information from
StructInfoorTraitInfo, in a typical attempt to speed things up (one less dereference) by using more memory.
The current layout was proposed as a normalized version of what the v-table could look like; whilst still remaining efficient. Benchmarks are necessary to effectively measure the effect of the various options on the performance of different usecases (method call, drop/clone call, casting).
Sized, Typed, VTable
A dedicated (compiler-generated) TypedVTable<T> could be generated for each trait, allowing a user to navigate the v-table in library code (even sub-tables and methods).
This RFC proposes NOT to introduce such a type for now, as it can be added in a backward compatible fashion.
Library section
The entire library section could be tabled, or an entirely different version of the library could be proposed, without affecting much the value of the Language & Compiler section.
Notably, even without the library section, one would gain casts on existing &T and &mut T references.
RawClone
As noted RawClone in itself is insufficient to solve all DST challenges. It does, however, constitutes a building brick which could be leveraged later on, and is a very small addition that is transparent to existing user code.
It could be tabled for now, only cloning would be affected and it could be added back later (in this form or another).
Comparison to existing RFCs
There are many other RFCs, as already mentioned:
- #9: Fat Objects
- #11: Extending Enums
- #223: Trait Based Inheritance
- #250: Associated Field Inheritance
This RFC emphasizes flexibility and a clean separation of concern between payload (struct), behaviour (trait) and usage (&T or &Dyn<T>). The same struct or trait can freely be shared in situations where thin pointers are desirable and in situations they are not.
This RFC can be seen as a refined version of Fat Objects (#9), proposing a more fully fleshed out implementation and simplifying the implementation of non-virtual methods by simply adding them to the struct rather than creating an extraneous trait. It was also pointed by gereeter (author of Trait Based Inheritance (#223)) that there was some overlap.
Compared to ... (#11), this RFC does not require distinguishing between enum that can be extended and enum that cannot (mixes payload and usage). This distinction already exists today in C++ (inheriting a class without a virtual destructor) and has proven to be a pain point of the language; it introduces a split in the language ecosystem between those struct that can be extended and those that cannot. On the contrary, this RFC emphasizes that every existing trait and struct can be reused, and no foresight is necessary when designing new ones. It is somewhat less ambitious, as it does not attempt any large scale changes to the language beyond fulfilling the given requirements, but at the same time implements up-casting and down-casting for existing traits.
Compared to associated fields (#250), this RFC does not inject data in traits (mixes payload and behaviour). It neatly sidesteps the issue of splitting the ecosystem into stateful traits and stateless traits, and therefore guarantees that traits can be shared between any library, in any direction.
Compared to the associated fields (#250), this RFC's approach to fields is both cheaper than the indirect fields approach (with its required offset in v-table per field) and less constrained than the #[repr(fixed)] approach (which precludes implementing two fixed traits with contradicting requirements). It also does not require the compiler to try and guarantee the non-aliasing of fields. On the other hand, it is obviously less flexible given its conservative choice (no renaming/re-arrangement).
Compared to the associated fields (#250), this RFC's approach does not require that common fields be public, which is a violation of encapsulation. The struct can define methods with exclusive access to its fields, guaranteeing the invariants of its choice, and because those methods are not polymorphic they can be easily inlined. Still, if desired, its fields can be public. Note that this RFC does require that the Parent/Child relationship be public, as it is exposed via traits.
Compared to the Internal Vtable (#250), this RFC once again avoids enforcing that a struct or trait only be usable in a particular way (mixes payload and usage). This allows using either the struct or trait in other contexts, where this particular representation would be less attractive (it is known that LLVM has issues with devirtualizing calls through internal v-pointers, for example).
Unresolved Questions
- Could
DownCastRefandUpCastRefbe supplanted by implementing the regularDownCast/UpCaston references instead? - How to provide a safe
?Sizedtype in the absence of Custom DST? Is it even possible? - What does "sharing methods between definitions" mean, exactly?