- Start Date: 2014-06-02
- RFC PR #: (leave this empty)
- Rust Issue #: (leave this empty)
Summary
Allow functions to return unboxed abstract types, written impl Trait for
"some hidden type T such that T: Trait".
These types specify a trait bound, but hide the concrete type implementing the bound. Unlike trait objects, but like generics, the actual type of an unboxed abstract type is statically known and sized, and will thus generate statically-dispatched code.
Optionally, allow unboxed abstract type syntax to be used elsewhere (function arguments, structs) as a shorthand for generics.
Motivation
In today's Rust, you can write a function signature like
fn consume_iter_static<I: Iterator<u8>>(iter: I)
fn consume_iter_dynamic(iter: Box<Iterator<u8>>)In both cases, the function does not depend on the exact type of the argument. The type is held "abstract", and is assumed only to satisfy a trait bound.
- In the
_staticversion using generics, each use of the function is specialized to a concrete, statically-known type, giving static dispatch, inline layout, and other performance wins. - In the
_dynamicversion using trait objects, the concrete argument type is only known at runtime using a vtable.
On the other hand, while you can write
fn produce_iter_dynamic() -> Box<Iterator<u8>>you cannot write something like
fn produce_iter_static() -> Iterator<u8>That is, in today's Rust, abstract return types can only be written using trait objects, which
can be a significant performance penalty. This RFC proposes "unboxed abstract
types" as a way of achieving signatures like produce_iter_static. Like
generics, unboxed abstract types guarantee static dispatch and inline data
layout.
Here are some problems that unboxed abstract types solve or mitigate:
-
Returning unboxed closures. The ongoing work on unboxed closures expresses closures using traits. Sugar for closures generates an anonymous type implementing a closure trait. Without unboxed abstract types, there is no way to use this sugar while returning the resulting closure unboxed, because there is no way to write the name of the generated type.
-
Leaky APIs. Functions can easily leak implementation details in their return type, when the API should really only promise a trait bound. For example, a function returning
Rev<Splits<'a, u8>>is revealing exactly how the iterator is constructed, when the function should only promise that it returns some type implementingIterator<u8>. Using newtypes/structs with private fields helps, but is extra work. Unboxed abstract types make it as easy to promise only a trait bound as it is to return a concrete type. -
Complex types. Use of iterators in particular can lead to huge types:
Chain<Map<'a, (int, u8), u16, Enumerate<Filter<'a, u8, vec::MoveItems<u8>>>>, SkipWhile<'a, u16, Map<'a, &u16, u16, slice::Items<u16>>>>
Even when using newtypes to hide the details, the type still has to be written out, which can be very painful. Unboxed abstract types only require writing the trait bound.
-
Documentation. In today's Rust, reading the documentation for the
Iteratortrait is needlessly difficult. Many of the methods return new iterators, but currently each one returns a different type (Chain,Zip,Map,Filter, etc), and it requires drilling down into each of these types to determine what kind of iterator they produce.
In short, unboxed abstract types make it easy for a function signature to promise nothing more than a trait bound, and do not generally require the function's author to write down the concrete type implementing the bound.
Detailed design
Note: this design borrows from
The basic idea is to allow code like the following:
pub fn produce_iter_static() -> impl Iterator<int> {
range(0, 10).rev().map(|x| x * 2).skip(2)
}where impl Iterator<int> should be understood as "some type T such that T: Iterator<int>. Notice that the function author does not have to write down any
concrete iterator type, nor does the function's signature reveal those details
to its clients. But the type promises that there exists some concrete type.
This code is roughly equivalent to
pub struct Result_produce_iter_static(
iter::Skip<iter::Map<'static,int,int,iter::Rev<iter::Range<int>>>>
);
impl Iterator<int> for Result_produce_iter_static {
fn next(&mut self) -> Option<int> {
match *self {
Result_produce_iter_static(ref mut r) => r.next()
}
}
}
pub fn produce_iter_static() -> Result_produce_iter_static {
Result_produce_iter_static(
range(0, 10).rev().map(|x| x * 2).skip(2)
)
}That is, using an unboxed abstract type is semantically equivalent to introducing an anonymous newtype where:
- the representation is private
- the newtype implements the given trait bound by delegating to the representation
An unboxed abstract return type impl Trait has several implications for the compiler:
-
Typechecking the function: The compiler should check that the inferred actual return type
Tof the function satisfies the trait boundTrait. This is a local check. -
Typechecking the clients: The compiler should treat each use of
impl Traitas a fresh unknown type that satisfies the boundTrait. Thus, function clients cannot depend on the concrete return type, only the bound, which retains local typechecking. -
Code generation: The compiler should record the concrete return type
Tfor the function for the purposes of specialization. Clients that use the return type should generate the same code as if they has used the concrete typeT, despite that typechecking prevents them from relying on anything aboutTexcept the trait boundTrait. (This can be viewed as a form of monomorphization for abstract types, which stand for a single type, in contrast to generics (universal types), which stand for a family of types.)
So, for example, given the function signatures
fn produce_iter_static() -> impl Iterator<int>
fn consume_iter_static<I: Iterator<u8>>(iter: I);the expression
consume_iter_static(produce_iter_static())should typecheck, and should compile using static dispatch. On the other hand,
let iter: iter::Skip<_> = produce_iter_static();should fail to typecheck, since it relies on the concrete return type of
produce_iter_static, which is hidden behind the unboxed abstract return
type.
The subsections below fill in several details necessary for implementing this high-level design.
Generics and lifetime parameters
A function returning an unboxed abstract type might also use generic types or lifetime parameters:
impl<T> SomeCollectionType<T> {
fn iter<'a>(&'a self) -> impl Iterator<&'a T> { ... }
}In general, the concrete return type may be parameterized by any type or lifetime variable in scope. These parameters may also be mentioned in the trait bound for the unboxed abstract type.
Just as is currently done for trait objects, the typechecker must ensure that lifetime parameters are not stripped when using an unboxed abstract type. For example (adapted from @glaebhoerl):
fn evil<T, Iter: Iterator<T>>(iter: Iter) -> Box<Iterator<T>> {
box iter as Box<Iterator<T>>
}generates the error "value may contain references; add 'static bound to
Iter". Since unboxed abstract types are the statically-dispatched equivalent
to returning trait objects, the same logic should apply:
fn evil<T, Iter: Iterator<T>>(iter: Iter) -> Box<impl Iterator<T>> {
box iter as Box<Iterator<T>>
}should not be allowed for the same reason.
Possible add-on: unboxed abstract types in non-return positions
Note: this is an optional feature.
So far, the RFC has assumed that the new use of impl for unboxed abstract types
is only allowed for the return type of a function. In principle, it makes sense
to allow unboxed abstract types anywhere that a type is allowed.
Note: the extensions below are somewhat orthogonal. For example, the
extension to include function arguments does not depend on including struct
types.
Function arguments
Instead of writing
fn extend<I: Iterator<T>>(&mut self, iterator: I)we could write
fn extend(&mut self, iterator: impl Iterator<T>)These two signatures are almost equivalent, but the first has an explicit type
argument (which can be provided using :: syntax) while the second has an
implicit type argument.
Using unboxed abstract types in arguments makes (simple) static and dynamic dispatch syntactically closer:
fn extend_static(&mut self, iterator: impl Iterator<T>)
fn extend_dynamic(&mut self, iterator: &Iterator<T>)It may be especially useful for passing unboxed closures as arguments in a pleasant way:
fn use_fn<T: Fn<u8, ()>>(f: T)versus
fn use_fn(f: impl Fn<u8, ()>)Nested results
Rather than only supporting
pub fn produce_iter_static() -> impl Iterator<int>unboxed abstract types could also be permitted in nested form:
pub fn produce_iter_box() -> Box<impl Iterator<int>>
pub fn produce_iters() -> (impl Iterator<int>, impl Iterator<int>)Each use of impl would mark a distinct abstract type, so produce_iters
could provide different concrete iterator types for the first and second
components of the tuple.
Structs and other compound types
Note: the motivation for this extension is purely consistency —
allowing impl Trait wherever a type is allowed. There is no known concrete
use case, which is perhaps a strong argument against including the
feature. (Personally, I think this is probably a bad idea.)
If we truly want to allow impl Trait everywhere, that would include struct fields as well.
struct Foo {
s: impl Set<u8>;
}would be equivalent to
struct Foo<T: Set<u8>> {
s: T;
}except that the type argument would be treated as implicit, meaning that you would write
fn use_foo(f: &Foo) { ... }and the function use_foo would itself implicitly parameterize over the
Set<u8> implementation. Similarly,
fn make_foo() -> Foo { ... }would implicitly treat the Set<u8> field as an abstract type, which can be
implemented by an arbitrary (but hidden) concrete type.
Possible add-on: type equality and Self
Note: this is an optional feature.
Although the clients of an unboxed abstract types do not know its concrete type, they can rely on there being some consistent concrete type. For example, given a function
fn collect_to_set<T, I: Iterator<T>>(iter: I) -> impl Set<T>the code
let s1 = collect_to_set(iter1);
let s2 = collect_to_set(iter2);
s1.is_subset(s2)should typecheck as long as iter1 and iter2 are iterators over the same
type, since the underlying concrete type implementing Set<T> is guaranteed to
be the same.
More generally, the typechecker could treat each unboxed abstract type as implicitly parameterized over all of the in-scope type and lifetime variables. Two uses of the same existential type with the same parameters should be treated as the same type. In terms of the newtype expansion suggested earlier, this is the equivalent of having the newtype take all of the in-scope type and lifetime parameters.
Staging
This RFC can be accepted and/or implemented in stages, in the following order of priority:
- Unboxed abstract return types, by far the most important use-case
- Unboxed abstract argument types
- Unboxed abstract types in structs/enums/tuples
- Equality/Self for unboxed abstract types
The first two bullets would affect library API stabilization; the others are nice-to-haves.
Drawbacks
The main downside is complexity. It can already be difficult to understand the distinction between generics and trait objects, and this RFC proposes another generics-like mechanism for unboxed abstract return types. This drawback is discussed in more detail in the following section on Alternatives.
Other drawbacks are specific to the optional add-ons. In particular, adding
impl Trait fields to structs adds a somewhat scary form of implicitness,
since code using the struct will be silently monomorphized without any
explicit use of generics.
Alternatives
The proposal here, with all the add-ons, ultimately allows impl Trait to be
used wherever a type can be used. This is partly for consistency and partly as
a lighter weight way to write generics. The downside is that we would be adding
yet another distinction to the type system, and quite a bit of implicit
parameterization/monomorphization.
A very conservative alternative
A more targeted approach would be to just tackle return types, and try to make them look more akin to generics:
fn produce_iter_static() -> _ : Iterator<u8>This more conservative proposal would be easier to implement, and would add less complexity to the language, while still addressing the primary use case for the RFC.
A somewhat conservative alternative
Finally, a midpoint would be to use impl Trait but restrict it to function
signatures, not allowing it in struct fields. That design:
- retains the benefit of syntactically lightweight static dispatch
- keeps the nice property that, by looking at a function signature alone, it is possible to tell exactly where monomorphization will happen.
This design might also make it possible for the implicit type parameters for
impl Trait to be provided using :: syntax.
Unresolved questions
Multiple trait bounds
The design should definitely be generalized to support multiple trait bounds
(impl Trait1 + Trait2), since it is common for a return type to implement
several traits of interest. (Iterator return types are a good example). This
should not be any harder to implement than the basic design, but the syntax may
need some thought.
Naming an abstract return type
The above proposal for type equality allows multiple uses of the same
existential to be treated as equal types. However, there is no way to name
that concrete type, so it cannot be stored in a struct or passed as an
argument.
Note: this is different from using impl Trait in a struct field, which
allows any type implementing Trait. What we cannot do is have a struct
field whose type is "whatever concrete type function foo returns".
Thinking again of the signature
fn collect_to_set<T, I: Iterator<T>>(iter: I) -> impl Set<T>we could allow naming the concrete result type by a path like
collect_to_set::<T, I>::impl. The only way to get a value of this type is by
calling collect_to_set. Thinking in terms of the newtype encoding of unboxed
abstract types, this is just a way to name the concrete newtype used.
This should be considered as part of any design for associated types, which also involve paths to types.