Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue for RFC #1909: Unsized Rvalues #48055

Open
aturon opened this issue Feb 7, 2018 · 52 comments

Comments

Projects
None yet
@aturon
Copy link
Member

commented Feb 7, 2018

This is a tracking issue for the RFC "Unsized Rvalues " (rust-lang/rfcs#1909).

Steps:

Related bugs:

  • #61335 -- ICE when combined with async-await

Unresolved questions:

  • How can we mitigate the risk of unintended unsized or large allocas? Note that the problem already exists today with large structs/arrays. A MIR lint against large/variable stack sizes would probably help users avoid these stack overflows. Do we want it in Clippy? rustc?

  • How do we handle truely-unsized DSTs when we get them? They can theoretically be passed to functions, but they can never be put in temporaries.

  • Decide on a concrete syntax for VLAs.

@Aaron1011

This comment has been minimized.

Copy link
Contributor

commented Feb 20, 2018

How do we handle truely-unsized DSTs when we get them?

@aturon: Are you referring to extern type?

@aturon

This comment has been minimized.

Copy link
Member Author

commented Feb 20, 2018

@Aaron1011 that was copied straight from the RFC. But yes, I presume that's what it's referring to.

@ldr709

This comment has been minimized.

Copy link
Contributor

commented Feb 28, 2018

Why would unsized temporaries ever be necessary? The only way it would make sense to pass them as arguments would be by fat pointer, and I cannot think of a situation that would require the memory to be copied/moved. They cannot be assigned or returned from functions under the RFC. Unsized local variables could also be treated as pointers.

In other words, is there any reason why unsized temporary elision shouldn't be always guaranteed?

@F001

This comment has been minimized.

Copy link
Contributor

commented May 11, 2018

Is there any progress on this issue?
I'm trying to implement VLA in the compiler. For the AST and HIR part, I added a new enum member for syntax::ast::ExprKind::Repeat and hir::Expr_::ExprRepeat to save the count expression as below:

enum RepeatSyntax { Dyn, None }
syntax::ast::ExprKind::Repeat(P<Expr>, P<Expr>, RepeatSyntax)

enum RepeatExprCount {
  Const(BodyId),
  Dyn(P<Expr>),
}
hir::Expr_::ExprRepeat(P<Expr>, RepeatExprCount)

But for the MIR part, I have no idea how to construct a correct MIR. Should I update the structure of mir::RValue::Repeat and corresponding trans_rvalue function? What should they look like? What is the expected LLVM-IR?

Thanks in advance if someone would like to write a simple mentoring instruction.

@qnighy

This comment has been minimized.

Copy link
Contributor

commented May 26, 2018

I'm trying to remove the Sized bounds and translate MIRs accordingly.

@mikeyhew

This comment has been minimized.

Copy link
Contributor

commented Jul 14, 2018

An alternative that would solve both of the unresolved questions would be explicit &move references. We could have an explicit alloca! expression that returns &move T, and truly unsized types work with &move T because it is just a pointer.

If I remember correctly, the main reason for this RFC was to get dyn FnOnce() to be callable. Since FnOnce() is not implementable in stable Rust, would it be a backward-compatible change to make FnOnce::call_once take &move Self instead? If that was the case, then we could make &move FnOnce() be callable, as well as Box<FnOnce()> (via DerefMove).

cc @arielb1 (RFC author) @qnighy (currently implementing this RFC in #51131) @eddyb (knows a lot about this stuff)

@eddyb

This comment has been minimized.

Copy link
Member

commented Jul 14, 2018

@mikeyhew There's not really much of a problem with making by-value self work and IMO it's more ergonomic anyway. We might eventually even have DerefMove without &move at all.

@mikeyhew

This comment has been minimized.

Copy link
Contributor

commented Jul 20, 2018

@eddyb

I guess I can see why people think it's more ergonomic: in order to opt into it, you just have to add ?Sized to your function signature, or in the case of trait methods, do nothing. And maybe it will help new users of the language, since &move wouldn't be show up in documentation everywhere.

If we're going to go ahead with this implicit syntax, then there are a few details that would be good to nail down:

  • If this is syntactic sugar for &move references, what does it desugar too? For function arguments, this could be pretty straightforward: the lifetime of the reference would be limited to the function call, and if you want to extend it past that, you'd have to use explicit &move references. So

    fn call_once(f: FnOnce(i32))) -> i32

    desugars too

    fn call_once(f: &move FnOnce(i32)) -> i32

    and you can call the function directly on its argument, so foo(|x| x + 1) desugars to foo(&move (|x| x + 1)).

    And to do something fancier, you'd have to resort to the explicit version:

    fn make_owned_pin<'a, T: 'a + ?Sized>(value: &'a move T) -> PinMove<'a, T> { ... }
    
    struct Thunk<'a> {
        f: &'a move FnOnce()
    }

    Given the above semantics, DerefMove could be expressed using unsized rvalues, as you said:

    EDIT: This is kind of sketchy though. What happens if the implementation is wrong, and doesn't call f?

    // this is the "closure" version of DerefMove. The alternative would be to have an associated type
    // `Cleanup` and return `(Self::Target, Self::Cleanup)`, but that wouldn't work with unsized
    // rvalues because you can't return a DST by value
    fn deref_move<F: FnOnce(Self::Target) -> O, O>(self, f: F) -> O;
    
    // explicit form
    fn deref_move<F: for<'a>FnOnce(&'a move Self::Target) -> O, O>(&'a move self, f: F) -> O;

    I should probably write an RFC for this.

  • When do there need to be implicit allocas? I can't actually think of a case where an implicit alloca would be needed. Any function arguments would last as long as the function does, and wouldn't need to be alloca'd. Maybe something involving stack-allocated dynamic arrays, if they are returned from a block, but I'm pretty sure that's explicitly disallowed by the RFC.

@mikeyhew

This comment has been minimized.

Copy link
Contributor

commented Jul 20, 2018

@eddyb have you seen @alercah's RFC for DerefMove? rust-lang/rfcs#2439

bors added a commit that referenced this issue Aug 19, 2018

Auto merge of #51131 - qnighy:unsized-locals, r=eddyb
Implement Unsized Rvalues

This PR is the first step to implement RFC1909: unsized rvalues (#48055).

## Implemented

- `Sized` is removed for arguments and local bindings. (under `#![feature(unsized_locals)]`)
- Unsized locations are allowed in MIR
- Unsized places and operands are correctly translated at codegen

## Not implemented in this PR

- Additional `Sized` checks:
  - tuple struct constructor (accidentally compiles now)
  - closure arguments at closure generation (accidentally compiles now)
  - upvars (ICEs now)
- Generating vtable for `fn method(self)` (ICEs now)
- VLAs: `[e; n]` where `n` isn't const
- Reduce unnecessary allocations

## Current status

- [x] Fix `__rust_probestack` (rust-lang-nursery/compiler-builtins#244)
  - [x] Get the fix merged
- [x] `#![feature(unsized_locals)]`
  - [x] Give it a tracking issue number
- [x] Lift sized checks in typeck and MIR-borrowck
  - [ ] <del>Forbid `A(unsized-expr)`</del> will be another PR
- [x] Minimum working codegen
- [x] Add more examples and fill in unimplemented codegen paths
- [ ] <del>Loosen object-safety rules (will be another PR)</del>
- [ ] <del>Implement `Box<FnOnce>` (will be another PR)</del>
- [ ] <del>Reduce temporaries (will be another PR)</del>
@qnighy

This comment has been minimized.

Copy link
Contributor

commented Aug 20, 2018

As a next step, I'll be working on trait object safety.

@alexreg

This comment has been minimized.

Copy link
Contributor

commented Aug 22, 2018

@mikeyhew Sadly @alercah just postponed their DerefMove RFC, but I think a separate RFC for &move that complements that (when it does get revived) would be very much desirable. I would be glad to assist with that even, if you're interested.

@mikeyhew

This comment has been minimized.

Copy link
Contributor

commented Aug 23, 2018

@alexreg I would definitely appreciate your help, if I end up writing an RFC for &move.

The idea I have so far is to treat unsized rvalues as a sort of sugar for &move references with an implicit lifetime. So if a function argument has type T, it will be either be passed by value (if T is Sized) or as a &'a move T, and the lifetime 'a of the reference will outlive the function call, but we can't assume any more than that. For an unsized local variable, the lifetime would be the variable's scope. If you want something that lives longer than that, e.g. you want to take an unsized value and return it, you'd have to use an explicit &move reference so that the borrow checker can make sure it lives long enough.

@alexreg

This comment has been minimized.

Copy link
Contributor

commented Aug 24, 2018

@mikeyhew That sounds like a reasonable approach to me. Has anyone specified the supposed semantics of &move yet, even informally? (Also, I'm not sure if bikeshedding on this has already been done, but we should probably consider calling it &own.)

@eddyb

This comment has been minimized.

Copy link
Member

commented Aug 25, 2018

Not sure if this is the right place to document this, but I found a way to make a subset of unsized returns (technically, all of them, given a T -> Box<T> lang item) work without ABI (LLVM) support:

  • only Rust ABI functions can return unsized types
  • instead of passing a return pointer in the call ABI, we pass a return continuation
    • we can already pass unsized values to functions, so if we could CPS-convert Rust functions (or wanted to), we'd be done (at the cost of a stack that keeps growing)
    • @nikomatsakis came up with something similar (but only for Box) a few years ago
  • however, only the callee (potentially a virtual method) needs to be CPS-like, and only in the ABI, the callers can be restricted and/or rely on dynamic allocation, not get CPS-transformed
  • while Clone becoming object-safe is harder, this is an alright starting point:
// Rust definitions
trait CloneAs<T: ?Sized> {
    fn clone_as(&self) -> T;
}
impl<T:  Trait + Clone> CloneAs<dyn Trait> for T {
    fn clone_as(&self) -> dyn Trait { self.clone() }
}
trait Trait: CloneAs<dyn Trait> {}
// Call ABI signature for `<dyn Trait as CloneAs<dyn Trait>>::clone_as`
fn(
     // opaque pointer passed to `ret` as the first argument
    ret_opaque: *(),
    // called to return the unsized value
    ret: fn(
        // `ret_opaque` from above
        opaque: *(),
        // the `dyn Trait` return value's components
        ptr: *(), vtable: *(),
    ) -> (),
    // `self: &dyn Trait`'s components
    self_ptr: *(), self_vtable: *(),
) -> ()
  • the caller would use the ret_opaque pointer to pass one or more sized values to its stack frame
    • could allow ret return one or two pointer-sized values, but that's an optional optimization
  • we can start by allowing composed calls, of this MIR shape:
y = call f(x); // returns an unsized value
z = call g(y); // takes the unsized value and returns a sized one
// by compiling it into:
f(&mut z, |z, y| { *z = call g(y); }, x)
  • this should work out of the box for {Box,Rc,...}::new(obj.clone_as())
  • while we could extract entire portions of the MIR into these "return continuations", that's not necessary for being able to express most things: worst case, you write a separate function
  • since Box::new works, anything with a global allocator around could fall back to that
    • let y = f(x); would work as well as let y = *Box::new(f(x));
    • its cost might be a bit high, but so would that of a proper "unsized return" ABI
  • we can, at any point, switch to an ABI where e.g. the value is copied onto the caller's stack, effectively "extending it on return", and there shouldn't be any observable differences

cc @rust-lang/compiler

@mikeyhew

This comment has been minimized.

Copy link
Contributor

commented Aug 25, 2018

@alexreg

Has anyone specified the supposed semantics of &move yet, even informally?

I don't think it's been formally specified. Informally, &'a move T is a reference that owns its T. It's like

  • an &'a mut T that owns the T instead of mutably borrowing it, and therefore drops the T when dropped, or
  • a Box<T> that is only valid for the lifetime 'a, and doesn't free heap allocated memory when dropped (but still drops the T).

(Also, I'm not sure if bikeshedding on this has already been done, but we should probably consider calling it &own.)

Don't think that bikeshed has been painted yet. I guess &own is better. It requires a new keyword, but afaik it can be a contextual keyword, and it more accurately describes what is going on. Often times you would use it to avoid moving something in memory, so calling it &move T would be confusing, and plus there's the problem of &move ||{}, which looks like &move (||{}) but would have to mean & (move ||{}) for backward compatibility.

@F001

This comment has been minimized.

Copy link
Contributor

commented Nov 8, 2018

Since unsized values can be used in more circumstances, should we add T: ?Sized bound for more types in std? For example:

pub enum Option<T: ?Sized>{}
pub enum Result<T: ?Sized, E> {}
pub struct AssertUnwindSafe<T: ?Sized>(T);
@eddyb

This comment has been minimized.

Copy link
Member

commented Nov 8, 2018

@F001 Unsized enums are disallowed today and part of it has to do with the fact that all enum layout optimizations would have to either be disabled or reified into the vtable.

@stjepang

This comment has been minimized.

Copy link
Contributor

commented Nov 8, 2018

While playing with the unsized_locals feature, I ran into a stumbling block: mem::forget() and mem::drop() don't accept T: ?Sized.

What's the plan for allowing ?Sized types in these and related functions?

@eddyb

This comment has been minimized.

Copy link
Member

commented Nov 8, 2018

@stjepang sounds like an oversight, feel free to open a PR to relax those bounds.
EDIT: we probably need to go around and see what APIs can benefit from this now.

@Centril

This comment has been minimized.

Copy link
Member

commented Nov 8, 2018

@eddyb shouldn't we wait until unsized_locals is stabilized before doing that?

@cramertj

This comment has been minimized.

Copy link
Member

commented Nov 8, 2018

@Centril It seems unlikely that we'd see anyone relying on them accepting ?Sized on stable without using the unsized_locals feature, but I suppose it is technically possible to do so.

@eddyb

This comment has been minimized.

Copy link
Member

commented Nov 8, 2018

@cramertj I don't see how, moves aren't shouldn't be allowed out of !Sized places, on stable.
(looks like my intuition was wrong, and might have some bugs, see #55785 (comment))

@cramertj

This comment has been minimized.

Copy link
Member

commented Nov 8, 2018

@eddyb You can observe that the function can be constructed with those type params, just not apply them:

#![feature(unsized_locals)] // for the fn declaration

fn my_drop<T>(_: T) {}
fn my_unsized_drop<T: ?Sized>(_: T) {}
trait Trait {}

fn main() {
    let _ = my_drop::<()>;
    // let _ = my_drop::<dyn Trait>; // This one won't compile
    let _ = my_unsized_drop::<()>;
    let _ = my_unsized_drop::<dyn Trait>; // This will compile
}
@qnighy

This comment has been minimized.

Copy link
Contributor

commented Nov 9, 2018

As I said in #51131 (comment), I've once attempted to add Sizedness check for function arguments (it would solve #50940 too); that, however, turned out to be difficult. The problem case was something like this:

pub trait Pattern<'a> {
    type Searcher;
}

fn clone<'a, P: Pattern<'a>>(this: &MatchesInternal<'a, P>) -> MatchesInternal<'a, P> where P::Searcher: Clone {
    MatchesInternal(this.0.clone())
}

pub struct MatchesInternal<'a, P: Pattern<'a>>(P::Searcher);

fn main() {}

Here P::Searcher: Clone implies P::Searcher: Sized. As bounds take precedence in trait selection, during type inference, P::Searcher arbitrarily appears in places where Sized is demanded. That (in combination with the additional Sized bounds) broke compilation of this simple code.

Therefore, I'm thinking of adding an independent pass (or adding a check to an existing pass) after typeck to ensure Sizedness (although I have other things to do regarding unsized rvalues).

pietroalbini added a commit to pietroalbini/rust that referenced this issue Nov 25, 2018

Rollup merge of rust-lang#56045 - qnighy:additional-sizedness, r=cram…
…ertj

Check arg/ret sizedness at ExprKind::Path

This PR solves three problems:

- rust-lang#50940: ICE on casting unsized tuple struct constructors
- Unsized tuple struct constructors were callable in presence of `unsized_locals`.
- rust-lang#48055 (comment): we cannot relax `Sized` bounds on stable functions because of fn ptr casting

These are caused by lack of `Sized`ness checks for arguments/retvals at **reference sites of `FnDef` items** (not call sites of the functions). Therefore we can basically add more `Sized` obligations on typeck. However, adding `Sized` obligations arbitrarily breaks type inference; to prevent that I added a new method `require_type_is_sized_deferred` which doesn't interfere usual type inference.
@earthengine

This comment has been minimized.

Copy link

commented Jan 15, 2019

(moved from internals)

I recently make a lot of mistakes by missing impl in function signatures. Almost everytime I was protected by the fact that unsized values cannot appear in function signatures, either argument or return position.

For myself, I would turn on #![deny(bare_trait_objects)] to gain more protection.

I think this could happen to others as well, so I guess unless we prohibited raw Trait appear in function sigtures, stablizing unsized_locals would make things worse as we lost protections. Maybe when it lands it is a good time to also make #![deny(bare_trait_objects)] default?

@alexreg

This comment has been minimized.

Copy link
Contributor

commented Jan 16, 2019

@earthengine I agree. I wish we had implemented this and made #![deny(bare_trait_objects)] default for the 2018 edition, in fact! It's too late now, sadly (until the next edition, at least).

@alercah

This comment has been minimized.

Copy link
Contributor

commented Jan 17, 2019

Could it be made deny-by-default only in those places where it was previously disallowed, so that no currently-invalid code becomes accepted?

@alexreg

This comment has been minimized.

Copy link
Contributor

commented Jan 17, 2019

@alercah That sounds complicated due to the way the linting system currently works... maybe possible though. @arielb1, any thoughts?

@alercah

This comment has been minimized.

Copy link
Contributor

commented Jan 17, 2019

Maybe make it two separate lints?

@alexreg

This comment has been minimized.

Copy link
Contributor

commented Jan 17, 2019

Yes, that's the obvious solution... seems kind of ugly though?

@alercah

This comment has been minimized.

Copy link
Contributor

commented Jan 17, 2019

Could they be pseudonamespaced like clippy lints? #[deny(bare_trait_objects::all)] or #[allow(bare_trait_objects::unsize)]

@alexreg

This comment has been minimized.

Copy link
Contributor

commented Jan 17, 2019

That's an interesting idea. I don't know, but @arielb1 probably does.

@varkor

This comment has been minimized.

Copy link
Member

commented Jan 17, 2019

It's possible to have lint groups such that bare_trait_objects implies bare_trait_objects_as_unsized_rvalues, for instance. That might be a cleaner solution.

bors added a commit that referenced this issue Feb 11, 2019

Auto merge of #55431 - qnighy:boxed-closure-impls, r=cramertj
Unsized rvalues: implement boxed closure impls.

This pull request contains **`boxed_closure_impls`** that provides long-hoped three impls:

```rust
impl<A, F: FnOnce<A> + ?Sized> FnOnce<A> for Box<F> { .. }
impl<A, F: FnMut<A> + ?Sized> FnMut<A> for Box<F> { .. }
impl<A, F: Fn<A> + ?Sized> Fn<A> for Box<F> { .. }
```

This has been blocked by several reasons; see `FnBox` #28796 for details.

Now that #48055 is (partly) implemented, we're ready to introduce the above impls, replacing existing `FnBox` workarounds.

There are two major concerns, however.

## Major concern 1: instability

I think these impls should be introduced as an unstable feature first, mainly because it relies on unsized rvalues. However, `impl` itself can't stand as unstable; all existing unstable features are tied with functions, methods, or types. I tried putting `#[unstable]` on the impls but that didn't work. <del>**I'll mark this PR as [WIP] until it is resolved**</del>. I'll just remove `[WIP]` and let the compiler team decide how to deal with the instability.

## Major concern 2: compatibility with `FnBox`

I'm not really sure, but `FnBox` may be widely used in the nightly world. Although unstable features may regress, it's good if there's a path for gradual migration. It can be done in the following way:

- Make `FnBox` a subtrait of `FnOnce`. This ensures that `dyn FnBox` implements `FnOnce`.
- Make use of specialization to avoid overlap between `FnOnce` impls for `Box<impl FnOnce>` and `Box<dyn FnBox>`.

I believe that this minimizes breakage of crates that use `FnBox`.

## Minor concern: feature name and tracking issue

I currently assign `boxed_closure_impls` as the name and #48055 as the tracking issue. I'll prepare a separate tracking issue when the Major Concern 1 is resolved.

bors added a commit that referenced this issue Feb 12, 2019

Auto merge of #55431 - qnighy:boxed-closure-impls, r=cramertj
Unsized rvalues: implement boxed closure impls.

This pull request contains **`boxed_closure_impls`** that provides long-hoped three impls:

```rust
impl<A, F: FnOnce<A> + ?Sized> FnOnce<A> for Box<F> { .. }
impl<A, F: FnMut<A> + ?Sized> FnMut<A> for Box<F> { .. }
impl<A, F: Fn<A> + ?Sized> Fn<A> for Box<F> { .. }
```

This has been blocked by several reasons; see `FnBox` #28796 for details.

Now that #48055 is (partly) implemented, we're ready to introduce the above impls, replacing existing `FnBox` workarounds.

There are two major concerns, however.

## Major concern 1: instability

I think these impls should be introduced as an unstable feature first, mainly because it relies on unsized rvalues. However, `impl` itself can't stand as unstable; all existing unstable features are tied with functions, methods, or types. I tried putting `#[unstable]` on the impls but that didn't work. <del>**I'll mark this PR as [WIP] until it is resolved**</del>. I'll just remove `[WIP]` and let the compiler team decide how to deal with the instability.

## Major concern 2: compatibility with `FnBox`

I'm not really sure, but `FnBox` may be widely used in the nightly world. Although unstable features may regress, it's good if there's a path for gradual migration. It can be done in the following way:

- Make `FnBox` a subtrait of `FnOnce`. This ensures that `dyn FnBox` implements `FnOnce`.
- Make use of specialization to avoid overlap between `FnOnce` impls for `Box<impl FnOnce>` and `Box<dyn FnBox>`.

I believe that this minimizes breakage of crates that use `FnBox`.

## Minor concern: feature name and tracking issue

I currently assign `boxed_closure_impls` as the name and #48055 as the tracking issue. I'll prepare a separate tracking issue when the Major Concern 1 is resolved.

bors added a commit that referenced this issue Feb 12, 2019

Auto merge of #55431 - qnighy:boxed-closure-impls, r=cramertj
Unsized rvalues: implement boxed closure impls.

This pull request contains **`boxed_closure_impls`** that provides long-hoped three impls:

```rust
impl<A, F: FnOnce<A> + ?Sized> FnOnce<A> for Box<F> { .. }
impl<A, F: FnMut<A> + ?Sized> FnMut<A> for Box<F> { .. }
impl<A, F: Fn<A> + ?Sized> Fn<A> for Box<F> { .. }
```

This has been blocked by several reasons; see `FnBox` #28796 for details.

Now that #48055 is (partly) implemented, we're ready to introduce the above impls, replacing existing `FnBox` workarounds.

There are two major concerns, however.

## Major concern 1: instability

I think these impls should be introduced as an unstable feature first, mainly because it relies on unsized rvalues. However, `impl` itself can't stand as unstable; all existing unstable features are tied with functions, methods, or types. I tried putting `#[unstable]` on the impls but that didn't work. <del>**I'll mark this PR as [WIP] until it is resolved**</del>. I'll just remove `[WIP]` and let the compiler team decide how to deal with the instability.

## Major concern 2: compatibility with `FnBox`

I'm not really sure, but `FnBox` may be widely used in the nightly world. Although unstable features may regress, it's good if there's a path for gradual migration. It can be done in the following way:

- Make `FnBox` a subtrait of `FnOnce`. This ensures that `dyn FnBox` implements `FnOnce`.
- Make use of specialization to avoid overlap between `FnOnce` impls for `Box<impl FnOnce>` and `Box<dyn FnBox>`.

I believe that this minimizes breakage of crates that use `FnBox`.

## Minor concern: feature name and tracking issue

I currently assign `boxed_closure_impls` as the name and #48055 as the tracking issue. I'll prepare a separate tracking issue when the Major Concern 1 is resolved.
@earthengine

This comment has been minimized.

Copy link

commented Feb 14, 2019

Just found that the Into trait have an implicit Sized bound.
It is for sure #![feature(unsized_local)] will enable calling into() on unsized objects. So shall we relax this by adding a T: ?Sized bound? Would there be any compatibility issues?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.