Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

`#[may_dangle]`, a refined dropck escape hatch (tracking issue for RFC 1327) #34761

Open
pnkfelix opened this Issue Jul 11, 2016 · 24 comments

Comments

Projects
None yet
@pnkfelix
Copy link
Member

pnkfelix commented Jul 11, 2016

Tracking issue for rust-lang/rfcs#1327

  • Allow attributes on lifetime/type formal parameters (generic_param_attrs feature): #34764
  • Add #[may_dangle] attribute for lifetime/type formal parameters on unsafe impl<...> Drop (dropck_eyepatch feature)
  • Replace uses of #[unsafe_destructor_blind_to_params] attribute in libstd with #[may_dangle] #38664
  • Update nomicon to use #[may_dangle] instead of #[unsafe_destructor_blind_to_params] #39196
  • Remove support for #[unsafe_destructor_blind_to_params] attribute (with warning cycle before removal).
    • Deprecated in #38970
    • need to replace remaining references (e.g. in test suite) to #[unsafe_destructor_blind_to_params] with [may_dangle] before we can actually remove the former.

Manishearth added a commit to Manishearth/rust that referenced this issue Sep 30, 2016

Manishearth added a commit to Manishearth/rust that referenced this issue Oct 1, 2016

@pnkfelix pnkfelix self-assigned this Oct 3, 2016

@pnkfelix

This comment has been minimized.

Copy link
Member Author

pnkfelix commented Oct 10, 2016

see also #28498

eddyb added a commit to eddyb/rust that referenced this issue Oct 18, 2016

Rollup merge of rust-lang#37117 - pnkfelix:may-dangle-attr, r=nikomat…
…sakis

`#[may_dangle]` attribute

`#[may_dangle]` attribute

Second step of rust-lang#34761. Last big hurdle before we can work in earnest towards Allocator integration (rust-lang#32838)

Note: I am not clear if this is *also* a syntax-breaking change that needs to be part of a breaking-batch.

eddyb added a commit to eddyb/rust that referenced this issue Oct 19, 2016

Rollup merge of rust-lang#37117 - pnkfelix:may-dangle-attr, r=nikomat…
…sakis

`#[may_dangle]` attribute

`#[may_dangle]` attribute

Second step of rust-lang#34761. Last big hurdle before we can work in earnest towards Allocator integration (rust-lang#32838)

Note: I am not clear if this is *also* a syntax-breaking change that needs to be part of a breaking-batch.

eddyb added a commit to eddyb/rust that referenced this issue Oct 19, 2016

Rollup merge of rust-lang#37117 - pnkfelix:may-dangle-attr, r=nikomat…
…sakis

`#[may_dangle]` attribute

`#[may_dangle]` attribute

Second step of rust-lang#34761. Last big hurdle before we can work in earnest towards Allocator integration (rust-lang#32838)

Note: I am not clear if this is *also* a syntax-breaking change that needs to be part of a breaking-batch.
@pnkfelix

This comment has been minimized.

Copy link
Member Author

pnkfelix commented Oct 26, 2016

Some people (myself included) have noted that may_dangle is not an ideal name. (Some have gone so far as to call it "terrible".)

Perhaps we could return to calling it an eyepatch:

3vvuqta

GuillaumeGomez pushed a commit to GuillaumeGomez/rust that referenced this issue Jan 4, 2017

GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this issue Jan 4, 2017

Rollup merge of rust-lang#38664 - apasel422:may-dangle, r=pnkfelix
Replace uses of `#[unsafe_destructor_blind_to_params]` with `#[may_dangle]`

CC rust-lang#34761

r? @pnkfelix

steveklabnik added a commit to steveklabnik/rust that referenced this issue Jan 4, 2017

Rollup merge of rust-lang#38664 - apasel422:may-dangle, r=pnkfelix
Replace uses of `#[unsafe_destructor_blind_to_params]` with `#[may_dangle]`

CC rust-lang#34761

r? @pnkfelix

frewsxcv added a commit to frewsxcv/rust that referenced this issue Jan 9, 2017

Rollup merge of rust-lang#38664 - apasel422:may-dangle, r=pnkfelix
Replace uses of `#[unsafe_destructor_blind_to_params]` with `#[may_dangle]`

CC rust-lang#34761

r? @pnkfelix

frewsxcv added a commit to frewsxcv/rust that referenced this issue Jan 9, 2017

sanxiyn added a commit to sanxiyn/rust that referenced this issue Jan 10, 2017

Rollup merge of rust-lang#38664 - apasel422:may-dangle, r=pnkfelix
Replace uses of `#[unsafe_destructor_blind_to_params]` with `#[may_dangle]`

CC rust-lang#34761

r? @pnkfelix

GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this issue Jan 19, 2017

Rollup merge of rust-lang#38970 - apasel422:may-dangle, r=pnkfelix
Deprecate `#[unsafe_destructor_blind_to_params]`

CC rust-lang#34761

r? @pnkfelix

bors added a commit that referenced this issue Jan 30, 2017

Auto merge of #39196 - apasel422:nomicon, r=petrochenkov
Update nomicon to describe `#[may_dangle]`

CC #34761
r? @pnkfelix

GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this issue Feb 2, 2017

@RalfJung

This comment has been minimized.

Copy link
Member

RalfJung commented Feb 14, 2017

I have to say I find may_dangle clearer than eyepatch. After all the idea (if I understood it correctly) is that for the given parameters, it may be the case that they do not necessarily outlive the current function call.

@asajeffrey

This comment has been minimized.

Copy link

asajeffrey commented Feb 1, 2018

An issue that came up in the context of Josephine is asajeffrey/josephine#52. This is mainly an issue for the interaction between the drop checker and untagged unions (#32836 (comment)) but does have impact on the code guidelines for when may_dangle may be used safely.

The issue is the text "When used on a type, e.g. #[may_dangle] T, the programmer is asserting the only uses of values of that type will be to move or drop them," and whether the programmer is allowed to discard values of type T, that is reclaim the memory used by the value without running T's drop code. In particular, is the programmer allowed to safely mem::forget a value of type T?

@SimonSapin

This comment has been minimized.

Copy link
Contributor

SimonSapin commented Feb 1, 2018

std::mem::forget itself is not unsafe. I believe that leaking any value (never running destructors) is considered sound in Rust. (This is why for example scoped threads are based on closures rather than "guard" values with destructors.)

@glaebhoerl

This comment has been minimized.

Copy link
Contributor

glaebhoerl commented Feb 1, 2018

Yes, he knows, that's not what he's asking.

The part of the equation that's unsafe and whose contract is in question is #[may_dangle].

@RalfJung

This comment has been minimized.

Copy link
Member

RalfJung commented Feb 2, 2018

The issue is the text "When used on a type, e.g. #[may_dangle] T, the programmer is asserting the only uses of values of that type will be to move or drop them," and whether the programmer is allowed to discard values of type T, that is reclaim the memory used by the value without running T's drop code. In particular, is the programmer allowed to safely mem::forget a value of type T?

Personally I don't consider mem::forget as using its argument. It does quite explicitly the opposite, actually. ;)

My mental model for #[may_dangle] T is that it provides a way to "opt out" of the usual implicit side-conditions Rust has on all type and lifetime arguments that they be alive in the current function. When you write

fn foo<'a, T>(x: &'a i32, y: T)

then there are some implicit conditions (where clauses) added to the function, namely that 'a and T both outlive the duration of the function call. #[may_dangle] T prevents these where clauses from being added, so if we are in a function that has a #[may_dangle] T type parameter and would try to call a function that has a T type parameter, we'd be unable to satisfy the where clauses of the callee. That's why we are not allowed to call said function.

Right now this is all specialized to be just about drop, but that's mostly because this is the only place the need for such a parameter has come up. If we ignore that, mem::forget could be declared as fn forget<#[may_dangle] T>(x: T) in order to express that it can be called on dangling data.

@nox

This comment has been minimized.

Copy link
Contributor

nox commented Feb 2, 2018

First, why would adding #[may_dangle] on mem::forget allow perma-borrowed values such as values of the Pin type in #32836 (comment) to be forgotten? mem::forget takes ownership of its argument and in Alan's code you can't take ownership of the pinned value. Even if the forget function had #[may_dangle] on its type parameter, it would be pretty unsound to be able to call it while the value is borrowed, don't you think?

Second, what even would be the purpose of adding #[may_dangle] on mem::forget? People believe that what Josephine currently does was a happy accident all that time prior to ManuallyDrop's existence, but what is the purpose of allowing to forget dangling data? What new uses cases does it bring to the table of Rust users, apart from breaking Josephine's use case?

@RalfJung

This comment has been minimized.

Copy link
Member

RalfJung commented Feb 2, 2018

First, why would adding #[may_dangle] on mem::forget allow perma-borrowed values such as values of the Pin type in #32836 (comment) to be forgotten?

I didn't say that. All I did is answer the question (quoted in my reply) of whether mem::forget can be called in a drop that uses #[may_dangle] affirmatively. The question did not even mention perma-borrowing. I did not make the connection to the Pin trick nor do I see it. :)

Second, what even would be the purpose of adding #[may_dangle] on mem::forget?

No idea. @asajeffrey quite specifically asked about this above, so I gave my 2 cents.

Or did I entirely misunderstand their question...?

@asajeffrey

This comment has been minimized.

Copy link

asajeffrey commented Feb 2, 2018

Hmm, good point @RalfJung, perhaps my model of may_dangle needs to broaden...

I'd been thinking about it as specific to the drop checker, in that code such as:

fn foo(x: T) {
  // ... do stuff with x that doesn't transfer ownership
}

informally desugars to:

fn foo(mut x: T) {
  // ... do stuff with x that doesn't transfer ownership
  T::drop(&mut x); // if T implements Drop
  mem::forget(x);
}

but that &mut x borrow impacts the borrow-checker, and may_dangle tells the drop-checker to pretend it's not there.

But there may be a way of thinking about it more generally, as loosening the requirement that in &'a T, we have T: 'a. This is what @nikomatsakis was saying in IRC at https://botbot.me/mozilla/rust-lang/2018-01-31/?msg=96353190&page=2

@RalfJung there certainly are ways of reading the text as allowing mem::forget(x), e.g. as you say you could say this is not using x. I guess I'd just like the text to be a bit more explicit about this.

Also, your example at #32836 (comment) is interesting. Slightly messed around with, it's:

struct Transmuted<T> {
    data: Vec<u8>,
    marker: PhantomData<T>,
}
impl<T> Transmuted<T> {
    fn new(x: T) { ... horrible mem::transmute to store the byte representation of x in data }
}
impl<T> Deref for Transmuted<T> {
    type Target = T;
    fn deref(&self) -> &T { ... more horrible transmuting ... }
}
impl<#[may_dangle] T> Drop for Transmuted<T> {
    fn drop(&mut self) { ... what can we do with data here? ... }
}

We're not allowed to dereference data as a T, but it's not obvious to me what we're allowed to do with it as a Vec<u8>.

@RalfJung

This comment has been minimized.

Copy link
Member

RalfJung commented Feb 2, 2018

(Quoting from another thread but I think it better fits here)

your code shows an interesting corner case, where the run-time type for data is T, but it's compile-time type is [u8; N]. Which type counts as far as may_dangle is concerned?

If the code says #[may_dangle] T, the rules "that type" refer to T, I'd say. Not sure what else it could be?

there certainly are ways of reading the text as allowing mem::forget(x), e.g. as you say you could say this is not using x. I guess I'd just like the text to be a bit more explicit about this.

I think we'd all like to better understand what it takes to make this a usable, modular and ideally safe concept. :)

Also, your example at #32836 (comment) is interesting. Slightly messed around with, it's:

We're not allowed to dereference data as a T, but it's not obvious to me what we're allowed to do with it as a Vec.

T could contain padding and uninitialized data. So in terms of bytes I think we can do with it about as much as we can do with a mem::uninitialized::[u8; 32]().

Except, of course, we know it was a T at some time! I think this is no different than a struct that actually contains a T field, except that there's no automatic destructor because the compiler has no idea. So I think transmuting to a T should be fine, and then we can do anything on it that respects the #[may_dangle] rules. Whatever that means. In my interpretation, it means we can call any method that has relaxed its implicit bounds so as to not require T to be alive, like mem::forget.

Actually most methods that work for any generic T could be soundly written that way, which lead to the original parametricity-based rule -- after all, if you don't know anything about T, all you can do is move it, and that's fine by #[may_dangle] rules.

(And this is where I realized that I can explain the failure of the parametricity-based argument in terms of my #[may_dangle] interpretation.)

Of course, it turned out this does not work. The counterexample involves a type like

struct S<T> { x: Option<T>, f: fn (T) }

And of course, f does have all the usual implicit where clauses! So calling it from drop that opted out of the where clauses is unsound. We were in a context where the WF assumptions about T are relaxed, and called into a function that expects the WF assumptions to fully hold.

@RalfJung

This comment has been minimized.

Copy link
Member

RalfJung commented Feb 2, 2018

I'd been thinking about it as specific to the drop checker, in that code such as:

Thanks a lot for this! I came up with my model mostly because I didn't see any model defined anywhere, so this is very useful.

What I'm currently struggling with is reconciling "but that &mut x borrow impacts the borrow-checker, and may_dangle tells the drop-checker to pretend it's not there" with the fact that #[may_dangle] is an attribute on a type parameter, not on a function argument. If we have

unsafe impl <#[may_dangle] T, A: Allocator> Drop for AllocVec<T, A> {
  fn drop(&mut AllocVec<T, A>) {
    // can rely on A being alive, but not T
  }
}

then how is that reconciled by the borrow checker at the drop call site?

I like your desugaring into drop-wth-borrow + mem::forget as it gives us some idea of why the type of drop is &mut. However, that type is still blatantly wrong -- if it were correct, it'd be safe to desugar things to

drop(&mut x);
drop(&mut x);
mem::forget(x);

which is clearly not the case. drop will stay a fishy function.

What I think I'd like to add to the picture is where exactly the lifetimes end. In a usual situation, we may have something like this:

// Assume some outer lifetime `'a`
let v : Vec<&'a i32> = Vec::new(); // lifetime of this variable: 'v
// ...
// The vector "dies" during the drop call, so we have to end the lifetime *before*
// calling drop.
EndLifetime('v);
drop(&mut v); mem::forget(v);
EndLifetime('a);

Everything is fine here as 'a is alive while Vec<&'a i32>::drop is executed.

However, if we have cyclic references between elements in the Vec (or with your Pin inside a ManuallyDrop), we have a situation more like

let v : Vec<&'v i32> = Vec::new(); // lifetime of this variable: 'v
// It's literally cyclic! The type of this variable refers to its own lifetime.
EndLifetime('v); // As usual end the lifetime of a variable before it is dropped
drop(&mut v); // Uh-oh. 'v is dead and yet we are calling Vec<&'v i32>::drop. *explosion sound*

Because the type of this variable involves its own lifetime, there just is no way to end that lifetime and call the destructor in any order. So Rust rejects this, unless it knows that Vec<&'v i32>::drop is actually safe to call even though &'v i32 is not well-formed (because the lifetime has ended).

This scales to AllocVec: AllocVec<&'v i32, Allocator<'a>>::drop is safe to call even if 'v has ended, but 'alloc must still be a live. So we can safely have a

let v : AllocVec<&'v i32, Allocator<'a>>

but we cannot have a

let v : AllocVec<&'v i32, Allocator<'v>>

So, I guess I just repeated my point about WF checks from the perspective of the drop call site: The actual &mut x borrow for drop is just fine (x itself is not going to get deallocated or borrowed anywhere else!), but the type of &mut x is not entirely well-formed because some of the lifetimes it refers to may have already ended. So before calling drop we have to make sure that the concrete drop implementation we call is actually fine with relaxing the WF constraints to the extend that we can satisfy them.

Please let me know what you think... I'm kind of making this up as I go here; I've had plenty of fuzzy thoughts for a while but this is the first time I'm writing them down.

@asajeffrey

This comment has been minimized.

Copy link

asajeffrey commented Feb 2, 2018

Trying to make my slightly incoherent examples more precise...

The kind of thing I'm worried about is something like:

struct Bar<'a>(*const String, PhantomData<&'a()>);
impl<'a> Bar<'a> {
    fn new(x: &'a String) -> Bar<'a> { Bar(x, PhantomData) }
}
impl<'a> Deref for Bar<'a> {
    Target = String;
    fn deref(&self) -> &String { unsafe { &*self.0 } }
}
unsafe impl<#[may_dangle] 'a> Drop for Bar<'a> {
    fn drop(&mut self) { unsafe { println!("{}", &*self.0) } }
}

AFAICT

a) without the may_dangle this is safe, and
b) it passes the may_dangle rules since it never uses anything whose static type has lifetime 'a. (it derefs self.0 but that has type *const String not &'a String).

But it's unsafe, since it's printing a string that might have been reclaimed.

This is what I'm meaning about which type (e.g. for self.0) are we talking about, it's run-time "real" type &'a String or its compile-time type *const String.

@asajeffrey

This comment has been minimized.

Copy link

asajeffrey commented Feb 2, 2018

OK, the desugaring is a bit inaccurate, instead of:

T::drop(&mut x);
T::drop(&mut x);
mem::forget(x);

it should be something like:

if !x.drop_flag { T::drop(&mut x); x.drop_flag = true; }
if !x.drop_flag { T::drop(&mut x); x.drop_flag = true; }
mem::forget(x);

but I don't think this affects the typing.

Putting in the explicit lifetimes is interesting, as it makes clear the difference between the usual desugaring:

  T<'a>::drop(&mut x);
  EndLifetime('a);
  mem::forget(x);

and the one that #[may_dangle] 'a causes:

  EndLifetime('a);
  T<'static>::drop(&mut x);
  mem::forget(x);

that is a T<'a> value is being used after EndLifetime('a).

@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented Feb 3, 2018

But it's unsafe, since it's printing a string that might have been reclaimed.

Bar with a may_dangle is unsafe because it prints out of a raw pointer that might be deallocated.

The implementation of Bar might maintain an invariant that the raw pointer is live for 'a - in that case, an implementation without #[may_dangle] would be sound, because 'a is known to be alive.

@asajeffrey

This comment has been minimized.

Copy link

asajeffrey commented Feb 3, 2018

@arielb1 indeed, the issue is one of the definition of the may_dangle safety rules, that you can have code that is correct without may_dangle, satisfies the may_dangle rules, but is incorrect. Oops.

@RalfJung

This comment has been minimized.

Copy link
Member

RalfJung commented Feb 3, 2018

it passes the may_dangle rules since it never uses anything whose static type has lifetime 'a. (it derefs self.0 but that has type *const String not &'a String).

I disagree. The wording in the RFC is

When used on a lifetime, e.g. #[may_dangle] 'a, the programmer is asserting that no data behind a reference of lifetime 'a will be accessed by the destructor.

Notice it doesn't say "static". Semantically speaking, your *const String is a reference with lifetime 'a, and you are accessing data behind that reference. You are hiding this from the compiler, but that doesn't make any difference when reasoning about the operations you are safely permitted on this data.

So, your code violates my reading of the #[may_dangle] rules. And rule lawyering aside, if you think in terms of "what are my proof obligations when I want to show this unsafe code to be safe" (and I think this should always be the mindset when justifying unsafe code; counterexample-guided reasoning doesn't scale), one of the things you have to do is justify why you should be permitted to dereference this raw pointer. As @arielb1 said, this is where #[may_dangle] makes the difference because it tells you whether your proof can rely on 'a being alive.


it should be something like:

if !x.drop_flag { T::drop(&mut x); x.drop_flag = true; }
if !x.drop_flag { T::drop(&mut x); x.drop_flag = true; }
mem::forget(x);

I wasn't talking about the drop flag. My comment about the type being a lie refers to the fact that drop destroys the data behind the reference. If you call some "normal" function of type foo(&mut Vec<i32>), you can rely on this function to leave the vector in a valid state. That's part of the contract of &mut. On the other hand, drop(&mut Vec<i32>) blatantly violates that contract because it wrecks havoc on the data structure.

and the one that #[may_dangle] 'a causes:

 EndLifetime('a);
 T<'static>::drop(&mut x);
 mem::forget(x);

This should be T<'a>::drop(&mut x);, right?

@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented Feb 3, 2018

So one version of the invariant josephine wants is basically this:

If a place is borrowed, and then (of course, after the borrow is over) the memory in that place is overwritten, then either:

  1. The data in the place was alive and unborrowed at the same time.
  2. The destructor is called at the place

That makes josephine sound: after a place is borrowed for &'a Foo<'a>, it can never be simultaneously live and unborrowed from that point on, which means the memory can't be invalidated without the destructor being called.

The invariant also probably holds in "specified pre-1.20 Rust": in safe code, if a place is memory-overwritten, then (up to the VecDeque bug, which we could declare "just a soundness bug") the data at that place was already freed, which requires it to either have been moved out (which requires it to be both alive and unborrowed at the point of the move) or have its destructor called.

OTOH, that invariant is never something we tried to guarantee. In fact, a #[may_dangle] Rc on top of an arena allocator breaks it:

struct Cyclic<A, T> {
    cycle: RefCell<Option<Rc<Cyclic<T, A>, A>>,
    data: T
}

fn foo() {
    let arena = make_arena();
    let rc1 = arena <- Cyclic {
        cycle: Default::default(),
        data: Pin::new(),
    };
    
    // make a cycle    
    *rc1.cycle.borrow_mut() = Some(rc1.clone());
    rc1.data.pin();

    // at end-of-scope, no destructors are run, but the arena (and its containing Rc) are free-ed.
}

@glaebhoerl glaebhoerl referenced this issue Feb 14, 2018

Merged

Dropck Eyepatch RFC. #1327

1 of 2 tasks complete
@Centril

This comment has been minimized.

Copy link
Contributor

Centril commented Feb 24, 2018

What is blocking stabilization of generic_param_attrs ? I think that should be stabilized irrespective of #[may_dangle] as it is useful for custom derive.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Feb 26, 2018

@Centril nothing really...

@nox

This comment has been minimized.

Copy link
Contributor

nox commented Mar 8, 2018

What can we do to stabilise generic_param_attrs? I want to use custom derive attributes in type parameters in Stylo, which runs on stable.

@SimonSapin

This comment has been minimized.

Copy link
Contributor

SimonSapin commented Mar 8, 2018

@nox I would guess: make a PR that stabilizes it and ask the lang team to do a FCP.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Mar 8, 2018

@nox

What can we do to stabilise generic_param_attrs? I want to use custom derive attributes in type parameters in Stylo, which runs on stable.

Best thing: create a new issue for just that stabilization. Describe briefly the behavior you want to stabilize (as there is no RFC we need a bit more explanation, but then this is very simple so that seems easy enough). Ideally, show pointers to some tests that include that behavior. I would particularly be interested in tests that show that we properly error on unrecognized attributes (like #[foo]) used in that position (i.e., we want to avoid accepting things we don't expect). cc me or the team and we'll get it going!

To be clear: I think we can stabilize this. =)

@petrochenkov

This comment has been minimized.

Copy link
Contributor

petrochenkov commented Mar 8, 2018

I would particularly be interested in tests that show that we properly error on unrecognized attributes (like #[foo]) used in that position

ui/feature-gate-custom_attribute2.rs
All the other notable tests can be found in #34764
I'll send the stabilization PR today.

i.e., we want to avoid accepting things we don't expect

Unknown attributes are easy, but this is much harder because known attributes are generally accepted in any weird positions with any syntax without validation, e.g. we already accept a lot of things we don't expect

fn main() {
    #[inline(x = "y")] // OK
    let x = 10;
}

Fixing this is somewhere in my queue though! Maybe I'll even get to it this year.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.