Sound Generic Drop #769

Merged
merged 5 commits into from Feb 10, 2015

Projects

None yet

9 participants

@pnkfelix
Member

Remove #[unsafe_destructor] from the Rust language. Make it safe for developers to implement Drop on type- and lifetime-parameterized structs and enum (i.e. "Generic Drop") by imposing new rules on code where such types occur, to ensure that the drop implementation cannot possibly read or write data via a reference of type &'a Data where 'a could have possibly expired before the drop code runs.

(rendered text/)

Spawned from rust-lang/rust#8861 (which is also the tracking issue)

@arielb1
Contributor
arielb1 commented Jan 29, 2015

I guess that Arena can be fixed by giving it a lifetime parameter and forcing types to be bounded by it.

However, doing this directly would be problematic, as it would prevent types within an arena from referencing other such types. I think that we could give the Drop trait a lifetime parameter, which bounds the lifetime of things used within the dtor (of course, make it at least as long as the Drop-lifetime of the contents of the type), and have Arena::<'a>::alloc<T> require T: 'a.

@pnkfelix
Member

@arielb1 it does not suffice to just use T: 'b or 'a: 'b ; as pointed out in the RFC, these constructs only express "must live at least as long as ...",

But the semantics here require one be able to express 'a strictly-outlives 'b and T strictly-outlives 'b (... or something equivalent, like 'a : parent_extent('b))


(anyway, as you point out, such a requirement would be a real drag, since it would indeed prevent a number of useful patterns from being used there.)

I'm not entirely sure I follow how adding a lifetime parameter to Drop would actually address things, since the Drop trait itself is not exposed at client sites. Or maybe you meant that it would be an associated item ... but we do not yet support associated lifetimes.

@pnkfelix pnkfelix was assigned by nrc Jan 29, 2015
@P1start
Contributor
P1start commented Jan 29, 2015

If we had negative bounds, could we represent ‘'a strictly outlives 'b’ with 'a: 'b, 'b: !'a? I’m not sure how to represent ‘T strictly outlives 'b’, though—T: 'b, 'b: !T doesn’t really make much sense.

@Gankro
Contributor
Gankro commented Jan 29, 2015

It seems to me that Unique<T> expresses exactly the desired ownership relationship needed for Vec.

A wrapper around a raw *mut T that indicates that the possessor of this wrapper owns the referent.

So the definition would become ptr: NonZero<Unique<T>>, as should be the case. As far as I know, the only reason we don't have this is that Unique didn't impl Zeroable or whatever, but there's an accepted PR in for that.

@Gankro
Contributor
Gankro commented Jan 29, 2015

I haven't read through all the alternatives and appendices, but a quick skim didn't reveal to me why we can't just parse:

{
  let foo = Foo::new();
  let bar = Bar::new();
  let baz = Baz::new();
}

as

{
  let foo = Foo::new();
 {
    let bar = Bar::new();
    {
      let baz = Baz::new();
    }
  }
}

implying truly distinct lifetimes, and thus resolving the example problems? Are there any problem cases that aren't just a consequence of considering everything in the same scope to have the same lifetime? Although I'm pretty fuzzy on lifetime unification, so maybe this would Break Everything -- or I'm misunderstanding what the problem actually is.

@Gankro
Contributor
Gankro commented Jan 29, 2015

Ah, I see the appendix suggests exactly this. Than all the other rules introduced are just to support cyclic structures, then?

@arielb1
Contributor
arielb1 commented Jan 30, 2015

@pnkfelix

The point is that the lifetime parameter of the Arena must strictly outlive the arena itself (same as TypedArena). The lifetime-parameter-on-Drop is just to allow types that have cyclic references, as long as they don't use them on their destructor.

@pnkfelix
Member

@arielb1 hmm, you know, I thought I actually tried an experiment like that (that is, putting a lifetime parameter on Arena, as a way to see if I could provide an impoverished Arena<'a> API rather than removing it entirely), but for some reason it did not work out at the time. I will try it again and report back.


Okay, now I understand what you are suggesting with the lifetime parameter on Drop; that idea is something I have actually mused about, but we need something a little bit more general, I think. In particular, in practice one calls other methods from Drop, and so we need something like fn foo<'a: UnusedLifetime>(b: Bar<'a>) { ... }, where that bound would then limit the body of the function so that it cannot ever actually access borrowed data of type &'a _. (We probably would not use that particular name for the bound; I have tossed around other options like fn foo<'a: ?Live>(...), but no clear winners yet.

Anyway, in my opinion, these are potential future extensions to the language that, while potentially very convenient, need not block landing this RFC. (i.e. I think they can be added backwards compatibly post 1.0 ... the only exception would be if we anticipate needing to change existing stable API's in non-backwards compatible ways, but I am not aware of an example of that.)

@pnkfelix
Member

@Gankro to what "other rules" are you referring? Do you mean the other conditions in the Drop-Check rule?

(In which case I would say, yes, for the most part, things like Condition B. were largely added to support cyclic structure that we found used in practice, in particular in rustc itself; adding Condition B. is what enabled me to close PR #21024 without landing it.)

@pnkfelix
Member

@Gankro also, regarding Unique<T>: that is not a zero-sized type. I think we are better off using PhantomData<T> for this purpose, for maximum generality. (Under the hood, I strongly suspect Unique<T> will have a PhantomData<T> itself ... it cannot do so in its current form, because it is a tuple struct, but I think that is an API bug in Unique<T>.)

@pnkfelix pnkfelix Fixed a typo.
Added an "unresolved question" alluding to the potential importance of
better handling of covariance (and thus variance in general).
4b4e5f5
@Gankro
Contributor
Gankro commented Jan 30, 2015

Ok, that sounds reasonable to me. The rules necessary to handle cycles are a bit hairy, but if we have usecases today I suppose we need a solution.

@kennytm
kennytm commented Jan 31, 2015

@P1start T strictly outlives 'a can be written as T: 'b, 'b: 'a, 'a: !'b. However, it is one thing that the concept can be written down, it is another thing the compiler can recognize and make use of it. I think it is a bad idea to use negative bounds to express the "strictly outlives" concept.

It would be better to use a new symbol like 'a :!= 'b, T :!= 'b if it is truly necessary (I don't see in this RFC that the programmer is required to express such bound though).

@pnkfelix
Member
pnkfelix commented Feb 1, 2015

@arielb1 hmm, you know, I thought I actually tried an experiment like that (that is, putting a lifetime parameter on Arena, as a way to see if I could provide an impoverished Arena<'a> API rather than removing it entirely), but for some reason it did not work out at the time. I will try it again and report back.

Follow-up: My previous experiment was flawed because I did not add an appropriate marker for the variance when I made the version of Arena that carried a lifetime parameter. So that version, due to being too flexible in its variance, ended up allowing unsound code to be compiled without error and without uses of unsafe.

It now seems to me like one can indeed make a safe version of Arena that carries the necessary constraints.

(Also, it might make sense to continue providing the current API under the name UnsafeArena, to reflect the fact that it does not handle destructors in a safe manner.)


I will update the RFC accordingly. @arielb1 : thanks for pushing back on this point.

@pythonesque
Contributor

Wow, this is really fantastic. I realize the result is complex, but it seems to address pretty much all of my concerns with the previous variants of the outlives rules. Great work!

@huonw huonw commented on the diff Feb 5, 2015
text/0000-sound-generic-drop.md
+# Detailed design
+
+## The Drop-Check Rule
+[The Drop-Check Rule]: #the-drop-check-rule
+
+The Motivation section alluded to the compiler enforcing a new rule.
+Here is a more formal statement of that rule:
+
+Let `v` be some value (either temporary or named); if the type of `v` owns data of type `D`,
+where `D` has a lifetime- or type-parametric `Drop` implementation and
+either:
+
+ * (A.) the `Drop impl` for `D` instantiates `D` at `'a`
+ directly, i.e. `D<'a>`, or,
+
+ * (B.) the `Drop impl` for `D` has some type parameter with a
@huonw
huonw Feb 5, 2015 Member

This is relying on parametric polymorphism, right? Is that OK in the presence of Any?

@huonw
huonw Feb 5, 2015 Member

Also, many types have an implicit X: Drop bound, e.g. Vec

impl<T> Drop for Vec<T> {
    fn drop(&mut self) {
        // This is (and should always remain) a no-op if the fields are
        // zeroed (when moving out, because of #[unsafe_no_drop_flag]).
        if self.cap != 0 {
            unsafe {
                for x in self.iter() {
                    ptr::read(x);
                }
                dealloc(*self.ptr, self.cap)
            }
        }
    }
}

The ptr::read(x); line is basically a x.drop() call.

@pnkfelix
pnkfelix Feb 5, 2015 Member

yes, I didn't spell this out in the RFC, but in general all type parameters must be assumed to have a destructor attached, which is not quite the same as having an explicit Drop bound, but the point is, the analysis does not treat the implicit attached destructor as cause for Condition B to fire (since that would basically mean that condition B would always fire, and defeat its purpose).

In fact, your example of the implicit x.drop() calls within the <Vec as Drop>::drop implementation is the reason that we need the PhantomData<T>: That is the magic sauce that tells the analysis, during its recursive traversal of the Vec struct, that it needs to analyze the drop behavior of T when it is analyzing Vec<T>.


This approach is indeed relying on a kind of parametricity, but only in that it is using the lack of bounds as proof that the drop implementation will never access borrowed data via a method call (except for potentially via implicit calls to drop on values owned by the type -- but then the analysis catches those during its recursive application to the substructure of the type).

I admit that I did not explicitly consider Any when I was thinking about this. The docs say that:

Every type with no non-'static references implements Any, so Any can be used as a trait object to emulate the effects dynamic typing.

So if a type has references to non-'static data, it does not get the implicit implementation of Any. Can one implement Any explicitly for other types that do have references to non-'static data? If so, then this scheme might have a problem. (But it also seems like maybe we should not be allowing for such types to implement Any?)

@pnkfelix
pnkfelix Feb 5, 2015 Member

Okay so I poked around a little bit more; from the source code, it seem like any type implementing Any is forced to outlive 'static. So one should not be able to hide borrowed data behind the Any trait, and therefore it is okay for the analysis to treat Any like a black box whose destructor is safe to run (at least with respect to not accessing borrowed data).

pnkfelix added some commits Feb 9, 2015
@pnkfelix pnkfelix Revise suggestion for the `Arena` API.
Namely, now officially propose that `Arena` be changed to `Arena<'a>`.

(As an aside, suggest that we *could* also provide the old API with an
`UnsafeArena` type; but that is not part of the proposal proper).
9fd6c72
@pnkfelix pnkfelix Add notes on parametricity and `Any`.
Props to huonw for pointing this out, since it forced me to think
through the cases where the near-parametricity of `T` matters.

Notably, it does not suffice to focus solely on the case where `v`
owns data of type `T`; one must also consider the scenario where the
destructor for `v` constructs a `T` on the fly.
232da78
@pnkfelix pnkfelix fix typo. f78aa94
@nikomatsakis nikomatsakis merged commit f78aa94 into rust-lang:master Feb 10, 2015
@nikomatsakis nikomatsakis added a commit that referenced this pull request Feb 10, 2015
@nikomatsakis nikomatsakis Accept RFC #769 (sound generic drop) 2841310
@nikomatsakis
Contributor

This RFC is accepted. The tracking issue is rust-lang/rust#8861.

@bors bors added a commit to rust-lang/rust that referenced this pull request Feb 11, 2015
@bors bors Auto merge of #21972 - pnkfelix:new-dtor-semantics-6, r=nikomatsakis
This is a resurrection and heavy revision/expansion of a PR that pcwalton did to resolve #8861.

The most relevant, user-visible semantic change is this: #[unsafe_destructor] is gone. Instead, if a type expression for some value has a destructor, then any lifetimes referenced within that type expression must strictly outlive the scope of the value.

See discussion on rust-lang/rfcs#769
b7884f6
@bors bors added a commit to rust-lang/rust that referenced this pull request Feb 11, 2015
@bors bors Auto merge of #21972 - pnkfelix:new-dtor-semantics-6, r=nikomatsakis
This is a resurrection and heavy revision/expansion of a PR that pcwalton did to resolve #8861.

The most relevant, user-visible semantic change is this: #[unsafe_destructor] is gone. Instead, if a type expression for some value has a destructor, then any lifetimes referenced within that type expression must strictly outlive the scope of the value.

See discussion on rust-lang/rfcs#769
fc6123d
@bors bors added a commit to rust-lang/rust that referenced this pull request Feb 11, 2015
@bors bors Auto merge of #21972 - pnkfelix:new-dtor-semantics-6, r=nikomatsakis
This is a resurrection and heavy revision/expansion of a PR that pcwalton did to resolve #8861.

The most relevant, user-visible semantic change is this: #[unsafe_destructor] is gone. Instead, if a type expression for some value has a destructor, then any lifetimes referenced within that type expression must strictly outlive the scope of the value.

See discussion on rust-lang/rfcs#769
e29f420
@pnkfelix
Member

A passing thought

The Drop-Check rule currently implies that traits with no methods are safe to use as bounds. (the current implementation takes a short-cut and is not so general: it only allows the known builtin bounds as bounds, not arbitrary method-less traits.)

So the question: Is there some hack one could use with an associated type to get around that restriction, i.e. something like:

trait Foo<T> {
    fn foo(t: &T) -> Self;
}

trait Bar<T> {
    type B: Foo<T>;
}

struct Quu<X: Bar<X>> { state: X }

impl<X:Bar> Drop for Quu<X> {
    fn drop(&mut self) {
        let b: X::B = Foo::foo(&self.state);
    }
}

The above seems like a case that the drop-check rule would need to cover; the rule as currently written would not force lifetimes in the type substituted for X to strictly outlive the value for Quux<X>, but clearly such lifetimes need to in the above example.

cc @nikomatsakis


(This is probably fixable by generalizing the text from saying "some type parameter with a trait bound T where T is a trait that has at least one method" to something more general, like "where T is a trait that has at least one method, or has a type parameter that could be instantiated with a type having a method that takes T as a an argument. and as I said above, the current implementation is not as general as what this RFC states, so this is not a 1.0 blocker.)

@pnkfelix
Member

(still, probably best in the short term to amend the RFC with something to fix this believed hole, even if its a really conservative solution like "where T is a trait that has at least one item.")

@nikomatsakis
Contributor

(I'd personally be satisfied with the really conservative sol'n for now)

@pnkfelix pnkfelix referenced this pull request in rust-lang/rust Mar 6, 2015
Closed

Order of code affects lifetime resolution #23108

@glaebhoerl
Contributor

Noncopy types

All non-Copy type parameters are (still) assumed to have a
destructor. Thus, one would be correct in noting that even a type
T with no bounds may still have one hidden method attached; namely,
its Drop implementation.

However, the drop implementation for T can only be called when
running the destructor for value v if either:

  1. the type of v owns data of type T, or

A small thing I noticed, I'm not sure if this has any practical significance:

The impl Drop for T can also be invoked through an &mut T, e.g. assignment through it invokes the destructor on the previous value. This possibility doesn't appear to be covered by "the type of v owns data of type T". (Also, &RefCell<T>.)

@pnkfelix
Member
pnkfelix commented May 2, 2015

The impl Drop for T can also be invoked through an &mut T, e.g. assignment through it invokes the destructor on the previous value. This possibility doesn't appear to be covered by "the type of v owns data of type T".

Hmm, yes. I am now trying to construct an example where this yields unsoundness ... it is a little tricky, but ... oh I have some new ideas, will report back later ...

(In any case, it is probably safest to just treat this as another case where "the type of v owns data of type T", although at that point I might reword the term it as "the type of v can destroy data of type T" ...)


&RefCell<T> is trickier, since that is a library type. PhantomData inside of it won't suffice... maybe I need an even stronger marker type, that propagates even across &T ...


Update: A tricky part of trying to expose brokenness here is that any destructor that actually writes to a contained &'a mut T or &'a RefCell<T> within a value v will need to either:

  • be attached to a type that is parametric in 'a (which will force 'a to strictly outlive v, and thus I think will force T also to strictly outlive v), or
  • be hidden behind a generic type with some method that does the write to &'a mut T or &'a RefCell<T> -- but again, I think we will again here see the 'a leak up and end up being forced to strictly outlive v?

That is the basic difficulty I am facing in trying to use the observation to actually construct an example where one could use this to break soundness.

Here is some example code illustrating the kind of issue I am thinking about: http://is.gd/fDfLPX

@pnkfelix
Member
pnkfelix commented May 2, 2015

So my current thought is that I need to amend the "Noncopy types" subsection with an explanation along the lines of the one presented in my previous comment.

(Oh, and in fact, Copy types aren't even allowed through anymore; we had to change that, see: rust-lang/rust#24906 ) ((edit: The first part of this parenthetical did not make any sense, and the second part was not relevant to the conversation.))


It might also be good to note that calls to destructors for values extracted via unsafe pointers (e.g. ptr::read) are not directly tracked) though admittedly that is sort of already implied by the discussion in the "Phantom Data" section.

@glaebhoerl
Contributor

That is the basic difficulty I am facing in trying to use the observation to actually construct an example where one could use this to break soundness.

Yes, I admit I don't understand the rules well enough to tell whether this could actually violate safety. Just thought it was worth noticing.

@pnkfelix
Member
pnkfelix commented May 3, 2015

Yes, I admit I don't understand the rules well enough to tell whether this could actually violate safety. Just thought it was worth noticing.

Hey, I do know the rules, and you still managed to give me quite a scare, so I'd say it is definitely worth noticing/noting. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment