Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

`Arc` should only require `Sync`, not `Send` #20257

Closed
pythonesque opened this Issue Dec 27, 2014 · 21 comments

Comments

Projects
None yet
9 participants
@pythonesque
Copy link
Contributor

pythonesque commented Dec 27, 2014

The original justification for this was from issue #2788. I am pretty confident that this reasoning no longer applies, especially now that we have Sync. The reason this is safe is that Arc<T> will not be Send unless T is, and if you try to pass Arc<T> into spawn() without Send it won't work anyway.

(Making Arc<T> only Send sometimes may not have been doable prior to unsafe traits, so this issue may have depended on #20119 being resolved. But I think the way Arc is defined this isn't true).

@pythonesque

This comment has been minimized.

Copy link
Contributor Author

pythonesque commented Dec 27, 2014

I guess the new way things are defined, Arc should really require Sync for Sync and Send + Sync for Send.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Dec 28, 2014

Now that we have opt-in Send/Sync I think that this should be quite easy to encode (the latter T: Sync => Arc<T>: Sync and T: Send + Sync => Arc<T>: Send).

I also agree that the bounds on construction and other functions can all be removed once these are in place.

@flaper87

This comment has been minimized.

Copy link
Contributor

flaper87 commented Dec 28, 2014

#20119 not only relaxed the Send bound but also the Sync bound. 8818693

I failed at giving a better explanation in the commit message but, one of the main reasons behind this is that it is reasonable to have something ref-counted that never escapes the thread. I know @alexcrichton and @nikomatsakis have discussed this a bit more. As far as I can tell, the reasoning behind this change makes sense to me. Probably we need a new type ?

@pythonesque

This comment has been minimized.

Copy link
Contributor Author

pythonesque commented Dec 28, 2014

Yeah, I saw that a bit after creating this issue. However, we do still require Sync + Send for both Sync and Send, and I think Sync only really requires Sync (but I doubt this distinction will matter to anyone at all in present day Rust).

@quantheory

This comment has been minimized.

Copy link
Contributor

quantheory commented Jan 6, 2015

I doubt this distinction will matter to anyone at all in present day Rust

I should point out that this did cause some confusion to me about a week ago (i.e. within days after this was posted). I'm perfectly willing to admit that I might be weird in this respect, but I think that for pedagogical reasons it is useful to have pretty tight bounds, even if in practice most user types that implement Sync will also implement Send.

@huonw

This comment has been minimized.

Copy link
Member

huonw commented Sep 29, 2015

I believe making this change will cause the following data structure to become invalid:

use std::marker::PhantomData;
use std::sync::Arc;
use std::sync::atomic::{AtomicPtr, Ordering};
use std::mem;

struct ArcCell<T> {
     x: AtomicPtr<T>,
     _marker: PhantomData<Arc<T>>,
}

impl<T> ArcCell<T> {
    fn new(x: Arc<T>) -> ArcCell<T> {
        ArcCell { x: AtomicPtr::new(unsafe {mem::transmute(x)}), _marker: PhantomData }
    }

    fn swap(&self, y: Arc<T>, order: Ordering) -> Arc<T> {
        unsafe {
             mem::transmute(self.x.swap(mem::transmute(y), order))
        }
    }
}

Because it allows for destroying Arcs on different threads without Sending the Cell.

let cell = ArcCell::new(Arc::new(1));

thread::scoped(|| { // (or equivalent)
     let a = cell.swap(Arc::new(2), Ordering::SeqCst);
     assert_eq!(*a, 1);
     // no other references, so `a` is destroyed here
})

This could be fixed by having ArcCell have a more restricted implementation of Sync, but it is theoretically a breaking change because the above is working and correct code now.

@rust-lang/libs (et al): are the bounds in existing trait implementations public guarantees for things like this, where they influence correctness?

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Sep 29, 2015

There is of course no way to test for this automatically.

@aturon

This comment has been minimized.

Copy link
Member

aturon commented Sep 30, 2015

@huonw This is a pretty interesting form of negative reasoning. In some sense, this is what the "fundamental" attribute is meant to capture -- that it's a breaking change to implement the trait. I'm not sure yet what I think about it in this case.

@bluss

This comment has been minimized.

Copy link
Contributor

bluss commented Sep 30, 2015

It must be Sync for Arc<T> where T: Sync + Send because of @huonw's theorem 😄 that says that Sync + Copy => Send.

Sending a &Arc<T> (requires Arc<T>: Sync), you can .clone() it to get a new Arc<T> on that thread, and thus, the Arc<T> is effectively sent to another thread (and the T can be dropped on that thread), which must only be allowed if T: Send.

@aturon

This comment has been minimized.

Copy link
Member

aturon commented Sep 30, 2015

@bluss FWIW, this isn't what Send means in my mind. To me, it specifically requires moving ownership of the original value directly across threads.

@bluss

This comment has been minimized.

Copy link
Contributor

bluss commented Sep 30, 2015

One popular definition of ownership is Owner is who drops the value. and using the clone trick, you can drop the inner T on arbitrary thread.

How is the ownership not moved if there is now an Arc<T> that owns the T on a different thread? You only need to drop all other Arc to make the new thread the only living shared owner of the T.

@aturon

This comment has been minimized.

Copy link
Member

aturon commented Sep 30, 2015

@bluss The ownership of T isn't what's at stake here -- the ownership of the particular Arc<T> is. That's what's being marked Send, or not. EDIT: that is, it's not at stake when we ask "is Arc<T> being moved?"

What I'm saying is that sending a reference to an Arc<T> and cloning it is distinct from moving the original Arc<T> in terms of ownership transfer as the borrow checker sees things.

@aturon

This comment has been minimized.

Copy link
Member

aturon commented Sep 30, 2015

OK, clarifying a bit more from IRC:

  • I don't believe that the Arc<T> itself is being moved in the clone example.
  • However, the fact that by sending &Arc<T> you're able to run the dtor on the underlying T (or, with try_unwrap, to move the underlying T) means that Arc<T>: Sync requires T: Send + Sync; the T is potentially sent, via API usage on Arc that begins only with a reference.

So I agree with @bluss's overall conclusion that the bounds need to stay as they are.

@bluss

This comment has been minimized.

Copy link
Contributor

bluss commented Sep 30, 2015

From discussion with @huonw:

Assuming Sync for Arc<T> where T: Sync.

If a &Arc<T> is sent from one thread to another (borrowed from the Arc owner's "home" thread to the "remote" thread), we know that the lender must live strictly longer than the whole thread.

  • Any Arc<T> we clone from this will not cause T to drop on the remote thread, because there is an Arc<T> on the home thread that lives strictly longer.
  • Any Arc<T> we clone must stay in the same scoped thread, because it's not Send.

Can this be correct? @aturon It's your scoped threads, so you make the rules 😉

@huonw

This comment has been minimized.

Copy link
Member

huonw commented Sep 30, 2015

@aturon convinced me that that argument doesn't hold up in the face of imagination: it's not that obvious (at least to me) that we can't construct some sort of thread scoping/cross-thread &-transfer that means the clone approach is a counterexample too.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Sep 30, 2015

@aturon @bluss another similar (but more direct) problem would arise if Arc offered "unwrap" APIs similar to Rc

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Sep 30, 2015

@huonw

@aturon convinced me that that argument doesn't hold up in the face of imagination: it's not that obvious (at least to me) that we can't construct some sort of thread scoping/cross-thread &-transfer that means the clone approach is a counterexample too.

Stack overflow! One too many uses of double negatives! Can you rephrase this?

@huonw

This comment has been minimized.

Copy link
Member

huonw commented Oct 1, 2015

Even the unwrap method of sending a T is outlawed by the reasoning in #20257 (comment) (one can't unwrap while there's another existing Arc<T>).

Stack overflow! One too many uses of double negatives! Can you rephrase this?

#20257 (comment) implies that the Clone reasoning isn't problematic with the concurrency APIs we currently have (AFAIK), e.g. with the basic scoped threads, sending a reference to an Arc will mean the thing that reference points to will have to outlive the thread itself:

let x = Arc::new(1);

let guard = thread::scoped(|| {
    let y = x.clone();
});

// illegal, since `x` is still borrowed by `guard`
drop(x);

y's destructor will never touch the 1, since x won't be destroyed until strictly after the thread has finished, so the reference count when will never be small when y is manipulating it.

However, it may be possible to have a fancier API that does allow transferring a reference to a new thread, doing something with it, and then "releasing" any borrows while keeping some data on the new thread (some sort of partial scoped API, maybe).

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Oct 1, 2015

@huonw well for example I think you could stick it in thread-local-data, perhaps.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Oct 1, 2015

Sorry, to be clear, I am assuming a thread::scoped that multiplexes over a smaller number of O/S threads, versus the original API which always started a new thread.

@aturon

This comment has been minimized.

Copy link
Member

aturon commented Oct 13, 2015

Closing. While there are some interesting policy questions here (e.g., the exact definition of Send), it's clear that the bounds around Arc need to stay as they are.

@aturon aturon closed this Oct 13, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.