Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce atomic closures implemented using flat-combining #141

Closed
wants to merge 3 commits into from

Conversation

Projects
None yet
3 participants
@nikomatsakis
Copy link
Member

nikomatsakis commented Nov 15, 2016

This is a second stab at flat-combine. This time, the construct is intended for end-user use. The easiest way to use it is with the atomically! macro, which lets you wrap a closure to indicate that it should execute atomically:

let mut set = HashSet::new();
some_vec
    .par_iter()
    .for_each(atomically!(|x| { set.insert(x); }));

But more generally you can create an atomic closure like:

let x = Atomic::new(|foo| process(foo));

and then call it with x.invoke(data).

The current code is intended for use with Rayon; it won't really provide a benefit outside of Rayon, just acts more-or-less like a regular lock. It is also not intended to be super long-lived, so e.g. we never prune the lists of registries that touch it and so forth, which could lead to a (very minor) memory leak if you had a long-lived atomic and created a lot of thread pools that used it independently. But both of those things are anti-patterns, so I'm not worried about that.

Some things remain not done:

  • needs some docs
  • needs some testing of the panic pathways

cc @aturon -- I'd love a detailed look at the acquire/release orderings. I think they're correct. =)

@nikomatsakis nikomatsakis force-pushed the flat-combine branch from ca4bf73 to 82ad491 Nov 15, 2016

@nikomatsakis

This comment has been minimized.

Copy link
Member Author

nikomatsakis commented Nov 15, 2016

It may of course make sense to factor this outside of rayon altogether; the main difference would be that we would have to assign each thread that touches it an index, instead of being able to leverage rayon's pre-existing indices. You might also want to revisit the "won't be used by an open-ended set of threads over a long period of time" assumption in that case, which would in general make the management of the active thread list more complicated. (The flat-combining paper talks about ways to manage it.)

src/util.rs Outdated
// The actual `ID` value is irrelevant. We're just using its TLS
// address as a unique thread key, faster than a real thread-id call.
thread_local!{ static ID: bool = false }
ID.with(|id| ThreadId { addr: id as *const bool as usize })

This comment has been minimized.

@cuviper

cuviper Nov 15, 2016

Member
  • Is this refactoring relevant to the PR? I don't see any new callers here.
  • You did not keep it #[inline], which means it will probably turn into a cross-crate call from the generic bridge functions.
  • The performance claim that I made about this should probably be examined, if this ever gets used in a true hotspot. e.g. I'm not sure if Rust ever uses static TLS, in which case there's probably a call for dynamic TLS anyway. (Your OS may vary, no warranty is expressed or implied, yada yada.)
    • But anyway, it never showed up as a hotspot in my testing, and AFAIK there's no thread id available in the standard library, so this invented id is probably still good enough.

This comment has been minimized.

@nikomatsakis

nikomatsakis Nov 15, 2016

Author Member

Yeah, not really relevant -- I originally planned to write a more general purpose thing that made use of ThreadId. I'll just drop this commit I think.

src/lib.rs Outdated
@@ -27,5 +31,8 @@ pub use api::dump_stats;
pub use api::initialize;
pub use api::join;
pub use api::ThreadPool;
pub use atomic::Atomic;
#[cfg(feature = "unstable")]
pub use atomic::Atomic;

This comment has been minimized.

@cuviper

cuviper Nov 15, 2016

Member

Am I seeing double here?

This comment has been minimized.

@nikomatsakis

nikomatsakis Nov 15, 2016

Author Member

Heh, merge fail...

@nikomatsakis nikomatsakis force-pushed the flat-combine branch 3 times, most recently from 744d0fa to b75e8e0 Nov 15, 2016

@nikomatsakis

This comment has been minimized.

Copy link
Member Author

nikomatsakis commented Nov 16, 2016

Did some measurements on my linux box with this branch:

lunch-box. cargo-benchcmp map_collect::i_to_i::with_mutex map_collect::i_to_i::with_atomic killme.atomic
 name  map_collect::i_to_i::with_mutex ns/iter  map_collect::i_to_i::with_atomic ns/iter  diff ns/iter   diff %
       135,382,261                              93,304,115                                 -42,078,146  -31.08%
 _vec  85,193,005                               45,689,313                                 -39,503,692  -46.37%
lunch-box. cargo-benchcmp map_collect::i_mod_10_to_i::with_mutex map_collect::i_mod_10_to_i::with_atomic killme.atomic
 name  map_collect::i_mod_10_to_i::with_mutex ns/iter  map_collect::i_mod_10_to_i::with_atomic ns/iter  diff ns/iter   diff %
       73,926,743                                      42,409,191                                        -31,517,552  -42.63%
 _vec  13,173,497                                      8,222,546                                          -4,950,951  -37.58%

@nikomatsakis nikomatsakis force-pushed the flat-combine branch 4 times, most recently from 8158168 to b245f4a Nov 18, 2016

@nikomatsakis

This comment has been minimized.

Copy link
Member Author

nikomatsakis commented Nov 18, 2016

I encountered an interesting problem trying to port the TSP solver to use this (actually, I encountered it when doing filtering too). The current Atomic wants to take a FnMut(A) -> B, but sometimes you would want FnMut(&A) -> B (i.e., with a higher-ranked lifetime). I think we can make it accept both but it'll require some juggling, and might mean that you wind up with invoke() and invoke_ref() (and hence that the macro atomically! probably needs two variants). Have to ponder.

nikomatsakis added some commits Nov 14, 2016

add `atomic` module
Implement a flat combining lock. Falls back to mutex when not using
Rayon.

@nikomatsakis nikomatsakis force-pushed the flat-combine branch from b245f4a to affb3dd Dec 30, 2016

@nikomatsakis

This comment has been minimized.

Copy link
Member Author

nikomatsakis commented Dec 30, 2016

Rebased. This is working now, but I was trying to integrate it into the TSP benchmark and I found one problem with the API: it assumes you want to own the argument, but this isn't always true, sometimes you'd prefer a closure of type Fn(&T) and not Fn(U) -- these look equivalent but they are not, because the former is known to always borrow its argument, whereas the latter takes ownership of it. Even if U = &'x T, that still requires us to supply references with a known lifetime 'x. I was going to experiment with trying to support both patterns, but haven't gotten around to it yet. It'll probably make the API mildly more complex, which is unfortunate.

@cuviper

This comment has been minimized.

Copy link
Member

cuviper commented Mar 18, 2017

Where do you stand on this? It will at least have to be updated for the rayon-core split.

@stjepang
Copy link
Contributor

stjepang left a comment

Orderings in src/atomic/mod.rs look good to me.

If in doubt, I'd suggest using Acquire, Release, and AcqRel liberally. Those are not very expensive operations on x86 processors.

head: AtomicPtr<Cell<T>>, // if not null, a unique, transmuted ptr to a Box<T>
}

struct Cell<T> {

This comment has been minimized.

@stjepang

stjepang Mar 18, 2017

Contributor

I'd prefer to name this Node<T> in order to avoid confusion with std::cell::Cell.

loop {
let head = self.head.load(Ordering::Relaxed);
(*cell).next = head;
if self.head.compare_and_swap(head, cell, Ordering::Release) == head {

This comment has been minimized.

@stjepang

stjepang Mar 18, 2017

Contributor

You can make this loop faster under contention, like this:

let mut head = self.head.load(Ordering::Acquire);
loop {
    (*cell).next = head;
    let previous = self.head.compare_and_swap(head, cell, Ordering::AcqRel);
    if previous == head {
        break;
    } else {
        head = previous;
    }
}

Also, your orderings are incorrect - you need Acquire and AcqRel. :)
The way to think about this is the following...

Suppose Bob wants to prepend a new value to the list. He loads the current head with Relaxed ordering (like you did). Then he sets the next field of the new cell and installs a new cell with Release ordering.

Now Alice comes and loads the head with Acquire ordering. She gets value h, which is the head Bob just installed. Because she used Acquire ordering, she will see all writes to memory that happened before h was installed. This is good.

However, will she see writes to memory that happened before (*h).next was installed? Unfortunately, no, and that is because Bob used Relaxed load! Had he used Acquire load, then Alice would see even those writes.

With correct ordering, this is what would happen...
Think of this as a long synchronization chain. Alice's load (Acquire) synchronizes with Bob's store (Release), which happened before he wrote to (*cell).next, and that load (Acquire) synchronized with the previous store (Release) to the head (whoever did it) and so forth...

So the key take away is: make sure to chain these operations! Relaxed is often fishy. :)

@nikomatsakis

This comment has been minimized.

Copy link
Member Author

nikomatsakis commented Mar 26, 2017

I think I'll probably wind up closing this, at least for now, but I'm going to keep it open until I have time to read @stjepang's comments and give them some thought! @stjepang, I would encourage you to take a look at the orderings in the sleep module and give me your feedback on that as well! There is some subtle logic there and I suspect there are better ways of doing things. (The README for that module contains a lot of details.)

@nikomatsakis

This comment has been minimized.

Copy link
Member Author

nikomatsakis commented Sep 22, 2017

Gonna close this for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.