scope without Send #562

jonhoo · 2018-04-03T17:48:54Z

This ties somewhat into the discussion over in #522 (comment).

Currently, ThreadPool::scope requires the passed closure to be Send because the closure itself is executed on the thread pool. However, if ThreadPool is used as a more generic thread pool (rather than explicitly for data-dependent computation), it is not unreasonable for some existing thread to wish to spin off a number of jobs with access to its stack, and then wait for them all to complete (essentially as a pool of scoped threads). With the Send bound in place, that thread is pretty restricted in what it can use to generate jobs (e.g., anything with Rc is a no-go). It'd be good if there was an alternate version of scope that did not require Send for its closure, and which instead executed the closure on the current thread (but still waited for any spawned jobs to complete).

The text was updated successfully, but these errors were encountered:

cuviper · 2018-04-03T18:02:57Z

I expect it's possible, but I'm not sure that we should.

One reason that we enter the threadpool for scope is so that its calls to spawn can push directly to a local job queue. Pushing to the global queue requires taking a lock. But maybe that synchronization wouldn't be a big deal for some use cases.

With this change, generators monitor how quickly clients are draining queued jobs, and stop issuing jobs when they detect that clients have enough queued work to last for the remaining duration of the experiment. This is mostly a work-around for rayon-rs/rayon#544. Note that the load generator now runs *in* the thread pool, so the `threads` argument should now be set to the total number of cores rather than #core - #generators. This is due to rayon-rs/rayon#562. It's a little unfortunate because it means that *all* job distribution requires stealing (the generator will put all jobs on its local queue). Note also that (because of the same linked rayon issue) the creation of `id_rng` is now in a closure. This is so that the argument can be `Send` so we can get it into the thread pool in the first place.

alecmocatta · 2018-04-10T16:52:14Z

I just bumped into this. I'm using scoped_threadpool as well as rayon and noticed the latter ostensibly contained the former's functionality, enabling me to unify on one threadpool. Unification would be convenient as I want a thread per core, and ensuring that over two independent threadpools is nontrivial (given I wouldn't want cores to be idle if one threadpool was full but one empty).

rocallahan · 2018-08-13T05:34:32Z

I bumped into this too. https://github.com/reem/rust-scoped-pool is unmaintained so I'm trying to migrate to rayon's ThreadPool and in some places that doesn't work because the scope closure is not Send (and making it Send requires significant API changes in callers).

ghost · 2018-08-18T19:38:55Z

@cuviper

Pushing to the global queue requires taking a lock. But maybe that synchronization wouldn't be a big deal for some use cases.

What if we used MsQueue for the global queue and avoid the lock?

Alternatively, I'm considering implementing a new Deque that supports push/pop/steal/steal_half, implements Sync, and doesn't separate workers and stealers. The intended use case for it is shared global deque of tasks, which is not owned by any worker thread.

cuviper · 2018-08-20T18:19:04Z

What if we used MsQueue for the global queue and avoid the lock?

It's worth trying!

cuviper · 2018-12-14T18:57:50Z

FWIW, #615 is changing the global queue to a SegQueue (without a lock), so we could follow up by experimenting here without an initial Send.

abonander · 2019-07-10T23:01:09Z

I've run into a need for this in #676.

DzmitryFil · 2020-08-03T20:11:40Z

+1 for this, Send requirement is very restrictive in some cases. I work primarily with webassembly, and none of the web api's are capable of working across different threads (!Send), which severely limits rayon usage. Scope without Send would allow me to do useful work on the worker thread with !Send stuff (inside scope closure), while at the same time running rayon jobs in scope. Currently i have to choose, either i make some progress with !Send stuff on the main thread, or i run rayon scope/join.

rocallahan · 2021-01-16T10:16:04Z

I hacked together a version of this here: https://github.com/rocallahan/rayon/commits/downstream

Apart from eliminating the Send bound on the OP closure it's a bit more efficient because you don't need to send the op across threads, when you're not already on a worker thread. Also, the op can safely wait for spawned closures to complete; with regular scope, you can't, because blocking a worker thread on spawned work can deadlock.

Should I clean it up and submit it?

cuviper · 2021-02-13T00:22:09Z

@rocallahan I would be interested in that, but I hope we don't have to fork Scope for it. Can we refactor enough to let external_scope just be an alternate entry point that still uses the same Scope? Then we can probably do the same for ScopeFifo.

I'm still wary of changing the current scope entry behavior, since at some level that also serves as an alternate install. That is, by entering the thread pool it also sets the implicit pool for join, iterators, etc.

Apart from eliminating the Send bound on the OP closure it's a bit more efficient because you don't need to send the op across threads, when you're not already on a worker thread.

Yes, but then everything you spawn has to be sent across threads to the pool-wide injector queue, instead of just pushing to the thread-local worker queue. That tradeoff will probably vary widely by workload.

Also, the op can safely wait for spawned closures to complete; with regular scope, you can't, because blocking a worker thread on spawned work can deadlock.

This is only true if you're sure that you're not already on a worker thread.

rocallahan · 2021-02-13T00:27:00Z

Can we refactor enough to let external_scope just be an alternate entry point that still uses the same Scope?

We can, but then there will be slightly higher overhead for scope users (some conditional branches at least). Is that acceptable?

I'm still wary of changing the current scope entry behavior, since at some level that also serves as an alternate install. That is, by entering the thread pool it also sets the implicit pool for join, iterators, etc.

Yes, we can't break that, we need a new entry point.

Yes, but then everything you spawn has to be sent across threads to the pool-wide injector queue, instead of just pushing to the thread-local worker queue. That tradeoff will probably vary widely by workload.

Right.

This is only true if you're sure that you're not already on a worker thread.

Right. In our case we have a dedicated thread pool and restrict access to it so only specific code can run in it.

cuviper · 2021-02-13T00:38:29Z

Can we refactor enough to let external_scope just be an alternate entry point that still uses the same Scope?

We can, but then there will be slightly higher overhead for scope users (some conditional branches at least). Is that acceptable?

I suspect a few branches won't be noticeable compared to the existing synchronization, but let's see what it looks like.

rocallahan · 2021-02-13T02:16:48Z

The tricky bit here is the latch. Currently ScopeBase contains a CountLatch, which is two atomics (state, job count). In my PoC, I replaced this with a LockLatch (Mutex, CondVar) plus an atomic job count. With a unified Scope, we need to support both. We could just enum our way to victory here, but maybe we can do better.

One thing I don't understand which may be relevant: Why is CountLatch two atomics? Couldn't it use a single atomic? wait_until_cold would have to be generic over L: Latch but that doesn't look like a problem. (You could argue that the 'state' atomic changes less frequently which might mean less cache traffic as waiters poll it, but it will share the same cache line as the counter in most cases.)

It might be simplest to allow CountLatch to be used as a raw atomic job count for the external scope case, and just add a LockLatch to Scope to be used for the external case. This could be whittled down to just one extra conditional branch when the last job completes in the Scope. This would increase the size of Scope by 32 bytes, is that OK?

(I considered trying to reuse Registry's LOCK_LATCH but I guess that could lead to deadlocks if an external scope op tries to call in_worker.)

cuviper · 2021-02-13T02:51:31Z

You've jogged my memory enough to remember that I had tinkered with this, but not finished. I've now pushed that branch so I can share it here, in case that helps to compare notes and think through some of the issues:
master...cuviper:raw_scope

One possibility your prototype didn't cover was if we call this from one thread pool into another. It's not great to use a LockLatch in that case, blocking the thread in the old pool. In general, we try to let that sort of blocking go into work stealing too, and I tried to allow that in my branch.

Why is CountLatch two atomics?

I believe this is primarily so we can share the logic in the sleep module, which is fairly tricky.

This would increase the size of Scope by 32 bytes, is that OK?

Size shouldn't be of much concern, because the Scope is created once on the stack and only passed around by reference.

rocallahan · 2021-02-13T03:14:59Z

I believe this is primarily so we can share the logic in the sleep module, which is fairly tricky.

Ok.

One possibility your prototype didn't cover was if we call this from one thread pool into another.

Yes, this had crossed my mind and then I forgot about it.

Stealing your ScopeLatch seems like the way to go.

rocallahan · 2021-02-13T03:17:45Z

Using ScopeLatch means a conditional branch when we increment the latch too, but I guess that's OK.

rocallahan · 2021-02-13T04:11:08Z

Here's what I have:
Pernosco@external-scope
I probably should write some more tests. Should I do that and submit a PR?

rocallahan · 2021-02-13T04:17:46Z

Also, is external_scope a good name here? I didn't put any thought into choosing it.

cuviper · 2021-04-02T23:18:46Z

Naming is hard -- external_scope sounds better than my raw_scope, at least. Let's rekindle this discussion in a PR and we can think about it more from there.

rocallahan · 2021-04-02T23:42:30Z

Submitted PR #844

844: Implement `in_place_scope` r=cuviper a=rocallahan As discussed in #562. Co-authored-by: Robert O'Callahan <robert@ocallahan.org>

rocallahan · 2021-05-04T11:11:02Z

Fixed by #844 6a01573

cuviper mentioned this issue Mar 27, 2019

Block until all rayon::spawn have completed #650

Open

cuviper mentioned this issue Jul 10, 2019

Introduce an alternative to ParallelBridge that does not require the iterator type to be Send #676

Open

petr-tik mentioned this issue Sep 8, 2019

Redesign the core/executor quickwit-oss/tantivy#652

Closed

cuviper mentioned this issue Aug 3, 2020

Efficiently combining rayon with separate ui/render thread #780

Closed

cuviper mentioned this issue Aug 13, 2020

Spawning tasks to multiple thread pools from a single scope #782

Open

cuviper mentioned this issue Mar 22, 2021

Scopes and blocking operations #835

Open

rocallahan mentioned this issue Apr 2, 2021

Implement in_place_scope #844

Merged

bors bot added a commit that referenced this issue May 4, 2021

Merge #844

6a01573

844: Implement `in_place_scope` r=cuviper a=rocallahan As discussed in #562. Co-authored-by: Robert O'Callahan <robert@ocallahan.org>

nikomatsakis closed this as completed May 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scope without Send #562

scope without Send #562

jonhoo commented Apr 3, 2018

cuviper commented Apr 3, 2018

alecmocatta commented Apr 10, 2018

rocallahan commented Aug 13, 2018

ghost commented Aug 18, 2018

cuviper commented Aug 20, 2018

cuviper commented Dec 14, 2018

abonander commented Jul 10, 2019

DzmitryFil commented Aug 3, 2020

rocallahan commented Jan 16, 2021 •

edited

cuviper commented Feb 13, 2021

rocallahan commented Feb 13, 2021

cuviper commented Feb 13, 2021

rocallahan commented Feb 13, 2021

cuviper commented Feb 13, 2021

rocallahan commented Feb 13, 2021

rocallahan commented Feb 13, 2021

rocallahan commented Feb 13, 2021

rocallahan commented Feb 13, 2021

cuviper commented Apr 2, 2021

rocallahan commented Apr 2, 2021

rocallahan commented May 4, 2021

scope without Send #562

scope without Send #562

Comments

jonhoo commented Apr 3, 2018

cuviper commented Apr 3, 2018

alecmocatta commented Apr 10, 2018

rocallahan commented Aug 13, 2018

ghost commented Aug 18, 2018

cuviper commented Aug 20, 2018

cuviper commented Dec 14, 2018

abonander commented Jul 10, 2019

DzmitryFil commented Aug 3, 2020

rocallahan commented Jan 16, 2021 • edited

cuviper commented Feb 13, 2021

rocallahan commented Feb 13, 2021

cuviper commented Feb 13, 2021

rocallahan commented Feb 13, 2021

cuviper commented Feb 13, 2021

rocallahan commented Feb 13, 2021

rocallahan commented Feb 13, 2021

rocallahan commented Feb 13, 2021

rocallahan commented Feb 13, 2021

cuviper commented Apr 2, 2021

rocallahan commented Apr 2, 2021

rocallahan commented May 4, 2021

rocallahan commented Jan 16, 2021 •

edited