Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a deadlock that can occur when using scope() on ComputeTaskPool from within a system #892

Merged

Conversation

aclysma
Copy link
Contributor

@aclysma aclysma commented Nov 19, 2020

This bevy app will deadlock 100% of the time (provided by @bonsairobo with minor adjustments):

use bevy::{prelude::*, tasks::ComputeTaskPool};

fn main() {
    App::build()
        .add_resource(DefaultTaskPoolOptions::with_num_threads(1))
        .add_system(example_system.system())
        .add_plugins(DefaultPlugins)
        .run();
}

fn example_system(pool: Res<ComputeTaskPool>) {
    pool.scope(|s| {
        s.spawn(async move {
            println!("Task hello");
        })
    });
}

The root cause is that in this example, ComputeTaskPool has only one thread in the pool. Calling scope() from within the system eventually calls future::block_on(...), blocking this thread until all the spawned tasks complete. However, those tasks will never run because the only thread in the ComputeTaskPool is blocking.

This PR fixes this by having the thread that calls spawn() participate in driving the tasks spawned in the scope forward.

Another way this could be resolved would be to await the scope() call. This would require making scope() an async function which may complicate capturing local vars, and making systems themselves async. But the solution in the PR is much simpler.

@aclysma
Copy link
Contributor Author

aclysma commented Nov 19, 2020

Also while I'm thinking about it, this change (aclysma@e1f7619) would ensure that if there is only a single task, that the calling thread immediately executes it. I think this would be better than having another thread pick up the work and execute it with potentially a cold cache, with the calling thread stuck until completes.

BTW, and this is true for the PR as well, the calling thread might not belong to the pool used by scope(). So you could end up with a thread outside the pool doing work on tasks that are in the pool. (having scope() be async and awaiting would avoid this and be more proper behavior, but I think in practice the tradeoffs here are fine.) I can PR this one if you'd like.

@Moxinilian Moxinilian added C-Bug An unexpected or incorrect behavior A-Core Common functionality for all bevy apps labels Nov 19, 2020
@cart
Copy link
Member

cart commented Nov 21, 2020

Preventing deadlocks is definitely desirable, but this change does appear to come at the cost of cpu usage (at least in dev builds, which is what i tested).

before (reported as a combined 15%)

image

after (reported as a combined 17%)

image

linked commit w/ single task opt (reported as a combined 17%)

image

Cpu usage is much "spikier" which sort of makes sense. I'll do release mode tests next

@cart
Copy link
Member

cart commented Nov 21, 2020

Release mode results (sorry for the cut off units ... the scale is the same in each image):

before (reported as a combined 1%)

image

after (reported as a combined 1%)

image

linked commit w/ single task opt (reported as a combined 1%)

image

Looks like it matters a lot less in release mode. The average is still very low and all impls have "spikes"

@aclysma
Copy link
Contributor Author

aclysma commented Nov 21, 2020

The stats are very interesting! It looks like these measurements are based on CPU % (as opposed to throughput). If this change avoids waiting for another thread to step in and pick up the work, I would expect both a slight increase in CPU % and raw throughput.

@cart
Copy link
Member

cart commented Nov 21, 2020

Yeah its probably worth measuring frame time. Ideally in a non-graphical compute-heavy example (with decent parallelism potential) to remove rendering from the equation.

(the tests above were performed on the breakout example)

@cart
Copy link
Member

cart commented Nov 26, 2020

Just measured frame time for the breakout example with and without this change. @aclysma's prediction was correct. Before this change, we get ~0.000900 seconds per frame. After this change we get ~0.000660 seconds per frame. Thats huge!

I'm inclined to merge this as (1) it resolves a real bug (2) it gives us a performance boost. The spikiness of the usage graph is something to think about from a power consumption perspective, but I'm not going to worry about that / im not even sure how we would measure it / if we really care about power consumption theres a million other things we should resolve first.

I'm also very open to your other change if you're still interested in a pr.

@cart
Copy link
Member

cart commented Nov 26, 2020

Im putting together a fix for the new clippy lint now. I'll rebase this pr when its ready.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Core Common functionality for all bevy apps C-Bug An unexpected or incorrect behavior
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants