Skip to content

Commit

Permalink
Round up for the batch size to improve par_iter performance (bevyengi…
Browse files Browse the repository at this point in the history
…ne#9814)

# Objective

The default division for a `usize` rounds down which means the batch
sizes were too small when the `max_size` isn't exactly divisible by the
batch count.

## Solution

Changing the division to round up fixes this which can dramatically
improve performance when using `par_iter`.

I created a small example to proof this out and measured some results. I
don't know if it's worth committing this permanently so I left it out of
the PR for now.

```rust
use std::{thread, time::Duration};

use bevy::{
    prelude::*,
    window::{PresentMode, WindowPlugin},
};

fn main() {
    App::new()
        .add_plugins((DefaultPlugins.set(WindowPlugin {
            primary_window: Some(Window {
                present_mode: PresentMode::AutoNoVsync,
                ..default()
            }),
            ..default()
        }),))
        .add_systems(Startup, spawn)
        .add_systems(Update, update_counts)
        .run();
}

#[derive(Component, Default, Debug, Clone, Reflect)]
pub struct Count(u32);

fn spawn(mut commands: Commands) {
    // Worst case
    let tasks = bevy::tasks::available_parallelism() * 5 - 1;
    // Best case
    // let tasks = bevy::tasks::available_parallelism() * 5 + 1;
    for _ in 0..tasks {
        commands.spawn(Count(0));
    }
}

// changing the bounds of the text will cause a recomputation
fn update_counts(mut count_query: Query<&mut Count>) {
    count_query.par_iter_mut().for_each(|mut count| {
        count.0 += 1;
        thread::sleep(Duration::from_millis(10))
    });
}
```

## Results

I ran this four times, with and without the change, with best case
(should favour the old maths) and worst case (should favour the new
maths) task numbers.

### Worst case

Before the change the batches were 9 on each thread, plus the 5
remainder ran on one of the threads in addition. With the change its 10
on each thread apart from one which has 9. The results show a decrease
from ~140ms to ~100ms which matches what you would expect from the maths
(`10 * 10ms` vs `(9 + 4) * 10ms`).

![Screenshot from 2023-09-14
20-24-36](https://github.com/bevyengine/bevy/assets/1353401/82099ee4-83a8-47f4-bb6b-944f1e87a818)

### Best case

Before the change the batches were 10 on each thread, plus the 1
remainder ran on one of the threads in addition. With the change its 11
on each thread apart from one which has 5. The results slightly favour
the new change but are basically identical as the total time is
determined by the worse case which is `11 * 10ms` for both tests.

![Screenshot from 2023-09-14
20-48-51](https://github.com/bevyengine/bevy/assets/1353401/4532211d-ab36-435b-b864-56af3370d90e)
  • Loading branch information
MJohnson459 authored and Ray Redondo committed Jan 9, 2024
1 parent 628661d commit 29ca434
Showing 1 changed file with 5 additions and 2 deletions.
7 changes: 5 additions & 2 deletions crates/bevy_ecs/src/query/par_iter.rs
Expand Up @@ -11,7 +11,7 @@ use super::{QueryItem, QueryState, ReadOnlyWorldQuery, WorldQuery};
///
/// By default, this batch size is automatically determined by dividing
/// the size of the largest matched archetype by the number
/// of threads. This attempts to minimize the overhead of scheduling
/// of threads (rounded up). This attempts to minimize the overhead of scheduling
/// tasks onto multiple threads, but assumes each entity has roughly the
/// same amount of work to be done, which may not hold true in every
/// workload.
Expand Down Expand Up @@ -197,7 +197,10 @@ impl<'w, 's, Q: WorldQuery, F: ReadOnlyWorldQuery> QueryParIter<'w, 's, Q, F> {
.max()
.unwrap_or(0)
};
let batch_size = max_size / (thread_count * self.batching_strategy.batches_per_thread);

let batches = thread_count * self.batching_strategy.batches_per_thread;
// Round up to the nearest batch size.
let batch_size = (max_size + batches - 1) / batches;
batch_size.clamp(
self.batching_strategy.batch_size_limits.start,
self.batching_strategy.batch_size_limits.end,
Expand Down

0 comments on commit 29ca434

Please sign in to comment.