Skip to content

Commit

Permalink
Use instancing for sprites (#9597)
Browse files Browse the repository at this point in the history
# Objective

- Supercedes #8872 
- Improve sprite rendering performance after the regression in #9236 

## Solution

- Use an instance-rate vertex buffer to store per-instance data.
- Store color, UV offset and scale, and a transform per instance.
- Convert Sprite rect, custom_size, anchor, and flip_x/_y to an affine
3x4 matrix and store the transpose of that in the per-instance data.
This is similar to how MeshUniform uses transpose affine matrices.
- Use a special index buffer that has batches of 6 indices referencing 4
vertices. The lower 2 bits indicate the x and y of a quad such that the
corners are:
  ```
  10    11

  00    01
  ```
UVs are implicit but get modified by UV offset and scale The remaining
upper bits contain the instance index.

## Benchmarks

I will compare versus `main` before #9236 because the results should be
as good as or faster than that. Running `bevymark -- 10000 16` on an M1
Max with `main` at `e8b38925` in yellow, this PR in red:

![Screenshot 2023-08-27 at 18 44
10](https://github.com/bevyengine/bevy/assets/302146/bdc5c929-d547-44bb-b519-20dce676a316)

Looking at the median frame times, that's a 37% reduction from before.

---

## Changelog

- Changed: Improved sprite rendering performance by leveraging an
instance-rate vertex buffer.

---------

Co-authored-by: Giacomo Stevanato <giaco.stevanato@gmail.com>
  • Loading branch information
superdump and SkiFire13 committed Sep 2, 2023
1 parent 40c6b3b commit 4fdea02
Show file tree
Hide file tree
Showing 7 changed files with 277 additions and 228 deletions.
21 changes: 1 addition & 20 deletions crates/bevy_pbr/src/render/mesh_functions.wgsl
Expand Up @@ -4,26 +4,7 @@
#import bevy_pbr::mesh_bindings mesh
#import bevy_pbr::mesh_types MESH_FLAGS_SIGN_DETERMINANT_MODEL_3X3_BIT
#import bevy_render::instance_index get_instance_index

fn affine_to_square(affine: mat3x4<f32>) -> mat4x4<f32> {
return transpose(mat4x4<f32>(
affine[0],
affine[1],
affine[2],
vec4<f32>(0.0, 0.0, 0.0, 1.0),
));
}

fn mat2x4_f32_to_mat3x3_unpack(
a: mat2x4<f32>,
b: f32,
) -> mat3x3<f32> {
return mat3x3<f32>(
a[0].xyz,
vec3<f32>(a[0].w, a[1].xy),
vec3<f32>(a[1].zw, b),
);
}
#import bevy_render::maths affine_to_square, mat2x4_f32_to_mat3x3_unpack

fn get_model_matrix(instance_index: u32) -> mat4x4<f32> {
return affine_to_square(mesh[get_instance_index(instance_index)].model);
Expand Down
2 changes: 1 addition & 1 deletion crates/bevy_pbr/src/render/mesh_types.wgsl
Expand Up @@ -2,7 +2,7 @@

struct Mesh {
// Affine 4x3 matrices transposed to 3x4
// Use bevy_pbr::mesh_functions::affine_to_square to unpack
// Use bevy_render::maths::affine_to_square to unpack
model: mat3x4<f32>,
previous_model: mat3x4<f32>,
// 3x3 matrix packed in mat2x4 and f32 as:
Expand Down
3 changes: 3 additions & 0 deletions crates/bevy_render/src/lib.rs
Expand Up @@ -234,6 +234,8 @@ pub struct RenderApp;

pub const INSTANCE_INDEX_SHADER_HANDLE: HandleUntyped =
HandleUntyped::weak_from_u64(Shader::TYPE_UUID, 10313207077636615845);
pub const MATHS_SHADER_HANDLE: HandleUntyped =
HandleUntyped::weak_from_u64(Shader::TYPE_UUID, 10665356303104593376);

impl Plugin for RenderPlugin {
/// Initializes the renderer, sets up the [`RenderSet`](RenderSet) and creates the rendering sub-app.
Expand Down Expand Up @@ -391,6 +393,7 @@ impl Plugin for RenderPlugin {
"BASE_INSTANCE_WORKAROUND".into()
]
);
load_internal_asset!(app, MATHS_SHADER_HANDLE, "maths.wgsl", Shader::from_wgsl);
if let Some(future_renderer_resources) =
app.world.remove_resource::<FutureRendererResources>()
{
Expand Down
21 changes: 21 additions & 0 deletions crates/bevy_render/src/maths.wgsl
@@ -0,0 +1,21 @@
#define_import_path bevy_render::maths

fn affine_to_square(affine: mat3x4<f32>) -> mat4x4<f32> {
return transpose(mat4x4<f32>(
affine[0],
affine[1],
affine[2],
vec4<f32>(0.0, 0.0, 0.0, 1.0),
));
}

fn mat2x4_f32_to_mat3x3_unpack(
a: mat2x4<f32>,
b: f32,
) -> mat3x3<f32> {
return mat3x3<f32>(
a[0].xyz,
vec3<f32>(a[0].w, a[1].xy),
vec3<f32>(a[1].zw, b),
);
}
8 changes: 8 additions & 0 deletions crates/bevy_render/src/render_resource/buffer_vec.rs
Expand Up @@ -144,6 +144,14 @@ impl<T: Pod> BufferVec<T> {
pub fn clear(&mut self) {
self.values.clear();
}

pub fn values(&self) -> &Vec<T> {
&self.values
}

pub fn values_mut(&mut self) -> &mut Vec<T> {
&mut self.values
}
}

impl<T: Pod> Extend<T> for BufferVec<T> {
Expand Down

0 comments on commit 4fdea02

Please sign in to comment.