Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instanced line rendering for gizmos based on bevy_polyline #8427

Merged
merged 21 commits into from
Jun 13, 2023

Conversation

irate-devil
Copy link
Contributor

@irate-devil irate-devil commented Apr 17, 2023

Objective

Adopt code from bevy_polyline for gizmo line-rendering.
This adds configurable width and perspective rendering for the lines.

Many thanks to @mtsr for the initial work on bevy_polyline. Thanks to @aevyrie for maintaining it, @nicopap for adding the depth_bias feature and the other contributors for squashing bugs and keeping bevy_polyline up-to-date.

Before

Before

After - with line perspective

After

Line perspective is not on by default because with perspective there is no default line width that works for every scene.

After - without line perspective

After - no perspective

Somewhat unexpectedly, the performance is improved with this PR.
At 200,000 lines in many_gizmos I get ~110 FPS on main and ~200 FPS with this PR.
I'm guessing this is a CPU side difference as I would expect the rendering technique to be more expensive on the GPU to some extent, but I am not entirely sure.

irate-devil and others added 2 commits April 17, 2023 20:43
Co-authored-by: Jonas Matser <github@jonasmatser.nl>
Co-authored-by: Aevyrie <aevyrie@gmail.com>
Co-authored-by: Nicola Papale <nico@nicopap.ch>
@alice-i-cecile alice-i-cecile added A-Rendering Drawing game state to the screen C-Performance A change motivated by improving speed, memory usage or compile times labels Apr 17, 2023
@alice-i-cecile alice-i-cecile added this to the 0.11 milestone Apr 17, 2023
@aevyrie
Copy link
Member

aevyrie commented Apr 17, 2023

Performance is better on my machine up until 220,000 lines in many_gizmos after which FPS suddenly tanks from ~200 down to ~60. Noting this so I can investigate later.

IIRC, you're going to want to batch lines to get really good performance. bevy_polyline can handle many millions of line segments by batching lines with the same material into the same polyline and separating each polyline in the single gpu struct with NaNs. See: ForesightMiningSoftwareCorporation/bevy_polyline#20.

You could probably aggregate all polylines during extract, and shove them into a single GPU uniform, where each set of vertices and colors are separated by a NaN, if you aren't doing this already. This would result in a single draw call for all polylines.

It's also possible bevy_polyline has a much higher ceiling because it doesn't include color information in instance data.

@irate-devil
Copy link
Contributor Author

I should have clarified that bevy_gizmos is currently exclusively an immediate mode API.
This means that, unlike with bevy_polyline, those 220,000 lines are being sent every single frame to render them, so it works a little differently (i.e. I'm not expecting the same performance characteristics here).

many millions of line segments

I assumed above that this number is referring to static polylines. Is that right?

You could probably aggregate all polylines during extract, and shove them into a single GPU uniform, where each set of vertices and colors are separated by a NaN, if you aren't doing this already. This would result in a single draw call for all polylines.

I learned the NaN trick from @IceSentry iirc, wouldn't be surprised if they got it from you :)

All lines are split across just 2 batches. One for the NaN separated linestrip topology and one for linelist topology (Used only for single segment lines)
Those 220,000 lines are all in the linelist batch.

It's also possible bevy_polyline has a much higher ceiling because it doesn't include color information in instance data.

I wonder how much this might affect the performance of this implementation. I would like to keep the fancy color gradients q:

@aevyrie
Copy link
Member

aevyrie commented Apr 17, 2023

All lines are split across just 2 batches. One for the NaN separated linestrip topology and one for linelist topology (Used only for single segment lines)

Sorry, I didn't see that when I did a brief review of the changes. It's pretty awesome you're already doing this. 🙂

I wonder how much this might affect the performance of this implementation. I would like to keep the fancy color gradients q:

Yeah, it's a nice feature to have. You might have luck by treating it like mesh indices, and only storing colors as a list of indices into the list of colors. My guess is for most cases, this would significantly cut down on the amount of data that needs to be yeeted (yoten?) over to the GPU. You would be reducing memory use for color from 16 bytes ([f32; 4]) per vertex down to 2 bytes (u16). This would allow you to store at most 65k distinct colors across all polylines, which is probably a reasonable limitation (1MB of color data). Though you could have 4 billion colors if you use a u32 as your color index.

Alternatively, you could keep the same approach as now, but use run length encoding to losslessly compress color information for polylines that use a single color. Another upside of that is it would allow you to easily add per-vertex line width and depth bias without blowing up memory usage in most cases.

Anyway, these are obviously some potential optimizations for a future day. This PR is definitely a nice addition.

@irate-devil
Copy link
Contributor Author

No need to apologize :) Thanks for all the pointers!

Another upside of that is it would allow you to easily add per-vertex line width and depth bias without blowing up memory usage in most cases.

It's good that you mention this. I was planning to enable more granular control over these settings in a future PR

@IceSentry
Copy link
Contributor

This PR is pretty big, is there anything that could be done in a follow up PR in order to keep this one a bit smaller?

I learned the NaN trick from @IceSentry iirc, wouldn't be surprised if they got it from you :)

Yep 😉

@irate-devil
Copy link
Contributor Author

irate-devil commented Apr 18, 2023

This PR is pretty big, is there anything that could be done in a follow up PR in order to keep this one a bit smaller?

This is pretty much the minimal change required to do this line rendering technique. I already stripped down the code a quite a bit compared to bevy_polyline. Especially considering this includes both a 3d and a 2d pipeline.

superdump pushed a commit that referenced this pull request Apr 22, 2023
# Objective

Compute the `vertex_count` for indexed meshes as well as non-indexed
meshes.

I will need this in a future PR based on #8427 that adds a gizmo
component that draws the normals of a mesh when attached to an entity
([branch](irate-devil/bevy@instanced-line-rendering...devil-ira:bevy:instanced-line-rendering-normals)).

<details><summary>Example image</summary>
<p>


![image](https://user-images.githubusercontent.com/29694403/233789526-cb5feb47-0aa7-4e69-90a2-e31ec24aadff.png)

</p>
</details> 

## Solution

Move `vertex_count` field from `GpuBufferInfo::NonIndexed` to `GpuMesh`

## Migration Guide

`vertex_count` is now stored directly on `GpuMesh` instead of
`GpuBufferInfo::NonIndexed`.
Copy link
Contributor

@nicopap nicopap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. You say it's a perf upgrade compared to LineList? That's surprising, but good for us, because that would be amazing to have size (even if joints are still an open question). I'd just like to avoid some magic numbers.

crates/bevy_gizmos/src/lib.rs Show resolved Hide resolved
crates/bevy_gizmos/src/lib.rs Outdated Show resolved Hide resolved
Comment on lines +220 to +226
struct LineGizmoUniform {
line_width: f32,
depth_bias: f32,
/// WebGL2 structs must be 16 byte aligned.
#[cfg(feature = "webgl")]
_padding: bevy_math::Vec2,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So only a global line_width and depth_bias is possible? Given you only upload two instances.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, yes. I'll look into more granular configuration in a future PR.

crates/bevy_gizmos/src/lib.rs Outdated Show resolved Hide resolved
crates/bevy_gizmos/src/lib.rs Outdated Show resolved Hide resolved
crates/bevy_gizmos/src/lib.rs Outdated Show resolved Hide resolved
crates/bevy_gizmos/src/lib.rs Outdated Show resolved Hide resolved
crates/bevy_gizmos/src/pipeline_3d.rs Outdated Show resolved Hide resolved
irate-devil and others added 2 commits April 23, 2023 21:01
@nicopap nicopap self-requested a review April 23, 2023 19:41
@hate
Copy link
Contributor

hate commented May 31, 2023

This looks great, the performance increase is a significant one

Copy link
Contributor

@IceSentry IceSentry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, there's a couple of things that could be improved in the future, but it's already in a really nice state for 0.11

@nicopap nicopap added the S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it label Jun 11, 2023
@alice-i-cecile alice-i-cecile added this pull request to the merge queue Jun 12, 2023
@alice-i-cecile
Copy link
Member

Merging for you: y'all have carefully considered this and have more than enough expertise between you. The code and changes look good from my end too.

@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jun 12, 2023
@alice-i-cecile alice-i-cecile added this pull request to the merge queue Jun 12, 2023
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jun 12, 2023
@mockersf mockersf added this pull request to the merge queue Jun 12, 2023
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jun 12, 2023
@alice-i-cecile alice-i-cecile added this pull request to the merge queue Jun 12, 2023
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jun 12, 2023
@nicopap
Copy link
Contributor

nicopap commented Jun 13, 2023

Dang merge failed

@mockersf mockersf added this pull request to the merge queue Jun 13, 2023
Merged via the queue into bevyengine:main with commit 001b3eb Jun 13, 2023
24 checks passed
@irate-devil irate-devil deleted the instanced-line-rendering branch June 14, 2023 13:22
github-merge-queue bot pushed a commit that referenced this pull request Jul 15, 2023
# Objective

Gizmos are intended to draw over everything but for some reason I set
the sort key to `0` during #8427 :v

I didn't catch this mistake because it still draws over sprites with a Z
translation of `0`.

## Solution

Set the sort key to `f32::INFINITY`.
cart pushed a commit that referenced this pull request Aug 10, 2023
# Objective

Gizmos are intended to draw over everything but for some reason I set
the sort key to `0` during #8427 :v

I didn't catch this mistake because it still draws over sprites with a Z
translation of `0`.

## Solution

Set the sort key to `f32::INFINITY`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Rendering Drawing game state to the screen C-Performance A change motivated by improving speed, memory usage or compile times S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants