Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[3.x] Implement LOD range in GeometryInstance #53778

Closed
wants to merge 1 commit into from

Conversation

Calinou
Copy link
Member

@Calinou Calinou commented Oct 13, 2021

Salvaged version of #20489. Thanks @Windfisch for the implementation 馃檪

Similar to the one present in master, this provides a LOD system for GeometryInstance-derived nodes such as MeshInstance and MultiMeshInstance.

This LOD system currently doesn't obey shadow rendering, but it can still provide substantial performance improvements when used well. (To prevent multi-LOD objects from casting shadows over each other, only make one of the LOD levels cast shadows by configuring other meshes' Shadow Casting property to Off.)

On the other hand, this LOD system should be able to handle split screen, unlike the one currently present in master.

Coupled with the occlusion culling systems (portals/rooms or occluder spheres), this makes possible to achieve much greater performance in large detailed scenes.

Testing project: test_lod_3.x.zip

Rationale

As you may have seen in #20489's comments, the PR was closed since a different approach was adopted in master. The closing comment refers to the automatic LOD system, which can't be backported to 3.x as it would require too many backwards-incompatible changes. The automatic LOD system is a good start as it requires no manual setup, but it's not suited to every use case and can't perform HLOD (merging several distant objects into a single draw call).

However, the manual (H)LOD system can still be backported or reimplemented in a limited fashion in 3.x, while still providing substantial performance improvements with some manual setup.

The existing LOD properties can be read by a Godot 4.0 converter to allow LOD to work automatically when porting a project from Godot 3.x to Godot 4.0.

Preview

The testing project has 1,134 boxes with 16,000 triangles each (20脳20脳20 subdivision).

No LOD

No LOD

The video is stuttery due to how I recorded it, but it's much smoother in practice.

no_lod.mp4

With LOD

With LOD

The video is stuttery due to how I recorded it, but it's much smoother in practice.

with_lod.mp4

TODO

@EzraT
Copy link

EzraT commented Jan 8, 2022

Thank you!

instance->lod_begin = p_min;
instance->lod_end = p_max;
instance->lod_begin_hysteresis = p_min_margin;
instance->lod_end_hysteresis = p_max_margin;
}
void VisualServerScene::instance_geometry_set_as_instance_lod(RID p_instance, RID p_as_lod_of_instance) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one should be removed, nothing ever calls it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That method is exposed to the scripting API 鈥撀爓ouldn't removing it technically break compatibility?

@akien-mga akien-mga requested a review from JFonS February 15, 2022 13:11
@Calinou Calinou force-pushed the implement-lod-range branch 2 times, most recently from 58dffe9 to eecc9cd Compare May 11, 2022 07:45
// Calculate instance->depth from the camera.
const Vector3 aabb_center = ins->transformed_aabb.position + (ins->transformed_aabb.size * 0.5);
ins->depth = near_plane.distance_to(aabb_center);
ins->depth_layer = CLAMP(int(ins->depth * 16 / z_far), 0, 15);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This calculation seems only required here if LOD is active for this instance. This could be set efficiently with e.g. a bool when setting the LOD params for the instance. i.e.:

if (ins->lod_active) ... do lod calcs

In addition the depth calculated here could potentially be reused in many cases later in the line 3305 part, to save doing the same calculation twice. Although that would need some perf testing to check it is a win.

Copy link
Member Author

@Calinou Calinou May 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This calculation seems only required here if LOD is active for this instance. This could be set efficiently with e.g. a bool when setting the LOD params for the instance. i.e.:

Do we need a dedicated boolean field for this? We can check if LOD is enabled by checking whether the min range or max range is greater than 0 (exclusive). We can also check for hysteresis being greater than 0 (exclusive) if any of those conditions is true, but lazy evaluation can take care of that for us anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't profiled this, but imo a good aim is to make any new code as cheap as possible for instances that aren't using LOD. We essentially should aim not to be doing extra unnecessary calculations for the 99% of times where people aren't using LOD on an instance.

Doing unneeded distance calculations, and checking two float values against CMP_EPSILON is not as cheap as just detecting a bool is set to false which will likely be the cheapest conditional (bear in mind on some mobile floating point math could even be emulated). This double float check is additionally now not required at runtime should the bool be set to true.

There are lots of occasions where just doing things unoptimized are fine (where code clarity is important, and it isn't performance sensitive), but inside the renderer imo we should strive to make it as efficient as possible as there is a fair amount of churn, and we are just wasting cycles and battery on mobile.

Looking at the Instance : RasterizerScene::InstanceBase structure, it looks like RID lod_instance is unused so could be removed. You could for instance add a bool up with the update_materials bools, and you could possibly make them 1 bit too (e.g. bool update_aabb : 1;), so they would likely be no more than 32 bit between them.

In instance_geometry_set_draw_range() you could just do something like this:

instance->using_lod = (instance->lod_begin > CMP_EPSILON) || (instance->lod_end > CMP_EPSILON);

Therefore the float check is done once (rarely) when setting the params, and the cheap bool test can be done at runtime on each frame on each prepare_scene.

@lawnjelly
Copy link
Member

lawnjelly commented May 25, 2022

One thing I'm mostly unsure about is the use of a hashmap to store the hysteresis state of each instance on each camera. Even with the improvements to hashmap in 4.x (which aren't available in 3.x), this seems like a sub-optimal way of storing this for realtime use.

I wonder whether we could simply store a vector of instance IDs per camera that are in a "far" hysteresis state. You could run through this vector at the start of prepare_scene, set a variable on the instances concerned to a "current counter" (static incremented by one each prepare_scene), delete the vector, then when running through the instances you could simply check the stored counter on the instance against the current_counter to determine the hysteresis state (if they are equal, state is "far", if not, is near). And store the latest hysteresis state into the cleared vector for the next time if it is far this time.

The only snag here is that objects that move outside the view frustum would lose their hysteresis state (but maybe that would not matter? not sure).

An alternative approach might be to store a bitfield on each instance of the hysteresis state for each camera (say 32 bit?). Cameras could be assigned a hysteresis ID as they are created, with a limit of 32 current for the hysteresis to work. I don't know if anyone would need more than 32 cameras...

These are just some ideas but personally I would really try and avoid the hashmap for this kind of thing, if possible.

Similar to the one present in `master`, this provides a LOD system
for GeometryInstance-derived nodes such as MeshInstance
and MultiMeshInstance.

This LOD system currently doesn't obey shadow rendering, but it
can still provide substantial performance improvements when used well.

Coupled with the occlusion culling systems (portals/rooms or
occluder spheres), this makes possible to achieve much greater
performance in large detailed scenes.

Co-authored-by: Florian Jung <florian.jung@fau.de>
@lawnjelly
Copy link
Member

lawnjelly commented May 30, 2022

Just doing some more testing. I've profiled with 100,000 boxes (not all at the same time, it was kind of derived from your test project, but this should be indicative), with most of them being LOD culled:

lod_profile

So as expected the bottlenecks look to be

  • prepare_scene
  • BVH culling
  • OAHashMap

Within prepare_scene the LOD calculations look like they are big features. Also it looks like it is still doing a bunch of processing even if keep is set to false. Some of this can probably be avoided.

lod_profile_1
lod_profile_2
lod_profile_3

@lawnjelly
Copy link
Member

lawnjelly commented May 30, 2022

As I was saying on rocket chat, for the extreme situation of an open world, where the user simply wants some far away stuff not to show up, we should probably look at modifying VisibilityEnabler to be able to activate at a distance (rather than just on or off screen). This is because VisibilityEnabler works SceneTree side, and can prevent processing, and could possibly tell the VisualServer to make instances visible / invisible via removing from the BVH, which would also reduce the BVH culling cost (in exchange for a cost of adding / removing them at runtime as the cameras move around - this trade off might be worth it though).

An alternative approach (like in the addon) is to maintain a list of currently LODed instances on the camera, and run through this list round robin fashion, and not necessarily do the whole list every frame. If this was done over say 10 frames, you have a 1/10th of the checks necessary (at a cost of a slight lag to the LOD change). This would be in exchange for a slight complexity of adding LODed instances to the relevant camera lists as cameras are added / destroyed, and instances are added / destroyed.

Having said that, I don't think the above is necessary for a first implementation, provided that we make the LOD check highly efficient in the code as I outlined earlier, by checking e.g. a bool.

See https://github.com/lawnjelly/godot/tree/calinou_lod_range_modified . I'll try and push some commits for possible changes.

@lawnjelly
Copy link
Member

lawnjelly commented May 30, 2022

The changes I added to https://github.com/lawnjelly/godot/tree/calinou_lod_range_modified now seem to approx double the frame rate in the extreme test case (from 60fps to 120fps).

  • bool lod_active
  • eliminate calculations when keep is false
  • change from storing hysteresis state on OAHashMap to bitset per instance

The downside (versus the HashMap) is that the bitset is currently hardcoded to a max of 32 current cameras (above that, hysteresis will not work on added cameras). There is a typedef so it can be changed to e.g. 64, but 32 should be fine imo.

Usability

Although most comments I made earlier are to do with performance, given we have a bit of time before 3.6 we should consider whether there is anything we can do to increase the usability (especially regarding visibility parents, and auto-generating LODs / merging as a menu option). I'm not super familiar with the LOD in 4.x, but hopefully some of the guys that worked on that there can chime in such as @JFonS .

Self shadowing

Thinking ahead one of problems that might be faced is the difference between the LOD used for shadows and rendering which can cause nasty artefacts. This is probably an issue just using a low LOD for the shadow map and a higher LOD for rendering. However it can potentially be reduced if you can use the same LOD for shadow map and rendering (except in the case of fading between LODs). I don't know how 4.x deals with this currently.

However, this PR is doing LOD and storing hysteresis states independently for different cameras, whereas shadow maps are rendered once per light afaik. This means you no longer have the option to just pick the same LOD for the shadow map, because with a split screen game, different screens may be rendering a different LOD from the camera view.

An alternative might be to make the decision of which LOD to use globally (i.e. not per camera). You would probably have to do a step ahead of time to calculate the closest camera and choose the highest LOD (or average or something) determined by the closest camera. This might lead to some weirdness in split screen as far off objects move in and out of the view of a secondary camera, but it could help with the shadow artefact problem.

Anyway this isn't saying we should necessarily use the other approach, as there are problems with both. You could also render different shadow maps per camera, but this would be slow and quite complex to change.

@TokisanGames
Copy link
Contributor

TokisanGames commented May 30, 2022

Does this have all the features of multilod? It's already well tested and stable.
https://github.com/puchik/godot-extras/tree/master/gdnative/multi-lod

Eg, the plugin manages shadow impostors, sends signals on lod changes, and adjusts lights and MMI.

@Calinou
Copy link
Member Author

Calinou commented May 30, 2022

Does this have all the features of multilod? It's already well tested and stable. https://github.com/puchik/godot-extras/tree/master/gdnative/multi-lod

Eg, the plugin manages shadow impostors, sends signals on lod changes, and adjusts lights and MMI.

See the first post in #58512 for light LOD. I started working on a 3.x backport but didn't manage to complete it yet.

Shadow meshes aren't supported in 3.x yet (but they are in master). It's not striaghtforward to backport, but it's technically feasible. This should be tackled in a separate PR though.

MultiMeshInstance might work already as the LOD properties used in this PR are available on any node that inherits from GeometryInstance, not just MeshInstance.

As for signals on LOD changes, I'm not sure if it's worth the added complexity. We try to keep the feature set as consistent as possible between master and 3.x, so adding features that only make sense in 3.x should be avoided as it makes upgrades to Godot 4 more difficult.
You may also consider that VisibilityNotifier is getting a "max distance" property: #61544

@TokisanGames
Copy link
Contributor

Alright, thanks. Signals on lod change are essential for us. It serves as a proximity detector, similar to visibility notifier but for multiple distances. We'll continue to use puchik's multi-lod in 3.x until we upgrade to GD4.

@fire
Copy link
Member

fire commented May 31, 2022

The godot https://github.com/CaptainProton42/GodotHoloPlay uses 35 cameras I think. @CaptainProton42

@lawnjelly
Copy link
Member

The godot https://github.com/CaptainProton42/GodotHoloPlay uses 35 cameras I think. @CaptainProton42

This case looks more suitable for manual LOD though (if indeed it is required). A situation like that will likely benefit more from performing LOD once per object, rather than once per camera.

@EzraT
Copy link

EzraT commented Jun 18, 2022

Will this be merged before 3.5 stable releases?

@Calinou
Copy link
Member Author

Calinou commented Jun 18, 2022

Will this be merged before 3.5 stable releases?

3.5 is nearing release, so I'm afraid not. This will have to wait for 3.6, but nothing prevents you from building your own editor and export template binaries with this PR included or sticking to godot-lod.

@akien-mga akien-mga modified the milestones: 3.5, 3.x Jun 20, 2022
@Calinou Calinou changed the title Implement LOD range in GeometryInstance [3.x] Implement LOD range in GeometryInstance Jun 27, 2023
@EzraT
Copy link

EzraT commented Nov 21, 2023

Just want to throw in my own 2 cents on this.
I know there are other LOD options available for Godot 3, I've used godot-lod before in my Godot projects and it works good, and yes I know I can build my own editor with this PR merged in.

But I do personally think its a very bad look to have the LOD properties available in the inspector while this isn't actually implemented in the engine, especially considering Godot 3 is labeled as a LTS version now.

Considering this is probably not going to be merged before 3.6 is released, why not just decide to hide these properties until this is actually implemented, to avoid confusion?
They don't do anything so there wouldn't be any compatibility breakage happening or anything like that.

This is the type of thing that confuses and potentially turns off new users.

Am I wrong?

https://www.reddit.com/r/godot/comments/180jtdh/does_anyone_know_when_is_this_actually_making_it/

@TheConceptBoy
Copy link

TheConceptBoy commented Nov 22, 2023

But I do personally think its a very bad look to have the LOD properties available in the inspector while this isn't actually implemented in the engine, especially considering Godot 3 is labeled as a LTS version now.

I wanted to pitch this as well. It's in the UI, clearly someone has already deemed it worthy of proposing and a whole lot of other people have already approved it to warrant adding. LOD is kind of a must-have and there's no way that a GDScript implementation is anywhere as performant as a C++ implementation (considering that allegedly, GDscript had a speed boost in Godot 4). Adding either a script to run logic or a whole node just to get LOD in seems bloated. Not to mention that you can't use this LOD node with GLTF animated meshes like characters that have baked in LOD meshes. This is a must have for any engine, especially for those who have existing projects in Godot 3 and a lot of props.

@akien-mga
Copy link
Member

Superseded by #85437.

@akien-mga akien-mga closed this Mar 8, 2024
@AThousandShips AThousandShips removed this from the 3.x milestone Mar 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants