Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BVH templated mask checks and generic NUM_TREES #55640

Merged
merged 1 commit into from Jan 21, 2022

Conversation

lawnjelly
Copy link
Member

@lawnjelly lawnjelly commented Dec 5, 2021

Refactors the BVH to make it more generic and customizable. Instead of hard coding the system of pairable_mask and pairable_type into the BVH, this information is no longer stored internally, and instead the BVH uses callbacks both for determining whether pairs of objects can pair with each other, and for filtering cull / intersection tests.

In addition, instead of hard coding the number of trees, the BVH now supports up to 32 trees, and each object can supply a tree collision mask to determine which trees it can collide against.

This enables the BVH to scale to either the two or 3 trees needed in physics, and the single tree used without pairing in Godot 4 render tree.

Appears to fix #47316 based on closed source testing.

Introduction

This is the second part of a major refactor to the BVH (after the move to expanded bounds in the leafs), and constitutes a departure from the historical model used by the octree in 3.x. Initially when writing the BVH the emphasis was on backward compatibility, even though the old approach had ended up rather contrived trying to accommodate both the render and physics trees.

The system of pairable_type and pairable_mask worked in a generic fashion in the tree and bore little resemblence to the actual types in the physics and rendering.

Instead the general approach in Godot 4 has been to move much of the client specific checks out of the tree and into the client code. Following this model, this PR no longer stores pairable_mask and pairable_type data in the BVH. Instead it uses callbacks, both during pairing, and during cull checks, to allow the client code perform whatever checks are appropriate, using data stored client side.

This means that:

  • Client code is now no longer restricted to logic using a type and mask. It can use whatever logic makes sense for that system, or indeed no checks at all (e.g. render tree in Godot 4).
  • The checks and data are now stored client side. There is thus more potential for slowdown due to e.g. cache misses and lack of inlining. I'll be profiling and tuning this as there is a slight slowdown.

Multiple user definable trees

One feature the BVH introduced was the ability to use 2 trees, one for pairable, and one for non-pairable objects, with the view that non-pairable would not need to be checked against other non-pairable objects when moving. Although this has been the historical design, the historical design has a problem : non-monitorable Areas do not detect collisions with StaticBodies, because both are in a non-pairable tree (#17238). This bug is now no longer closed in this PR (to allow easier comparison with previous version) and will be in a follow up.

It had become clear that a better way of approaching this mechanism was instead of hard coding the BVH to use 2 trees, the number of trees is now supplied as a template parameter. It will thus have no cost when using a single tree, but also allows the client code to utilize up to 32 trees for new more efficient capabilities.

In order to make use of the (up to) 32 trees, objects are now no longer pairable or non-pairable, they now have a tree_id and a tree_collision_mask. Each object belongs to a single tree, and the mask determines which other trees the object will collide against (in a similar manner to collision masks, but this is actually far more efficient).

For now in the PR the BVH created by the client code in the physics and rendering exactly mirror the old behaviour. But there is now the possibility to make use of this extra functionality.

Cons

  • Client code is a little thicker. Clients are now responsible for declaring callbacks for pair_checks and cull_checks. These are not a lot of code, and comes hand in hand with making the system user definable.
  • Any new functionality used will not be available in octree. On the other hand we are currently phasing out octree, and few people are using it.
  • Any new functionality exploited in terms of more custom checks would have to also be fixed to work when Bullet is running. This would probably not be that involved.
  • Testing - As this is quite major surgery, it will require extensive testing for regressions. There is no reason why it should not work similarly to the old system, but even a small typo could lead to a bug, so I would be surprised if there weren't a few small regressions that need fixing.

Discussion

It is actually possible to stay with the existing approach, but given the move to Godot 4 I personally feel that rather than adding bandaids on an old design (which is probably from Godot 3.0 or earlier) it is a great opportunity to refactor and clear out old cruft, and make the system generic and more sensible for the future.

Regarding whether we do actually soon make use of the new capability for multiple trees, this is something we can discuss in a separate proposal. The opportunity will be there, but we do not have to make use of it.

Copy link
Contributor

@pouleyKetchoupp pouleyKetchoupp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! The concept of having multiple trees will allow interesting optimizations in physics.

I haven't checked the BVH code itself yet, but I had a look to the physics client code and left some comments to discuss a few things.

// NOTE This is a pain. With templated mask checks, this is now done ahead of
// time in the BVH, so this check is not needed, and valid_collision_pair can be
// hard coded to true. But bullet and octree may still need this
// check. We may have to do this even if not required in BVH.
bool valid_collision_pair = p_object_A->test_collision_mask(p_object_B);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing this check here shouldn't be a problem.

Bullet implementation is independent from this code as it implements a whole different physics server, so we could just move the collision check to the octree implementation here:

void *BroadPhaseOctree::_pair_callback(void *self, OctreeElementID p_A, CollisionObjectSW *p_object_A, int subindex_A, OctreeElementID p_B, CollisionObjectSW *p_object_B, int subindex_B) {
BroadPhaseOctree *bpo = (BroadPhaseOctree *)(self);
if (!bpo->pair_callback) {
return nullptr;
}
return bpo->pair_callback(p_object_A, subindex_A, p_object_B, subindex_B, nullptr, bpo->pair_userdata);
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have eliminated the test here and moved the test into the octree. The octree version could do with a second pair of eyes to check is correct.

@@ -33,7 +33,9 @@
#include "core/project_settings.h"

BroadPhase2DSW::ID BroadPhase2DBVH::create(CollisionObject2DSW *p_object, int p_subindex, const Rect2 &p_aabb, bool p_static) {
ID oid = bvh.create(p_object, true, p_aabb, p_subindex, !p_static, 1 << p_object->get_type(), p_static ? 0 : 0xFFFFF); // Pair everything, don't care?
uint32_t tree_id = p_static ? 0 : 1;
uint32_t tree_collision_mask = 3;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this will cause all static objects to pair with other static objects. Isn't it going to cause an overhead in cases involving lots of static objects or areas?

As a first step, I would rather keep the exact same logic as before, even if it means we don't fix the area issue with this PR yet, so we don't have surprises with performance.

And then it will be a bit clearer to implement the extra tree in another PR to fix the area vs. static bodies issue.

I would find that order of operations easier when it comes to comparing performance and finding regressions, but of course it's open to discussion if you disagree :)

So if I understand correctly, this line would be like that in order to keep the same logic:

Suggested change
uint32_t tree_collision_mask = 3;
uint32_t tree_collision_mask = p_static ? 0 : 3;

Copy link
Member Author

@lawnjelly lawnjelly Dec 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah it didn't occur to me that there may actually be an overhead in the physics from having e.g. static objects paired with each other. As far as the BVH is concerned, there is no (or very small) overhead as they aren't moving. Is there a physics overhead?

But yes this does lend weight to the idea of having 3 trees, and having areas collide with statics, statics with areas, but no area - area or static - static pairs. We can do this pretty easily now it is all user definable client side.

Yes you are right it would be better to put the interaction changes in a separate PR. 👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a physics overhead?

There should be no overhead on the physics side as long as it doesn't actually create new pairs (which should be the case with the cull check).

What I'm wondering is if we're going to have an overhead in the BVH itself if, let's say, you add lots of static bodies all at once in the broadphase. Even if they don't move later, they might cause an initial pairing check with each other.

It seems to me that the closest to the previous behavior in term of performance and functionality would be to:
-Start with using tree masks (but still with just 2 trees, static and non-static)
-Remove the object type thing (it wasn't really used anyway since objects were either paired with everything or nothing)
-Just return true in the cull check

And then in a separate PR, we can add one more tree and fix the area vs. static scenario.

I would be less worried for performance in some corner case scenarios this way.
Does that make sense to you?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic is now as suggested above (I've used an enum which isn't really necessary, but might help for readability).

// we are changing the mask to ALWAYS collide
// with other objects.
// In order to revert, uncomment the next line.
//_bvh_mask = is_static() ? 0 : 0xFFFFFFFF;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see you moved the logic for pairable/unpairable here.

I'm not sure I understand why having this layer of bvh_mask and bvh_type in physics objects though. Isn't it more efficient to just use tree_id and tree_mask to replace the previous pairable/unpairable logic?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we may be able to do this for the physics (we can just return true here and it will compile out). The render tree is a bit more tricky here as it has more types etc so it may still need this mechanism, I haven't yet looked in detail at which situations we can solve with the multiple trees, as the render tree logic is a bit more complex if I remember right.

As it is, this is pretty much porting the existing behaviour, and not yet taking advantage of the tree masks. We could e.g. have first PR just port the existing behaviour, next PR changes to use the trees for this, third to change the interactions to fix the area / static bug, so we can evaluate each at a time.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incidentally bvh_mask and bvh_type can also alternatively be derived from the object type etc on the fly, I haven't profiled these to compare. However if we are able to use the tree masks for this, they wouldn't be necessary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, makes sense to keep the cull check mechanism if it can be useful for render at least (and maybe in physics in the future if we encounter new cases where it would benefit performance).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made the user_cull_check a null check now, so it is just relying on the tree masks.

}

public:
static bool user_cull_check(const T *p_a, const T *p_b) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we were filtering objects out using tree mask it seems we could also get rid of the extra callback here. I might be missing something in the way the BVH trees work though so let me know if it's the case :)

Copy link
Member Author

@lawnjelly lawnjelly Dec 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we use the tree masks for this, we would just make this return true, and it should compile out. This check is still used in the render trees so we can't remove it altogether.

There's also a dummy version of this in the BVH_tree file which does exactly this, just returns true, for this purpose, and is the default parameter for the BVH template.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This now just returns true. I could remove this code and just use the Dummy version specified in BVH_tree, or leave this here as a stub in case we want to use it in future. Either is fine, whichever is clearer.

@lawnjelly lawnjelly force-pushed the bvh5_multi_tree branch 2 times, most recently from 55baa0d to c8778fb Compare December 11, 2021 14:44
@lawnjelly
Copy link
Member Author

Ok have made the changes we talked about, and going with the idea of trying 3 trees in a followup PR, which should be fun.

Could just do with some testing now, as there's a fair potential for regressions in the cases of mask changes etc not being detected.

Also note with the cull tests, there's now the future potential for users to test against individual trees rather than testing everything (the mask is currently just hard coded to FFFFFFFF in the function). This could potentially be quite handy if they e.g. just want to test against statics.

@lawnjelly lawnjelly marked this pull request as ready for review December 12, 2021 15:48
@lawnjelly lawnjelly requested review from a team as code owners December 12, 2021 15:48
@lawnjelly lawnjelly changed the title [WIP] BVH templated mask checks and generic NUM_TREES BVH templated mask checks and generic NUM_TREES Dec 12, 2021
Refactors the BVH to make it more generic and customizable. Instead of hard coding the system of pairable_mask and pairable_type into the BVH, this information is no longer stored internally, and instead the BVH uses callbacks both for determining whether pairs of objects can pair with each other, and for filtering cull / intersection tests.

In addition, instead of hard coding the number of trees, the BVH now supports up to 32 trees, and each object can supply a tree collision mask to determine which trees it can collide against.

This enables the BVH to scale to either the two or 3 trees needed in physics, and the single tree used without pairing in Godot 4 render tree.
Copy link
Member

@akien-mga akien-mga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me 👍

Probably needs a rebase just to make sure it still builds and work after ~1 month of extra changes in 3.x.

@lawnjelly
Copy link
Member Author

Will push rebased. Seemed to compile fine for me. I'll just do a final testing to refresh my memory but if it is all ok should be okay to merge this afternoon.

@lawnjelly
Copy link
Member Author

Yup it looks fine, I've been through the PR source to remind myself and done some final testing. Should be good for merging. 👍

@akien-mga akien-mga merged commit 66672c0 into godotengine:3.x Jan 21, 2022
@akien-mga
Copy link
Member

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants