Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

updateMatrixWorld Optimization using new Object3DMatrixData class #25114

Open
wants to merge 26 commits into
base: dev
Choose a base branch
from

Conversation

diarmidmackenzie
Copy link
Contributor

@diarmidmackenzie diarmidmackenzie commented Dec 11, 2022

Related issue: #25115

Description

This PR shows a set of changes to Object3D that delivers a substantial performance improvement to updateMatrixWorld() under a "realistic" benchmark test proposed under PR #25113.

For pre-existing benchmark tests, it appears to be approximately performance-neutral.

Therefore this PR only becomes relevant if we reach a consensus that the Benchmark test added under PR #25113 (or something similar) is indeed better representative of "real world" performance than the pre-existing Benchmark tests.

If we do agree on that, then I think the approach proposed in this PR is an interesting way forwards, though it will need more work to develop into a PR that's ready to merge.

In summary, this PR attempts to optimize the traversal through the scene graph in updateMatrixWorld() by guaranteeing iteration over a monomorphic set of objects, resulting in superior performance for retrieval of object properties.
(see this article for an overview of why this makes a difference).

It achieves this as follows:

  • It creates a new class Object3DMatrixData, which contains all the properties that needs to be accessed in updateMatrixWorld() - 2 x matrices, some flags, position, quaternion & scale.
  • On creation, Object3D instantiates a new instance of Object3DMatrixData, and also sets up a number of properties, getters & setters to map through to these (so that Object3D's external interface is unchanged).
  • The implementation of updateMatrixWorld is moved onto the Object3DMatrixData class, with the Object3D.updateMatrixWorld() method mapping through to this

This approach should preserve the existing interface for Object3D, but allows updateMatrixWorld() to iterate exclusively through a monomorphic population of Object3DMatrixData objects, which should be substantially more performant.

For the new benchmark proposed in #25113, this PR delivers a 50% performance improvement, from 125 ops/second to 180+ ops/second on my system (Windows + Chrome)

For the existing benchmarks, it seems to be performance neutral (to within the margin of error of these tests).

I don't know why this PR doesn't bring performance on the new benchmark up to the levels seen with the existing benchmarks (250 ops/second) - this suggests that there are other factors at play that I don't fully understand.

Nevertheless, a 50% performance improvement in real-world use cases is a substantial gain. Given that I believe this can be delivered without significant side-effects, and with no externally-visible changes to Object3D, I think it is well worth looking at this further.

Perf not looking great - linked list probably not a good solution as deciding when to stop traversing is too expensive.
Linked list was performing worse, not better.
Saves having to reference data on Object3D during updateMatrix()
@diarmidmackenzie
Copy link
Contributor Author

I have checked existing Unit Tests all run, which suggests I have achieved my goal of 100% maintaining the pre-existing Object3D interface.

Probably more work needed on this PR before merging:

  • Some additional Unit Tests
  • Perhaps update some other code such as updateWorldMatrix() to take advatange of the new option for monomorphic

But I'd rather get some initial feedback on this fix before doing further work on it, and I think the code is ready for an initial review, so marking this as "Ready for Review" now.

@diarmidmackenzie diarmidmackenzie marked this pull request as ready for review December 14, 2022 09:45
@diarmidmackenzie
Copy link
Contributor Author

Looking into e2e test failures:
webgl_geometry_shapes and webgl_geometry_convex do appear broken.

webgl_geometry_convex starts in wrong position, and then switches to correct position on touching the middle mouse wheel
webgl_geometry_shapes stays fixed in wrong position (this example is not sensitive to middle mouse wheel)

Not yet clear what specifically about these examples leads to problems, when other examples seem fine...

@diarmidmackenzie
Copy link
Contributor Author

OK, understand the problem, and it is very much a real one...

Various classes extend updateMatrixWorld() with additional function.

A search for super.updateMatrixWorld() reveals the following, about half in core THREE.js, the other half in examples\jsm.

image

The implementation in this PR is just invoking updateMatrixWorld() on the Object3DMatrixData and assuming that's enough. It's not.

This manifested in te e2e test failures above as the Camera pointing in the wrong direction as a result of matrixWorldInverse not being set up correctly (which is what the Camera extension to updateMatrixWorld() takes care of.

Presumably all the other classes above would exhibit deficiencies of equivalent severity.

Definitely a severe problem with this PR that needs solving.

@diarmidmackenzie
Copy link
Contributor Author

I've made a set of updates to accommodate the need to extend updateMatrixWorld() for certain sub-classes.

This is not very elegant, and I suspect there's a better solution (very happy to receive suggestions), but at least this shows that it is possible to extend behaviour on updateMatrixWorld() on a class-by-class basis while still keeping the main updateMatrixWorld() iteration loop monomorphic, and maintaining good performance.

Benchmarks on latest code look like this:

image

A key point is that I believe that all the sub-classes that extend updateMatrixWorld (e.g. lights, cameras, skinned meshes etc.) are going to have fairly low populations in the scene graph, so it should be OK that the calls to updateMatrixWorldAfter and updateMatrixWorldBefore are polymorphic, as they are only invoked for a small percentage of Object3Ds.

Code left over from having started with a slightly different implementation for extending updateMatrixWorld().  Can be removed now.
@diarmidmackenzie
Copy link
Contributor Author

diarmidmackenzie commented Dec 19, 2022

This comment on a related PR suggests that the API changes here are not going be acceptable:
#25142 (comment)

Specifically this part (which actually relates to the changes under this PR, rather than the extensions made by #25142)

Looking at the changes to the examples, the PR does not fulfill this premise.

Based on this comment, it seems that to save this PR, we'd need to find a way to continue to allow sub-classes of Object3D to override updateMatrixWorld() exactly as they do today.

I don't know of any way that can be done, while retaining the monomorphic iteration that this PR relies on for its performance gains.

commit 1cc4151
Author: diarmidmackenzie <diarmid.mackenzie@gmail.com>
Date:   Mon Dec 19 15:50:07 2022 +0000

    Fix linting

commit 49d73ee
Author: diarmidmackenzie <diarmid.mackenzie@gmail.com>
Date:   Mon Dec 19 15:46:09 2022 +0000

    Add comment explaining why we call these functions twice.

commit 414fece
Author: diarmidmackenzie <diarmid.mackenzie@gmail.com>
Date:   Mon Dec 19 15:44:43 2022 +0000

    Bind to sub-call methods correctly

commit 1119983
Author: diarmidmackenzie <diarmid.mackenzie@gmail.com>
Date:   Mon Dec 19 15:31:36 2022 +0000

    Fix indentation

commit 6630f54
Author: diarmidmackenzie <diarmid.mackenzie@gmail.com>
Date:   Mon Dec 19 15:30:56 2022 +0000

    More fix-up following revert of API changes

commit 8138ffb
Author: diarmidmackenzie <diarmid.mackenzie@gmail.com>
Date:   Mon Dec 19 15:24:12 2022 +0000

    Further fix up in reverting previous changes.

commit 533aa61
Author: diarmidmackenzie <diarmid.mackenzie@gmail.com>
Date:   Mon Dec 19 15:18:11 2022 +0000

    Allow sub-classes to override updateMatrixWorld() without any API change
@diarmidmackenzie
Copy link
Contributor Author

diarmidmackenzie commented Dec 19, 2022

I have figured out a way to (pretty much) maintain the existing API, while switching to monomorphic iteration. - just pushing updated code now.

The one caveat is that when a sub-class of Object3D overrides updateMatrixWorld(), the additional function gets invoked twice.

That shouldn't (and as far as testing shows, doesn't) create a functional issue, since anything done in updateMatrixWorld() should be idempotent (since the author of the class has no idea how many times updateMatrixWorld() could be called).

It's sub-optimal performance-wise, but the net performance impact is still strongly positive.

That said, I do think that the established pattern of sub-classes of Object3D overriding updateMatrixWorld() is actually not a good one.

My view is that updateMatrixWorld() for any class should do what it says - i.e. update the matrixWorld of the class. It seems that it is being used to hook in all kinds of sub-class-specific side-effects, and I suspect there is a cleaner way to implement such side-effects.

IMO it would make for a much cleaner interface to have Object3D offer onBeforeMatrixUpdate and onAfterMatrixUpdate callbacks, on the same model as onBeforeRender and onAfterRender.

One way forward would be:

  • Implement this PR in a way that continues to accommodate this pattern of sub-classes implementing side-effects by overriding updateMatrixWorld()
  • Also implement a PR to offer a new clearner interface for such side-effects: onBeforeMatrixUpdate and onAfterMatrixUpdate
  • Deprecate overriding of updateMatrixWorld() using the standard pattern of a generating a warning for a few releases, and eventually removing this option entirely.

This was causing failing Unit Tests when run on command line.
My error in thinking this check was needed.  Highlighted as an issue by DeepScan.
@diarmidmackenzie
Copy link
Contributor Author

I've created PR #25159, which is a variant of this one, but includes the updated (back-compatible) API as suggested here:
#25114 (comment)

That's not strictly required for this PR, as this PR can work OK with the old approach of overriding updateMatrixWorld() in sub-classes. But it delivers cleaner, more performant code, so could well be worth taking at the same time.

I'd be happy to close this PR and focus on #25159, if it seems like a more complete package.

@mrdoob
Copy link
Owner

mrdoob commented Dec 21, 2022

@diarmidmackenzie

Be aware that our time is pretty limited... I personally I'm unable to read long posts like the ones you're writing.

Please, try to be as concise as you can.

@diarmidmackenzie
Copy link
Contributor Author

Understood. Not planning to do more work on this or #25159 until I receive some feedback from one of the maintainers.

TLDR: these PRs deliver a ~50% performance gain on updateMatrixWorld() by restructuring code in a way the V8 engine can execute more efficiently.

My 1st attempt failed to preserve API back compatibility, but after feedback from @Mugen87, I have addressed that.

I believe there's value there, but of course you need to decide how you spend your time.

@robertlong
Copy link
Contributor

This looks really interesting! One minor suggestion. Maybe call it Transform instead of Object3DMatrixData?

@netpro2k @takahirox you may want to test this one out on Hubs without the current perf hacks and see how it does.

@diarmidmackenzie
Copy link
Contributor Author

Yes, I think Transform is a good name for capturing what this is.

The current naming presents this as an under-the-covers implementation detail that most people using THREE.js shouldn't need to worry about.

If we called it Transform, that starts to look like a more explicit addition of a new conceptual layer to the THREE.js class hierarchy, and encourages users to interact with it directly. That might be a good thing, but it also feels like a much more significant change...

@robertlong
Copy link
Contributor

I'm not so sure the naming should imply that it's more public than it is. There's some precedent with Source which is a mostly internal API.

@mrdoob
Copy link
Owner

mrdoob commented Jan 22, 2023

Before discussing naming it would be great if someone could do a thorough test on performance/memory implications of the approach.

@diarmidmackenzie
Copy link
Contributor Author

Performance is discussed here:
#25113

On existing perf benchmarks, the changes are ~neutral. However the existing benchmarks are flawed as they use a monomorphic set of Object3Ds, which is not realistic.

#25113 proposes some more realistic benchmarks, against which these changes show substantial improvements.

I'm not aware of any pre-existing memory usage benchmarks, but if someone can point me at them, I'd be happy to run them.

@Hoodgail
Copy link
Contributor

Hoodgail commented Oct 2, 2023

I'm not aware of any pre-existing memory usage benchmarks, but if someone can point me at them, I'd be happy to run them.
Devtools...

You can use console.profile() to start a benchmark and console.profileEnd() to stop it. This provides more than just memory benchmarking, and you can visualize even more data.

Explanation:

The original sentence is a bit wordy and could be more concise. It also doesn't explain the benefits of using console.profile() and console.profileEnd(). The rewritten sentence is more concise and informative. It also highlights the benefits of using these methods, such as the ability to benchmark more than just memory and to visualize more data.

Here is an example of how to use console.profile() and console.profileEnd() to benchmark a function:

function myFunction() {
  // Do something
}

console.profile('myFunction');
myFunction();
console.profileEnd('myFunction');

This will start a benchmark for the myFunction() function and stop it after the function has finished executing. You can then view the benchmark results in the browser's developer tools.

Visualization:
Once you have finished recording, you can view the profile once the its been stopped. The profile will show you a timeline of all the events that occurred during the recording. You can then zoom in and out of the timeline to see more detail.

The Performance tab also provides a number of other tools that you can use to analyze the performance of your code, such as the Call Tree and the Flame Chart. These tools can help you to identify bottlenecks in your code and to optimize its performance.

@Hoodgail
Copy link
Contributor

Hoodgail commented Oct 2, 2023

image
Remember to enable this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants