Optimize Viewport's _gui_find_control_at_pos for Container nodes#86361
Open
rsubtil wants to merge 1 commit intogodotengine:masterfrom
Open
Optimize Viewport's _gui_find_control_at_pos for Container nodes#86361rsubtil wants to merge 1 commit intogodotengine:masterfrom
_gui_find_control_at_pos for Container nodes#86361rsubtil wants to merge 1 commit intogodotengine:masterfrom
Conversation
aee087a to
b00b3d0
Compare
Maran23
reviewed
Dec 20, 2023
b00b3d0 to
f25bf72
Compare
MewPurPur
suggested changes
Dec 20, 2023
9c66e64 to
a0d26d3
Compare
MewPurPur
reviewed
Dec 21, 2023
MewPurPur
reviewed
Dec 21, 2023
a0d26d3 to
09a7911
Compare
MewPurPur
approved these changes
Dec 21, 2023
Contributor
MewPurPur
left a comment
There was a problem hiding this comment.
Looks good from a code review standpoint.
rsubtil
added a commit
to retrohub-org/godot
that referenced
this pull request
Feb 22, 2024
Optimize mouse UI picking on BoxContainer, FlowContainer and GridContainer
|
any updates? |
Member
Author
|
@virophagesp not any updates here, but this should still work properly. I deployed this in a custom Godot version for the app I'm developing, and so far I haven't seen nor received any bug reports around it. I do want to explore again the "unsafe" optimization I mentioned though, because the change in behavior I talked about seems to occur with this optimization as well, and if it is a reasonable assumption to make, the speedup should increase quite a bit more. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
(I couldn't find a related issue, but if desired, I can open one describing the problem in our project)
When developing an application for managing the user's gaming library, we've had a few users with thousands of games reporting extremely low performance whenever they interacted with any mouse input, such as picking and scrolling the library. This is easily reproducible by adding 10~20k
Controlnodes to some container, such asFlowContainer. When moving the mouse and/or scrolling around, the drop in FPS is extremely noticeable (while in this simple scenario the FPS are still good, on our application performance drops to ~20FPS and has very noticeable stutters) :Peek.2023-12-20.13-52.mp4
This slowdown was not visible in any of Godot's profilers, suggesting the issue lied somewhere in engine code. After profilling with external tools, the root cause was on how the viewport tests for valid controls under a given mouse position (
Viewport::_gui_find_control_at_pos):godot/scene/main/viewport.cpp
Lines 1695 to 1707 in 3ce73e5
This function iterates over every node's children, thus being O(N), which in our scenario is extremely wasteful due to our usage of
Containers with known positions for children nodes, and since this runs on every frame of input, it quickly becomes a noticeable bottleneck.By taking advantage of the
Container's behavior of automatically repositioning nodes, it was possible to greatly optimize this node lookup process by taking into account the mouse's position, lowering the candidate nodes considerably.Optimization
This introduces the
Container::get_children_at_posfunction for allContainernodes, which should return the candidate list of nodes likely to be at the given position. The default implementation returns all children, which should keep the current behavior for any containers not overriding this function. When the viewport traverses children, it will call this new function in order to optimize the search, and fallback to the existing behavior if no valid children are found. Thus, in theory, this change should not break or change any existing behavior.This PR overrides this function for
FlowContainer(HFlowContainer/VFlowContainer),BoxContainer(HBoxContainer/VBoxContainer) andGridContainer. These seemed the best candidates for this optimization, but if there's more containers that could benefit from this, I can add them as well.Note
In the following diagrams, grey numbers/dotted lines represent invisible nodes.
FlowContainerSince this container can fit an arbitrary number of nodes for each row, information about each row's position and first child index is stored. When searching for children, we find the row that contains the given position through a binary search, and return all the children in that row.
GridContainerSame behavior as
FlowContainer.BoxContainerBeing a one-dimensional container, only visible children's position and index are stored. When searching for children, we search for it's position with a binary search. This container thus only returns one child.
Benchmark
ReproUIPerformance.zip
This test project allows to test the performance of the viewport's node lookup. It should be opened in the editor, and before running, make one of the existing container's visible in order to let it spawn the children.
Invisvariants spawn some children as invisible. Each child is a button indicating it's index, and global (x, y) position.To test this yourself, enable only one of the container nodes, run the project, click on the
Benchmarkbutton, and don't move the mouse for the next 500 frames. The test will warp the mouse pointer to the center of the screen, and simulate both mouse motion and wheel scrolling. When done, it shows the average frame time for the test.Warning
Each enabled container spawns a lot of children nodes (50k), and take significant amounts of RAM to launch (~2GB).
Peek.2023-12-20.14-10.mp4
Here are the results I obtained from my setup (Ryzen 5 5600G, RX 6500 XT, 8GB RAM):
Note
Both vanilla and optimized builds were compiled with:
$ scons use_llvm=yes linker=mold scu_build=yes use_static_cpp=no optimize=speedImportant
Values are updated when requested changes modify speedups significantly. Previous data and comparisons remain available below.
HFlowContainerHFlowContainerInvisVFlowContainerVFlowContainerInvisHBoxContainerHBoxContainerInvisVBoxContainerVBoxContainerInvisGridContainerGridContainerInvisOld values
Original results
HFlowContainerHFlowContainerInvisVFlowContainerVFlowContainerInvisHBoxContainerHBoxContainerInvisVBoxContainerVBoxContainerInvisGridContainerGridContainerInvisRemarks
Control::_window_find_focus_neighbor). Something similar is needed to optimize that scenario as well.Containers, which makes it even faster (for the slowest time atVBoxContainerInvis, time becomes 17.812 and speedup x3.692). This, however, assumes thatContainer's children only have content on their "bounding boxes", which is not a guarantee as nodes could be moved beyond this, and thus becoming uninteractable. While the speedup with the fallback is already considerable nevertheless, it would be interesting to find a way to support these scenarios in order to improve even further.Peek.2023-12-20.15-04.mp4