Skip to content

Optimize Viewport's _gui_find_control_at_pos for Container nodes#86361

Open
rsubtil wants to merge 1 commit intogodotengine:masterfrom
retrohub-org:feature-improve_viewport_ui_mouse_picking
Open

Optimize Viewport's _gui_find_control_at_pos for Container nodes#86361
rsubtil wants to merge 1 commit intogodotengine:masterfrom
retrohub-org:feature-improve_viewport_ui_mouse_picking

Conversation

@rsubtil
Copy link
Copy Markdown
Member

@rsubtil rsubtil commented Dec 20, 2023

(I couldn't find a related issue, but if desired, I can open one describing the problem in our project)

When developing an application for managing the user's gaming library, we've had a few users with thousands of games reporting extremely low performance whenever they interacted with any mouse input, such as picking and scrolling the library. This is easily reproducible by adding 10~20k Control nodes to some container, such as FlowContainer. When moving the mouse and/or scrolling around, the drop in FPS is extremely noticeable (while in this simple scenario the FPS are still good, on our application performance drops to ~20FPS and has very noticeable stutters) :

Peek.2023-12-20.13-52.mp4

This slowdown was not visible in any of Godot's profilers, suggesting the issue lied somewhere in engine code. After profilling with external tools, the root cause was on how the viewport tests for valid controls under a given mouse position (Viewport::_gui_find_control_at_pos):

godot/scene/main/viewport.cpp

Lines 1695 to 1707 in 3ce73e5

if (!c || !c->is_clipping_contents() || c->has_point(matrix.affine_inverse().xform(p_global))) {
for (int i = p_node->get_child_count() - 1; i >= 0; i--) {
CanvasItem *ci = Object::cast_to<CanvasItem>(p_node->get_child(i));
if (!ci || ci->is_set_as_top_level()) {
continue;
}
Control *ret = _gui_find_control_at_pos(ci, p_global, matrix);
if (ret) {
return ret;
}
}
}

This function iterates over every node's children, thus being O(N), which in our scenario is extremely wasteful due to our usage of Containers with known positions for children nodes, and since this runs on every frame of input, it quickly becomes a noticeable bottleneck.

By taking advantage of the Container's behavior of automatically repositioning nodes, it was possible to greatly optimize this node lookup process by taking into account the mouse's position, lowering the candidate nodes considerably.

Optimization

This introduces the Container::get_children_at_pos function for all Container nodes, which should return the candidate list of nodes likely to be at the given position. The default implementation returns all children, which should keep the current behavior for any containers not overriding this function. When the viewport traverses children, it will call this new function in order to optimize the search, and fallback to the existing behavior if no valid children are found. Thus, in theory, this change should not break or change any existing behavior.

This PR overrides this function for FlowContainer (HFlowContainer/VFlowContainer), BoxContainer (HBoxContainer/VBoxContainer) and GridContainer. These seemed the best candidates for this optimization, but if there's more containers that could benefit from this, I can add them as well.

Note

In the following diagrams, grey numbers/dotted lines represent invisible nodes.

FlowContainer

Since this container can fit an arbitrary number of nodes for each row, information about each row's position and first child index is stored. When searching for children, we find the row that contains the given position through a binary search, and return all the children in that row.

image

GridContainer

Same behavior as FlowContainer.

image

BoxContainer

Being a one-dimensional container, only visible children's position and index are stored. When searching for children, we search for it's position with a binary search. This container thus only returns one child.

image

Benchmark

ReproUIPerformance.zip

This test project allows to test the performance of the viewport's node lookup. It should be opened in the editor, and before running, make one of the existing container's visible in order to let it spawn the children. Invis variants spawn some children as invisible. Each child is a button indicating it's index, and global (x, y) position.

image

To test this yourself, enable only one of the container nodes, run the project, click on the Benchmark button, and don't move the mouse for the next 500 frames. The test will warp the mouse pointer to the center of the screen, and simulate both mouse motion and wheel scrolling. When done, it shows the average frame time for the test.

Warning

Each enabled container spawns a lot of children nodes (50k), and take significant amounts of RAM to launch (~2GB).

Peek.2023-12-20.14-10.mp4

Here are the results I obtained from my setup (Ryzen 5 5600G, RX 6500 XT, 8GB RAM):

Note

Both vanilla and optimized builds were compiled with:
$ scons use_llvm=yes linker=mold scu_build=yes use_static_cpp=no optimize=speed

Important

Values are updated when requested changes modify speedups significantly. Previous data and comparisons remain available below.

Node Type Frame time - Vanilla (ms) Frame time - Optimized (ms) Speedup (relative)
HFlowContainer 44.296 16.540 x2.678
HFlowContainerInvis 65.306 26.398 x2.474
VFlowContainer 44.084 16.160 x2.728
VFlowContainerInvis 63.886 26.120 x2.446
HBoxContainer 44.974 10.532 x4.270
HBoxContainerInvis 68.046 19.448 x3.499
VBoxContainer 44.542 14.746 x3.021
VBoxContainerInvis 66.014 24.556 x2.688
GridContainer 43.754 16.608 x2.635
GridContainerInvis 65.038 26.692 x2.437
Old values

Original results

Node Type Frame time - Vanilla (ms) Frame time - Optimized (ms) Speedup (relative) Improvement (%)
HFlowContainer 40.690 16.256 x2.503 + 6.99%
HFlowContainerInvis 61.696 27.214 x2.267 + 9.13%
VFlowContainer 40.470 16.142 x2.507 + 8.82%
VFlowContainerInvis 61.688 26.044 x2.369 + 3.25%
HBoxContainer 42.042 11.038 x3.809 + 12.10%
HBoxContainerInvis 65.264 18.890 x3.455 + 1.27%
VBoxContainer 42.838 14.172 x3.023 - 0.07%
VBoxContainerInvis 65.764 26.184 x2.512 + 7.01%
GridContainer 41.256 16.180 x2.550 + 3.33%
GridContainerInvis 62.544 26.808 x2.333 + 4.46%

Average improvement: +5.629%

Remarks

  • This optimization does not improve the scenario of switching node focus with keyboard/controllers for nodes without specific focus neighbors, which go through a different function (Control::_window_find_focus_neighbor). Something similar is needed to optimize that scenario as well.
  • There is potential to optimize further by not falling back to the slow O(N) search if no valid children are found on Containers, which makes it even faster (for the slowest time at VBoxContainerInvis, time becomes 17.812 and speedup x3.692). This, however, assumes that Container's children only have content on their "bounding boxes", which is not a guarantee as nodes could be moved beyond this, and thus becoming uninteractable. While the speedup with the fallback is already considerable nevertheless, it would be interesting to find a way to support these scenarios in order to improve even further.
Peek.2023-12-20.15-04.mp4

Comment thread scene/main/viewport.cpp Outdated
@rsubtil rsubtil force-pushed the feature-improve_viewport_ui_mouse_picking branch 2 times, most recently from aee087a to b00b3d0 Compare December 20, 2023 15:26
Comment thread scene/gui/flow_container.h Outdated
@rsubtil rsubtil force-pushed the feature-improve_viewport_ui_mouse_picking branch from b00b3d0 to f25bf72 Compare December 20, 2023 19:09
Comment thread scene/main/viewport.cpp Outdated
Comment thread scene/gui/flow_container.cpp Outdated
Comment thread scene/gui/box_container.cpp Outdated
@rsubtil rsubtil force-pushed the feature-improve_viewport_ui_mouse_picking branch 3 times, most recently from 9c66e64 to a0d26d3 Compare December 21, 2023 17:19
Comment thread scene/gui/flow_container.cpp Outdated
Comment thread scene/gui/grid_container.cpp Outdated
@rsubtil rsubtil force-pushed the feature-improve_viewport_ui_mouse_picking branch from a0d26d3 to 09a7911 Compare December 21, 2023 18:42
Copy link
Copy Markdown
Contributor

@MewPurPur MewPurPur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good from a code review standpoint.

rsubtil added a commit to retrohub-org/godot that referenced this pull request Feb 22, 2024
Optimize mouse UI picking on BoxContainer, FlowContainer and GridContainer
@virophagesp
Copy link
Copy Markdown

any updates?

@rsubtil
Copy link
Copy Markdown
Member Author

rsubtil commented Jun 18, 2024

@virophagesp not any updates here, but this should still work properly. I deployed this in a custom Godot version for the app I'm developing, and so far I haven't seen nor received any bug reports around it.

I do want to explore again the "unsafe" optimization I mentioned though, because the change in behavior I talked about seems to occur with this optimization as well, and if it is a reasonable assumption to make, the speedup should increase quite a bit more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants