New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add bulk_get_node function #14225
Add bulk_get_node function #14225
Conversation
Please may you include benchmarks, as that is the justification for this feature |
@rubenwardy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code looks good and CI is green
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation looks reasonable. Just a thought: I wonder whether we could achieve more than 1.3x (especially if we take building the input tables into account) if we change the interface a bit:
- Have you considered passing the input positions as three separate lists of
x
,y
andz
tables? - Alternatively, packed even tighter, we could pass input positions as a list of hashed node positions.
- Output could be three lists of nodenames / param1s / param2s, or maybe even content IDs / param1s / param2s like voxelmanip. This could also be bitpacked in one table, though that might make extraction of the values awkward or inefficient.
- We could also consider a faster
get_node
interface which just pushes the 3 values to the stack rather than building a hash table each time.
@appgurueu |
(sorry for the fuss. deleted my erroneous comment) Some rough benchmarks with LuaJIT on x86_64. Left side is in ms, right side in us.
That's roughly a 19% speed-up over regular |
Okay, I finally got to benchmarking this. I'm afraid the benchmark seems to be flawed. (Furthermore, the current code does not store The changes I made for benchmarking: diff --git a/games/devtest/mods/benchmarks/init.lua b/games/devtest/mods/benchmarks/init.lua
index 75d959c3d..cf4e4176e 100644
--- a/games/devtest/mods/benchmarks/init.lua
+++ b/games/devtest/mods/benchmarks/init.lua
@@ -133,15 +133,16 @@ minetest.register_chatcommand("bench_bulk_get_node", {
minetest.chat_send_player(name, "Benchmarking minetest.bulk_get_node...")
local start_time = minetest.get_us_time()
+ minetest.bulk_get_node(pos_list, {name = "mapgen_stone"})
+ local middle_time = minetest.get_us_time()
+ local get_node = minetest.get_node
for i=1,#pos_list do
- minetest.get_node(pos_list[i])
+ get_node(pos_list[i])
end
- local middle_time = minetest.get_us_time()
- minetest.bulk_get_node(pos_list, {name = "mapgen_stone"})
local end_time = minetest.get_us_time()
local msg = string.format("Benchmark results: minetest.get_node loop: %.2f ms; minetest.bulk_get_node: %.2f ms",
- ((middle_time - start_time)) / 1000,
- ((end_time - middle_time)) / 1000
+ ((end_time - middle_time)) / 1000,
+ ((middle_time - start_time)) / 1000
)
return true, msg
end, I'm not sure that "bulking" by means of a list of node positions as parameters / nodes as return value is the right way to optimize this. This reduces the Lua - C context switch overhead, but that overhead isn't all that large, and we pay for it by having to store the nodes we return in a table. (In the future, a "bulk" API could do further "batching" optimizations, but currently this isn't being done. Hence there is a case to be made for introducing a "bulk" API both for modder convenience, and for prospective future performance improvements which that would enable.) If we want to optimize getting nodes, there are other approaches worth considering (see #14225 (review)). TL;DR:
|
Benchmark updated.
|
Storing get_node in a local variable does not look to make a significant effect on my machine:
|
FYI, most of the overhead of
|
So are we doing this or not? |
I think we can add this as a convenience feature (which might lend itself better to future optimization work than |
If we add this, I think we should go a step further: We should explicitly state that there is no performance benefit to avoid modders doing premature optimisation. |
Removed performance hint from doc. |
I don't think this offers a meaningful usability improvement, so without the performance improvement I'd say this is not needed |
@sfence do you still see a reason for this PR or a way to make it faster? See today's IRC log https://irc.minetest.net/minetest-dev/2024-03-03#i_6156853 |
From the API side view, it can be nice to have a complement function to the But at all performance benefits of #14384 are much better. |
Considering this as rejected, then. |
To do
This PR is a Ready for Review.
How to test
Run devgame
/unittests
command.