Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forceloading is excessive #277

Closed
Hawk777 opened this issue Jul 25, 2016 · 16 comments
Closed

Forceloading is excessive #277

Hawk777 opened this issue Jul 25, 2016 · 16 comments

Comments

@Hawk777
Copy link
Contributor

Hawk777 commented Jul 25, 2016

This is a bit of a mix of feature request and discussion topic. I’m not sure if people agree or not.

As written, at present, Mesecons forceloads every mapblock connected along a wire to a signal source when the signal changes state. In my opinion, forceloading is a rather large hammer being used against a rather small target. All we care about is tracing through the mapblocks to see where the wire connects to (and to change some nodes), yet to accomplish this, we will burn the global forceloading budget and activate a lot of mapblocks.

What if, instead, in get_node_force, we called get_node_or_nil first, then, on a nil return, used a voxel manipulator to bring the target mapblock into the cache and simply called get_node_or_nil a second time? This would lead to all those mapblocks being loaded but inactive, which should be much cheaper. As long as signal edges occur more frequently than once every server_unload_unused_data_timeout, everything would live in cache permanently and so the initial get_node_or_nil would succeed and no voxel manipulator would ever be created, but we would not pay the cost of the mapblock being active. For slower signals, the blocks would eventually be unloaded, and subsequent signal edges would reload them via the VM.

Thoughts? If people are agreeable, I would be happy to look at implementing this myself and turning this into a PR.

@Jeija
Copy link
Collaborator

Jeija commented Jul 25, 2016

This sound like an interesting idea! I'm not so sure about the performance of VoxelManip and if it is suitable for this purpose.

We are using forceloading for several reasons: First of all, it can be more performant to keep mapblocks in memory instead of loading them every time a signal passes by. But most importantly, not being able to trace back a wire to all of its sources can cause serious bugs: If we have two onstate receptors connected to a long wire at opposite ends, turning off only one of them shouldn't turn off the wire. If we can't trace back the whole wire quickly, we cannot act accordingly.

It sounds like it should be possible to use VoxelManip for this purpose. Just make sure that this doesn't impact performance. Especially in singleplayer it is more important for mesecons to function correctly and also be fast than to save server resources. So all in all, I would be willing to try out an implementation based on VoxelManip 👍

@Hawk777
Copy link
Contributor Author

Hawk777 commented Jul 26, 2016

Do you have any guidelines for how to figure out what the performance impact is? I have a fairly fast computer, and I don’t personally have any large mesecons networks set up in my own worlds. Is there a sample world with a big machine in it somewhere I could download?

@sofar
Copy link
Member

sofar commented Jul 26, 2016

You'll get nodes and p1/p2 from vmanips, but no node metadata. If you can do everything with those bits, it's pure Lua and it'll be very fast, much faster than any get_node() call can ever be.

@Jeija
Copy link
Collaborator

Jeija commented Jul 27, 2016

@Hawk777 For testing out performance, a very long wire (>300 nodes) with a switch on one end and a large light sign or something at the other end should be enough. Then make sure, it doesn't only work if the whole area is loaded, but also works if you just restarted the game. Make sure that it correctly handles two switches connected to it at large distance, that when turning of one switch while keeping on the other will keep the light sign on.

@sofar That sounds really good, mesecons doesn't use node metadata, so this sounds like it could positively impact performance in general! I just wasn't aware that VManips could be used in this way.

@Hawk777
Copy link
Contributor Author

Hawk777 commented Jul 27, 2016

Actually, my original thought was rather simpler: LuaVoxelManip::l_read_from_map calls MMVManip::initialEmerge calls ServerMap::emergeBlock calls ServerMap::loadBlock(v3s16) calls ServerMap::createSector(v2s16) and ServerMap::loadBlock(string, v3s16, MapSector*, bool). The net result of all of this is that simply asking a VM to read an area, even without actually looking at the data from the VM itself, causes that area to be warmed into the cache, after which a normal get_node should succeed where it previously failed, no?

Certainly using a VM for everything may be faster, only this would be rather easier to implement I think.

@Jeija
Copy link
Collaborator

Jeija commented Jul 27, 2016

I'm not sure if this will load the area, but it sounds like it. However, that is sort of dependent on the implementation, we can't be sure that a future minetest version won't do things differently. So if possible, I'd prefer directly reading from the VoxelManip map instead of calling get_node() for compatibility with future versions and for performance.

I think it is best to wrap all that logic up in a simple mesecon.get_node_always function in util.lua (or something similar) so that interal.lua is abstracted away from that, like it currently is with mesecon.get_node_force. If you still feel like you want to implement this, but need help, feel free to ask!

@Hawk777
Copy link
Contributor Author

Hawk777 commented Jul 27, 2016

It looks like the current implementation of MMVManip will still use the main map cache, so cache hits should be very fast via VM just as they are through get_node. Do you think this means we can just use the VM unconditionally, rather than starting with get_node and deferring to VM if it fails? @sofar says VMs should be faster than get_node, so I guess this would be the way to go? I can certainly take a look at implementing it, no promises on success though :)

Another question came to mind: this could result in signals going so much further than before, because VMs are not limited the way forceloading is. Should we have a cap on how far a signal can go before we just stop and say “OK, that’s enough, do the rest on the next tick via the actionqueue”? Otherwise a sufficiently perverse person could build a giant signal wire and force the server to try and load gigabytes of data from disk in a single tick.

@Jeija
Copy link
Collaborator

Jeija commented Jul 27, 2016

Yes, that is what I was thinking, don't use get_node at all anymore, only use VManips. In most cases there are several mesecon structures inside a single MapBlock, so it is one read_from_map vs. multiple get_node. That is, if this actually improves performance as suggested. Also, try to keep VManips for blocks in memory somehow until signal propagation has finished, so that we don't need to call read_from_map all the time. If you don't happen to come up with a better implementation, you could even have a global VManip list in a varibale in util.lua somewhere and just clear that storage before every signal propagation (in receptor_on/off).

I wouldn't limit wire lengths just now. If someone wanted to make the server lag, there are propably better ways to achive that right now (like just having a large farm of luacontrollers with interrupts at maximum frequency and powering large mesecon wire structures or something).

@Hawk777
Copy link
Contributor Author

Hawk777 commented Jul 27, 2016

Well, read_from_map actually hits the main map cache, so calling it multiple times on the same mapblock should be fast the second and subsequent times within 29 seconds.

Also, from inspecting the source code, it’s not clear to me that keeping a VM around from one tick to the next is a good idea. As far as I can tell, after calling read_from_map, the VM will contain a copy of the map data in a fairly raw format. I don’t think there’s anything in there that ensures that copy is kept up to date as other changes to the world happen. So that would mean no caching VMs (or at least their contents have to be discarded on every tick). Unless I’m missing something? If it were just for the purpose of reading the network layout during the propagation of a single signal edge, that might not be a huge problem, but we would also be modifying the world to change the signal level on the wires, and I can see that going very badly when the VM finally writes back.

@Jeija
Copy link
Collaborator

Jeija commented Jul 27, 2016

Yes, the VM contents will need to be discarded every tick (or every time a new signal propagation starts), I think we are on the same page there. Not caching at all would mean calling read_from_map every time we used get_node before, which can't possibly faster, right? I think the tricky part just is to abstract all that "caching" (even if just for one tick) away from the individual function in internal.lua that all need to access the map in some way, unless you want the code in internal.lua to become really ugly.

Since effectors (like pistons etc. that could actually change the map) are currently only activated one globalstep after the signal has reached them (using the actionqueue), what would need to be done is basically performing an iterative (!) breadth-first search looking for linking (by their connection rules) conductors from given starting point while at the same time loading all the VManips for the affected blocks.

The on signal is simple: After having found the whole network of conductors that connects to the starting point, you have a list of conductors and a list of effectors that will need to be activated. You can then replace all the conductors and effectors in the VM and write your changes to the map.

The off signal is more difficult: While performing the breadth-first search, you should try and see if any onstate receptors are connected to the conductors and abort the search at that point (the wire stays on!). If no receptors were found, continue to replace all the conductors and deactivate receptors in the VM, and write changes to the map.
This is not how turning off a wire currently works: We search the whole wire before sending of the turnoff signal which actually performans the action. This is an inefficient way to turn off the wire and is known to be poorly performing.

@sofar
Copy link
Member

sofar commented Jul 27, 2016

The one problem I can see is that you have limited information about how much data to load in the vmanip - you don't know from the start of the cable where the end is, or how the cable runs.

If you solve that problem, you could actually have two endpoints properly update hundreds of nodes apart, and only update the cabling in between as blocks become active.

It would of course be possible to store endpoint data in metadata, and cables themselves would have to store all the endpoints as well so updates can be efficient.

Then again, that approach uses nodemeta and really doesn't need to use vmanips, and perhaps I'm overseeing something entirely as well....

@Hawk777
Copy link
Contributor Author

Hawk777 commented Jul 29, 2016

@sofar, I think this is touching on a bigger, more general question of how Mesecons handles circuits. On the one end is pretty much what we have today, where the only information about circuit topology is stored in the world map itself, and this ticket is just a question of improving performance while walking through that data.

On the other end, taking your idea and fully generalizing it, is that we store some sort of abstract schematic representation of the circuit in an auxiliary data structure. A way to do this would be to allocate what the electronic engineering world would call equipotential nodes, i.e. all the wire reachable by a search starting from a particular point. Roughly equivalent to “one wire”, except that forks, loops, etc. are also included. As wires are placed and dug, this auxiliary data structure is kept up to date. Turning receptors on and off just updates the state of a single equipotential node, which instantly tells you all the actuators connected to it and lets you notify them. Wires on the ground have their equipotential node ID stored as metadata, and they just visually always reflect the state of their associated equipotential node. In Minetest terms, you would just swap all the wires currently loaded to their new state, and then each wire would, on load, check whether it should be on or off and swap itself if needed, but you wouldn’t need to touch unloaded map blocks at all.

The first option is a fairly simple optimization, while the second would be a significant rewrite. You would have to deal with a lot of complicated issues, like finding a data structure that allows efficient searching but also reasonable cutting and joining performance (for digging and placing wires), while also taking care of ensuring that the data structure always stays up to date with respect to the world map, even in the face of possible errors and unclean shutdowns (think power failures of the Minetest server machine).

That’s why I thought of either having each VM load just one mapblock, or perhaps having a small radius as a tuning knob which allows creation of fewer VMs by having each one load, say, n_×_m_×_n mapblocks. Or perhaps _n_×1×1 in the direction the wire is going when it reaches the boundary, under the assumption that most wires that go a long way will do so in a straight line.

@Jeija
Copy link
Collaborator

Jeija commented Jul 29, 2016

You would have to deal with a lot of complicated issues, like finding a data structure that allows efficient searching but also reasonable cutting and joining performance (for digging and placing wires), while also taking care of ensuring that the data structure always stays up to date with respect to the world map, even in the face of possible errors and unclean shutdowns (think power failures of the Minetest server machine).

This is the point where this alternative fails. I have also experimented with rewriting the mesecons core in similar fashion (last summer). And while all the problems with data structures, joining and splitting equipotential areas are solvable (and also interesting for programmers), all of this goes wrong as soon as you take into account crashing servers or even just mods that do unexpected things (like worldedit). Above all of that, debugging a system like this with two places storing the same information (the visible map and the equipotential area store) is just a complete nightmare of unreproducible bugs. So, please don't do this to yourself!

I think it should make sense to just always load the current MapBlock into a VManip, since the server also thinks in terms of MapBlocks. So it wouldn't improve performance much to load less than the current MapBlock (assuming that passing the data from minetest to lua isn't the bottleneck - that remains to be testesd), and I also don't think loading more than the current mapblock is required in many cases, since most circuits are pretty small. If this turns out to be poorly performant, we could have some fancy prediction (e.g. load close blocks if node is close to the edge of a mapblock, prefer loading horizonal nodes over loading vertical ones, assume straight wires, ...), but I hope this isn't necessary.

@Hawk777
Copy link
Contributor Author

Hawk777 commented Jul 29, 2016

I think it should make sense to just always load the current MapBlock into a VManip, since the server also thinks in terms of MapBlocks. So it wouldn't improve performance much to load less than the current MapBlock (assuming that passing the data from minetest to lua isn't the bottleneck - that remains to be testesd),

A VM isn’t capable of loading less than a mapblock anyway.

and I also don't think loading more than the current mapblock is required in many cases, since most circuits are pretty small. If this turns out to be poorly performant, we could have some fancy prediction (e.g. load close blocks if node is close to the edge of a mapblock, prefer loading horizonal nodes over loading vertical ones, assume straight wires, ...), but I hope this isn't necessary.

I also agree with this idea; I’ll write up a basic implementation which loads one mapblock per VM and we can add some sort of hinting mechanism later if it becomes necessary.

@Hawk777
Copy link
Contributor Author

Hawk777 commented Aug 1, 2016

Having looked at this in more detail, I’m not sure it’s possible to do without breaking backwards compatibility with other Mesecons-connectable mods. The reason for this is that we call mesecon.conductor_get_rules to get the linkage rules for a node. That function accepts a node object as a parameter and looks up the rules entry in the conductors table based on the node’s name. If the rules entry happens to be of type function, then the rules are generated by calling that function and passing the whole node object as a parameter. Doesn’t this mean we need the node object to exist, including all its metadata, in the general case? I suppose we could possibly work around this by using a VM until we hit a node whose conductor type has a function in its rules entry and then using get_node_or_nil on that particular location, but that feels difficult to implement sensibly without building a rampant layering violation.

This wouldn’t rule out my original idea of using the VM to prime the mapblock cache and then using get_node_or_nil to access the node objects. That should still work.

@Hawk777
Copy link
Contributor Author

Hawk777 commented Sep 5, 2016

The associated PR is now merged into master so I think this discussion is probably over.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants