[blueprint] support multiple output file formats #1965
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Part of milestone 2 of #1842
Add the ability to output in either "pretty" format (equivalent to the old behavior) or "minimal" format (the new default). Minimal blueprints don't include any annotations or "shadows" of building footprints, but are much faster to read and write. This allows us to optimize for the use case where a player is generating blueprints to play back later without manual editing. More advanced users that want to edit the generated blueprints should continue to use "pretty" format.
I am confident that there was no behavior change for "pretty" blueprints since the
quickfort
ecosystem
integration/regression tests continue to pass without modification. I did make a few modifications to theecosystem
test harness to make it more efficient, but I did not need to change any of the golden regression test files.This PR involved rewriting the core
blueprint
data structures and algorithm to allow the plugin to support multiple output formats. The new data structures will also support features planned for future milestones. My first attempt used sparsestd::map
s andstd::string
s for all data, but that ended up being much slower than the old implementation and, more importantly, ran out of memory and crashed on larger maps.My second attempt used sparse
std::map
s andconst char *
, which significantly cut down on runtime, but was still too memory hungry. This was actually not as complex as I feared since pointers to string literals in the code can be passed up the stack without having to dynamically copy them into the heap. I did implement a static cache for the few strings that were constructed on the stack so that the pointers would be valid when returned from functions.My third attempt used sparse
std::map
s for the higher-levelz
andy
coordinate structures but a pre-allocatedstd::vector
for thex
coordinate structure (the one that actually held theconst char *
pointers). This was the key that brought both the runtime and memory utilization down below the original implementation.Testing on a 16x16 embark for maximum scale (that's 768 by 768 by 198 = 116 million tiles), I get the following numbers:
old implementation: 18s, 1.1G memory
new implementation, pretty format: 17s, 0.6G memory
new implementation, minimal format: 8s, 0.6G memory
The difference in runtime between the new pretty and minimal formats is mostly due to I/O.
I did a further experiment using only
std::vector
s and no maps at all. This brought the runtime down by one second for the minimal format, but memory utilization stayed the same. However, it significantly complicated the code and required a lot of manual indexing and careful memory management. I decided that a 1 second savings on a 16x16 embark is just not worth the complexity cost. For the common, non-pathological case, the blueprint area will be much smaller and runtime will be near-instantaneous. I don't need to optimize for more speed. I'm satisfied that the new data structures have half the memory footprint of the old data structures, and I am confident that anything that worked before will continue to work.The test is a little biased because most of the map was solid wall, which incurs a memory cost for the old implementation but is zero cost for the new implementation (we now only allocate memory if we have something to store). If the entire map were hollowed out then the new implementation would likely be on par with the memory consumption of the old algorithm.
I will be continuing to make significant structural changes to the
blueprint
plugin in the upcoming weeks and months, so I don't want to spend your time on a review quite yet. Perhaps if we get close to releasing a new version of DFHack, or once meta blueprints are implemented in milestone 5, whichever comes first.