Refactor geometry storage #499

systemed · 2023-06-13T11:42:15Z

This reworks tilemaker's internal storage to have a cleaner separation of objects:

OSMStore is for nodes, ways and relations from OSM data
TileDataSource (and its subclasses) is for generated geometries

This enables more understandable code and less duplication, and perhaps other interesting stuff in the future. 👀

There are a couple of small optimisations in here but generally there should be little to no performance impact. After this PR, creating an OMT-compatible .mbtiles from great-britain-latest.osm.pbf on my reference system is largely unchanged at 5m20 execution time with ~14.3Gb peak memory usage.

…geometries

systemed · 2023-06-18T14:06:47Z

Further changes now access attribute sets by index rather than by reference. There's some memory impact from this but I think with a little bit more work we can probably reduce it.

systemed · 2023-09-19T12:25:32Z

Moving to tsl::ordered_set has brought memory consumption back down to previous levels.

It looks like PR systemed#499 adopts 36 bits as the max size of an OSM ID. The NodeStore currently uses a full 64 bits for these IDs. This PR changes it to shard the nodes across 16 collections (4 bits) and then store only the last 32 bits in the collection itself. This reduces memory usage for the NodeStore by 25%, without much impact on runtime. The CompactNodeStore is still much better, as it has no overhead and constant time lookups -- but I'm often lazy and not using a renumbered PBF file.

* make AttributeStore::get const I think AttributeStore lives forever, and AttributeSets are immutable once added to it, so we can avoid the copy. * use a string pool for AttributeSet keys There are relatively few unique key values for attributes, e.g. `kind`, `name`, `admin_level`. The Shortbread schema has only ~50 or so. I imagine OMT is similar, but haven't checked. We generate lots of AttributePairs -- on the order of tens of millions for GB, and std::string has an overhead of 32 bytes. By using a string pool and storing only an offset into it, we can save a few hundred MB of RAM. * lock-free reads for keys, vector for pairs This is the groundwork for implementing two future improvements: - hot/cold pairs: there is a bimodal distribution of attribute frequency. `landuse=wood`, `tunnel=0` are often duplicated. `name=Sneed's Seed & Feed` is not. In the future, we'll try to re-use the "hot" pairs to avoid paying the cost of an AttributePair for them. - "short vectors" - similar to the short string optimization, we should be able to pack up to 6 pairs (3 hot, 3 cold) in the overhead that a vector would otherwise use. As it stands, this commit increases memory usage. But we'll claw a lot of it back, and then some. * Have a "hot" shard for popular pairs If a pair looks like it might be re-usable, put it in a special shard and be able to re-use it. The special shard is limited to max 64K items, teeing up future work to have a simple vector for AttributeSets with few pairs. * treat 0 as a sentinel * de-dupe all AttributePairs The stats I was looking at were counting AttributePairs via AttributeSets, which of course presents a misleading image of how many duplicate AttributePairs there are, because by that point, they've already been deduped. De-duping doesn't add that much runtime overhead--and it could probably be improved by someone who knows more C++ concurrency tricks than me. * store pointers in pairMaps, optimize debug spew `Tile_Value` is a really memory-expensive object. Since we maintain long-lived references to the canonical AttributePair, we can store pointers to save a bit more memory. Now that value->AttributePairs are guaranteed to be 1:1, we can do our debug statistics on ints, and translate to pairs only when writing to stdout. * use boost::container::flat_map over std::map Doesn't appreciably affect runtime, saves a bit of memory. * don't memoize hash function Now that there is a 1:1 mapping between values and AttributePairs, it's trivial to compute the hash on demand. * output_object: avoid Tile_Value temporaries Also const-ify a few things * defer creating Tile_Value Tile_Value is a big union that takes up 96 bytes, but for our purposes, we're happy with a union of string, float and bool -- which can be expressed in 28 bytes. We need a discriminator variable, but due to alignment, that's free. I also consider `boost::variant<bool, float, string>`, but it seemed to take 40 bytes. I worried that not having a pool of Tile_Values would affect PBF writing time, but it seems unaffected. * adjust headers, remove unneeded rng * any integer 0 <= 25 is eligible for hot pool This is useful for ranks, which run from 1..25 * Use a small vector optimization for pair indexes `vector<uint32_t>` takes 24 bytes just to store its internal pointers. If you actually want to store a `uint32_t` in it, it'll then allocate some memory on the heap, taking a further 32-64 bytes depending on STL and malloc implementations. 56-88 bytes! For a single `uint32_t`! Outrageous. Instead, store references to pair indexes in an array of shorts. If the pairs don't fit in the array, upgrade it to a vector. Since we previously arranged for very popular pairs like `amenity=toilets` to have small indexes, our array of shorts is capable of storing between 4 and 8 pairs before we need to upgade to a vector. Most AttributeSets will not need to use a vector. * simplify AttributeKeyStore * use camelCase * re-write to avoid static lifetime AttributeKeyStore/AttributePairStore have the same lifetime as AttributeStore, so just make them owned by it. This results in slightly more convoluted code, but avoids having them floating around as globals. * reduce lock contention * Improve TileCoordinates hash function x ^ y will only use as many bits as max(x, y), but tiles only use the full 32-bit space at z16, so we're leaving a lot of the hash space on the table. * d'oh, avoid looking up the key name needlessly * change AttributeXyz(...) to be last-written wins Previously, if you set the same key to different values, it was not guaranteed that the last value written would win. * remove misleading comment * include deque * include map * return vector, not set set seems a bit like overkill - we already know the items are unique, and the consumer is likely just going to iterate over them * avoid GNU-specific initializer also avoid hardcoding 12 * Revert "Improve TileCoordinates hash function" This reverts commit 7570737. Oops, I think this change isn't meaningful, and is a result of me misreading the original code. It might still be an improvement to do something like `hash(x << 16) ^ hash(y)`, since the default TileCoordinate is only 16 bits, but that can be considered independent of this PR. * remove dead code * avoid copying AttributePairs They're long-lived, so pass pointers * OutputObjects - greatly reduce need for locks I'm slowly remembering how to write concurrent code... * AttributeKeyStore: use a TLS cache This should reduce futex contention significantly. I'll apply the same change for AttributePairStore's shard 0, then measure. * AttributePairStore: reduce lock contention * ensure atomics are initialized Per https://stackoverflow.com/questions/36320008/whats-the-default-value-for-a-stdatomic, they aren't initialized by default. Somewhat surprised this didn't result in crashes. * don't store duplicate way geometries A common pattern is: ```lua way:Layer("waterway", false) ... way:Layer("waterway_names", false) ``` Previously, we'd process the geometry twice, and store a second copy of it in memory. Instead, re-use the previously stored geometry. This saves another ~1GB of memory for the GB extract. It doesn't seem to affect runtime - I think we only re-use linestrings, and linestrings are relatively cheap to do `is_valid` on. It seems like with the rest of the work on this branch, the `OutputObjectXyz` classes are very thin -- inspecting `geomType` in order to construct the right was a bit tedious, so I removed them.

systemed · 2023-12-03T14:20:52Z

Running on planet.osm from June 2021 (which I had to hand!), with --store on my usual 16-core machine and latitudes between -60° and 75°, this took 15h27 with RAM usage of ~56GB. Andros (island east of Athens) is still troublesome at z8 but no other geometry issues that I can see. Compared to #315, this has halved both RAM usage and execution time.

The locks in AttributeStore are necessary only during PBF reading, to avoid concurrent mutations corrupting things. Once we're writing the mbtiles, it's safe to read without acquiring the lock. This eliminates ~9% of system time, and ~2-3% of wall clock time. The PR also adds a `finalize()` to AttributeStore, AttributeKeyStore and AttributePairStore. Nothing actually uses this yet - I initially checked the `finalized` variable and threw if an unsafe method was called, but that gave up the speed benefits, so I removed it again. Perhaps in the future, a debug build could leave such checks in to detect programming errors.

systemed added 11 commits June 10, 2023 13:43

Move geometry stores into TileDataSource

14a3937

Rename classes away from OsmStore

0944743

Don't use magick bits to tell between osm/shp

397024a

Merge branch 'master' of github.com:systemed/tilemaker into refactor_…

383ff98

…geometries

Additional store only needed for indexed objects

594b48b

Optimise bitfield ordering (20 bytes -> 16)

95fa2a0

Break pair into separate output/attribute lists

ed14fc4

Common add-to-index code across osm and shp

1f8e518

Use cached geometries rather than regenerating

b7cf929

Reinstate output/attribute pair (memory leak)

856951d

Store attribute sets by index

23d8e6d

systemed added 4 commits July 29, 2023 12:55

Consistently use floats in attributes

8295413

find/end isn't thread-safe (particularly on Mac)

3cdded2

Merge branch 'master' into refactor_geometries

adcdb6e

Use tsl::ordered_set for attribute storage

e2f6e17

systemed added 2 commits September 19, 2023 18:43

Store objects sequentially (breaks include_ids)

22a3e64

Keep output list as OutputObjects for longer

2249202

systemed mentioned this pull request Oct 2, 2023

Support for a tile-relative "rank" attribute #547

Closed

systemed added 2 commits October 3, 2023 17:52

Per-tile feature limit as per #547

0ea137f

Set maximum zoom for feature limit

05c3cda

systemed marked this pull request as draft October 4, 2023 16:00

systemed added 2 commits November 4, 2023 12:58

Merge branch 'master' into refactor_geometries

4db8c25

Merge branch 'master' into refactor_geometries

06b68df

cldellow mentioned this pull request Nov 5, 2023

shard the node store to reduce memory usage #571

Merged

This was referenced Nov 5, 2023

Weekly update world osm mbtiles #569

Open

Unable to open landcover SHP #573

Closed

Merge branch 'master' into refactor_geometries

d9bb72b

cldellow and others added 5 commits November 19, 2023 17:59

Merge branch 'master' into refactor_geometries

ff0b9d6

Remove output object ref (#595)

5e06647

Merge branch 'master' into refactor_geometries

c217e4d

Merge branch 'master' into refactor_geometries

b33ccf0

systemed marked this pull request as ready for review December 4, 2023 21:05

systemed and others added 2 commits December 4, 2023 21:25

Tidy logging, remove tsl

2614985

systemed merged commit c01d9ee into master Dec 4, 2023
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor geometry storage #499

Refactor geometry storage #499

systemed commented Jun 13, 2023

systemed commented Jun 18, 2023

systemed commented Sep 19, 2023

systemed commented Dec 3, 2023

Refactor geometry storage #499

Refactor geometry storage #499

Conversation

systemed commented Jun 13, 2023

systemed commented Jun 18, 2023

systemed commented Sep 19, 2023

systemed commented Dec 3, 2023