-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't preflight zoom levels for potential as-needed dropping #40
Conversation
In #3 I write all of the tiles to a tempfile (protected by mutex) as they're generated, and also add an entry (zxy, offset/len) to a vector. It sounds like I would need to slightly revise this by using a std::map for the entries so I can overwrite an entry if the zoom level is rewritten. That means there would be dead unreferenced data in the tempfile, which would be skipped over when the tempfile is turned into the final archive. Does that approach sound reasonable? One concern is if entire deep zoom levels are being re-done, it will leave a massive amount of dead data in the tempfile, possibly consuming much more disk space than the final archive. |
@bdon Can you truncate your temporary file to the length it was at the end of the last successfully-generated zoom level, since it will always be entire zoom levels that are retried? But if that isn't possible, I think it is OK to leave unreferenced data in a temporary file, even though it might add up. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed explanation!
I opened #43 as a simpler alternative to having PMTiles as a sibling output. Truncating the tempfile sounds tricky with multiple threads involved, so I'd prefer that approach; otherwise I'd prefer to leave the dead data in the tempfile. |
Instead, go ahead and write out the zoom level, and if any of the tiles didn't work, clear out the zoom level and do it again. This should make the common case where the data fits and nothing has to be done as needed much faster.
Tippecanoe previously would preflight each zoom level to determine if all the tiles were going to be small enough, or, if they weren't, what feature-dropping or -coalescing threshold would allow the entire zoom level to succeed, before doing another pass through the zoom level to actually write out the tiles. Now it writes out the tiles that work the first time through, and erases the zoom level from the mbtiles output and tries the zoom level again only if it actually needs to change something to make the zoom level work.
Sample time savings (no features being dropped):
tippecanoe --no-tile-size-limit --no-feature-limit --drop-densest-as-needed tl_2021_06037_roads.shp.json
56.25s user 3.31s system 239% cpu 24.825 total
31.96s user 2.10s system 196% cpu 17.338 total
Sample time savings (features being dropped from some zoom levels):
tippecanoe --drop-densest-as-needed tl_2021_06037_roads.shp.json
62.82s user 3.37s system 208% cpu 31.732 total
49.42s user 2.72s system 190% cpu 27.440 total
No intended change without as-needed:
tippecanoe --no-tile-size-limit --no-feature-limit tl_2021_06037_roads.shp.json
31.35s user 1.95s system 213% cpu 15.587 total
30.83s user 2.06s system 224% cpu 14.622 total