-
-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split a pmtiles file #25
Comments
Thought about this for a bit and here's what I think are the benefits/drawbacks:
The case of GitHub pages seems to be meant for versioned code/docs and some associated assets, so I don't think it's a great fit as a primary target for tile archive hosting, though being free+fast is nice and you can accomplish the same thing with expanding to directories/archives. Are there other examples out there where we need to split archives to a max piece size? 32-bit systems might be one but I'd rather not consider that in scope. |
Agree that since the goal of pmtiles is to combine many files into one it may not make sense to split them back out again... According to https://github.com/phiresky/sql.js-httpvfs, the benefit they see for splitting a large file that you make byte range requests to from the client are:
Also using something like S3 is it possible to allow only range requests? A concern hosting a tileset in S3 would be a request comes from a client missing a range header and they accidentally start downloading the whole thing, which could run up bandwidth costs quickly. A split archive would partially mitigate that concern, but maybe it's not really an issue in practice? |
It's not possible on raw S3 to allow only range requests. That concern is somewhat mitigated by having clients implement a rudimentary check as shown on this line: https://github.com/protomaps/PMTiles/blob/master/js/index.src.mjs#L71 In practice, it can be an issue, but it's not unique to PMTiles; the other cloud-optimized formats like COG have the same drawback. The best solution for now is to run a proxy in front of your bucket such as https://github.com/protomaps/go-pmtiles , but of course that's no longer just S3 :) |
OK thanks, that check helps prevent accidental full downloads, but there's still the issue of intentional full downloads, which could start to be an issue with a 100gb full planet tileset hosted on s3 since each full download would cost the owner $10 in egress fees. I was thinking of using pmtiles for the planetiler demo site (~500MB mbtiles file on github pages) but if splitting a pmtiles archive doesn't make sense then I can stick with the current approach of extracting all of the tiles to individual files. |
Yeah, I agree the intentional linking/leeching is a concern - the basemap downloads I offer at http://protomaps.com/downloads are limited to at most a hundred or so megabytes, and my stopgap solutions for larger maps is proxy-based like above. I'm optimistic about the long-term solve here being market pressure downwards on bandwidth in the next few years, for example if/when Cloudflare R2 becomes available. |
I'm going to close this issue about archive splitting for now; I think the https://bdon.github.io/planetiler-demo/ (endpoint http://free-tiles.protomaps.com/planetiler/{z}/{x}/{y}.pbf) Open to suggestions on how to organize the URL structures or metadata, or access for hosting regular updates. |
Some hosts (like github pages) have maximum file sizes. Alternatives like https://github.com/phiresky/sql.js-httpvfs provide a way to split the tile archive until it is less than that max file size (https://github.com/phiresky/sql.js-httpvfs/blob/master/create_db.sh). Would it be possible for the pmtiles reader and writer to optionally support splitting a pmtiles file?
The text was updated successfully, but these errors were encountered: