Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caching #96

Closed
andreschardong opened this issue Jan 18, 2017 · 14 comments
Closed

Caching #96

andreschardong opened this issue Jan 18, 2017 · 14 comments
Assignees
Labels
Milestone

Comments

@andreschardong
Copy link

Is caching pbf on our plans? I'm wondering if I could help. Never worked with Go before, however could try.

@ARolek
Copy link
Member

ARolek commented Jan 18, 2017

It's been a point of discussions many times, though we have not decided how we want to proceed. Our current suggestion is to use a CDN or caching layer in front of tegola. This keeps tegola focused on slicing tiles from data providers.

We have played with the idea of supporting a local disk cache using the the z/x/y.pbf naming convention as well as "seeding tools" (#64) which would precompute all the tiles with the same convention. The tiles could then be persisted to somewhere like S3.

Maybe our discussion will turn into a fun project for you to give Go a try. What would be your desired caching solution?

@andreschardong
Copy link
Author

andreschardong commented Jan 18, 2017

Local disk cache using the the z/x/y.pbf schema would be perfect (at least for my project). we have some huge datasets on PostGis. tegola has to do the heavy processing and produce the tiles (pbf files) each user request. The beauty is that tegola takes care of the simplification and does an awesome job with that. Much better that a simple ST_simplify on postgis, for instance.

@ARolek ARolek added the feature label Jan 19, 2017
@ARolek
Copy link
Member

ARolek commented Jan 19, 2017

@andreschardong got it. What are your thoughts on clearing out the cache? We go back and forth on this one. A couple options we have considered:

  • Use a TTL, and crawl the cache directory every so often to purge old tiles.
  • Purge on reboot of the instance
  • Purge on reload of a config (no downtime config reloading is something we're working on)

Also if you're interested in taking a stab at the cache functionality I'm happy to bounce around ideas on the design.

@andreschardong
Copy link
Author

andreschardong commented Jan 20, 2017

@ARolek, regarding the options:
Use a TTL its OK. I would leave default 0 (never)
Purge on reboot of the instance: would not do it
Purge on reload of a config: not sure if it's a good idea. Eventually just purge does deleted layers (layers on the disk that are not on the config file anymore). BTW great to hear that you working on the reload config feature.

Some functionalities that could be added (based on geoserver):

  • allow the user empty/purge the cache manually (tegola would have to list the layers).
  • implement a disk quota (would leave that for a second phase. I think it is harder to implement).

@pnorman
Copy link
Contributor

pnorman commented Jun 1, 2017

I think caching is essential when rendering from an updating data like OSM.

There are two strategies in production: pre-rendering the entire planet into the cache, or pre-rendering some parts into cache and rendering the rest on-demand.

When pre-rendering everything, you need to be able to seed the cache and then as osm2pgsql/etc reports tile expiry, re-render them.

If pre-rendering some and live-rendering others, the standard way is what mod_tile does with the idea of a "dirty" tile, one which is not flushed from cache, but kept around and not re-rendered until re-requested.

It's worth looking at the logic for Mapzen's tilezen, mod_tile/renderd or tirex/renderd, and Kartotherian. t-rex might also be worth a look, but I haven't investigated what they're doing yet.

@bagage
Copy link

bagage commented Jun 26, 2017

Is there any documentation on how to setup some caching with Tegola and Nginx for instance?
I setup a nginx configuration that works, but I didn't succeed to have a "clear cache" functionnality yet. I cannot find an easy way to list pbf files located in a given bounding box… so any hint's welcome :).

@ARolek
Copy link
Member

ARolek commented Jun 27, 2017

@bagage are you using nginx Content Caching? An alternative suggestion would be to use a CDN in front of tegola to handle the caching.

I'm not quite sure what you mean by:

I cannot find an easy way to list pbf files located in a given bounding box…

Are you trying to find the tiles that intersect with a lat / lng bounding box?

@ARolek ARolek added this to the v0.5.0 milestone Jun 27, 2017
@bagage
Copy link

bagage commented Jun 27, 2017

@ARolek yes I'm using Content Caching. It does the job well, except when I wish to force refreshing a given zone.

Are you trying to find the tiles that intersect with a lat / lng bounding box?

Yes, I'd like to get the list of .pbf files located within a lat / lng bounding box for all zoom levels, so that I can send a PURGE command on them. So far I only managed to get generic URLs, for instance:

... lots more ...
https://server.org/tegola/maps/layer/18/137681/97833.vector.pbf
https://server.org/tegola/maps/layer/18/137681/97834.vector.pbf
https://server.org/tegola/maps/layer/18/137681/97835.vector.pbf
https://server.org/tegola/maps/layer/18/137681/97836.vector.pbf
https://server.org/tegola/maps/layer/18/137681/97837.vector.pbf
https://server.org/tegola/maps/layer/18/137681/97838.vector.pbf
https://server.org/tegola/maps/layer/18/137681/97839.vector.pbf
https://server.org/tegola/maps/layer/18/137681/97840.vector.pbf
https://server.org/tegola/maps/layer/18/137681/97841.vector.pbf
https://server.org/tegola/maps/layer/18/137681/97842.vector.pbf
https://server.org/tegola/maps/layer/18/137681/97843.vector.pbf
... lots more too ...

But most of these are not existing/empty files (because nothing is located at this position), and sending too much PURGE requests is not working well.

@ARolek
Copy link
Member

ARolek commented Jun 27, 2017

@bagage I'm trying to think what's the best way to pull this off. I think you're going to need to do the following steps:

  1. convert the lower left lat/lng and upper right lat/lng to x/y tile values for each zoom you want to purge.
  2. Calculate the various tiles in between the min and max x/y tile values for each zoom.
  3. Issue a PURGE command for each z/x/y value.

You can see the math for converting lat/lng to x/y given a z value here.

This is an interesting request around caching. I'm going to think on it a bit to see if we should support "area purging". I can see the application, basically purge the cache when a subset of data has been modified. I don't know if exposing this via HTTP is a good idea but maybe as a sub command that could be issued, i.e. tegola purge --dir=path/to/cache/ --ll=[lower-left-lat,lng] --up=[upper-right-lat,lng]

@bagage
Copy link

bagage commented Jun 27, 2017

@ARolek Thanks for the pointers. That's actually how I implemented it, but it results in way too much PURGE requests (around 28K requests for zoom 18 for a "screen size bounding box"). Maybe my code is wrong though, will recheck it.

Not sure what the ideal solution could be here, exposing a purge area via HTTP could be useful imo (even if in my case I will have access to tegola purge directly).

@ARolek
Copy link
Member

ARolek commented Jun 27, 2017

@bagage yeah something has to be off with your script. What did you write it in? I can try to take a quick look if you want to post it up on a gist.

@bagage
Copy link

bagage commented Jun 28, 2017

@ARolek nevermind, the issue was actually simply due to boundingbox dimensions and zoom levels (both too bigs at the same time). For record my implementation (similar to yours) is here. With lower zoom value, it works as a charm! Cool!

@ARolek
Copy link
Member

ARolek commented Jun 28, 2017

@bagage great solution! When I get working on caching support I will design for bounding box & zoom area purging too. Thank you for the feedback!

@ARolek ARolek self-assigned this Oct 14, 2017
ARolek added a commit that referenced this issue Oct 16, 2017
…strategy. the cacher interface needs to be thought about a bit more. #96
ARolek added a commit that referenced this issue Oct 16, 2017
ARolek added a commit that referenced this issue Oct 16, 2017
ARolek added a commit that referenced this issue Oct 17, 2017
ARolek added a commit that referenced this issue Oct 17, 2017
ARolek added a commit that referenced this issue Oct 18, 2017
…ntation. extra error checking and durability for the file cache. #96
ARolek added a commit that referenced this issue Oct 24, 2017
ARolek added a commit that referenced this issue Oct 24, 2017
ARolek added a commit that referenced this issue Oct 24, 2017
@ARolek
Copy link
Member

ARolek commented Nov 4, 2017

A file cache has been implemented in the release candidate for v0.4.0. The fist implementation is a file cache but we're positioned to support other caching backends as necessary.

Seeding and purging are coming up next. I'm going to address purging in another issue so we can close this one.

@ARolek ARolek closed this as completed Nov 4, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants