Skip to content
This repository has been archived by the owner on Aug 25, 2023. It is now read-only.

Discuss API refinements, including caching #73

Closed
5 tasks done
ghost opened this issue Mar 11, 2014 · 57 comments
Closed
5 tasks done

Discuss API refinements, including caching #73

ghost opened this issue Mar 11, 2014 · 57 comments
Labels
Milestone

Comments

@ghost
Copy link

ghost commented Mar 11, 2014

Over at issue #61 we considered a refactored design for asynchronously loading TileJSON, and at pull request #71 we discussed my issue61 branch with a new approach to solving problems around the asynchronous TileJSON stuff.

The conclusion from #71 seems to be that the general approach in the issue61 branch is sound, but it needs to be packaged within an API which feels consistent with Apple's API's in general, and MKMapKit in particular. So this ticket is for discussing refinements to the API. One of the big open questions is what the caching will look like.

My plan for the moment is:

I'm guessing this iteration will reach a good stopping point by approximately March 20th (next Thursday).

@ghost ghost added this to the 0.3.0 milestone Mar 11, 2014
@ghost ghost added the refactor label Mar 11, 2014
@ghost
Copy link
Author

ghost commented Mar 11, 2014

The new branch for working on this is at mapbox:issue73

@ghost
Copy link
Author

ghost commented Mar 11, 2014

Something about upgrading to Xcode 5.1 yesterday seems to have messed up the maps entitlement for the new OS X demo app (the old MBXMapKit 0.1.0 demo app still works though). Probably this is just a local provisioning profile issue on my development box, but it may pop up elsewhere.

@ghost
Copy link
Author

ghost commented Mar 11, 2014

The Xcode provisioning profile thing for the OS X demo seems to be fine now. I did end up committing some changes to the project file as part of getting it working again.

ghost pushed a commit that referenced this issue Mar 12, 2014
ghost pushed a commit that referenced this issue Mar 12, 2014
@ghost
Copy link
Author

ghost commented Mar 13, 2014

Based on my experimenting so far, it doesn't seem to be possible to conveniently determine when the data from an NSURLSession data session task came from cache (vs an HTTP download) just by looking at the response. That's inconvenient, because I want to gather statistics about cache hits and misses in order to determine if, and how well, caching strategies are working.

The one option I can see so far is to implement URLSession:dataTask:willCacheResponse:completionHandler: from the NSURLSessionDataDelegate Protocol and have it add a userInfo dictionary with an entry flagging the response as having been locally cached. The responses do have caching related headers, but those are all about what's happening server-side and at the HTTP level. I want to be able to tell the difference between a freshly downloaded tile which was a CloudFront cache hit and the same tile (with its associated same response!) which might come out of the local disk cache 20 minutes, or 2 days, later.

@ghost
Copy link
Author

ghost commented Mar 13, 2014

The NSURLSessionDataDelegate idea didn't work because, as Apple's documentation explains, "If you create a task using a method that takes a completion handler block, the delegate methods for response and data delivery are not called."

@ghost
Copy link
Author

ghost commented Mar 13, 2014

I'm hitting a world of pain as I try to verify what NSURLCache + NSURLSession are actually doing in response to my requested cache and session configurations. My suspicion is that none of my NSURLSession requests are actually getting served from cache, and instead they are always resulting in new HTTP requests. The thing that concerns me is that, as far as I can tell, all the responses I'm getting are NSURLHTTPResponse, but some of them ought to be NSCachedURLResponse.

In order to get more visibility into, and control over, what happens during network requests, it looks like I might need to stop using data tasks with completion handlers and come up with an alternate approach based on implementing the NSURLSessionDelegate, NSURLSessionTaskDelegate, and NSURLSessionDataDelegate protocols.

ghost pushed a commit that referenced this issue Mar 14, 2014
ghost pushed a commit that referenced this issue Mar 14, 2014
@ghost
Copy link
Author

ghost commented Mar 14, 2014

I figured out a way to verify that tiles and JSON are definitely getting written to cache, at least some of the time, but I still haven't found a way to tell if and when subsequent requests are getting served from cache.

On my iPhone, the current cache configuration creates a Cache.db sqlite database in the app's sandbox at /AppData/Library/Caches/com.mapbox.MBXMapKit-iOS-Demo-v030. If you open that up and do a sqlite> SELECT request_key FROM cfurl_cache_response;, it dumps out a list of all the urls in the cache (lots of pngs for each map, along with metadata and marker json). So, I guess that's progress.

ghost pushed a commit that referenced this issue Mar 14, 2014
@ghost
Copy link
Author

ghost commented Mar 14, 2014

I've been running my phone's traffic through Charles Proxy and experimenting with setting cache policies like this:

NSURLRequest *request;
NSURLSessionDataTask *task;
request = [NSURLRequest requestWithURL:url
           cachePolicy:NSURLRequestReturnCacheDataDontLoad timeoutInterval:60];
task = [_dataSession dataTaskWithRequest:request completionHandler:^(NSData *data, 
        NSURLResponse *response, NSError *error){ /* ... */ }];

My conclusions are:

  1. The caching is definitely working, and lots of images/json are getting returned from cache.
  2. With the default cache policy, some of the cache hits are re-validated, so if you're watching traffic in a proxy, but not paying close attention to the headers and response codes, it might look like the resources are being re-downloaded.
    Revalidation requests for local cache hits include If-None-Match: ... and If-Modified-Since: ... headers, and regular requests for local cache misses don't include those headers. Responses can look like X-Cache: RefreshHit from cloudfront, X-Cache: Hit from cloudfront, and X-Cache: Miss from cloudfront. Presumably other X-Cache responses might also be possible.
  3. My existing notification mechanism for HTTP success, HTTP failure, and network failure is currently counting local cache hits as successful HTTP requests (that's bad). I'm not sure if it's possible to distinguish between a response that comes from a local cache hit and a response that just came over HTTP. I'm wondering if the thing to do might just be abandon my attempts to measure cache stuff from inside the app and instead just rely on an external proxy to make sure the cache is actually working.

ghost pushed a commit that referenced this issue Mar 14, 2014
@incanus
Copy link
Contributor

incanus commented Mar 14, 2014

Responses can look like X-Cache: RefreshHit from cloudfront, X-Cache: Hit from cloudfront, and X-Cache: Miss from cloudfront.

Just in case it's not clear, this has to do with whether tiles are missing and going all the way back to the backend render server, or if they are hitting and cached as rasters in the CDN.

@ghost
Copy link
Author

ghost commented Mar 14, 2014

I'm pretty close to convinced it's not possible to distinguish between an NSURLHTTPResponse that comes from a local cache hit and one that comes fresh off an HTTP connection. A bit ago I wrote,

The thing that concerns me is that, as far as I can tell, all the responses I'm getting are NSURLHTTPResponse, but some of them ought to be NSCachedURLResponse.

and now I'm pretty sure I misunderstood the purpose of NSCachedURLResponse. As far as I can tell, it is only used with
- (NSCachedURLResponse *)cachedResponseForRequest:(NSURLRequest *)request
and
- (void)storeCachedResponse:(NSCachedURLResponse *)cachedResponse forRequest:(NSURLRequest *)request
from NSURLCache.

After some time in the debugger looking at response objects coming back from dataTaskWithRequest:completionHandler:, I'm pretty convinced that there isn't any way to tell when you're getting a cached data/response pair. It is possible to tell when something is getting written to cache, but I can't see any way to tell when data is coming back out of the cache. NSURLCache seems to do its job well based on the caching policy you set, but it apparently doesn't like to share a lot of details about what's going on under the hood. This seems like a potential Radar.

@ghost
Copy link
Author

ghost commented Mar 14, 2014

@incanus re caching headers... yeah, all the server side stuff is fairly straightforward. I've been searching for any hints of about whether my responses are hits or misses from the local cache, and I'm just using this issue to take some notes.

ghost pushed a commit that referenced this issue Mar 14, 2014
…nable a way of counting api hits. However, there seems to be no way of distinguishing between a local cache hit response and a fresh response from the API, so it seems like my notification mechanism can’t actually be completed to serve its intended purpose.
@ghost
Copy link
Author

ghost commented Mar 14, 2014

Note to self: for persistent offline tile storage, Justin's recommended approach is: don't impose any arbitrary upper limit on the number of tiles it will attempt to download, and do provide a delegate callback which gives progress updates on percent of total completion.

Also, I should take a look at RMTileCache and its delegate in the sdk.

@ghost
Copy link
Author

ghost commented Mar 17, 2014

As I'm getting ready to take a swing at background tile downloading for offline maps, I see there's been a fair amount of relevant discussion over in the iOS SDK issues:

@ghost
Copy link
Author

ghost commented Mar 17, 2014

Apple's recommendations about directories for storing app data and how to exclude files from iCloud backups:

ghost pushed a commit that referenced this issue Apr 9, 2014
ghost pushed a commit that referenced this issue Apr 9, 2014
@ghost
Copy link
Author

ghost commented Apr 9, 2014

Status update summarizing my recent commits above:

  1. The MBXOfflineMapDatabase class is theoretically complete and ready, but it needs further testing on the sqlite side. It's usable for initializing an MBXRasterTileOverlay layer, and the sample app is set up to do that.
  2. The plumbing in the sample app to restore saved state from disk when the app is re-launched is theoretically complete. That includes resuming a suspended download job and populating the list of completed offline maps. The resume-a-download-on-re-launch feature has been tested against the timer based fake progress indicator that's currently still built into MBXOfflineMapDownloader, so it ought to work once the real stuff is ready.
  3. When MBXOfflineMapDownloader receives a begin message, it's currently set up to create a real metadata dictionary and a fake array of tiles to be downloaded (fake = from an array literal instead of a calculation). As of last night, it can now write the metadata and list of tiles out to an sqlite database. The database schema is:
CREATE TABLE metadata (name text, value text);
CREATE UNIQUE INDEX name ON metadata (name);
CREATE TABLE resources (url text, status text, data blob);
CREATE UNIQUE INDEX url ON resources (url);

The status column of the resources table is how MBXOfflineMapDownloader will keep track of which urls have been processed and which ones still need to be requested. Building the mechanism to do that is next up.

ghost pushed a commit that referenced this issue Apr 9, 2014
… status info from the resource table will be frequently read and updated, but the blobs only need to be written. This change aims to reduce the volume of changes to the resources table since it will be read frequently.
ghost pushed a commit that referenced this issue Apr 11, 2014
@ghost
Copy link
Author

ghost commented Apr 11, 2014

The next step is to build a pipeline for reading some urls from the db, putting those urls in the download queue, writing the downloaded files back to the db, and keeping the process rolling with additional batches of urls until everything has been downloaded.

Building that pipeline is going to take some thought about how to balance the following constraints and questions in a reasonable way, according to the "good enough" principle:

  1. How many urls should be read and processed in one batch? One is definitely too few, and 100 is quite possibly too many. In any case, the system should be robust enough to process maps with many thousands of tiles.
  2. If we've got the device's radio turned on, we should be making the most of the battery usage by running several downloads in parallel. That means we need to keep at least a small backlog in the NSURLSession's download queue.
  3. It needs to be possible to rapidly and cleanly suspend the processing and to resume it again later.

@incanus
Copy link
Contributor

incanus commented Apr 11, 2014

You may also find use in exploring hosts [a-d].tiles.mapbox.com and -[NSURLSessionConfiguration HTTPMaximumConnectionsPerHost].

@ghost
Copy link
Author

ghost commented Apr 11, 2014

With the maximum connections per host thing, I think that just using NSURLSession will work fine and there aren't really any problematic questions at that level (i.e. how do we download 4/8/16 files at once). It gets more complicated to make sure that NSURLSession consistently has a reasonable number of urls waiting in its queue to be downloaded.

I'm expecting that the NSURLSession will be managing a download queue with a typical backlog of perhaps tens of items waiting to be downloaded in parallel. On the other hand, there might be a backlog in the sqlite offline map database of 1,000 (or 10,000!) items waiting to be downloaded.

Throwing 10,000 urls at an NSURLSession seems like a pretty good way to cause memory problems, and on the other end, letting NSURLSession repeatedly drain the backlog down to nothing will lead to lots of time when less than the full number of possible simultaneous downloads are going at once (i.e. takes longer and uses more power).

I'm thinking the solution will look like two queues, the small one (NSURLSession) will be for the purpose of keeping simultaneous downloads going while the radio is turned on. The larger one (offline map sqlite db plus some logic) will be to make sure NSURLSession has a backlog that's not too large and not too small, for the purpose keeping memory usage and disk activity within reasonable limits.

@ghost
Copy link
Author

ghost commented Apr 11, 2014

Note to self: To get some order of magnitude numbers for how many tiles people might try downloading, try calculating the tiles for these regions, zoom 0-17:

  • SF (about 50 square miles)
  • NYC metro (about 500 square miles)
  • Yosemite Valley (about 10 square miles)

@ghost
Copy link
Author

ghost commented Apr 15, 2014

I just hit a significant milestone. Basically the first draft of my sqlite stuff for offline maps is complete and working. By that I mean I can create a db using an array of urls which need to be downloaded, I can read progress counts out of it, I can read batches of urls (using SELECT... LIMIT) to download, I can write image blobs and status codes back out, I can save the completed db to a unique filename, and I can find and display completed offline map db's on the map. This is progress!

The main things left on the todo list for this ticket are:

  • Calculate urls to be downloaded for a given map region (it's using a hardcoded list for now)
  • In the sample app, add a way to delete offline maps
  • Beat on the demo app, see what breaks, and fix it.

@ghost
Copy link
Author

ghost commented Apr 18, 2014

Status update: rough cut at dynamically generating the tile urls is working, but the marker icon stuff needs work in a couple different ways. When I hardcoded a url for a marker icon that I knew was in the markers.geojson, it didn't show up on the map, and that's a bit puzzling. Also, on the MBXOfflineMapDownloader side, I need to implement something to download and parse the marker geojson at the time that the list of urls is being created.

ghost pushed a commit that referenced this issue Apr 21, 2014
ghost pushed a commit that referenced this issue Apr 21, 2014
@ghost
Copy link
Author

ghost commented Apr 21, 2014

Offline maps work on iOS, markers included!

@incanus
Copy link
Contributor

incanus commented Apr 22, 2014

Handy pre-PR link comparing the issue73 branch against master:

master...issue73

@incanus
Copy link
Contributor

incanus commented Apr 22, 2014

This seems really well thought-out, @wsnook! I can't possibly hope to totally grok 65 commits and 32 changed files with 4,961 additions and 5 deletions just yet, particularly the whole functionality of MBXOfflineMapDownloader and its dual queue system, but here's how I see things proceeding:

  1. We address some more important comments below leading up to / when we voice Tuesday.
  2. We consider some other comments, more as backburner things that we can discuss on GitHub.
  3. We squash these into a new commit or handful of commits and open a pull request for more in-depth review and testing, getting it into master for some testing before release of 0.3.0.

Main comments (voice)

  • Overall, and I mean this from a general "Mapbox how we do it" angle, it would help in future if you could more clearly highlight how to test the new functionality. I'm able to read through all the source myself and figure it out, but it takes a while and having a nice "run this app and press X button" comment would really help before I dive all in, since I've been away from these for some weeks / never used your example app directly anyway.
  • Related to the above, we need to clean up all the v030 stuff and just make the final proposed-merge branch reflect the state that master should be. In a branch, don't be shy about blowing away our current view controller/sample app/whatever. Branches are cheap and non-permanent.
  • If I'm understanding things properly, MBXOfflineMapDatabase instances, though referenced by the MBXOfflineMapDownloader and used as arguments by the delegate methods and -removeOfflineMapDatabase:, are meant to be opaque, "private" items. We should consider making all the methods and most/all of the properties on MBXOfflineMapDatabase private, with a category inside MBXOfflineMapDownloader / any other class that needs to use them directly. That is to say the "public" should never be calling these methods. As a descendent of NSObject, too, you can throw an exception inside of a declared -init to discourage such direct use. Otherwise, believe me, it's gonna happen / generate support inquiries.
  • Why is the delegate passed in -[MBXRasterTileOverlay initWithOfflineMapDatabase:delegate:]? How is this different from instantiating with a mapID? I'm asking both out of implementation curiosity but also as a future potential confused user of the API.

Minor comments (online)

  • Properties like these should be more declarative and explain what they are for. metadata says "data structure" to me, not a BOOL value. Similar for markers, which says "image data or GeoJSON", not a boolean value. Probably something like hasMetadata? This is especially confusing when reading code like
    if(metadata)
    {
    [self asyncLoadMetadata];
    }
    else
    {
    _didFinishLoadingMetadata = YES;
    }
    where it's not really obvious what's happening. Does metadata = YES mean that metadata is not loaded?
  • Thinking ahead to how MBXOfflineMapDatabase will be used, could we maybe take -initWithContentsOfFile: one step further as we do over in https://github.com/mapbox/mapbox-ios-sdk/blob/3a1835bc8871dfba53befa20b41bddfdf1f4ca33/MapView/Map/RMMBTilesSource.h#L53-L63. These examples allow the user to specify a bundle resource or even just the first part of the filename (assuming .mbtiles) and the framework does the rest. Similarly, for offline map databases, these are basically always going to be in the app jail and either in the documents folder or the caches folder. I would recommend some default, but changeable, file extension (.mboffline?) perhaps before your .partial/.complete or even nothing appended to .mboffline in place of .complete, and then something like -initWithDatabaseName:inDocuments:, the latter argument being a BOOL defaulting to YES.
  • Related to the above, we should make iCloud backup configurable (
    // Make sure the offline map directory exists and that it will be excluded from iCloud backups
    ) per our discussions in Exclude db cache from iCloud backup, with option to include DEPRECATED-mapbox-ios-sdk#410. Defaulting as you've done but with the ability to change seems like good practice.
  • I like how the user agent for background downloads is differentiated (
    #if TARGET_OS_IPHONE
    userAgent = [NSString stringWithFormat:@"MBXMapKit (%@/%@) -- offline map", [[UIDevice currentDevice] model], [[UIDevice currentDevice] systemVersion]];
    #else
    userAgent = [NSString stringWithFormat:@"MBXMapKit (OS X/%@) -- offline map", [[NSProcessInfo processInfo] operatingSystemVersionString]];
    #endif
    ) but we should allow customization there, as well as setting it in one place for all uses of the SDK to avoid code duplication. The iOS SDK has a good model for this now in https://github.com/mapbox/mapbox-ios-sdk/blob/3a1835bc8871dfba53befa20b41bddfdf1f4ca33/MapView/Map/RMConfiguration.h#L52. It's just crazy simple there, as it should be.
  • Inside of -completeDatabaseAndInstantiateOfflineMapWithError: the idea of moving a file from a background thread (albeit supposedly a single-concurrency one) is kind of gross to me. Maybe an assert or dispatch_sync() to main queue there? Just something that ensures that the -setMaxConcurrentOperationCount:1 has happened elsewhere rather than implicitly linking these by intent, which will eventually be lost to the sands of time.
  • Why isn't -[MBXOfflineMapDownloader delegate] a property as is custom? Why is this write-only? That's weird for a delegate.

@ghost
Copy link
Author

ghost commented Apr 22, 2014

The Plan from voice chat just now with @incanus... I need to:

  1. Squash all these commits into a new branch and create a new issue/PR for discussing that branch
  2. Track down a glitch where the offline map downloader can hang up when changing states. This is probably a new thing that I introduced in the past couple days with the marker icon code.
  3. Write up some instructions about how to use the sample app to exercise and test the capabilities of the api, perhaps in the readme.
  4. Start looking at what needs to happen for cleaning out cruft and polishing things for up an 0.3.0 release. First item on that list should be replacing the main sample app with the code from the v030 sample app. What to do about OS X support is still unresolved

@ghost ghost mentioned this issue Apr 22, 2014
10 tasks
@ghost
Copy link
Author

ghost commented Apr 22, 2014

I'm closing this to be continued over at #79.

This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants