Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Http tiles #1887

Merged
merged 27 commits into from
Jul 26, 2019
Merged

Http tiles #1887

merged 27 commits into from
Jul 26, 2019

Conversation

gknisely
Copy link
Member

@gknisely gknisely commented Jul 18, 2019

This PR does 3 things:

  1. it allows caching of tiles to disk that have been fetched via curl, previously the cache was in memory so when the cache was flushed the tiles would need to be fetched via curl again if another access was needed
  2. it adds a method, given a bounding box, to expand or contract the bounding box to fit the edges which have begin nodes within the bounding box
  3. it adds remove_all support to filesystem as its needed for proper testing of caching tiles to disk

both things have unit tests though the second one has its own executable as well, which will of course tank the coverage report.

CMakeLists.txt Outdated Show resolved Hide resolved
@@ -630,5 +645,43 @@ std::unordered_set<GraphId> GraphReader::GetTileSet(const uint8_t level) const {
return tiles;
}

AABB2<PointLL> GraphReader::GetMinimumBoundingBox(const AABB2<PointLL>& bb) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is one of the main areas of new functionality. will give you a right sized bounding box that will look at the input box, find all the nodes in it and expand/contract the bounding box to fit the edges leaving those nodes and their shape


// for setting where to read compressed data from
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i moved this into its own function so we can call it from multiple code paths, one such code path is seen above

// and if we cant cache to disk try to use it in-memory
auto tile = GraphTile();
if (gzipped) {
tile.DecompressTile(graphid, tile_data);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heres the other spot where this is called

@@ -126,24 +129,51 @@ GraphTile::GraphTile(const GraphId& graphid, char* ptr, size_t size) : header_(n
Initialize(graphid, ptr, size);
}

GraphTile::GraphTile(const std::string& tile_url, const GraphId& graphid, curler_t& curler) {
GraphTile GraphTile::CacheTileURL(const std::string& tile_url,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

made this into a static function that returns a graphtile instead of a constructor. its actually probably not necessary to be honest but i thought because of the caching to disk function it was a bit less tricky or more obvious to give it a name. let me know if anyone thinks this is better off as a constructor.

Initialize(graphid, graphtile_->data(), graphtile_->size());
// TODO: optionally write the tile to disk?
// try to cache it on disk so we dont have to keep doing this
if (!cache_location.empty()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the caching part is optional though, so if you dont specify a tile_dir in the valhalla config it will just use the in memory cache as it did before with urls

tile.DecompressTile(graphid, tile_data);
} // we dont need to decompress so just take ownership of the data
else {
tile.graphtile_.reset(new std::vector<char>(0, 0));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we also support non gzipped tiles, there is a new config parameter to tell which one to expect when curling. the reason you want this is so that you can cache a gzipped tile on disk isntead of always unpacking everything and storing uncompressed data on disk.

@@ -303,7 +333,7 @@ void GraphTile::AssociateOneStopIds(const GraphId& graphid) {
}
}

std::string GraphTile::FileSuffix(const GraphId& graphid) {
std::string GraphTile::FileSuffix(const GraphId& graphid, bool gzipped) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when making file names its convenient to have this flag

namespace valhalla {
namespace midgard {

bool Point2::ApproximatelyEqual(const Point2& p) const {
return equal<first_type>(first, p.first, EPSILON) && equal<second_type>(second, p.second, EPSILON);
bool Point2::ApproximatelyEqual(const Point2& p, float e) const {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for testing its nice to be able to specify the required precision

@@ -0,0 +1,112 @@
#include "baldr/graphreader.h"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the program that lets you get bounding boxes on the command line. this will be unfortunate for the coverage report

@@ -0,0 +1,224 @@
#include "test.h"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is an ingenous little test if i do say so myself :) it runs an http file server using prime_server to serve up the utrecht tiles we use for lots of other unit tests.

std::fstream input(full_path, std::ios::in | std::ios::binary);
if (input) {
std::string buffer((std::istreambuf_iterator<char>(input)), std::istreambuf_iterator<char>());
if (gz)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the cool thing is, that it does gzip compression of the responses on the fly so that when graphreader comes at it with a .gz request it returns a gzipped payload with proper content encoding. otherwise it returns the raw bytes from disk. this lets us test all the options below

@@ -0,0 +1,97 @@
#include "test.h"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here we test the minimum bounding box functionality by first configuring graphreader to nothing and seeing its an invalid bb and then configure it to utrecht. we test both expansion and contraction.

* @param node GraphId of the node from which the transitions leave
* @return returns an iterable collection of node transitions
*/
midgard::iterable_t<const NodeInfo> GetNodes() const {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was just convenient so i added it

@@ -187,16 +189,47 @@ class directory_entry {
filesystem::path path_;
};

class directory_iterator {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to do recursive delete, we need a non-recursive directory iterator, go figure. the recursion happens inside of the delete not at the iterator level

@@ -308,4 +341,19 @@ inline bool remove(const path& p) {
return ::remove(p.c_str()) == 0;
}

inline bool remove_all(const path& p) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heres where we implement rm -r. i think the comments suitably explain what is going on

dnesbitt61
dnesbitt61 previously approved these changes Jul 26, 2019
return nullptr;
}
// LOG_DEBUG("Memory map cache hit " + GraphTile::FileSuffix(base));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you intend to leave in these log messages?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in practice they were too chatty. i left them because they are useful for debugging this specific thing but i thought they were too much to always have on. trace would probably be more appropriate

if (!min_bb.minpt().IsValid())
min_bb = AABB2<PointLL>(node_ll, node_ll);

// Look at the shape of each edge leaving the node
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you really need the shape? can you just get the end node and use it to expand the bounding box? The shape will be held in the tile of the end node. Tough I guess if you want to do a loki search near a part of the shape outside both the begin and end node's tile then you would want to include the tile of the shape point (since that tile will have the bins to index into the end node tiles).

if (!cache_location.empty()) {
auto disk_location = cache_location + filesystem::path::preferred_separator + suffix;
auto dir = filesystem::path(cache_location + filesystem::path::preferred_separator + suffix);
dir.replace_filename("");
if (filesystem::create_directories(dir)) {
std::ofstream file(disk_location, std::ios::out | std::ios::binary | std::ios::ate);
file.write(&tile_data[0], tile_data.size());
return GraphTile(cache_location, graphid);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was a bug and a slower way of doing it. it was a bug because we didnt call file.close() before teling the constructor to go read it off the disk. it was slow because we already have the bytes in memory from curl and so we just load them directly below

@kevinkreiser
Copy link
Member

going to update the changelog and merge

@kevinkreiser kevinkreiser merged commit 9f07205 into master Jul 26, 2019
@kevinkreiser kevinkreiser deleted the http_tiles branch September 10, 2019 13:24
@kevinkreiser kevinkreiser restored the http_tiles branch September 10, 2019 13:24
@kevinkreiser kevinkreiser deleted the http_tiles branch September 10, 2019 13:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants