Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add time dependent mapmatching output #2030

Merged
merged 17 commits into from
Nov 12, 2019
Merged

add time dependent mapmatching output #2030

merged 17 commits into from
Nov 12, 2019

Conversation

kevinkreiser
Copy link
Member

add time tracking to map matching path formation such that the proper expected time information returned when for example you do trace_route. this will allow the time from regular routes and trace routes to match for the same path at the same time of day.

Co-authored-by: kdiluca kristen.diluca@mapbox.com

…ion information is retained when it is a time dependent map match

Co-authored-by: kdiluca <kristen.diluca@mapbox.com>
src/thor/route_matcher.cc Outdated Show resolved Hide resolved
// We support either the epoch timestamp that came with the trace point or
// a local date time which we convert to epoch by finding the first timezone
uint32_t origin_epoch = 0;
if (options.shape().begin()->time() != -1.0) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does -1.0 come from?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe he was checking the default value of time which is -1.0.
Suggestion: provide a bool isValidTime(double time) to make it more obvious.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its the default when you dont specify a timestamp for your gps coordinate, it signifies no timestamp

@@ -442,6 +442,13 @@ bool is_restricted(const bool type,
return (dow_in_range && dt_in_range);
}

uint32_t second_of_week(uint32_t epoch_time, const NodeInfo* node, double elapsedtime) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when tracking time within FormPath we need to account for changes in timezone which may change the second of the week that we are on (which changes which speed we get in the edge costing). we add a small helper method to allow us to always get the timezone offset seconds of week after offseting from an epoch time

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add this comment as a docstring for the function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is in the header we dont usually copy those to the impl

if (options.shape().begin()->time() != -1.0) {
origin_epoch = options.shape().begin()->time();
} else if (options.shape().begin()->has_date_time()) {
for (const auto& s : edge_segments) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here we allow also the date time string as an input for the time of the route. the trick is we need a nodeinfo to get the timezone so we can get the epoch time from it. so we just find the first one and use it. this may be somewhat inaccurate if the beginning of the match has discontinuities but it is less likely that you are flip flopping timezones.

@@ -153,39 +173,43 @@ MapMatcher::FormPath(meili::MapMatcher* matcher,
// This means there is a discontinuity. If so, fallback to using costing
use_timestamps = false;
}
last_interp_index = interpolations.front().back().original_index;
if (!interpolations.empty())
last_interp_index = interpolations.front().back().original_index;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kdiluca i dont remember making this change can you speak to it? i can see its to protect against dereferencing something that doesnt exist but it defaults to an index of 0 so maybe its not all the protection we need. have to look below i guess.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok ive rejiggered thsi to avoid any possible access of interpolations that doesnt make sense

Comment on lines 198 to 206
// Set seconds from beginning of the week
uint32_t second_of_week = kInvalidSecondsOfWeek;
// have to always compute the offset in case the timezone changes along the path
// we could cache the timezone and just add seconds when the timezone doesnt change
// we wont have a valid node on the first pass but then again we also wont have any
// elapsed time so this would be irrelevant anyway
if (origin_epoch != 0 && nodeinfo) {
second_of_week = DateTime::second_of_week(origin_epoch, nodeinfo, elapsed.secs);
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the meat of the change. on every new path edge we figure out what second of the week it was (if it was a time dependent route) and use that to do the edge costing below. thsi allows the costing to access the rigth speed for the time of day that route was supposed to have been happening

@@ -1,3 +1,4 @@
#include "baldr/datetime.h"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the changes in this file are basically the same as in map_matcher in spirit

if (use_timestamps)
elapsed.secs = shape[index].epoch_time() - shape[0].epoch_time();
elapsed.secs = shape.back().epoch_time() - shape[0].epoch_time();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks like an unintended change @kdiluca, will revert

// Match failed along this edge, pop the last entry off path_infos and try the next edge
// Match failed along this edge, pop the last entry off path_infos as well as what it
// contributed to the elapsed cost/time and try to keep going on the next edge
elapsed -= cost;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

previously when we went down a path that didnt match we weren't removing the impact it had on the elapsed time. this should take care of that and was the only reason i broke out the cost into its own object above. another option would ahve been to look at the previous path info if any and do the subtracting from there

@@ -288,6 +330,7 @@ bool RouteMatcher::FormPath(const std::shared_ptr<DynamicCost>* mode_costing,
float de_remaining_length = de->length() * (1 - edge.percent_along());
float de_length = length_comparison(de_remaining_length, true);
EdgeLabel prev_edge_label;
Cost elapsed;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the elapsed cost is in the inner loop because the outer loop is in case a potential edge candidate fails. again we dont want to count that failures elapsed time toward the overall correct match so we reset for each candidate path we try

second_of_week =
DateTime::day_of_week(dt) * kSecondsPerDay + DateTime::seconds_from_midnight(dt);
if (origin_epoch != 0) {
second_of_week = DateTime::second_of_week(origin_epoch, node, elapsedtime);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was already working but now that we are doing it in other places, we refactored it out into a function and now can use it here as well

std::string to_locations(const std::vector<PointLL>& shape,
const std::vector<float>& accuracies,
int frequency = 0,
int start_time = 8 * 60 * 60) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when generating locations for these tests its nice to be able to set a start time that isnt 0

try {
walked_json = actor.trace_attributes(
R"({"costing":"auto","shape_match":"edge_walk","encoded_polyline":")" +
R"({"date_time":{"type":1,"value":"2019-10-31T18:30"},"costing":"auto","shape_match":"edge_walk","encoded_polyline":")" +
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for every test we first do an edge walk match with a date time on it to check that that code path works

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and then for the regular map match we are using the to_locations function above to set timestamps on every resampled noisy gps point to trigger timestamp matching in that path way.

// We support either the epoch timestamp that came with the trace point or
// a local date time which we convert to epoch by finding the first timezone
uint32_t origin_epoch = 0;
if (options.shape().begin()->time() != -1.0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe he was checking the default value of time which is -1.0.
Suggestion: provide a bool isValidTime(double time) to make it more obvious.

origin_epoch = options.shape().begin()->time();
} else if (options.shape().begin()->has_date_time()) {
for (const auto& s : edge_segments) {
if (!s.edgeid.Is_Valid() || !matcher->graphreader().GetGraphTile(s.edgeid, tile))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

matching->graphreader().GetGraphTile(s.edgeid, tile) == nullptr

if (!s.edgeid.Is_Valid() || !matcher->graphreader().GetGraphTile(s.edgeid, tile))
continue;
directededge = tile->directededge(s.edgeid);
if (matcher->graphreader().GetGraphTile(directededge->endnode(), tile)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check != nullptr

src/thor/map_matcher.cc Outdated Show resolved Hide resolved
// we could cache the timezone and just add seconds when the timezone doesnt change
// we wont have a valid node on the first pass but then again we also wont have any
// elapsed time so this would be irrelevant anyway
if (origin_epoch != 0 && nodeinfo) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nodeinfo != nullptr

@@ -211,8 +211,7 @@ const valhalla::TripLeg* PathTest(GraphReader& reader,
PathLocation::toPBF(path_location.back(), options.mutable_locations()->Add(), reader);
}
std::vector<PathInfo> path;
bool ret =
RouteMatcher::FormPath(mode_costing, mode, reader, trace, false, options.locations(), path);
bool ret = RouteMatcher::FormPath(mode_costing, mode, reader, trace, options, path);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need better name for ret

// Match failed along this edge, pop the last entry off path_infos and try the next edge
// Match failed along this edge, pop the last entry off path_infos as well as what it
// contributed to the elapsed cost/time and try to keep going on the next edge
elapsed -= cost;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

elapsed can not go negative, maybe need a check negative exception to catch bugs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so above we add cost on, and here we subtract it back off. logically you can say it would only go negative if floating point round off screwed it up and puts it just below 0 HOWEVER we are inside the node expansion function which means we've already previously added the first edge on the path, ie the elapsed cost was already non zero, ie not close to zero when we entered this function. because of that i dont see how its possible for it to ever go below zero (unless someone writes a costing for which an edge has 0 cost, however this would cause many worse problems before making it to this point)

for (uint32_t i = start_trans; i < node_info->transition_count(); ++i, ++trans) {
if (!from_transition && nodeinfo->transition_count() > 0) {
const NodeTransition* trans = tile->transition(nodeinfo->transition_index());
for (uint32_t i = start_trans; i < nodeinfo->transition_count(); ++i, ++trans) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optimize away node0>transition_count() call in the for loop

elapsed +=
costing->EdgeCost(end_de, end_edge_tile, second_of_week) * end_edge.percent_along();
// overwrite time with timestamps
if (options.use_timestamps())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add {} for single line if condition

nodeinfo = nullptr;
}
}
}

// Interpolate match results if using timestamps for elapsed time
std::list<std::vector<interpolation_t>> interpolations;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using of std::list is suspicious. Rule of thumb, unless we need to perform list.splice() to do constant time re-ordering and such operations happens a lot. We should always avoid using std::list. Normally, reader would assume the code will often perform splice/ or delete element from middle(when container is huge, otherwise vector always wins in speed).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alternatives are std::deque/vector depends on if the size is know before hand, how many reallocations will be performed and how often do we delete items in the middle.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah i agree with this i just tend to not refactor everything if the PR that im working on isnt specifically about that.

@@ -25,7 +25,7 @@ Build Status

| Linux/MacOs | Windows | Code Coverage |
| ----------- | ------- | ------------- |
| [![Circle CI](https://circleci.com/gh/valhalla/valhalla.svg?style=svg)](https://circleci.com/gh/valhalla/valhalla) | [![Build status](https://ci.appveyor.com/api/projects/status/6w7emulgcjweu457/branch/master?svg=true)](https://ci.appveyor.com/project/kevinkreiser/valhalla/branch/master) | [![codecov](https://codecov.io/gh/valhalla/valhalla/branch/master/graph/badge.svg)](https://codecov.io/gh/valhalla/valhalla) |
| [![Circle CI](https://circleci.com/gh/valhalla/valhalla.svg?style=svg)](https://circleci.com/gh/valhalla/valhalla) | [![Build Status](https://dev.azure.com/valhalla1/valhalla/_apis/build/status/valhalla.valhalla?branchName=master)](https://dev.azure.com/valhalla1/valhalla/_build/latest?definitionId=1&branchName=master) | [![codecov](https://codecov.io/gh/valhalla/valhalla/branch/master/graph/badge.svg)](https://codecov.io/gh/valhalla/valhalla) |
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mloskot i switched over to the azure devops pipeline for windows builds, although they seem to take appx as long (due to installing dependencies) they start sooner and have more capacity to run parallel builds so the end result is a build that actually finishes in time for the rest of the platforms' ci's

…lder matches what mapmatching is doing in terms of sequence of operations. also fix issue where predicted flow is disabled for map matching when only using timestamps
Copy link
Contributor

@yuzheyan yuzheyan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some unimportant readability comments.

const auto date = date::make_zoned(time_zone, tp);
iso_date_time << date::format("%FT%R%z", date);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't combine two the if(tz_format) together?
tz_format means has_tz_format right?

nodeinfo = tile->node(directededge->endnode());
const auto* tz = DateTime::get_tz_db().from_index(nodeinfo->timezone());
// if its timestamp based need to signal that out to trip leg builder
if (options.shape(0).time() != -1.0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding a is_valid_time(double time) function maybe better for time validation check.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, maybe to avoid confusion it should take the api::Location as an argument. the reason being is that this is gnerated code so the only way to actually do this would be with a free function which of course leads to some ambiguity imho

@@ -364,7 +427,7 @@ bool RouteMatcher::FormPath(const std::shared_ptr<DynamicCost>* mode_costing,
// Update the elapsed time based on edge cost
elapsed += mode_costing[static_cast<int>(mode)]->EdgeCost(de, end_node_tile) *
(end.second.first.percent_along() - edge.percent_along());
if (use_timestamps)
if (options.use_timestamps())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for single line if statement, sometimes it has {} sometimes it does not, maybe it is better to enforce it with {}?

@@ -277,7 +319,7 @@ bool RouteMatcher::FormPath(const std::shared_ptr<DynamicCost>* mode_costing,
// Process directed edge and info
const DirectedEdge* de = begin_edge_tile->directededge(graphid);
const GraphTile* end_node_tile = reader.GetGraphTile(de->endnode());
if (begin_edge_tile == nullptr) {
if (end_node_tile == nullptr) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the rule of nullptr checking? Maybe it is better to have consistency by enforce compare with nullptr?

// because triplegbuilder relies on it. form path set the first one but
// on the subsequent legs we will need to set them by doing time offsetting
// like is done in route_action.cc thor_worker_t::depart_at

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

roger that

src/worker.cc Outdated
@@ -448,6 +451,23 @@ void parse_locations(const rapidjson::Document& doc,
midgard::logging::Log(node + "_count::" + std::to_string(request_locations->Size()),
" [ANALYTICS] ");
}

// push the date time information down into the locations
if (!had_date_time && options.has_date_time() && locations->size()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!locations->empty()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is pbf generated code, pbf did not generate empty for some older versions so to maintain compatibility we use size

@@ -39,8 +39,7 @@ class RouteMatcher {
const sif::TravelMode& mode,
baldr::GraphReader& reader,
const std::vector<meili::Measurement>& shape,
const bool use_timestamps,
const google::protobuf::RepeatedPtrField<valhalla::Location>& correlated,
valhalla::Options& options,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is somehow against intuitive to pass options as & then modify it. Maybe need another name or way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the problem is that we need to set time information on the locations. this is related to the TODOs i added in the functions that call these map matching FromPath functions. we could move all of the date setting there but that is a bit hairy because we are supporting timestamps as a date which means we need to get a node to turn it into tz and then turn the epoch into date string. the whole thing could probably be refactored to be easier to understand.

as it stands though this is inline with the request flow throughout the process. essentially as we fulfill the request we fill in bits of info along the way, which all hangs off of the api object, which owns this options object.

@kevinkreiser
Copy link
Member Author

@yuzheyan there seem to be a couple things going on here wrt to style etc. let me explain to you my principles when making a pr:

  1. i try not to change too much prior code unless its directly impacting the functionaly im changing. so if someone had extra braces on a single line if, i let them there, if someone already had == nullptr i let it there. i seek to make as minimal a diff as possible to make it easier to understand what i changed vs what i didnt. i may, as you have done, see places where old code could be more efficient, generally tend to just leave it untouched if it isnt part of what my PR is addressing. at the most i might add a TODO comment mentioning a more efficient change that could be done.
  2. we do not currently have very strict styling rules. we run a linter which requires some formatting. if we want to we can make that more strict. i personally prefer leaving out extra curly braces and checks for nullptr as i find those pedantic but i understand that everyone has their own opinion. except for tab indents those are universally hated :)

@yuzheyan
Copy link
Contributor

yuzheyan commented Nov 8, 2019 via email

@kevinkreiser
Copy link
Member Author

@yuzheyan would you like to take a swag at some changes to the style guide? i agree with all the different contributors that its gotten to a state of eclecticness that might be too far..

would you mind starting a new issue to tackle this concept and make some proposals. It would be great if most of it could be done via clang format however i imagine more semantic things like == nullptr are not possible (though maybe with clang tidy?)

@kevinkreiser
Copy link
Member Author

im still trying to bump the coverage on this.

danpaz
danpaz previously approved these changes Nov 11, 2019
Copy link
Collaborator

@danpaz danpaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some ideas to add to the tests. All else LGTM.

{"lat":52.0796213915245,"lon":5.139262676239015,"accuracy":10,"time":8},
{"lat":52.079637875461195,"lon":5.139581859111787,"accuracy":10,"time":10},
{"lat":52.07964776582031,"lon":5.139828622341157,"accuracy":10,"time":12},
{"lat":52.07985600778458,"lon":5.1404121782302865,"accuracy":10,"time":14}]})"));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you added time to all these test cases. Should we keep one or two test cases without timestamps to test the default case?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible also to add an assertion on the returned duration? At least make sure it's non-zero / non-negative?

@kevinkreiser kevinkreiser merged commit 7bcca3f into master Nov 12, 2019
@purew purew deleted the mm_time_tracking branch November 26, 2019 15:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants