Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Timezone Diff Performance #2316

Merged
merged 12 commits into from Apr 14, 2020
Merged

Improve Timezone Diff Performance #2316

merged 12 commits into from Apr 14, 2020

Conversation

kevinkreiser
Copy link
Member

@kevinkreiser kevinkreiser commented Apr 12, 2020

As noted in: #2278 (comment) our use of timezone diffing is not performant. In routes that cross multiple timezones and are particularly long we spend the vast majority of our CPU just computing timezone differences. In the linked issue above we spend about 14 seconds of a 15 second route calculation just doing timezone differencing..

This PR implements a caching strategy on top of timezone info look up so that as a given route progresses we dont need to construct timezone information to do the differencing. A first proof of concept laid out here: HowardHinnant/date#569.

TODO: integrate the POC into our datetime utility which does the differencing
TODO: performance test

as a test i disabled the distance limitation on timedep_forward and ran a route from Minsk to Aberdeen. with the old code this takes 15 seconds to complete on mondern hardware. with the timezone caching this comes down to 1.5 seconds. of course bidirectional astar is only 300ms. but we are working on getting timedependence in there

Comment on lines 115 to 121
auto duration = std::chrono::duration_cast<std::chrono::seconds>(origin.get_local_time() -
dest.get_local_time());
if (origin.get_info().offset < dest.get_info().offset) {
return std::abs(duration.count());
} else {
return -1 * std::abs(duration.count());
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

under the hood get_local_time also calls get_info. get_info is quite a heavy call. it turns out you can use the info returned from it to do the timezone offset directly so just this small refactor ends up cutting our long time dependent routes response time in half.

@@ -114,6 +114,7 @@
#include <cctype>
#include <cstdlib>
#include <cstring>
#include <cwchar>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

long story here. i know there was a thread about this when i did it originally but instead of taking the implementation inside of the date library directly. we copy it and modify it to not need to use CURL/filesystem to grab timezone database information but instead we package that data as strings with the build (as we do many things). this allows us to ship the library with all of the data it needs to run (other than the actual tiles).

anyway i updated the submodule to the latest version and in doing so had to redo my hacks for loading tzdb from strings.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the other thing this does is it ensure every platform has the same idea about what timezones exist where. faster, per-platform implementations exist but they dont agree that ET is index 109 for example which is a problem for us because we bake the timezone index into the tile data. so by making the data static we ensure everyone agrees on timezones.

@@ -31,7 +32,7 @@ struct tz_db_t {
const date::time_zone* from_index(size_t index) const;

protected:
std::vector<std::string> names;
std::unordered_map<std::string, size_t> names;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small refactor to use a hash map to look up the index from the name rather than std::find. probably not needed but if we do this alot somewhere it will be way faster than scanning 300+ entries

@kevinkreiser kevinkreiser changed the title Improve Timezone Diff Performance WIP: Improve Timezone Diff Performance Apr 12, 2020
@kevinkreiser kevinkreiser changed the title WIP: Improve Timezone Diff Performance Improve Timezone Diff Performance Apr 14, 2020
@@ -13,28 +13,59 @@
#include "midgard/logging.h"
#include "midgard/util.h"

namespace {
// use a cache to store already constructed sys_info's since they aren't cheap
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is where the speed up is had. more info about how this works can be found here: HowardHinnant/date#569 (comment)

@@ -232,6 +232,7 @@ thor::PathAlgorithm* thor_worker_t::get_path_algorithm(const std::string& routet
const valhalla::Location& destination) {
// Have to use multimodal for transit based routing
if (routetype == "multimodal" || routetype == "transit") {
valhalla::midgard::logging::Log("algorithm::multimodal ", " [ANALYTICS] ");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seemed crazy to me that we dont log the algorithm that we use. in fact this was the source of much frustration for me. i wanted to show that this implementation speeds up timedep forward but everytime i tried a route from say minsk to london it was the same speed as bidir astar and the reason for that was...

@@ -242,6 +243,7 @@ thor::PathAlgorithm* thor_worker_t::get_path_algorithm(const std::string& routet
PointLL ll1(origin.ll().lng(), origin.ll().lat());
PointLL ll2(destination.ll().lng(), destination.ll().lat());
if (ll1.Distance(ll2) < max_timedep_distance) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this line... if the route is more than 500km then it wont do time dependent but instead falls back to bidirastar.

test/datetime.cc Outdated
@@ -583,9 +583,56 @@ TEST(DateTime, TestSecondOfWeek) {
EXPECT_EQ(a, e) << "Wrong second of week";
}

TEST(DateTime, DiffCaching) {
// no cache
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first we check that the changes work even with out cache but checking the time difference between NY and LA to be -3 hours

test/datetime.cc Outdated
// for a really long route that crosses many timezone we end up doing a lot of tz diffing
// the tuple is: number of calls, origin tz index, destination tz index
// 0 is reserved for no timezone so we shift down by one to use them with the tzdb
std::vector<std::tuple<int, int, int>> test_cases = {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then we take what is a play by play of all the calls to the timezone diff method for a very long route through africa starting in europe

test/datetime.cc Outdated
} // namespace

int main(int argc, char* argv[]) {
// make this whole thing bail if it doesnt finish fast
alarm(10);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even on a really crazy machine the implementation without cache takes something like 15 seconds. with the cache the implementation is subsecond.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this github comment be put as the code comment? I think it describes the intent of the alarm a lot better than the short code comment.

const date::time_zone* tz,
std::unordered_map<const date::time_zone*, std::vector<date::sys_info>>& cache) {
// check if we have anything for this timezone in the cache
auto tz_it = cache.find(tz);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it ok to use a pointer as the cache key? Is each timezone only instantiated once?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are statically initialized in a singleton

auto tz_it = cache.find(tz);
if (tz_it != cache.cend()) {
// we have something in the cache, see if one of the infos has the particular time point in range
auto st = date::floor<std::chrono::seconds>(tp.get_sys_time());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could st be named something more verbose?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just following the patterns of the library it comes from. Tp=timepoint and st=system time. I think it's pretty clear from the function call get sys time no?

auto st = date::floor<std::chrono::seconds>(tp.get_sys_time());
auto info_it =
std::find_if(tz_it->second.begin(), tz_it->second.end(),
[&st](const date::sys_info& info) { return info.begin <= st && st < info.end; });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the array in the map is usually 1-2 elements long? There wouldn't be any point in sorting it according to begin time which would allow a binary search?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hahah i had that as a TODO but you are exactly right. there is basically no benefit. the only time the array isnt a single entry long is if the route crosses over daylight savings time

@@ -526,7 +533,7 @@ native_to_standard_timezone_name(const std::string& native_tz_name,
}

// Parse this XML file:
// http://unicode.org/repos/cldr/trunk/common/supplemental/windowsZones.xml
// https://raw.githubusercontent.com/unicode-org/cldr/master/common/supplemental/windowsZones.xml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the build fetching this at build-time?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no. all of the strings needed for all of the target platforms crammed into headers and read directly from memory. thats what my previous comment on this file was talking about. although tbh im not sure that windows has the same indexing scheme as non windows platforms (and if not (i think its not) its a long standing bug that we'll have to tackle in another)

Copy link
Contributor

@purew purew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice speedup! LGTM

@kevinkreiser kevinkreiser merged commit 6822ba0 into master Apr 14, 2020
@purew purew deleted the date_hacking branch April 14, 2020 19:00
yuzheyan added a commit that referenced this pull request Apr 16, 2020
* master:
  bidir a* elapsed time/cost tracking has a bug and its.... partial distance strikes again! (#2323)
  Fixes for Windows (#2314)
  fix local date shifting to be without timezone and add test (#2312)
  Improve Timezone Diff Performance (#2316)
  Missed changelog (#2318)
  Fix break/continue typo in search filtering (#2317)
  Approach, multi-cue, and length updates (#2313)
  Added default tile getter interface (#2309)
  Use flow mask when considering access for live traffic (#2311)
  Fix typos (#2310)
  Add missing declaration for helper function. (#2304)
  Guidance views - use changeset for data_id and MatchGuidanceViewJunctions fix (#2303)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants