Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add use_distance to auto cost to allow choosing between two primary cost components, time or distance #2771

Merged
merged 20 commits into from
Jan 20, 2021

Conversation

kevinkreiser
Copy link
Member

@kevinkreiser kevinkreiser commented Jan 12, 2021

In our previous work we removed auto_shorter and replaced it with a boolean option to give back the shortest route. I propose removing that (from the api) and replacing it with use_distance so that the user can control how much they favor time vs distance as the main factor of their cost units. This allows them to still specify other penalties like those for ferry or highway etc.

So far I only quickly hacked it in for auto cost so i could test it but I'd rather add it to the base class and use it the same everywhere.

EDIT: I'd like to reduce the scope of this pr just to adding use_distance to autocost and letting everything else intact. to properly replace shortest (which exists on all costings) I'll need to move use_distance to the base costing class. which is what i want to do but there isnt a really clean way of doing that at the moment. i propose that first we do a refactor which moves all the factors into the base class and then have the children classes tell the base class what defaults it should use. Its up in the air how they should do that but @genadz had some ideas on it.

… cost in contrast to time, could replace shorter..
@nilsnolde
Copy link
Member

It’s been bothering me a lot too that shortest just completely disregards other factors.. 👍

…hat the linear combination makes more sense. also handle this preference in transition cost
@kevinkreiser kevinkreiser changed the title Replace Shortest with use_distance Replace costing_option shortest with use_distance Jan 14, 2021
@kevinkreiser kevinkreiser changed the title Replace costing_option shortest with use_distance Add use_distance to auto cost to allow choosing between two primary cost components, time or distance Jan 14, 2021
@kevinkreiser
Copy link
Member Author

@nilsnolde any problem if i make another pr for replacing shortest with this in all the costings? id like to get this idea merged for autocost and then do a larger refactor, where we pull all the "factors" into DynamicCost and have the subclasses pass in their defaults for parsing. @genadz is dealing with the same kind of annoyance on #2769 and suggested a work around that i think would be best done separately.

@kevinkreiser kevinkreiser marked this pull request as ready for review January 14, 2021 20:49
@nilsnolde
Copy link
Member

Totally agree as well, more flexibility is always welcome:)

@@ -121,6 +121,7 @@ message CostingOptions {
optional bool ignore_closures = 63;
optional bool shortest = 64;
optional float service_penalty = 65;
optional float use_distance = 66;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new costing option from 0 to 1, how much do you care about minimizing over all distance

@@ -45,6 +45,7 @@ constexpr float kDefaultUseFerry = 0.5f; // Factor between 0 and 1
constexpr float kDefaultUseRailFerry = 0.4f; // Factor between 0 and 1
constexpr float kDefaultUseHighways = 1.0f; // Factor between 0 and 1
constexpr float kDefaultUseTolls = 0.5f; // Factor between 0 and 1
constexpr float kDefaultUseDistance = 0.f; // Factor between 0 and 1
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the default is to care only about minimizing time (as it always was) which means there are zero RAD differences

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the comment or variable name itself mention that it weights between minimizing time or distance?

Feel free to ignore if it doesn't make sense.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure!

// is to change length units into time units by multiplying by the reciprocal of a constant speed.
// This means basically changes the units of distance to be more in the same ballpark as the units of
// time and makes the linear combination make more sense.
constexpr float kInvMedianSpeed = 1.f / 16.f; // about 37mph
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please read the comment above 😄

Comment on lines +308 to +309
float distance_factor_; // How much distance factors in overall favorability
float inv_distance_factor_; // How much time factors in overall favorability
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only these two are new the rest is just whitespace

@@ -333,6 +346,10 @@ AutoCost::AutoCost(const CostingOptions& costing_options, uint32_t access_mask)
float use_highways_ = costing_options.use_highways();
highway_factor_ = 1.0f - use_highways_;

// Preference for distance vs time
distance_factor_ = costing_options.use_distance() * kInvMedianSpeed;
inv_distance_factor_ = 1.f - costing_options.use_distance();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the two coefficents of the linear combination

Comment on lines 418 to 445
float factor =
(edge->use() == Use::kFerry)
? ferry_factor_
: (edge->use() == Use::kRailFerry) ? rail_ferry_factor_ : density_factor_[edge->density()];
// base factor is either ferry, rail ferry or density based
float factor = (edge->use() == Use::kFerry) * ferry_factor_ +
(edge->use() == Use::kRailFerry) * rail_ferry_factor_ +
(edge->use() != Use::kFerry && edge->use() != Use::kRailFerry) *
density_factor_[edge->density()];

// TODO: factor hasn't been extensively tested, might alter this in future
// TODO: speed_penality hasn't been extensively tested, might alter this in future
float speed_penalty = (edge_speed > top_speed_) ? (edge_speed - top_speed_) * 0.05f : 0.0f;
factor += highway_factor_ * kHighwayFactor[static_cast<uint32_t>(edge->classification())] +
surface_factor_ * kSurfaceFactor[static_cast<uint32_t>(edge->surface())] + speed_penalty;

if (edge->toll()) {
factor += toll_factor_;
}

if (edge->use() == Use::kAlley) {
factor *= alley_factor_;
}

return Cost(sec * factor, sec);
factor += (highway_factor_ * kHighwayFactor[static_cast<uint32_t>(edge->classification())] +
surface_factor_ * kSurfaceFactor[static_cast<uint32_t>(edge->surface())] +
speed_penalty + edge->toll() * toll_factor_) *
(edge->use() == Use::kAlley ? alley_factor_ : 1.f);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i removed all the branches except 1 in favor of a multiplcation trick

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"trick" == using a boolean as a double? :). I wonder if that generates a compiler warning...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is a commonly used convention that simplifies the logic throughout, I can learn to live with promoting bools to doubles (or ints). Just an observation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bool gets upcasted to float yes. since float has more precision than bool there is no warning on standard warning levels though -Wall or -Wpedantic may warn about this i havent checked. we do have -Werror enabled for sif in the build so we are fine there for the moment imho

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

locally ive add both -Wall and -Wpedantic and indeed it doesnt complain (though pedantic did find an extra semicolon in some other code 😉

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ktatterso it is commonly used in this codebase, but I tend to agree with you - it makes for terser code, but I don't think that always means simpler logic or improved readability.

I would like to see a benchmark to prove out that removal of branches here is actually significant. I know the theory is to reduce branching -> improved pipelining -> faster code, but I have my doubts that these kinds of branches actually end having a significant impact in many places.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compiler should be able to optimize the simple branches away

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah i recently made some performance improvements to isochrones that resulted in something like a 10% gain by removing a cascade of if elses. we got a smaller gain by reducing branching and recursion in bidirectional_astar as well. i just wanted to do an experiment here to see if we could get anything out of it.

the compiler will likely be able to predict a lot of these that occur with relatively low frequency i agree. in c++20 we can even influence that with the unlikely keyword

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@purew I'm not sure the compiler can optimize away any condition unless it is programmatically impossible, e.g., "if ( false )". Maybe you can clarify?

Compiler optimized branch "prediction" can be baked into our binaries if we can employ some sort of "profile guided optimization". I can speak to this a bit further if there is interest.

In the end tho, the switch statement is readable (imo), doesn't rely on compiler optimization or branch prediction of any sort, and provably faster than cascading conditionals (tho as @danpat points out, not always measurably faster).


// base cost before the factor is a linear combination of time vs distance, depending on which one
// the user thinks is more important to them
return Cost((sec * inv_distance_factor_ + edge->length() * distance_factor_) * factor, sec);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here is where we use a linear combination for the edges time vs the edges length

}

// Account for the user preferring distance
c.cost *= inv_distance_factor_;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is also a linear combination for the transition cost however we can simplify it because the distance of a transition is always 0 which simplifies away the second half of the formula

* @param idx Index used for name consistency.
* @return Returns the transition cost (cost, elapsed time).
*/
sif::Cost base_transition_cost(const baldr::NodeInfo* node,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this method is completely redundant with the method below so it bit the dust

@nilsnolde nilsnolde self-requested a review January 15, 2021 09:06
nilsnolde
nilsnolde previously approved these changes Jan 15, 2021
@kevinkreiser
Copy link
Member Author

perf tested this and there is no discernable difference:

master:

         0%         50%         90%         99%        100% 
0.001742601 0.036118269 0.124286747 0.718150904 1.567052364 

branch:

         0%         50%         90%         99%        100% 
0.002444983 0.036208153 0.123085022 0.718498600 1.613869429

? ferry_factor_
: (edge->use() == Use::kRailFerry) ? rail_ferry_factor_ : density_factor_[edge->density()];
// base factor is either ferry, rail ferry or density based
float factor = (edge->use() == Use::kFerry) * ferry_factor_ +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously the factor was exclusive to either ferry, rail-ferry, or neither. Now it appears that the logic is written to allow for an edge to be both ferry and rail-ferry - in which case their factors are combined.

Based on how the code is written tho, I'm guessing edge->use() can only be one value. (e.g., there aren't any bitmasks involved).

I'm not a huge fan of using the result of the boolean condition as false=0 and true=1 in math below. How would you feel about using a switch statement?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yes, i wanted to comment on this, thanks for the reminder! an edge can only have one use, so while i did use addition here only one of the terms can be non-zero, so its really like saying ferry || rail_ferry || density

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wanted to experiment with what a more drastic removal of branch statements would do to this function as its called very very many times in the main loop of path finding. in my tests the performance wasnt measureably different but i did appreciate the simplification of the code so i left it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! If performance is the goal then I highly recommend a switch statement. A switch would work well here. I think it would also work well in base_transition_cost().

A switch statement compiles to a jump-table, avoiding the cascading conditionals.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i dont think we can use it in base_transition_cost because value for the switch is likely too complex. we could use it here but im not sure a jump table wouldnt still stall the pipeline anyway. what is suspect is that the branch predictor will realize that most of the time the density based factor will apply and it will make no performance impact. ill give it a quick test, if there is no difference ill just use your suggestion!

@kevinkreiser kevinkreiser merged commit 3b717ea into master Jan 20, 2021
@kevinkreiser kevinkreiser deleted the kk_use_distance branch January 5, 2022 03:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants