-
Notifications
You must be signed in to change notification settings - Fork 322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor/cvrp speedup #146
Conversation
A similar speedup can be observed with origin/master:
refactor/cvrp-speedup:
How did you calculate the 35% number for my last PR? I'd like to add a similarly calculated number to the changelog for this PR. |
Thanks for the implementation, great to see you managed to keep amounts usage unchanged. I'll probably come up with a few questions once I have time to look into the details.
As I recall, this was simply based on overall computing time comparison, summing values in the "computing times" column of the files produced by I'll also run the same comparisons for the current PR with various |
Running on all CVRP instances with 8 threads, I'm getting the following total computing times comparisons:
I confirm that all solutions are exactly the same. This is great! 🎉 |
Apologies for the unorganized work, but I added two commits that improve the computing times for instances with a lot of vehicles/with a bottleneck in |
I'm using callgrind with kcachegrind as a UI. My workflow:
For every change that has a big impact like this PR there are a lot of changes that don't have a noticeable improvement.
I did not spot it with X-n344-k43 since it does not spend a lot of time in |
Thanks for sharing. I've been really focused on the big-O complexity (based on number of jobs/vehicles) while designing the solving approach, but this is a great (and complementary) way to spot improvements. What I find really interesting is that (except for aca17de) the overall algorithmic complexity is untouched, yet the gains are stunning. |
CHANGELOG.md
Outdated
@@ -6,6 +6,7 @@ | |||
|
|||
- Update `clang-format` to 6.0 for automatic code formatting (#143) | |||
- Improve performance of the TSP heuristic by ~35% on all TSPLIB instances (#142) | |||
- Improve performance of the CVRP heuristic by ~55% on all CVRPLIB instances (#146) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the term "perfomance" for the solving approach is expected to relate to the solution quality. What about using "speed-up" instead? (Same applies to the entry for your latest PR on the TSP heuristic)
Also you might want to replace "heuristic" with "approach" for the CVRP entry as your fix is useful throughout the whole solving phase (heuristic + local search).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I replaced "heuristic" with "solving", since I don't think that "approach" fits very well.
@@ -348,7 +349,7 @@ void cvrp_local_search::update_amounts(index_t v) { | |||
std::transform(_sol_state.fwd_amounts[v].cbegin(), | |||
_sol_state.fwd_amounts[v].cend(), | |||
_sol_state.bwd_amounts[v].begin(), | |||
[&](const auto& a) { | |||
[&](const auto& a) -> amount_t { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is necessary so the compiler can resolve the return type, due to the template change for amount_t
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it's actually a bit worse. With this annotation the return type of this lambda is amount_t
and an implicit conversion at the return-statements converts the amount_diff_t
into an amount_t
by actually copying the values and everything is fine.
Without it the return type is inferred to be amount_diff_t
and the return-statement does not perform any implicit conversions, this is a problem since amount_diff_t
contains references to the lhs and rhs subexpressions. This way however undefined behaviour happens somewhere and this calculation is not carried out correctly. However now that I think about it, I understand the exact reason: the auto total_amount =
line copies into a local amount object. The lifetime of this local object ends after the return, but the amount_diff_t
still holds a reference to it. I pushed a commit that fixes this issue a bit more elegantly.
This is the part I don't like about this PR, it makes it a bit harder to use amount_t correctly.
@@ -494,6 +495,20 @@ void cvrp_local_search::try_job_additions(const std::vector<index_t>& routes, | |||
} | |||
} | |||
|
|||
auto smallest = std::numeric_limits<gain_t>::max(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well done for using smallest and second smallest, that's a clever way to get rid of the inner loop!
@@ -494,6 +495,20 @@ void cvrp_local_search::try_job_additions(const std::vector<index_t>& routes, | |||
} | |||
} | |||
|
|||
auto smallest = std::numeric_limits<gain_t>::max(); | |||
auto second_smallest = std::numeric_limits<gain_t>::max(); | |||
size_t smallest_idx = std::numeric_limits<gain_t>::max(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be std::size_t
for consistency with other occurences.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
src/structures/vroom/amount.h
Outdated
capacity_t operator[](std::size_t i) const { | ||
return lhs[i] - rhs[i]; | ||
}; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The formatting script complains about spaces here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
src/structures/vroom/amount.cpp
Outdated
@@ -31,51 +25,3 @@ amount_t& amount_t::operator-=(const amount_t& rhs) { | |||
return *this; | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The formatting script complains about newlines here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
src/structures/vroom/amount.h
Outdated
using parent::push_back; | ||
std::size_t size() const { | ||
return elems.size(); | ||
}; | ||
|
||
amount_t& operator+=(const amount_t& rhs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the reason for having all implementations for expression templates in the .h
file is the same as for my question in the previous PR (allows the compiler to have the details from other compilation units)?
If so, wouldn't that make sense for amount_t::operator+=
and amount_t::operator-=
too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved them to the header and deleted the .cpp file, since it is now empty.
Issue
Fixes #144
Implements expression templates for amount to reduce heap allocations.
Tasks
CHANGELOG.md
(remove if irrelevant)Benchmarks
All CVRP instances, 8 threads, exploration level = 1
origin/master:
refactor/cvrp-speedup:
Solutions do not change, everything gets a bit faster.