-
Notifications
You must be signed in to change notification settings - Fork 322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clustering heuristic for CVRP #74
Comments
I commited a first draft for a clustering approach in #75. The reason why it's not possible to just use a classical clustering technique like k-means:
Clusters are built around start/end locations and a "good" clustering should minimize the sum of "job spreading" (for some metric) over the clusters. As a measure of points closeness, we use the length of a spanning tree. The implementation is derived from Prim's algorithm, except that we want to end up with a forest of spanning trees (one per cluster). Thus we make the same kind of greedy choice but we maintain one heap per cluster. Clusters are built concurrently, picking the best possible addition to the best cluster at each step. Note: there is no guarantee that final spanning trees are minimal at cluster level. The rationale for this strategy is:
|
A couple of gifs are probably worth a long explanation. Here is a "good" case where all jobs are assigned and after applying a TSP to each cluster, the solution cost is already under +15% from the optimal value: This second example shows how the greedy assignments can end up with really wrong clusters when only bad choices are left (all jobs assigned but around +55% from the optimal)... Also there is no guarantee so far that all jobs will be served, even for a problem where it's possible. |
Turns out the parallel construction of the clusters is not always the best choice. It's good though in many situations where clusters are expected to be scattered geographically because vehicles start/end are distributed. In the specific case where vehicles start/end are equal (think single depot), the
The outline of this sequential approach is actually very close to the well-known insertion heuristic proposed by Solomon except we work at cluster level so we don't need to test all possible insertions at route level. |
So we end up with three parameters for the clustering scheme:
For now I'm happily firing a bunch of combinations of the above and picking up the best result before moving to the TSP solving phase. This probably calls for more tuning at some point but is fine for now as it's still fast (and can be parrallelized). Also we get a more "all-round" approach as a single tuning can be very good on some problems but really poor on others (different problem shape really call for different clustering parameters). I'll be doing a little tidying for output logs, update the docs and then merge #75 into |
On the benchmarking side, I've looked at results on CVRPLIB. The classes that can easily be tested are A, B, E, F, M, P, X (see conversion script). Overall 44737 jobs out of 44993 are served (that's 99.4%). We manage to get all jobs done for 159 instances out of 189. This means that only the 30 instances that are the tighter with regard to capacities have unassigned jobs (and usually a small number) after the heuristic. For those 159 instances, it makes sense to compare our overall costs to the best known solutions. Gaps range from 0% to 46.9% with an average of 15.8%. Median gap is at 15.2%. This looks promising for a fast heuristic approach and we now want to look at what happens when we get those solutions through a local search phase. |
Landed in |
I'd like to take a first step toward handling multiple vehicles by adding capacity constraints to force load-balancing between available vehicles. I think we could get interesting results with a cluster-first-route-second approach. Advantages would include:
Things to keep in mind for implementation:
The text was updated successfully, but these errors were encountered: