-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Google OR-Tools benchmark #453
Comments
We should set the 'search strategy' OR-Tools uses to GLS when solving our instances, as explained here. That's where we can also set time limits. I'm unsure how to handle multiple depots, because that's where we use a no-improvement criterion that I don't know how to add to OR-Tools. Something to figure out more generally: how do we make OR-Tools terminate with its best-found solution when it needs to? Does it support callbacks where we can terminate the search? |
There's a way to add callbacks when a solution is found https://developers.google.com/optimization/reference/python/constraint_solver/pywrapcp#addatsolutioncallback. But I'm not sure if it also finds "non-improving" solutions. Alternatively, I think we could also just ignore the NoImprovement criterion for MDVRPTW instances when benchmarking Google OR-Tools. I don't think it makes much sense since we don't know what an iteration is in the solver. As long as we keep the fixed time limit then the comparison still makes sense. Another issue with MDVRPTW: the benchmark requires double precision, something that OR-Tools doesn't support. https://or.stackexchange.com/questions/3325/floating-points-in-or-tools We can scale up the values but I recall that there were some issues with that. |
The VRPTW example does not explain how to include service times. I haven't found out yet how to add this. Update: we can incorporate the service times in the travel times. The travel time from |
This comment was marked as duplicate.
This comment was marked as duplicate.
I ran a benchmark using Google OR-Tools for CVRP, VRPTW and PC-VRPTW using our benchmarking instructions and the scripts below. Each instance was only solved once because I could not find a way to set the seed of the solver. Read scriptimport functools
import pathlib
from dataclasses import dataclass
from numbers import Number
from typing import Optional, Union
import numpy as np
import vrplib
_INT_MAX = np.iinfo(np.int32).max
def round_nearest(vals: np.ndarray):
return np.round(vals).astype(int)
def scale_and_truncate_to_decimals(vals: np.ndarray, decimals: int = 0):
return (vals * (10**decimals)).astype(int)
@dataclass
class ORToolsData:
"""
Data class for holding the data that will be used to solve the problem with
OR-Tools.
Note that there are no service times. This is because OR-Tools does not
support service times. The service times are instead added to the travel
times, and the resulting times are used as the time matrix.
Parameters
----------
depot
The depot index.
dimension
The number of locations.
num_vehicles
The number of vehicles.
vehicle_capacities
The capacity of each vehicle.
demands
The demands of each location.
distance_matrix
The distance matrix between locations.
time_matrix
The time matrix between locations. Optional.
time_windows
The time windows for each location. Optional.
prizes
The prizes for each location. Optional.
"""
depot: int
dimension: int
num_vehicles: int
vehicle_capacities: list[int]
distance_matrix: list[list[int]]
demands: list[int]
service_times: list[list[int]]
time_windows: Optional[list[list[int]]] = None
prizes: Optional[list[int]] = None
def read(where: Union[str, pathlib.Path], problem: str) -> ORToolsData:
"""
Reads the VRPLIB file at the given location, and returns a ProblemData
instance.
Parameters
----------
where
File location to read. Assumes the data on the given location is in
VRPLIB format.
problem
The VRP problem type. One of ['cvrp', 'vrptw', 'pcvrptw']
Raises
------
ValueError
When the data file does not provide information on the problem size.
Returns
-------
ProblemData
Data instance constructed from the read data.
"""
if problem == "cvrp":
round_func = round_nearest
elif problem in ["vrptw", "pcvrptw"]: # dimacs
round_func = functools.partial(
scale_and_truncate_to_decimals, decimals=1
)
else:
msg = "Unknown problem. Must be one of ['cvrp', 'vrptw', 'pcvrptw']"
raise ValueError(msg)
instance = vrplib.read_instance(where)
# A priori checks
if "dimension" in instance:
dimension: int = instance["dimension"]
else:
if "demand" not in instance:
raise ValueError("File should either contain dimension or demands")
dimension = len(instance["demand"])
depot: int = int(instance.get("depot", [0])[0])
num_vehicles: int = instance.get("vehicles", dimension - 1)
capacity: int = instance.get("capacity", _INT_MAX)
distances: np.ndarray = round_func(instance["edge_weight"])
if "demand" in instance:
demands: np.ndarray = instance["demand"]
else:
demands = np.zeros(dimension, dtype=int)
if "node_coord" in instance:
round_func(instance["node_coord"])
else:
np.zeros((dimension, 2), dtype=int)
if "service_time" in instance:
if isinstance(instance["service_time"], Number):
# Some instances describe a uniform service time as a single value
# that applies to all clients.
service_times = np.full(dimension, instance["service_time"])
service_times[0] = 0
service_times = round_func(service_times).tolist()
else:
service_times = round_func(instance["service_time"]).tolist()
else:
service_times = [0] * dimension
if "time_window" in instance:
time_windows = round_func(instance["time_window"]).tolist()
else:
time_windows = None
if "prize" in instance:
prizes = round_func(instance["prize"]).tolist()
else:
prizes = None
return ORToolsData(
depot=depot,
dimension=dimension,
demands=demands.tolist(),
num_vehicles=num_vehicles,
vehicle_capacities=[capacity] * num_vehicles,
distance_matrix=distances.tolist(),
service_times=service_times,
time_windows=time_windows,
prizes=prizes,
) Solver scriptfrom ortools.constraint_solver import pywrapcp, routing_enums_pb2
from read import ORToolsData
def solve(data: ORToolsData, max_runtime: float, log: bool = False):
"""
Solves an instance with OR-Tools.
Parameters
----------
data
The instance data.
max_runtime
The maximum runtime in seconds.
log
Whether to log the search.
Returns
-------
tuple[list[list[int]], int]
A tuple containing the routes and the objective value.
"""
# Manager for converting between nodes (location indices) and index
# (internal CP variable indices).
manager = pywrapcp.RoutingIndexManager(
len(data.distance_matrix), data.num_vehicles, data.depot
)
routing = pywrapcp.RoutingModel(manager)
# Distance callback and arc costs. Note that in the VRPLIB instances
# distance = duration, so we use also the distance matrix in a TW-setting.
def distance_cb(index1, index2):
node1 = manager.IndexToNode(index1)
node2 = manager.IndexToNode(index2)
return data.distance_matrix[node1][node2]
distance_cb_idx = routing.RegisterTransitCallback(distance_cb)
routing.SetArcCostEvaluatorOfAllVehicles(distance_cb_idx)
# Demand callback and capacity constraints.
def demand_cb(index):
return data.demands[manager.IndexToNode(index)]
demand_cb_idx = routing.RegisterUnaryTransitCallback(demand_cb)
routing.AddDimensionWithVehicleCapacity(
demand_cb_idx,
0, # null capacity slack
data.vehicle_capacities, # vehicle maximum capacities
True, # start cumul to zero
"Capacity",
)
# Time window constraints.
if data.time_windows is not None:
# The depot latest time window is a reasonable upper bound for the
# waiting time and maximum duration per vehicle.
horizon = data.time_windows[0][1]
def time_cb(index1, index2):
node1 = manager.IndexToNode(index1)
node2 = manager.IndexToNode(index2)
dist = data.distance_matrix[node1][node2]
service = data.service_times[node1]
return dist + service
time_cb_idx = routing.RegisterTransitCallback(time_cb)
routing.AddDimension(
time_cb_idx,
horizon, # allow waiting time up to the full horizon
horizon, # maximum duration per vehicle equal to horizon
False, # Don't force start cumul to zero.
"Time",
)
time_dim = routing.GetDimensionOrDie("Time")
for node, (tw_early, tw_late) in enumerate(data.time_windows):
if node == data.depot: # skip depot
continue
index = manager.NodeToIndex(node)
time_dim.CumulVar(index).SetRange(tw_early, tw_late)
# Add time window constraints for each vehicle start node equal to the
# depot time window.
depot_tw_early = data.time_windows[data.depot][0]
depot_tw_late = data.time_windows[data.depot][1]
for node in range(data.num_vehicles):
start = routing.Start(node)
time_dim.CumulVar(start).SetRange(depot_tw_early, depot_tw_late)
for node in range(data.num_vehicles):
cumul_start = time_dim.CumulVar(routing.Start(node))
routing.AddVariableMinimizedByFinalizer(cumul_start)
cumul_end = time_dim.CumulVar(routing.End(node))
routing.AddVariableMinimizedByFinalizer(cumul_end)
# Prize-collecting: visits are optional, but if clients are not visited
# then the prize is "lost" and incurred as penalty.
if data.prizes is not None:
for node, prize in enumerate(data.prizes):
if node == data.depot:
continue
routing.AddDisjunction([manager.NodeToIndex(node)], prize)
# Setup search parameters.
params = pywrapcp.DefaultRoutingSearchParameters()
pca = routing_enums_pb2.FirstSolutionStrategy.PATH_CHEAPEST_ARC
params.first_solution_strategy = pca
gls = routing_enums_pb2.LocalSearchMetaheuristic.GUIDED_LOCAL_SEARCH
params.local_search_metaheuristic = gls
params.time_limit.FromSeconds(int(max_runtime)) # only accepts int
params.log_search = log
solution = routing.SolveWithParameters(params)
routes = solution2routes(data, manager, routing, solution)
objective = solution.ObjectiveValue()
return routes, objective
def solution2routes(data, manager, routing, solution) -> list[list[int]]:
"""
Converts an OR-Tools solution to routes.
"""
routes = []
distance = 0 # for debugging
for vehicle_idx in range(data.num_vehicles):
index = routing.Start(vehicle_idx)
route = []
route_cost = 0
while not routing.IsEnd(index):
node = manager.IndexToNode(index)
route.append(node)
prev_index = index
index = solution.Value(routing.NextVar(index))
route_cost += routing.GetArcCostForVehicle(
prev_index, index, vehicle_idx
)
if clients := route[1:]: # ignore depot
routes.append(clients)
distance += route_cost
return routes Solution verification scriptimport argparse
import functools
from numbers import Number
import numpy as np
import vrplib
def round_nearest(vals: np.ndarray):
return np.round(vals).astype(int)
def scale_and_truncate_to_decimals(vals: np.ndarray, decimals: int = 0):
return (vals * (10**decimals)).astype(int)
def verify(solution_loc, instance_loc, problem: str):
"""
Verifies a solution to a VRPLIB instance.
Parameters
----------
solution_loc
Filesystem location of the VRPLIB solution file.
instance_loc
Filesystem location of the VRPLIB instance file.
problem
VRP problem type. One of ['cvrp', 'vrptw', 'pcvrptw'].
Returns
-------
bool
Whether the solution is feasible.
"""
instance = vrplib.read_instance(instance_loc)
solution = vrplib.read_solution(solution_loc)
if problem == "cvrp":
round_func = round_nearest
elif problem in ["vrptw", "pcvrptw"]: # dimacs
round_func = functools.partial(
scale_and_truncate_to_decimals, decimals=1
)
else:
msg = "Unknown problem. Must be one of ['cvrp', 'vrptw', 'pcvrptw']"
raise ValueError(msg)
depot = instance["depot"][0]
demands = instance["demand"]
capacity = instance["capacity"]
dimension = demands.size
distances = round_func(instance["edge_weight"])
if "time_window" in instance:
time_windows = round_func(instance["time_window"]).tolist()
else:
time_windows = None
if "service_time" in instance:
if isinstance(instance["service_time"], Number):
# Some instances describe a uniform service time as a single value
# that applies to all clients.
service_times = np.full(dimension, instance["service_time"])
service_times[0] = 0
service_times = round_func(service_times)
else:
service_times = round_func(instance["service_time"])
else:
service_times = None
if "prize" in instance:
prizes = round_func(instance["prize"]).tolist()
else:
prizes = None
total = 0
for route in solution["routes"]:
cost, time, load = 0, 0, 0
prev = depot
for loc in [*route, depot]:
dist = distances[prev][loc]
prev = loc
cost += dist
time += dist
load += demands[loc]
if load > capacity:
print("Load exceeds capacity.")
return False
if time_windows is not None:
tw_early, tw_late = time_windows[loc]
if time > tw_late:
print("Time exceeds late time window.")
return False
time = max(time, tw_early)
if service_times is not None:
time += service_times[loc]
total += cost
if prizes is not None:
visited = {client for route in solution["routes"] for client in route}
uncollected = [
prize
for client, prize in enumerate(prizes)
if client not in visited
]
total += sum(uncollected)
if solution["cost"] != total:
print("Computed cost does not match actual cost.")
return False
return True
if __name__ == "__main__":
parser = argparse.ArgumentParser(
description="Verify a solution to a VRPLIB instance."
)
parser.add_argument(
"--solution",
type=str,
help="Filesystem location of the solution file.",
)
parser.add_argument(
"--instance",
type=str,
help="Filesystem location of the VRPLIB instance.",
)
parser.add_argument(
"--problem", type=str, choices=["cvrp", "vrptw", "pcvrptw"]
)
args = parser.parse_args()
if verify(args.solution, args.instance, args.problem):
print("Solution is feasible.")
else:
print("Solution is not feasible.") |
Solutions found by OR-Tools in VRPLIB format: |
This comment was marked as resolved.
This comment was marked as resolved.
Some remarks on that code, as i'm a bit sceptical about ortools showing sub-par performance (although i don't know if that's more a linear overhead-like throughput-issue or missing metaheuristic/ls convergence).
|
Hi @sschnug
Using
Should we set this strategy at all? Or have the solver deduce this for us? |
Yes. Thats what i also found (i'm less familiar with the python-wrappers as i'm only using the C++ API). It seems it was introduced at a later stage.
Thats a good question! A practitioner will of course pick the best-performing one (and usually also introduce algorithmic portfolios as there are many construction-heuristics available and the solver being single-core only gives us free-lunch in multi-core environments). But as this is a benchmark of an external project (with a similar scope), i think one cannot ask for heavy parameter-tuning (without support / guidance) and letting the solver pick automatically seems ok (except for the case when there would be a big red block in some documentation pointing to what to choose from giving some instance-description; which i'm not aware of). So not setting anything = auto seems fine. But if one is playing with the benchmarks, a quick explicit |
Alright I'll discuss with @leonlan sometime next week to update our OR-Tools run script, and re-run the experiments. Should hopefully not take too long. Thanks for explaning some of this to us! Stay tuned for the results :) |
Code used to benchmark OR-Tools in #587 (cc @leonlan): from read import ORToolsData
from ortools.constraint_solver import pywrapcp, routing_enums_pb2
def solve(data: ORToolsData, max_runtime: int, log: bool = False):
"""
Solves an instance with OR-Tools.
Parameters
----------
data
The instance data.
max_runtime
The maximum runtime in seconds.
log
Whether to log the search.
Returns
-------
tuple[list[list[int]], int]
A tuple containing the routes and the objective value.
"""
# Manager for converting between nodes (location indices) and index
# (internal CP variable indices).
manager = pywrapcp.RoutingIndexManager(
data.num_locations, data.num_vehicles, data.depot
)
routing = pywrapcp.RoutingModel(manager)
# Set arc costs equal to distances.
distance_transit_idx = routing.RegisterTransitMatrix(data.distance_matrix)
routing.SetArcCostEvaluatorOfAllVehicles(distance_transit_idx)
# Max distance constraint.
routing.AddDimension(
distance_transit_idx,
0, # null distance slack
data.max_distance, # maximum distance per vehicle
True, # start cumul at zero
"Distance",
)
# Vehicle capacity constraint.
routing.AddDimensionWithVehicleCapacity(
routing.RegisterUnaryTransitVector(data.demands),
0, # null capacity slack
data.vehicle_capacities, # vehicle maximum capacities
True, # start cumul to zero
"Demand",
)
# Backhauls: this assumes that VRPB is implemented by forbidding arcs
# that go from backhauls to linehauls.
if data.backhauls is not None:
routing.AddDimensionWithVehicleCapacity(
routing.RegisterUnaryTransitVector(data.backhauls),
0, # null capacity slack
data.vehicle_capacities, # vehicle maximum capacities
True, # start cumul to zero
"Backhaul",
)
# Time window constraints.
if data.time_windows is not None:
depot_tw_early = data.time_windows[data.depot][0]
depot_tw_late = data.time_windows[data.depot][1]
# The depot's late time window is a valid upper bound for the waiting
# time and maximum duration per vehicle.
routing.AddDimension(
routing.RegisterTransitMatrix(data.duration_matrix),
depot_tw_late, # waiting time upper bound
depot_tw_late, # maximum duration per vehicle
False, # don't force start cumul to zero
"Time",
)
time_dim = routing.GetDimensionOrDie("Time")
for node, (tw_early, tw_late) in enumerate(data.time_windows):
if node == data.depot: # skip depot
continue
index = manager.NodeToIndex(node)
time_dim.CumulVar(index).SetRange(tw_early, tw_late)
# Add time window constraints for each vehicle start node.
for node in range(data.num_vehicles):
start = routing.Start(node)
time_dim.CumulVar(start).SetRange(depot_tw_early, depot_tw_late)
for node in range(data.num_vehicles):
cumul_start = time_dim.CumulVar(routing.Start(node))
routing.AddVariableMinimizedByFinalizer(cumul_start)
cumul_end = time_dim.CumulVar(routing.End(node))
routing.AddVariableMinimizedByFinalizer(cumul_end)
# Prize-collecting: visits are optional, but if clients are not visited
# then the prize is "lost" and incurred as penalty.
if data.prizes is not None:
for node, prize in enumerate(data.prizes):
if node == data.depot:
continue
routing.AddDisjunction([manager.NodeToIndex(node)], prize)
# Setup search parameters.
params = pywrapcp.DefaultRoutingSearchParameters()
gls = routing_enums_pb2.LocalSearchMetaheuristic.GUIDED_LOCAL_SEARCH
params.local_search_metaheuristic = gls
params.time_limit.FromSeconds(int(max_runtime)) # only accepts int
params.log_search = log
solution = routing.SolveWithParameters(params)
routes = solution2routes(data, manager, routing, solution)
objective = solution.ObjectiveValue()
return routes, objective
def solution2routes(data, manager, routing, solution) -> list[list[int]]:
"""
Converts an OR-Tools solution to routes.
"""
routes = []
distance = 0 # for debugging
for vehicle_idx in range(data.num_vehicles):
index = routing.Start(vehicle_idx)
route = []
route_cost = 0
while not routing.IsEnd(index):
node = manager.IndexToNode(index)
route.append(node)
prev_index = index
index = solution.Value(routing.NextVar(index))
route_cost += routing.GetArcCostForVehicle(
prev_index, index, vehicle_idx
)
if clients := route[1:]: # ignore depot
routes.append(clients)
distance += route_cost
return routes No more callbacks, and auto-selection of the construction heuristic. This helps a bit, but not much, as the results in #587 show. |
Hi @sschnug, thanks a lot for your suggestions on Google OR-Tools. It's always a bit difficult to benchmark other solvers since we are not experts of the software ourselves. Our implementation was derived directly from the examples, so suggestions for improvements from someone who uses Google OR-Tools are very welcome. To address your comments:
We have changed our implementation to use RegisterTransitMatrix for distances and durations (the latter which includes service times), and RegisterTransitUnaryVector for the other vector-based callbacks.
We also removed this setting so that the solver will now auto-select the initial solution strategy. The above changes does give some performance improvement: on my local environment with runtimes of 60 seconds on VRPTW instances, the cost decreases by ~3% on average.
Note that even on small runtimes (say 60 seconds), the performance difference between OR-Tools and PyVRP is of the same order of magnitude. |
The routing interface might change in OR-Tools v10.0; see google/or-tools#3992 for more details. Not sure if there will be any performance-related changes. |
Is your feature request related to a problem? Please describe
It is not a direct problem, but we received a question about how PyVRP compares against Google OR-Tools. We don't have any data on that right now and it should be fairly easy to setup.
Describe the solution you'd like
We should evaluate OR-Tools on our benchmark instances.
Describe alternatives you have considered
N/A
Additional context
There are some Google OR-Tools results online for CVRP (see Vidal 2022). There it shows that Google OR-Tools has a 4% gap w.r.t. to the best-known solutions (BKS) at the time. The BKS have not really changed since then. PyVRP currently has a 0.23% gap. So, for CVRP, PyVRP is quite a bit better. It's likely that these results translate to other variants.
The text was updated successfully, but these errors were encountered: