## Import dependencies

In [None]:
import pandas as pd
from IPython.display import HTML
import requests
import json
from scipy.spatial import distance

from helper_function.notebook_helpers import show_vehicle_routes, get_minutes_from_datetime
from helper_function.map_helpers import get_map_by_vehicle

from cuopt_thin_client import CuOptServiceClient

## Read input data from CSV files

Suppose you are working as an Optimization Scientist at a grocery chain throughout New York City. There are 97 stores and 3 distribution centers. Every day, stores place an order for food that must be delivered the next day to ensure they are fully stocked. Given input data about stores' orders, distribution centers, and the available fleet of vehicles, it is your job to calculate the route for each vehicle such that all orders are fulfilled while minimizing vehicles' travel time and cost. For a problem space of 100 locations there are about 100! possible routes. You can do the math yourself- but that's a lot. Lucky for you, you have access to the cuOpt solver. All you need to do is read the input data and preprocess the data. Once all the data is ready, you just need to save it to one Python dictionary and send it to cuOpt, which does all the hard computation for you.

Let's walk through these steps. 

For the Last Mile Delivery (LMD) use case, we need 3 datasets with the following features:

- Depots
    - Name
    - Location
    - Start and end time (operation hours)
- Orders
    - Store Name
    - Location
    - Start and end time (store hours)
    - Demand
    - Service time
    - Loyalty Member
    - Delivery Requirement
- Vehicles
    - Name/ID Number
    - Assigned depot
    - Start and end time (vehicle/driver shift hours)
    - Break time
    - Capacity
    - Max time
    
You may have additional features depending on the problem at hand.

Location data needs to be in the form of coordinate points (longitude and latitude values). Our data already has coordinate points included. If you're using new data and need to do so yourself, you will need to use a third party tool.


In this workflow, we are using locations from the following [Kaggle dataset](https://www.kaggle.com/datasets/arianazmoudeh/airbnbopendata). This is a dataset of Airbnbs in New York City. Our problem space has 100 locations total which includes 3 depots and 97 orders. The coordinate points are taken from the dataset and the rest of the features are synthetic data. We have 15 vehicles available.

In [None]:
DATA_PATH = "data/"

orders_df = pd.read_csv(DATA_PATH+"orders_lmd.csv")
depots_df = pd.read_csv(DATA_PATH+"depots_lmd.csv")
vehicles_df = pd.read_csv(DATA_PATH+"vehicles_lmd.csv")

In [None]:
n_depots = len(depots_df.index)
n_orders = len(orders_df.index)
n_vehicles = len(vehicles_df.index)

n_loc_total = n_orders + n_depots

In [None]:
locations_df = (pd.concat([depots_df[["Name","Longitude","Latitude"]], orders_df[["Name","Longitude","Latitude"]]], ignore_index=True)).reset_index()

# Create Cost Matrices

The **cost matrix** models the cost between each pair of locations.  It is used by cuOpt to compute the cost of traveling from any location to any other. The cost matrix needs to be a square matrix of dimension equal to the total number of locations which inlcludes both depots and orders. In this Vehicle Routing Problem, our cost metrics are distance and travel time. These are the costs we want to minimize. 



### Cost Matrix - Distance 

For our primary <code style="background:lightgreen;color:black">cost_matrix</code>, we will use travel distance. In practical applications, you can integrate this to a third-party map data provider like Esri or Google Maps to get live traffic data and run dynamic/real-time re-routing using cuOpt.


Let's create a cost matrix using Google Maps API.

If you want to build the cost matrix on your own, or if you are working with your own data, execute the next few cells. We've already created this cost matrix using Google API and saved it as a csv, so, alternatively, you can read it from the csv file.


#### Option 1: Create you own cost matrix using Google API

Before you start using the Google Maps API, you need a project with a billing account and the Distance Matrix API enabled. To learn more, see Set up in [Cloud Console](https://developers.google.com/maps/documentation/distance-matrix/cloud-setup).

In order to do this, please create your own API key [here](https://developers.google.com/maps/documentation/distance-matrix/get-api-key).


In [None]:
import googlemaps

google_api_key = "<your_google_api_key>"
gmaps = googlemaps.Client(key=google_api_key)

time_list = []
distance_list = []
origin_id_list = []
destination_id_list = []


In [None]:
#this might take around 15 minutes to run
for (i1, row1) in locations_df.iterrows():
    LatOrigin = row1['Latitude']
    LongOrigin = row1['Longitude']
    origin = (LatOrigin, LongOrigin)
    origin_id = row1['Name'] 
    for (i2, row2) in  locations_df.iterrows():
        LatDestination = row2['Latitude']
        LongDestination = row2['Longitude']
        destination_id = row2['Name']
        destination = (LatDestination, LongDestination)
        result = gmaps.distance_matrix(origin, destination, mode = 'driving')
        result_distance = result["rows"][0]["elements"][0]["distance"]["value"]
        result_time = result["rows"][0]["elements"][0]["duration"]["value"]

    
        time_list.append(result_time)
        distance_list.append(result_distance)
        origin_id_list.append(origin_id)
        destination_id_list.append(destination_id)

In [None]:
output = pd.DataFrame(distance_list, columns = ['Distance in meter'])
output['duration in seconds'] = time_list
output['origin_id'] = origin_id_list
output['destination_id'] = destination_id_list

In [None]:
cost_matrix_distance = []
for origin in output.origin_id.unique():
    cost_matrix_distance.append(output[output['origin_id'] == origin]['Distance in meter'].values.tolist())

#### Option 2: read from csv file

In [None]:
import pandas as pd

df = pd.read_csv('data/cost_matrix_distance.csv', header=None)
cost_matrix_distance = df.astype(int).values.tolist()

### Cost Matrix - Time

Next, let's create the <code style="background:lightgreen;color:black">travel_time_matrix</code>.
We already have travel time data from Google Maps API (this data is in the 'durations in sections' column in our output dataframe. However, let's take a look at using a different tool for this. We will use OSRM to calculate the travel time in minutes between each two pairs of locations which. 

[OSRM](https://project-osrm.org/) is a free and open and open source routing engine, which we will use for route mapping and visualization later on. 


In [None]:
latitude = locations_df.Latitude.to_numpy()
longitude = locations_df.Longitude.to_numpy()
    
locations=""
n_orders = len(locations_df)
for i in range(n_orders):
    locations = locations + "{},{};".format(longitude[i], latitude[i])
r = requests.get("http://router.project-osrm.org/table/v1/car/"+ locations[:-1])
routes = json.loads(r.content)
    
# OSRM returns duration in seconds. Here we are converting to minutes
for i in routes['durations']:
    i[:] = [x / 60 for x in i]
    
coords_index = { i: (latitude[i], longitude[i]) for i in range(df.shape[0])}
time_matrix_df = pd.DataFrame(routes['durations'])
time_matrix = time_matrix_df.values.tolist()

### Set Fleet Data

Here we take our raw data from the csv file and convert it into data that we can send to the cuOpt solver.

<code style="background:lightgreen;color:black">vehicle_locations</code> is a list of the start and end location of the vehicles. Each vehicle is assigned to a depot from which it departs in the morning and returns to at night. For example, a vehicle that starts and ends in depot 1 which is the location at index 0 would have the vehicle location of [0,0]. 

In [None]:
depot_names_to_indices_dict = {locations_df["Name"].values.tolist()[i]: i for i in range(n_depots)}
vehicle_locations = vehicles_df[["assigned_depot","assigned_depot"]].replace(depot_names_to_indices_dict).values.tolist()

<code style="background:lightgreen;color:black">capacities</code> is a list of how much goods each vehicle can carry in weight. Here we have two different types of vehicles: trucks and EV vans. A truck can carry up to 20,000 pounds and an EV van can carry up to 8,000 pounds. This is essential when assigning orders to vehicles because one vehicle can only carry so many orders at once. 


In [None]:
capacities = [[int(a) for a in vehicles_df['vehicle_capacity'].tolist()]]

<code style="background:lightgreen;color:black">vehicle_time_windows</code> is a list of the integer representation of the operating time of each vehicle. Equivalently, the shift of each vehicle driver. We convert the UTC timestamp to epoch time (integer representation in minutes).

In [None]:
vehicle_time_windows = pd.concat((vehicles_df['vehicle_start'].apply(get_minutes_from_datetime).to_frame(), vehicles_df['vehicle_end'].apply(get_minutes_from_datetime).to_frame()), axis=1).values.tolist()

<code style="background:lightgreen;color:black">vehicle_break_time_windows</code>
 is a list of the integer representation of break time of each vehicle within its operating time. For a driver working an 8 hour shift, this break in the middle of the day represents their lunch break. These time windows are when their lunch break may occur.
A driver can have multiple breaks throughout their day. 

In [None]:
vehicle_break_time_windows = [pd.concat((vehicles_df['break_start'].apply(get_minutes_from_datetime).to_frame(), vehicles_df['break_end'].apply(get_minutes_from_datetime).to_frame()), axis=1).values.tolist()]

 <code style="background:lightgreen;color:black">vehicle_break_durations</code> is the length of the break. Here, we set the duration to be 30 minutes for all vehicles. 


In [None]:
vehicle_break_durations = [[30] * n_vehicles]

<code style="background:lightgreen;color:black">vehicle_max_time</code> is a list of the maximum time a vehicle can operate. Even if a driver is available for a long period of time, this constraint enforces a maximum length for a driver's shift. This is also given in minutes. A driver's time window represents total availability which may be longer than a standard shift length. If a driver says they are available to work from 9am to 9pm, we still want to limit their shift to be shorter. A truck driver can drive up to 7 hours, and an EV driver can drive up to 4 hours. 

In [None]:
vehicles_max_time = vehicles_df['max_time'].tolist()

### Set Task Data


Here we take our raw data from the csv file and convert it into data that we can send to the cuOpt solver.

<code style="background:lightgreen;color:black">task_locations</code> is the list of stores that have placed an order. This list is simply the index of each location. 

In [None]:
task_locations = locations_df.index.tolist()[n_depots:]

<code style="background:lightgreen;color:black">demand</code> is the list of weight demand for each order. Here, these values are between 40 and 200 pounds. 

In [None]:
demands = [[int(a) for a in orders_df['Demand'].values.tolist()]]

<code style="background:lightgreen;color:black">service_times</code> is the list of the length of time for orders to be dropped off once the vehicle reaches the location. Here, these values are between 15 and 30 minutes.

In [None]:
service_times = orders_df['ServiceTime'].values.tolist()

<code style="background:lightgreen;color:black">task_time_windows</code> 
 is the list of integer representation of opening hours for each store. We convert the UTC timestamp to epoch time (integer representation in minutes).

In [None]:
task_time_windows = pd.concat((orders_df['order_start_time'].apply(get_minutes_from_datetime).to_frame(), orders_df['order_end_time'].apply(get_minutes_from_datetime).to_frame()), axis=1).values.tolist()

<code style="background:lightgreen;color:black">vehicle_match_list</code> allows us to ensure that some orders are assigned to specific vehicles. In this use case, some of the orders are frozen and can be delivered in trucks and not EV vans. Here we can indicate that the frozen orders are assigned specifically to vehicles that are trucks.  

In [None]:
trucks_ids = vehicles_df['vehicle_type'][vehicles_df['vehicle_type']=="Truck"].index.values.tolist()

In [None]:
vehicle_match_list = []
for i in orders_df['is_frozen'][orders_df['is_frozen']==1].index.values.tolist():
    vehicle_match_list.append({"order_id": i, "vehicle_ids": trucks_ids})

### Set Solver configuration

Before we send our data to the cuOpt solver, we will add a configuration setting.

<code style="background:lightgreen;color:black">time_limit</code> is the maximum time allotted to find a solution. This depends on the user, who has the flexibility of setting a higher time‑limit for better results. 

The cuOpt solver does not interrupt the initial solution. So if the user specifies a shorter time than it takes for the initial solution, the initial solution is returned when it is computed.

In [None]:
# Set the time limit 
time_limit = 5

## Save data in a dictionary

Here, we take all the data we have prepared so far and save it to one dictionary. This includes the cost matrices, task data, fleet data, and solver config. This is all the data that cuOpt needs to solve our LMD problem. 

In [None]:
cuopt_problem_data = {"cost_matrix_data": {"data": {"0": cost_matrix_distance }},
        
        "travel_time_matrix_data": {"data": {"0": time_matrix }},        
        
        "task_data": {"task_locations": task_locations,
                      "demand": demands,
                      "task_time_windows": task_time_windows,
                      "service_times":service_times,
                      "order_vehicle_match": vehicle_match_list,
                     
                     },

        "fleet_data": {"vehicle_locations": vehicle_locations,
                       "capacities": capacities,
                       "vehicle_time_windows": vehicle_time_windows,
                       "vehicle_break_time_windows": vehicle_break_time_windows,
                       "vehicle_break_durations": vehicle_break_durations,
                       "vehicle_max_times": vehicles_max_time,
                      },
        
        "solver_config": { "time_limit": 5}
       
       }


## Create a Service Client Instance

Now that we have prepared all of our data, we can establish a connection to the cuOpt service. 

In the cell below, there is a place to paste a client SAK which you can generate from the API Catalog. 

Here, we create an instance of the cuOpt Service Client to establish a connection. 


In [None]:
# Currently this notebook works with spoofed SAK and FUNCTION ID, but users need to use their own SAK and FUNCTION ID if
# they are going to run this notebook in their local environment

cuopt_client_sak = "<YOUR CLIENT SAK>"

cuopt_service_client = CuOptServiceClient(
    sak=cuopt_client_sak,
    function_id="<FUNCTION_ID_OBTAINED_FROM_NGC>"
    )

## Send data to the cuOpt service and get the routes

When using the cuOpt Managed Service, we send all the data in a single call and wait for the response.

In [None]:
# Solve the problem
solver_response = cuopt_service_client.get_optimized_routes(
    cuopt_problem_data
)

# Process returned data
solver_resp = solver_response["response"]
if "solver_response" in solver_resp:
    solver_resp = solver_resp["solver_response"]
else:
    solver_resp = solver_resp["solver_infeasible_response"]

location_names = [str(x) for x in locations_df.index.tolist()]

if solver_resp["status"] == 0:
    print("Cost for the routing in distance: ", solver_resp["solution_cost"])
    print("Vehicle count to complete routing: ", solver_resp["num_vehicles"])
    show_vehicle_routes(solver_resp, location_names)
else:
    print("NVIDIA cuOpt Failed to find a solution with status : ", solver_resp["status"])

## Visualize the routes


In the drop down menu below, you can select different vehicle ID's to see if they are dispatched. If they are, we print their assigned route on a map.


Generating a route and map uses third party tools and takes about 30 seconds to run.

In [None]:
from IPython.display import display, Markdown, clear_output
import ipywidgets as widgets
from ipywidgets import interact

w = widgets.Dropdown(
    options = list(vehicles_df.index.values),
    description='Vehicle ID:',
)

def on_change(value):
    if str(value) in list(solver_resp['vehicle_data'].keys()):
        if len(solver_resp["vehicle_data"][str(value)]['route']) == 1:
            l = solver_resp["vehicle_data"][str(value)]['route'][0]
            solver_resp["vehicle_data"][str(value)]['route'] = [l,l]
        curr_route_df = pd.DataFrame(solver_resp["vehicle_data"][str(value)]['route'], columns=["stop_index"])
        curr_route_df = pd.merge(curr_route_df, locations_df, how="left", left_on=["stop_index"], right_on=["index"])
        display(get_map_by_vehicle(curr_route_df, False))        
    else:
        print("This Vehicle is not assigned to any order!")

interact(on_change, value=w)

## Prize Collection

Imagine some of the drivers called out sick last minute, and now we have only 5 vehicles in our fleet. However, our orders remain in same. It is impossible for only 5 drivers to fulfill all of these tasks. Still, we want to deliver as many orders as we can while respecting all of the constraints.

With cuOpt, we can prioritize specific orders. This variation of VRP is called <code style="background:lightgreen;color:black">Prize Collection</code>. Each task has an associated prize, and when introducing this constraint, cuOpt will try to maximize the total prize while still minimizing the total cost of the solution.

Let's imagine that we have a preferred members program, and some of the stores are members and some are not. Since we are limited in how many orders we can fulfill today, we want to make sure we deliver orders to the stores that are preferred members, and then deliver as many as the remaing order as possible.

Let's start by truncating our fleet to 5 vehicles. 

P.s. you can play around with this parameter and see how the number of vehicles affects the solution

In [None]:
new_n_vehicles = 5

In [None]:
vehicle_locations_truncated = vehicle_locations[:new_n_vehicles]

capacities_truncated = [capacities[0][:new_n_vehicles]]

vehicle_time_windows_truncated = vehicle_time_windows[:new_n_vehicles]

vehicle_break_time_windows_truncated = [vehicle_break_time_windows[0][:new_n_vehicles]]

vehicle_break_durations_truncated = [vehicle_break_durations[0][:new_n_vehicles]]

vehicles_max_time_truncated = vehicles_max_time[:new_n_vehicles]

In [None]:
trucks_ids_truncated = [truck_id for truck_id in trucks_ids if truck_id<new_n_vehicles]

vehicle_match_list_truncated = []
for i in orders_df['is_frozen'][orders_df['is_frozen']==1].index.values.tolist():
    vehicle_match_list_truncated.append({"order_id": i, "vehicle_ids": trucks_ids_truncated})

In our orders dataset, we have a column indicating whether stores are part of the preferred members program. Approximately 1/3 of the stores are part of the program.
Stores that are part of the preferred members program are marked with the value `1`, and stores that are not are `0`.

However, this does't directly translate to prize value for cuOpt. we want all stores to have a prize associated to prevent cuOpt from dropping tasks. Since we are incroporating Prize Collection, cuOpt has no incentive to deliver tasks that have an associated prize value of `0`. We will increase all values by 1, such that stores that are part of the preferred members program are marked with the value `2`, and stores that are not are `1`.

In [None]:
prizes = orders_df['preferred_customer'].values.tolist()
updated_prizes = [x+1 for x in prizes]

Finally, Let's update our data for the API call. 

Here, we introduce <code style="background:lightgreen;color:black">Objectives</code> in the solver config. To implement Prize Collection, we set the `prize` objective to be greater than 0 while we set the rest of the objectives to 0.

In [None]:
cuopt_problem_data_pc = {"cost_matrix_data": {"data": {"0": cost_matrix_distance }},
        
        "travel_time_matrix_data": {"data": {"0": time_matrix }},        
        
        "task_data": {"task_locations": task_locations,
                     "demand": demands,
                      "task_time_windows": task_time_windows,
                    "service_times":service_times,
                      "order_vehicle_match": vehicle_match_list_truncated,
                     "prizes": updated_prizes
                     
                     },

        "fleet_data": {"vehicle_locations": vehicle_locations_truncated,
                      "capacities": capacities_truncated,
                       "vehicle_time_windows": vehicle_time_windows_truncated,
                        "vehicle_break_time_windows": vehicle_break_time_windows_truncated,
                        "vehicle_break_durations": vehicle_break_durations_truncated,
                        "vehicle_max_times": vehicles_max_time_truncated,
                      },
        
        "solver_config": { "time_limit": 10,
                          "objectives": {
                              "cost": 0,
                              "travel_time": 0,
                              "variance_route_size":0,
                              "variance_route_service_time": 0,
                              "prize": 10,
                              "vehicle_fixed_cost": 0   
                          }
                         }
       
       }

In [None]:
# Solve the problem
solver_response = cuopt_service_client.get_optimized_routes(
    cuopt_problem_data_pc
)

# Process returned data
solver_resp = solver_response["response"]
if "solver_response" in solver_resp:
    solver_resp = solver_resp["solver_response"]
else:
    solver_resp = solver_resp["solver_infeasible_response"]

location_names = [str(x) for x in locations_df.index.tolist()]
location_names = locations_df.Name.tolist()

if solver_resp["status"] == 0:
    print("Cost for the routing in distance: ", solver_resp["solution_cost"])
    print("Vehicle count to complete routing: ", solver_resp["num_vehicles"])
    show_vehicle_routes(solver_resp, location_names)
else:
    print("NVIDIA cuOpt Failed to find a solution with status : ", solver_resp["status"])

This time around, let's add the preferred members to our mapping data. Stops that are part of the perferred members program will have a blue pin on that map, and those that are not will have a green pin on the map. 

In [None]:
from IPython.display import display, Markdown, clear_output
import ipywidgets as widgets
from ipywidgets import interact

w = widgets.Dropdown(
    options = list(vehicles_df.index.values),
    description='Vehicle ID:',
)

def on_change(value):
    if str(value) in list(solver_resp['vehicle_data'].keys()):
        if len(solver_resp["vehicle_data"][str(value)]['route']) == 1:
            l = solver_resp["vehicle_data"][str(value)]['route'][0]
            solver_resp["vehicle_data"][str(value)]['route'] = [l,l]
            # add information about prize collection
        locations_df["preferred_members"] = [0] * n_depots + prizes
        curr_route_df = pd.DataFrame({"stop_index": solver_resp["vehicle_data"][str(value)]['route'],
                                      "stop_type":  solver_resp["vehicle_data"][str(value)]['type']})        
        curr_route_df = pd.merge(curr_route_df, locations_df, how="left", left_on=["stop_index"], right_on=["index"])
            
        
        display(get_map_by_vehicle(curr_route_df, True))        
    else:
        print("This Vehicle is not assigned to any order!!")

interact(on_change, value=w)

## Objective Functions

Prize Collection is one example of a solver objective. Let's look at a few more.

By default, cuOpt tries to optimize on the cost (in this example, travel distance in meters). However, we can finetune the solution by changing the objectives in the `solver_config` section of our data.

With the default objective values, our reponse looks like this

In [None]:
solver_response['response']['solver_response']['objective_values']

Let's assign minimizing the cost a heavier weight. We set <code style="background:lightgreen;color:black">cost</code> to 20, an arbitrary value for the weight, while setting the rest of the objective values to `0`. Then we will look at the objective values in the solver config data. We will skip printing the routes here, but you can do so by copying the code above.

In [None]:
cuopt_problem_data["solver_config"]["objectives"]= {
                              "cost": 20,
                              "travel_time": 0,
                              "variance_route_size":0,
                              "variance_route_service_time": 0,
                              "prize": 0,
                              "vehicle_fixed_cost": 0  
}

In [None]:
# Solve the problem
solver_response = cuopt_service_client.get_optimized_routes(
    cuopt_problem_data
)

solver_response['response']['solver_response']['objective_values']

Now, let's shift to minimizing travel time, rather than distance. We set <code style="background:lightgreen;color:black">travel_time</code> to a positive value while the rest of the values are set to 0.

In [None]:
cuopt_problem_data["solver_config"]["objectives"]= {
                              "cost": 0,
                              "travel_time": 20,
                              "variance_route_size":0,
                              "variance_route_service_time": 0,
                              "prize": 0,
                              "vehicle_fixed_cost": 0  
}

In [None]:
# Solve the problem
solver_response = cuopt_service_client.get_optimized_routes(
    cuopt_problem_data
)

solver_response['response']['solver_response']['objective_values']

Notice how the travel time now is lower than it was earlier, since this is the value we are primarily opimizing on. Of course, this comes at the expense of distance, so the cost value is higher than it was earlier.

**Note:** this doesn't have to be "either or". We can have positive values for both and instruct cuOpt on how to prioritize one over the other.

**Note:** In this example our cost is distance in meters. When you use cuOpt for your own business use case, this could be another metric. For example, you cost could be gas usage, so you can optimize for both gas useage and time.  

## License

SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.

SPDX-License-Identifier: MIT

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.