# Experiment for Ridesharing


In this notebook, we will walk through the ridesharing problem with three different enviornment configurations that is set up in the package. 

The following two variables can be used to 1) choose between the two version of the environment (no travel_time vs travel_time) and 2) use grid approximation to optimize the $\alpha$ parameter of the max_weight agent.

In [1]:
has_travel_time = False
algo_tune_on = False

# Step 1: Package Installation
First we import the necessary packages

In [2]:
import or_suite
import numpy as np
import itertools as it

import copy

import os
from stable_baselines3.common.env_checker import check_env
from stable_baselines3.common.monitor import Monitor
from stable_baselines3 import PPO
from stable_baselines3.ppo import MlpPolicy
from stable_baselines3.common.env_util import make_vec_env
from stable_baselines3.common.evaluation import evaluate_policy
import pandas as pd


import gym
import networkx as nx

# Configuration of $K4$ graph with uniform length of edges
The first configuration consists of a simple $K4$ graph with uniform distance of 10 for all edges. For this particular set-up, the closest_car agent trumps the max_weight agent in performance. Max_weight agent's algorithm attempts to maintain a relative uniform distribution throughout the system, allowing any incoming request to have a car readily available close by at all times. However, for complete graphs with uniformly distanced edges, as distance from one node to another is always uniform and any node with a car can be used to dispatch, max_weight agent's use of the weight parameter $\alpha$ and number of cars available at each node outputs less optimal action, compared to the closest_car agent.
The details of the configuration is specified below.


* The network is a $K4$ graph with uniform distance of 10 for all edges.
* There are 10 avaiable cars in the system.
* The fare parameter $f$ is 3, and the cost parameter is $c$ is 1.
* The average velocity $v$ is 3.
* $\gamma$ and $d_{threshold}$ are 1 and 20 respectively.

# Step 2: Pick problem parameters for the environment

Here we use the ridesharing environment as outlined in `or_suite/envs/ambulance/ambulance_metric.py`. In addition, we need to specify the number of episodes for learning, and the number of iterations (in order to plot average results with confidence intervals).

In [3]:
CONFIG =  or_suite.envs.env_configs.rideshare_graph_default_config
CONFIG['epLen'] = 100
epLen = CONFIG['epLen']
nEps = 5000
numIters = 25

# Step 3: Pick simulation parameters

Next we need to specify parameters for the simulation. This includes setting a seed, the frequency to record the metrics, directory path for saving the data files, a deBug mode which prints the trajectory, etc.

In [4]:
DEFAULT_SETTINGS = {'seed': 1, 
                    'recFreq': 1, 
                    'dirPath': '../data/rideshare/', 
                    'deBug': False, 
                    'nEps': nEps, 
                    'numIters': numIters, 
                    'saveTrajectory': True, 
                    'epLen' : epLen,
                    'render': False,
                    'pickle': False
                    }

starting_state = CONFIG['starting_state']
num_cars = CONFIG['num_cars']
num_nodes = len(starting_state)

if has_travel_time:
  rideshare_env = gym.make('Rideshare-v1', config=CONFIG) # fix the indents...
else:
  rideshare_env = gym.make('Rideshare-v0', config=CONFIG)
mon_env = Monitor(rideshare_env)

scaling_list = [0.1, 0.3, 1, 5]
observation_space = rideshare_env.observation_space
action_space = rideshare_env.action_space


# Step 4: Pick list of algorithms

We have several heuristics implemented for each of the environments defined, in addition to a `Random` policy, and some `RL discretization based` algorithms. 

In [5]:
agents = { #'SB PPO': PPO(MlpPolicy, mon_env, gamma=1, verbose=0, n_steps=epLen),
#'Random': or_suite.agents.rl.random.randomAgent(),
#'maxweightfixed' : or_suite.agents.rideshare.max_weight_fixed.maxWeightFixedAgent(CONFIG['epLen'], CONFIG, [1 for _ in range(num_nodes)]),
#'closestcar' : or_suite.agents.rideshare.closest_car.closetCarAgent(CONFIG['epLen'], CONFIG),
#'randomcar' : or_suite.agents.rideshare.random_car.randomCarAgent(CONFIG['epLen'], CONFIG),
'discreteql' : or_suite.agents.rl.discrete_ql.DiscreteQl(action_space, observation_space, epLen, scaling_list[0]),

}

#param_list = [list(p) for p in it.product(np.linspace(0,1,4),repeat = len(starting_state))]

In [6]:
print(action_space)
print(observation_space)

MultiDiscrete([4])
MultiDiscrete([11 11 11 11  4  4])


# Step 5: Run Simulations

Run the different heuristics in the environment

In [7]:
path_list_line = []
algo_list_line = []
path_list_radar = []
algo_list_radar= []

linspace_alpha = []

for agent in agents:
    print(agent)
    DEFAULT_SETTINGS['dirPath'] = '../data/rideshare_'+str(agent)+'_'+str(num_cars)
    if algo_tune_on and agent == 'maxweightfixed':
        or_suite.utils.run_single_algo_tune(rideshare_env,agents[agent], param_list, DEFAULT_SETTINGS)
    if agent == 'SB PPO':
        or_suite.utils.run_single_sb_algo(mon_env, agents[agent], DEFAULT_SETTINGS)
    else:
        or_suite.utils.run_single_algo(rideshare_env, agents[agent], DEFAULT_SETTINGS)

    path_list_line.append('../data/rideshare_'+str(agent)+'_'+str(num_cars))
    algo_list_line.append(str(agent))
    if agent != 'SB PPO':
        path_list_radar.append('../data/rideshare_'+str(agent)+'_'+str(num_cars))
        algo_list_radar.append(str(agent))

discreteql


KeyboardInterrupt: 

# Step 6: Generate Figures

Create a chart to compare the different heuristic functions. The ridesharing environment offers three more metrics: acceptance rate, mean and variance of response time. They are named as ACPT, MN, and VAR, respectively.

In [None]:
fig_path = '../figures/'
fig_name = 'rideshare_'+'_line_plot'+'.pdf'
or_suite.plots.plot_line_plots(path_list_line, algo_list_line, fig_path, fig_name, int(nEps / 40)+1)

additional_metric = {'ACPT': lambda traj : or_suite.utils.acceptance_rate(traj, lambda x, y : lengths[x,y]),'MN': lambda traj : or_suite.utils.mean_dispatch_dist(traj, lambda x, y : lengths[x,y]),'VAR': lambda traj : or_suite.utils.var_dispatch_dist(traj, lambda x, y : lengths[x,y])}

graph = nx.Graph(CONFIG['edges'])
lengths = or_suite.envs.ridesharing.rideshare_graph.RideshareGraphEnvironment.find_lengths(rideshare_env, graph, graph.number_of_nodes())

fig_name = 'rideshare_'+'_'+'_radar_plot'+'.pdf'
or_suite.plots.plot_radar_plots(path_list_radar, algo_list_radar,
fig_path, fig_name,
additional_metric
)

        Algorithm  Reward      Time      Space    ACPT        MN        VAR
0  maxweightfixed   46.32  1.307361 -129685.24  1.0000    [-5.7] -24.510000
1      closestcar   65.52  1.432694 -125712.68  1.0000  [-1.916] -15.488944
2      discreteql   47.20  0.527127 -115622.52  0.7952  [-7.522] -18.639516


# Configuration of two cities example
The second configuration is a simplified model of traffic between two cities. Two nodes act as the hub (so-called the city), and every other node is connected one of the hubs with an relatively short edge. In this example, these edges are the length of 10. These two hubs are connected with each other by a single edge of a longer length, 50 in our configuration. Every incoming request is defined to be from one hub or its neighbor (except the other hub) to the other hub or its neighbor. 
For this particular set-up, the max_weight agent surpasses the closest_car agent in performance. Intuitively, the cars located at the hub should be used sparingly, as they can be dispatched to any of its neighbor with relatively low cost. Therefore, an optimal algorithm should prioritive using the cars located outside the hub. The max_weight agent accomplishes such a procedure by putting a heavier weight on the non-hub nodes and prioritize using them.  
The details of the configuration is specified below.

* The network is a tree with two hubs.
* There are 9 avaiable cars in the system.
* The fare parameter $f$ is 3, and the cost parameter is $c$ is 1.
* The average velocity $v$ is 20.
* $\gamma$ and $d_{threshold}$ are 1 and 10 respectively.

In [None]:
CONFIG =  or_suite.envs.env_configs.rideshare_graph_2cities_config
epLen = CONFIG['epLen']
nEps = 2
numIters = 25

In [None]:
DEFAULT_SETTINGS = {'seed': 1, 
                    'recFreq': 1, 
                    'dirPath': '../data/rideshare/', 
                    'deBug': False, 
                    'nEps': nEps, 
                    'numIters': numIters, 
                    'saveTrajectory': True, 
                    'epLen' : epLen,
                    'render': False,
                    'pickle': False
                    }

starting_state = CONFIG['starting_state']
num_cars = CONFIG['num_cars']
num_nodes = len(starting_state)
if has_travel_time:
  rideshare_env = gym.make('Rideshare-v1', config=CONFIG)
else:
  rideshare_env = gym.make('Rideshare-v0', config=CONFIG)
mon_env = Monitor(rideshare_env)

In [None]:
agents = { #'SB PPO': PPO(MlpPolicy, mon_env, gamma=1, verbose=0, n_steps=epLen),
#'Random': or_suite.agents.rl.random.randomAgent(),
'maxweightfixed' : or_suite.agents.rideshare.max_weight_fixed.maxWeightFixedAgent(CONFIG['epLen'], CONFIG, [1 for _ in range(num_nodes)]),
'closestcar' : or_suite.agents.rideshare.closest_car.closetCarAgent(CONFIG['epLen'], CONFIG),
'randomcar' : or_suite.agents.rideshare.random_car.randomCarAgent(CONFIG['epLen'], CONFIG)
}

param_list = [list(p) for p in it.product(np.linspace(0,1,4),repeat = len(starting_state))]

In [None]:
path_list_line = []
algo_list_line = []
path_list_radar = []
algo_list_radar= []

linspace_alpha = []

for agent in agents:
    print(agent)
    DEFAULT_SETTINGS['dirPath'] = '../data/rideshare_'+str(agent)+'_'+str(num_cars)
    if algo_tune_on and agent == 'maxweightfixed':
        or_suite.utils.run_single_algo_tune(rideshare_env,agents[agent], param_list, DEFAULT_SETTINGS)
    if agent == 'SB PPO':
        or_suite.utils.run_single_sb_algo(mon_env, agents[agent], DEFAULT_SETTINGS)
    else:
        or_suite.utils.run_single_algo(rideshare_env, agents[agent], DEFAULT_SETTINGS)

    path_list_line.append('../data/rideshare_'+str(agent)+'_'+str(num_cars))
    algo_list_line.append(str(agent))
    if agent != 'SB PPO':
        path_list_radar.append('../data/rideshare_'+str(agent)+'_'+str(num_cars))
        algo_list_radar.append(str(agent))

maxweightfixed
{'iter': 0, 'episode': 0, 'step': 0, 'oldState': array([0, 0, 3, 0, 3, 3, 0, 5, 1]), 'action': [5], 'reward': 140.0, 'newState': array([0, 1, 3, 0, 3, 2, 0, 3, 5]), 'info': {'request': array([3, 5]), 'acceptance': True}}
{'iter': 0, 'episode': 0, 'step': 1, 'oldState': array([0, 1, 3, 0, 3, 2, 0, 3, 5]), 'action': [2], 'reward': 0.0, 'newState': array([0, 1, 3, 0, 3, 2, 0, 4, 1]), 'info': {'request': array([4, 1]), 'acceptance': False}}
{'iter': 0, 'episode': 0, 'step': 2, 'oldState': array([0, 1, 3, 0, 3, 2, 0, 4, 1]), 'action': [4], 'reward': 120.0, 'newState': array([0, 2, 3, 0, 2, 2, 0, 6, 1]), 'info': {'request': array([6, 1]), 'acceptance': True}}
{'iter': 0, 'episode': 0, 'step': 3, 'oldState': array([0, 2, 3, 0, 2, 2, 0, 6, 1]), 'action': [4], 'reward': 130.0, 'newState': array([0, 3, 3, 0, 1, 2, 0, 6, 0]), 'info': {'request': array([6, 0]), 'acceptance': True}}
{'iter': 0, 'episode': 0, 'step': 4, 'oldState': array([0, 3, 3, 0, 1, 2, 0, 6, 0]), 'action': [4], 'r

In [None]:
fig_path = '../figures/'
fig_name = 'rideshare_'+'_line_plot'+'.pdf'
or_suite.plots.plot_line_plots(path_list_line, algo_list_line, fig_path, fig_name, int(nEps / 40)+1)

additional_metric = {'ACPT': lambda traj : or_suite.utils.acceptance_rate(traj, lambda x, y : lengths[x,y]),'MN': lambda traj : or_suite.utils.mean_dispatch_dist(traj, lambda x, y : lengths[x,y]),'VAR': lambda traj : or_suite.utils.var_dispatch_dist(traj, lambda x, y : lengths[x,y])}

graph = nx.Graph(CONFIG['edges'])
lengths = or_suite.envs.ridesharing.rideshare_graph.RideshareGraphEnvironment.find_lengths(rideshare_env, graph, graph.number_of_nodes())

fig_name = 'rideshare_'+'_'+'_radar_plot'+'.pdf'
or_suite.plots.plot_radar_plots(path_list_radar, algo_list_radar,
fig_path, fig_name,
additional_metric
)

        Algorithm  Reward      Time     Space   ACPT       MN       VAR
0  maxweightfixed   558.4  5.167377 -12553.92  0.872  [-5.96]  -49.6784
1      closestcar   552.8  5.304312 -12598.08  0.860   [-5.0]  -53.0000
2       randomcar   268.0  5.208960 -11052.76  0.396  [-37.0] -853.0000


# New York yellow cab data
The last configuration is a simulation of the New York yellow cab data. The graph will be a graph representation of the New York Manhatten area, partitioned based on the taxi zone defined by Taxi and Limousine Commission of NYC. The requests will be based on TLC (Taxi and Limousine Commission) Trip Record Data from Janurary 2021. 
The details of the configuration is specified below.


* The network is a graph representation of the Manhatten area, partitioned based on the taxi zone defined by TLC of NYC, which are 63 different areas.
* There are 630 avaiable cars in the system.
* The fare parameter $f$ is 6.385456638089008, which is the average amount fare collected per mile in the given dataset.
* The cost parameter is $c$ is 1.
* The average velocity $v$ is 0.36049478338631713, which is the average velocity for all the services in the given dataset.
* $\gamma$ is 1.
* $d_{threshold}$ is 4.700448825133434, the average distance of trips in the given dataset.

In [None]:
CONFIG =  or_suite.envs.env_configs.rideshare_graph_ny_config
epLen = CONFIG['epLen']
nEps = 2
numIters = 25

In [None]:
DEFAULT_SETTINGS = {'seed': 1, 
                    'recFreq': 1, 
                    'dirPath': '../data/rideshare/', 
                    'deBug': False, 
                    'nEps': nEps, 
                    'numIters': numIters, 
                    'saveTrajectory': True, 
                    'epLen' : epLen,
                    'render': False,
                    'pickle': False
                    }

starting_state = CONFIG['starting_state']
num_cars = CONFIG['num_cars']
num_nodes = len(starting_state)
if has_travel_time:
  rideshare_env = gym.make('Rideshare-v1', config=CONFIG)
else:
  rideshare_env = gym.make('Rideshare-v0', config=CONFIG)
mon_env = Monitor(rideshare_env)

In [None]:
agents = { #'SB PPO': PPO(MlpPolicy, mon_env, gamma=1, verbose=0, n_steps=epLen),
#'Random': or_suite.agents.rl.random.randomAgent(),
'maxweightfixed' : or_suite.agents.rideshare.max_weight_fixed.maxWeightFixedAgent(CONFIG['epLen'], CONFIG, [1 for _ in range(num_nodes)]),
'closestcar' : or_suite.agents.rideshare.closest_car.closetCarAgent(CONFIG['epLen'], CONFIG),
'randomcar' : or_suite.agents.rideshare.random_car.randomCarAgent(CONFIG['epLen'], CONFIG)
}

#param_list = [list(p) for p in it.product(np.linspace(0,1,4),repeat = len(starting_state))]

In [None]:
path_list_line = []
algo_list_line = []
path_list_radar = []
algo_list_radar= []

linspace_alpha = []

for agent in agents:
    print(agent)
    DEFAULT_SETTINGS['dirPath'] = '../data/rideshare_'+str(agent)+'_'+str(num_cars)
    if algo_tune_on and agent == 'maxweightfixed':
        or_suite.utils.run_single_algo_tune(rideshare_env,agents[agent], param_list, DEFAULT_SETTINGS)
    if agent == 'SB PPO':
        or_suite.utils.run_single_sb_algo(mon_env, agents[agent], DEFAULT_SETTINGS)
    else:
        or_suite.utils.run_single_algo(rideshare_env, agents[agent], DEFAULT_SETTINGS)

    path_list_line.append('../data/rideshare_'+str(agent)+'_'+str(num_cars))
    algo_list_line.append(str(agent))
    if agent != 'SB PPO':
        path_list_radar.append('../data/rideshare_'+str(agent)+'_'+str(num_cars))
        algo_list_radar.append(str(agent))

maxweightfixed
{'iter': 0, 'episode': 0, 'step': 0, 'oldState': array([10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10,
       10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10,
       10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10,
       10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 49, 27]), 'action': [49], 'reward': 4.286637700391581, 'newState': array([10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10,
       10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 11, 10, 10, 10, 10, 10, 10,
       10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10,  9, 10,
       10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 49, 27]), 'info': {'request': (49, 27), 'acceptance': True}}
{'iter': 0, 'episode': 0, 'step': 1, 'oldState': array([10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10,
       10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 11, 10, 10, 10, 10, 10, 10,
       10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10

In [None]:
fig_path = '../figures/'
fig_name = 'rideshare_'+'_line_plot'+'.pdf'
or_suite.plots.plot_line_plots(path_list_line, algo_list_line, fig_path, fig_name, int(nEps / 40)+1)

additional_metric = {'ACPT': lambda traj : or_suite.utils.acceptance_rate(traj, lambda x, y : lengths[x,y]),'MN': lambda traj : or_suite.utils.mean_dispatch_dist(traj, lambda x, y : lengths[x,y]),'VAR': lambda traj : or_suite.utils.var_dispatch_dist(traj, lambda x, y : lengths[x,y])}

graph = nx.Graph(CONFIG['edges'])
lengths = or_suite.envs.ridesharing.rideshare_graph.RideshareGraphEnvironment.find_lengths(rideshare_env, graph, graph.number_of_nodes())

fig_name = 'rideshare_'+'_'+'_radar_plot'+'.pdf'
or_suite.plots.plot_radar_plots(path_list_radar, algo_list_radar,
fig_path, fig_name,
additional_metric
)

        Algorithm     Reward      Time     Space   ACPT  \
0  maxweightfixed  49.031096  3.687168 -15710.96  0.976   
1      closestcar  54.039070  3.803876 -17532.20  0.996   
2       randomcar  24.888810  3.748566 -13709.56  0.620   

                      MN       VAR  
0  [-0.7842311560594619] -0.729572  
1                 [-0.0] -0.000000  
2  [-4.0269822797683785] -5.620506  
