# Experiment for Ridesharing


In this notebook, we will walk through the ridesharing problem with three different enviornment configurations that is set up in the package. 

The following two variables can be used to 1) choose between the two version of the environment (no travel_time vs travel_time) and 2) use grid approximation to optimize the $\alpha$ parameter of the max_weight agent.

In [2]:
has_travel_time = True
algo_tune_on = False

### Package Installation
First we import the necessary packages

In [3]:
import or_suite
import numpy as np
import itertools as it

import copy

import os
from stable_baselines3.common.env_checker import check_env
from stable_baselines3.common.monitor import Monitor
from stable_baselines3 import PPO
from stable_baselines3.ppo import MlpPolicy
from stable_baselines3.common.env_util import make_vec_env
from stable_baselines3.common.evaluation import evaluate_policy
import pandas as pd


import gym
import networkx as nx

### Configuration of $K4$ graph with uniform length of edges
The first configuration consists of a simple $K4$ graph with uniform distance of 10 for all edges. For this particular set-up, the closest_car agent trumps the max_weight agent in performance. Max_weight agent's algorithm attempts to maintain a relative uniform distribution throughout the system, allowing any incoming request to have a car readily available close by at all times. However, for complete graphs with uniformly distanced edges, as distance from one node to another is always uniform and any node with a car can be used to dispatch, max_weight agent's use of the weight parameter $\alpha$ and number of cars available at each node outputs less optimal action, compared to the closest_car agent.
The details of the configuration is specified below.


* The network is a $K4$ graph with uniform distance of 10 for all edges.
* There are 10 avaiable cars in the system.
* The fare parameter $f$ is 3, and the cost parameter is $c$ is 1.
* The average velocity $v$ is 3.
* $\gamma$ and $d_{threshold}$ are 1 and 20 respectively.

### Environment Parameters

Here we use the ridesharing environment as outlined in `or_suite/envs/ambulance/ambulance_metric.py`.
* `epLen`: The int number of time steps to run the experiment for.
* `nEps`: an int representing the number of episodes
* `numIters`: an int representing the number of iterations

In [4]:
CONFIG =  or_suite.envs.env_configs.rideshare_graph_default_config
CONFIG['epLen'] = 100
epLen = CONFIG['epLen']
nEps = 2
numIters = 25

### Experimental parameters

Next we need to specify parameters for the simulation:
* `seed` allows random numbers to be generated. 
* `dirPath`, a string, is the location where the data files are stored.
* `deBug`, a bool, prints information to the command line when set true. 
* `save_trajectory`, a bool, saves the trajectory information of the simulation when set to true. 
* `render` renders the algorithm when set to true.
* `pickle` is a bool that saves the information to a pickle file when set to true.

In [5]:
DEFAULT_SETTINGS = {'seed': 1, 
                    'recFreq': 1, 
                    'dirPath': '../data/rideshare/', 
                    'deBug': False, 
                    'nEps': nEps, 
                    'numIters': numIters, 
                    'saveTrajectory': True, 
                    'epLen' : epLen,
                    'render': False,
                    'pickle': False
                    }

starting_state = CONFIG['starting_state']
num_cars = CONFIG['num_cars']
num_nodes = len(starting_state)

if has_travel_time:
  rideshare_env = gym.make('Rideshare-v1', config=CONFIG)
else:
  rideshare_env = gym.make('Rideshare-v0', config=CONFIG)
mon_env = Monitor(rideshare_env)

### Specifying Agent

We specify 6 different agents to compare the effectiveness of each.
* `SB PPO` is Proximal Policy Optimization. When policy is updated, there is a parameter that “clips” each policy update so that action update does not go too far
* `maxweightfixed` 
* `closestcar` is an agent that chooses the closest car for nearby calls for the ridesharing environment
* `randomcar` is an agent that chooses a random car for calls within the ridesharing environment

In [6]:
agents = { #'SB PPO': PPO(MlpPolicy, mon_env, gamma=1, verbose=0, n_steps=epLen),
#'Random': or_suite.agents.rl.random.randomAgent(),
'maxweightfixed' : or_suite.agents.rideshare.max_weight_fixed.maxWeightFixedAgent(CONFIG['epLen'], CONFIG, [1 for _ in range(num_nodes)]),
'closestcar' : or_suite.agents.rideshare.closest_car.closetCarAgent(CONFIG['epLen'], CONFIG),
'randomcar' : or_suite.agents.rideshare.random_car.randomCarAgent(CONFIG['epLen'], CONFIG)
}

param_list = [list(p) for p in it.product(np.linspace(0,1,4),repeat = len(starting_state))]

### Running Algorithm

Run the different heuristics in the environment

In [7]:
path_list_line = []
algo_list_line = []
path_list_radar = []
algo_list_radar= []

linspace_alpha = []

for agent in agents:
    print(agent)
    DEFAULT_SETTINGS['dirPath'] = '../data/rideshare_'+str(agent)+'_'+str(num_cars)
    if algo_tune_on and agent == 'maxweightfixed':
        or_suite.utils.run_single_algo_tune(rideshare_env,agents[agent], param_list, DEFAULT_SETTINGS)
    if agent == 'SB PPO':
        or_suite.utils.run_single_sb_algo(mon_env, agents[agent], DEFAULT_SETTINGS)
    else:
        or_suite.utils.run_single_algo(rideshare_env, agents[agent], DEFAULT_SETTINGS)

    path_list_line.append('../data/rideshare_'+str(agent)+'_'+str(num_cars))
    algo_list_line.append(str(agent))
    if agent != 'SB PPO':
        path_list_radar.append('../data/rideshare_'+str(agent)+'_'+str(num_cars))
        algo_list_radar.append(str(agent))

maxweightfixed
Writing to file data.csv
closestcar
Writing to file data.csv
randomcar
Writing to file data.csv


### Generate Figures

Create a chart to compare the different heuristic functions. The ridesharing environment offers three more metrics: acceptance rate, mean and variance of response time. They are named as ACPT, MN, and VAR, respectively.

In [8]:
fig_path = '../figures/'
fig_name = 'rideshare_'+'_line_plot'+'.pdf'
or_suite.plots.plot_line_plots(path_list_line, algo_list_line, fig_path, fig_name, int(nEps / 40)+1)

additional_metric = {'ACPT': lambda traj : or_suite.utils.acceptance_rate(traj, lambda x, y : lengths[x,y]),'MN': lambda traj : or_suite.utils.mean_dispatch_dist(traj, lambda x, y : lengths[x,y]),'VAR': lambda traj : or_suite.utils.var_dispatch_dist(traj, lambda x, y : lengths[x,y])}

graph = nx.Graph(CONFIG['edges'])
lengths = or_suite.envs.ridesharing.rideshare_graph.RideshareGraphEnvironment.find_lengths(rideshare_env, graph, graph.number_of_nodes())

fig_name = 'rideshare_'+'_'+'_radar_plot'+'.pdf'
or_suite.plots.plot_radar_plots(path_list_radar, algo_list_radar,
fig_path, fig_name,
additional_metric
)

        Algorithm  Reward      Time      Space  ACPT     MN        VAR
0  maxweightfixed   923.6  3.940824 -118877.20   1.0 -5.668 -24.553776
1      closestcar  1263.6  3.961825 -118879.16   1.0 -2.252 -17.448496
2       randomcar   769.6  3.674945 -118959.48   1.0 -7.556 -18.466864


In [9]:
from IPython.display import IFrame
IFrame("../figures/rideshare_line_plot.pdf", width=600, height=280)

In [10]:
IFrame("../figures/rideshare_radar_plot.pdf", width=600, height=450)

# Configuration of two cities example
The second configuration is a simplified model of traffic between two cities. Two nodes act as the hub (so-called the city), and every other node is connected one of the hubs with an relatively short edge. In this example, these edges are the length of 10. These two hubs are connected with each other by a single edge of a longer length, 50 in our configuration. Every incoming request is defined to be from one hub or its neighbor (except the other hub) to the other hub or its neighbor. 
For this particular set-up, the max_weight agent surpasses the closest_car agent in performance. Intuitively, the cars located at the hub should be used sparingly, as they can be dispatched to any of its neighbor with relatively low cost. Therefore, an optimal algorithm should prioritive using the cars located outside the hub. The max_weight agent accomplishes such a procedure by putting a heavier weight on the non-hub nodes and prioritize using them.  
The details of the configuration is specified below.

* The network is a tree with two hubs.
* There are 9 avaiable cars in the system.
* The fare parameter $f$ is 3, and the cost parameter is $c$ is 1.
* The average velocity $v$ is 20.
* $\gamma$ and $d_{threshold}$ are 1 and 10 respectively.

In [15]:
CONFIG =  or_suite.envs.env_configs.rideshare_graph_2cities_config
epLen = CONFIG['epLen']
nEps = 2
numIters = 25

In [16]:
DEFAULT_SETTINGS = {'seed': 1, 
                    'recFreq': 1, 
                    'dirPath': '../data/rideshare/', 
                    'deBug': False, 
                    'nEps': nEps, 
                    'numIters': numIters, 
                    'saveTrajectory': True, 
                    'epLen' : epLen,
                    'render': False,
                    'pickle': False
                    }

starting_state = CONFIG['starting_state']
num_cars = CONFIG['num_cars']
num_nodes = len(starting_state)
if has_travel_time:
  rideshare_env = gym.make('Rideshare-v1', config=CONFIG)
else:
  rideshare_env = gym.make('Rideshare-v0', config=CONFIG)
mon_env = Monitor(rideshare_env)

In [17]:
agents = { #'SB PPO': PPO(MlpPolicy, mon_env, gamma=1, verbose=0, n_steps=epLen),
#'Random': or_suite.agents.rl.random.randomAgent(),
'maxweightfixed' : or_suite.agents.rideshare.max_weight_fixed.maxWeightFixedAgent(CONFIG['epLen'], CONFIG, [1 for _ in range(num_nodes)]),
'closestcar' : or_suite.agents.rideshare.closest_car.closetCarAgent(CONFIG['epLen'], CONFIG),
'randomcar' : or_suite.agents.rideshare.random_car.randomCarAgent(CONFIG['epLen'], CONFIG)
}

param_list = [list(p) for p in it.product(np.linspace(0,1,4),repeat = len(starting_state))]

In [18]:
path_list_line = []
algo_list_line = []
path_list_radar = []
algo_list_radar= []

linspace_alpha = []

for agent in agents:
    print(agent)
    DEFAULT_SETTINGS['dirPath'] = '../data/rideshare_'+str(agent)+'_'+str(num_cars)
    if algo_tune_on and agent == 'maxweightfixed':
        or_suite.utils.run_single_algo_tune(rideshare_env,agents[agent], param_list, DEFAULT_SETTINGS)
    if agent == 'SB PPO':
        or_suite.utils.run_single_sb_algo(mon_env, agents[agent], DEFAULT_SETTINGS)
    else:
        or_suite.utils.run_single_algo(rideshare_env, agents[agent], DEFAULT_SETTINGS)

    path_list_line.append('../data/rideshare_'+str(agent)+'_'+str(num_cars))
    algo_list_line.append(str(agent))
    if agent != 'SB PPO':
        path_list_radar.append('../data/rideshare_'+str(agent)+'_'+str(num_cars))
        algo_list_radar.append(str(agent))

maxweightfixed
**************************************************
Running experiment
**************************************************
{'iter': 0, 'episode': 0, 'step': 0, 'oldState': array([0, 0, 3, 0, 3, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 6, 0]), 'action': 4, 'reward': 110.0, 'newState': array([0, 0, 3, 0, 2, 3, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 1, 3, 5]), 'info': {'request': array([3, 5]), 'acceptance': True}}
{'iter': 0, 'episode': 0, 'step': 1, 'oldState': array([0, 0, 3, 0, 2, 3, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 1, 3, 5]), 'action': 2, 'reward': 0.0, 'newState': array([0, 0, 3, 0, 2, 3, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 1, 4, 1]), 'info': {'request': array([4, 1]), 'acceptance': False}}
{'iter': 0, 'episode': 0, 'step': 2, 'oldState': array([0, 0, 3, 0, 2, 3, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 1, 4, 1]), 'action': 5, 'reward':

       0, 0, 0, 1, 6, 1]), 'info': {'request': array([6, 1]), 'acceptance': True}}
{'iter': 20, 'episode': 0, 'step': 1, 'oldState': array([0, 0, 3, 0, 2, 3, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 1, 6, 1]), 'action': 4, 'reward': 130.0, 'newState': array([0, 0, 3, 0, 1, 3, 0, 0, 1, 1, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 2, 5, 0]), 'info': {'request': array([5, 0]), 'acceptance': True}}
{'iter': 20, 'episode': 0, 'step': 2, 'oldState': array([0, 0, 3, 0, 1, 3, 0, 0, 1, 1, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 2, 5, 0]), 'action': 5, 'reward': 120.0, 'newState': array([1, 0, 3, 0, 1, 2, 0, 0, 2, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 2, 5, 1]), 'info': {'request': array([5, 1]), 'acceptance': True}}
{'iter': 20, 'episode': 0, 'step': 3, 'oldState': array([1, 0, 3, 0, 1, 2, 0, 0, 2, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 2, 5, 1]), 'action': 5, 'reward': 140.0, 'newState': array([1, 1, 3, 0, 1, 1, 0, 0

       0, 0, 0, 1, 6, 3]), 'info': {'request': array([6, 3]), 'acceptance': True}}
{'iter': 6, 'episode': 0, 'step': 1, 'oldState': array([0, 0, 3, 0, 2, 3, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 1, 6, 3]), 'action': 4, 'reward': 130.0, 'newState': array([0, 0, 3, 0, 1, 3, 0, 0, 1, 3, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 2, 1, 6]), 'info': {'request': array([1, 6]), 'acceptance': True}}
{'iter': 6, 'episode': 0, 'step': 2, 'oldState': array([0, 0, 3, 0, 1, 3, 0, 0, 1, 3, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 2, 1, 6]), 'action': 2, 'reward': 0.0, 'newState': array([1, 0, 3, 0, 1, 3, 0, 3, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 1, 4, 2]), 'info': {'request': array([4, 2]), 'acceptance': False}}
{'iter': 6, 'episode': 0, 'step': 3, 'oldState': array([1, 0, 3, 0, 1, 3, 0, 3, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 1, 4, 2]), 'action': 4, 'reward': 120.0, 'newState': array([1, 0, 3, 1, 0, 3, 0, 2, 2,

       0, 0, 0, 1, 6, 2]), 'info': {'request': array([6, 2]), 'acceptance': True}}
{'iter': 19, 'episode': 1, 'step': 1, 'oldState': array([0, 0, 3, 0, 2, 3, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 1, 6, 2]), 'action': 4, 'reward': 130.0, 'newState': array([0, 0, 3, 0, 1, 3, 0, 0, 1, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 2, 4, 3]), 'info': {'request': array([4, 3]), 'acceptance': True}}
{'iter': 19, 'episode': 1, 'step': 2, 'oldState': array([0, 0, 3, 0, 1, 3, 0, 0, 1, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 2, 4, 3]), 'action': 4, 'reward': 120.0, 'newState': array([1, 0, 3, 0, 0, 3, 0, 3, 2, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 2, 2, 5]), 'info': {'request': array([2, 5]), 'acceptance': True}}
{'iter': 19, 'episode': 1, 'step': 3, 'oldState': array([1, 0, 3, 0, 0, 3, 0, 3, 2, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 2, 2, 5]), 'action': 2, 'reward': 140.0, 'newState': array([1, 0, 3, 0, 0, 3, 0, 3

       0, 0, 0, 0, 5, 2]), 'info': {'request': array([5, 2]), 'acceptance': False}}
{'iter': 5, 'episode': 0, 'step': 3, 'oldState': array([0, 0, 3, 0, 3, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 5, 2]), 'action': 2, 'reward': 0.0, 'newState': array([0, 0, 3, 0, 3, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 3, 4]), 'info': {'request': array([3, 4]), 'acceptance': False}}
{'iter': 5, 'episode': 0, 'step': 4, 'oldState': array([0, 0, 3, 0, 3, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 3, 4]), 'action': 2, 'reward': 0.0, 'newState': array([0, 0, 3, 0, 3, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 6, 3]), 'info': {'request': array([6, 3]), 'acceptance': False}}
{'iter': 5, 'episode': 1, 'step': 0, 'oldState': array([0, 0, 3, 0, 3, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 6, 0]), 'action': 2, 'reward': 0.0, 'newState': array([0, 0, 3, 0, 3, 3, 0, 0, 0, 0

       0, 0, 0, 1, 5, 3]), 'info': {'request': array([5, 3]), 'acceptance': True}}
{'iter': 17, 'episode': 0, 'step': 4, 'oldState': array([1, 0, 2, 0, 2, 3, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 1, 5, 3]), 'action': 2, 'reward': 0.0, 'newState': array([2, 0, 2, 0, 2, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 3, 5]), 'info': {'request': array([3, 5]), 'acceptance': False}}
{'iter': 17, 'episode': 1, 'step': 0, 'oldState': array([0, 0, 3, 0, 3, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 6, 0]), 'action': 4, 'reward': 110.0, 'newState': array([0, 0, 3, 0, 2, 3, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 1, 5, 0]), 'info': {'request': array([5, 0]), 'acceptance': True}}
{'iter': 17, 'episode': 1, 'step': 1, 'oldState': array([0, 0, 3, 0, 2, 3, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 1, 5, 0]), 'action': 2, 'reward': 0.0, 'newState': array([0, 0, 3, 0, 2, 3, 0, 0, 1

In [19]:
fig_path = '../figures/'
fig_name = 'rideshare_'+'_line_plot'+'.pdf'
or_suite.plots.plot_line_plots(path_list_line, algo_list_line, fig_path, fig_name, int(nEps / 40)+1)

additional_metric = {'ACPT': lambda traj : or_suite.utils.acceptance_rate(traj, lambda x, y : lengths[x,y]),'MN': lambda traj : or_suite.utils.mean_dispatch_dist(traj, lambda x, y : lengths[x,y]),'VAR': lambda traj : or_suite.utils.var_dispatch_dist(traj, lambda x, y : lengths[x,y])}

graph = nx.Graph(CONFIG['edges'])
lengths = or_suite.envs.ridesharing.rideshare_graph.RideshareGraphEnvironment.find_lengths(rideshare_env, graph, graph.number_of_nodes())

fig_name = 'rideshare_'+'_'+'_radar_plot'+'.pdf'
or_suite.plots.plot_radar_plots(path_list_radar, algo_list_radar,
fig_path, fig_name,
additional_metric
)

        Algorithm  Reward      Time     Space   ACPT     MN       VAR
0  maxweightfixed   533.2  5.065019  -9900.36  0.872  -8.52  -38.2096
1      closestcar   524.8  4.822370 -11914.12  0.844  -7.60  -49.4400
2       randomcar   227.2  4.772872 -10631.52  0.348 -37.60 -783.0400


# New York yellow cab data
The last configuration is a simulation of the New York yellow cab data. The graph will be a graph representation of the New York Manhatten area, partitioned based on the taxi zone defined by Taxi and Limousine Commission of NYC. The requests will be based on TLC (Taxi and Limousine Commission) Trip Record Data from Janurary 2021. 
The details of the configuration is specified below.


* The network is a graph representation of the Manhatten area, partitioned based on the taxi zone defined by TLC of NYC, which are 63 different areas.
* There are 630 avaiable cars in the system.
* The fare parameter $f$ is 6.385456638089008, which is the average amount fare collected per mile in the given dataset.
* The cost parameter is $c$ is 1.
* The average velocity $v$ is 0.36049478338631713, which is the average velocity for all the services in the given dataset.
* $\gamma$ is 1.
* $d_{threshold}$ is 4.700448825133434, the average distance of trips in the given dataset.

In [13]:
CONFIG =  or_suite.envs.env_configs.rideshare_graph_ny_config
epLen = CONFIG['epLen']
nEps = 2
numIters = 25

In [14]:
DEFAULT_SETTINGS = {'seed': 1, 
                    'recFreq': 1, 
                    'dirPath': '../data/rideshare/', 
                    'deBug': False, 
                    'nEps': nEps, 
                    'numIters': numIters, 
                    'saveTrajectory': True, 
                    'epLen' : epLen,
                    'render': False,
                    'pickle': False
                    }

starting_state = CONFIG['starting_state']
num_cars = CONFIG['num_cars']
num_nodes = len(starting_state)
if has_travel_time:
  rideshare_env = gym.make('Rideshare-v1', config=CONFIG)
else:
  rideshare_env = gym.make('Rideshare-v0', config=CONFIG)
mon_env = Monitor(rideshare_env)

In [15]:
agents = { #'SB PPO': PPO(MlpPolicy, mon_env, gamma=1, verbose=0, n_steps=epLen),
#'Random': or_suite.agents.rl.random.randomAgent(),
'maxweightfixed' : or_suite.agents.rideshare.max_weight_fixed.maxWeightFixedAgent(CONFIG['epLen'], CONFIG, [1 for _ in range(num_nodes)]),
'closestcar' : or_suite.agents.rideshare.closest_car.closetCarAgent(CONFIG['epLen'], CONFIG),
'randomcar' : or_suite.agents.rideshare.random_car.randomCarAgent(CONFIG['epLen'], CONFIG)
}

#param_list = [list(p) for p in it.product(np.linspace(0,1,4),repeat = len(starting_state))]

In [16]:
path_list_line = []
algo_list_line = []
path_list_radar = []
algo_list_radar= []

linspace_alpha = []

for agent in agents:
    print(agent)
    DEFAULT_SETTINGS['dirPath'] = '../data/rideshare_'+str(agent)+'_'+str(num_cars)
    if algo_tune_on and agent == 'maxweightfixed':
        or_suite.utils.run_single_algo_tune(rideshare_env,agents[agent], param_list, DEFAULT_SETTINGS)
    if agent == 'SB PPO':
        or_suite.utils.run_single_sb_algo(mon_env, agents[agent], DEFAULT_SETTINGS)
    else:
        or_suite.utils.run_single_algo(rideshare_env, agents[agent], DEFAULT_SETTINGS)

    path_list_line.append('../data/rideshare_'+str(agent)+'_'+str(num_cars))
    algo_list_line.append(str(agent))
    if agent != 'SB PPO':
        path_list_radar.append('../data/rideshare_'+str(agent)+'_'+str(num_cars))
        algo_list_radar.append(str(agent))

maxweightfixed
**************************************************
Running experiment
**************************************************
{'iter': 0, 'episode': 0, 'step': 0, 'oldState': array([10, 10, 10, ...,  0, 49, 27]), 'action': 49, 'reward': 4.286637700391581, 'newState': array([10, 10, 10, ...,  1, 49, 27]), 'info': {'request': (49, 27), 'acceptance': True}}
{'iter': 0, 'episode': 0, 'step': 1, 'oldState': array([10, 10, 10, ...,  1, 49, 27]), 'action': 34, 'reward': 3.557783588985746, 'newState': array([10, 10, 10, ...,  1, 12, 25]), 'info': {'request': (12, 25), 'acceptance': True}}
{'iter': 0, 'episode': 0, 'step': 2, 'oldState': array([10, 10, 10, ...,  1, 12, 25]), 'action': 27, 'reward': 21.002518546845437, 'newState': array([10, 10, 10, ...,  1, 61, 57]), 'info': {'request': (61, 57), 'acceptance': True}}
{'iter': 0, 'episode': 0, 'step': 3, 'oldState': array([10, 10, 10, ...,  1, 61, 57]), 'action': 61, 'reward': 14.349374061836052, 'newState': array([10, 10, 10, ...,  2,

{'iter': 1, 'episode': 0, 'step': 2, 'oldState': array([10, 10, 10, ...,  1, 12, 25]), 'action': 12, 'reward': 21.788588039991755, 'newState': array([10, 10, 10, ...,  1, 61, 57]), 'info': {'request': (61, 57), 'acceptance': True}}
{'iter': 1, 'episode': 0, 'step': 3, 'oldState': array([10, 10, 10, ...,  1, 61, 57]), 'action': 61, 'reward': 14.349374061836052, 'newState': array([10, 10, 10, ...,  2,  6, 49]), 'info': {'request': (6, 49), 'acceptance': True}}
{'iter': 1, 'episode': 0, 'step': 4, 'oldState': array([10, 10, 10, ...,  2,  6, 49]), 'action': 6, 'reward': 9.716488762343293, 'newState': array([10, 10, 10, ...,  3, 49, 37]), 'info': {'request': (49, 37), 'acceptance': True}}
{'iter': 1, 'episode': 1, 'step': 0, 'oldState': array([10, 10, 10, ...,  0, 49, 27]), 'action': 49, 'reward': 4.286637700391581, 'newState': array([10, 10, 10, ...,  1, 49, 27]), 'info': {'request': (49, 27), 'acceptance': True}}
{'iter': 1, 'episode': 1, 'step': 1, 'oldState': array([10, 10, 10, ...,  1,

{'iter': 23, 'episode': 1, 'step': 4, 'oldState': array([10, 10, 10, ...,  2,  6, 49]), 'action': 6, 'reward': 9.716488762343293, 'newState': array([10, 10, 10, ...,  3, 49, 37]), 'info': {'request': (49, 37), 'acceptance': True}}
{'iter': 24, 'episode': 0, 'step': 0, 'oldState': array([10, 10, 10, ...,  0, 49, 27]), 'action': 49, 'reward': 4.286637700391581, 'newState': array([10, 10, 10, ...,  1, 49, 27]), 'info': {'request': (49, 27), 'acceptance': True}}
{'iter': 24, 'episode': 0, 'step': 1, 'oldState': array([10, 10, 10, ...,  1, 49, 27]), 'action': 49, 'reward': 4.286637700391581, 'newState': array([10, 10, 10, ...,  1, 12, 25]), 'info': {'request': (12, 25), 'acceptance': True}}
{'iter': 24, 'episode': 0, 'step': 2, 'oldState': array([10, 10, 10, ...,  1, 12, 25]), 'action': 12, 'reward': 21.788588039991755, 'newState': array([10, 10, 10, ...,  1, 61, 57]), 'info': {'request': (61, 57), 'acceptance': True}}
{'iter': 24, 'episode': 0, 'step': 3, 'oldState': array([10, 10, 10, ...

{'iter': 20, 'episode': 1, 'step': 0, 'oldState': array([10, 10, 10, ...,  0, 49, 27]), 'action': 6, 'reward': 2.4824288068276137, 'newState': array([10, 10, 10, ...,  1, 49, 27]), 'info': {'request': (49, 27), 'acceptance': True}}
{'iter': 20, 'episode': 1, 'step': 1, 'oldState': array([10, 10, 10, ...,  1, 49, 27]), 'action': 50, 'reward': 0.0, 'newState': array([10, 10, 10, ...,  0, 12, 25]), 'info': {'request': (12, 25), 'acceptance': False}}
{'iter': 20, 'episode': 1, 'step': 2, 'oldState': array([10, 10, 10, ...,  0, 12, 25]), 'action': 48, 'reward': 18.109255028266254, 'newState': array([10, 10, 10, ...,  1, 61, 57]), 'info': {'request': (61, 57), 'acceptance': True}}
{'iter': 20, 'episode': 1, 'step': 3, 'oldState': array([10, 10, 10, ...,  1, 61, 57]), 'action': 53, 'reward': 0.0, 'newState': array([10, 10, 10, ...,  1,  6, 49]), 'info': {'request': (6, 49), 'acceptance': False}}
{'iter': 20, 'episode': 1, 'step': 4, 'oldState': array([10, 10, 10, ...,  1,  6, 49]), 'action': 

In [17]:
fig_path = '../figures/'
fig_name = 'rideshare_'+'_line_plot'+'.pdf'
or_suite.plots.plot_line_plots(path_list_line, algo_list_line, fig_path, fig_name, int(nEps / 40)+1)

additional_metric = {'ACPT': lambda traj : or_suite.utils.acceptance_rate(traj, lambda x, y : lengths[x,y]),'MN': lambda traj : or_suite.utils.mean_dispatch_dist(traj, lambda x, y : lengths[x,y]),'VAR': lambda traj : or_suite.utils.var_dispatch_dist(traj, lambda x, y : lengths[x,y])}

graph = nx.Graph(CONFIG['edges'])
lengths = or_suite.envs.ridesharing.rideshare_graph.RideshareGraphEnvironment.find_lengths(rideshare_env, graph, graph.number_of_nodes())

fig_name = 'rideshare_'+'_'+'_radar_plot'+'.pdf'
or_suite.plots.plot_radar_plots(path_list_radar, algo_list_radar,
fig_path, fig_name,
additional_metric
)

        Algorithm     Reward      Time     Space   ACPT        MN       VAR
0  maxweightfixed  50.307349  5.570363 -60102.56  0.988 -0.692456 -0.524250
1      closestcar  54.039070  5.518143 -64975.48  0.996 -0.000000 -0.000000
2       randomcar  25.356206  5.438172 -63006.92  0.624 -4.008835 -5.551915
