Skip to content

DiTo97/odysseus-escooter-dqn

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ESBDQN: E-scooter Smart Balancing DQN

This repo is the JAX implementation based on DeepMind's DQN library of our WOA 2021 paper on "Smart Balancing of E-scooter Sharing Systems via Deep Reinforcement Learning", developed by Federico Minutoli, Gianvito Losapio, Viviana Mascardi, and Angelo Ferrando at the University of Genoa.

Please note: This repo uses a custom version of the ODySSEUS simulator developed by the SmartData lab at PoliTO. More information on the changes we have made can be found in the paper.

Usage guide

Follow these instructions to try out our code.

Data

Aggregated data for the municipality of Louisville are freely available for download here under a not specified license. In order to run our experiments, we disaggregated those data following the method proposed here.

Disaggregated data for the municipality of Louisville are available for download here on MEGA.

  • Unzip the folder and put its contents within odysseus\city_data_manager\data.

Disaggregated demand model of Louisville are available for download likewise here on MEGA.

  • Unzip the folder and put its contents within odysseus\demand_modelling\demand_models.

Data and demand model for the municipality of Austin are available upon request.

Input parameters

Input parameters take into account both simulation parameters of the ODySSEUS environment, as well as, paramters for the training/test of the Rainbow agents.

Simulation parameters

Simulation parameters can be found in the folder esbdqn\configs\escooter_mobility under the name sim_conf_<City>.py with identical structure. Each file comprises two Python objects, named General and Multiple_runs.

General object

  • city, name of the city, either Louisville or Austin;
  • relocation_workers_working_hours, shift hours for relocation workers;
  • bin_side_length, side length of the square zones each operative area is split into;
  • year, year of the trip requests to consider;
  • month_start, month_end, start and end of the month of the trip requests to consider;
  • day_start, day_end, start and end of the day of the trip requests to consider;
  • save_history, whether to save the results CSV after each iteration.

Multiple_runs object

  • n_vehicles, number of vehicles to spawn in the environment;
  • incentive_willingness, acceptance probability for each incentive proposal;
  • alpha, threshold on the battery level to mark vehicles as out-of-charge in percentage between 0 and 100;
  • battery_swap, toggle for battery swap events in the environment, either True or False;
  • n_workers, number of battery swap workers;
  • battery_swap_capacity, maximum number of vehicles each battery swap worker can process hourly;
  • scooter_relocation, toggle for relocation events in the environment, either True or False;
  • n_relocation_workers, number of relocation workers;
  • relocation_capacity, maximum number of vehicles each relocation worker can move hourly;

All the parameters that have not been touched or are unused with respect to the original ODySSEUS simulator have been omitted above.

Agents parameters

Agents parameters can be found in the file esbdqn\train.py. Also, they can be submitted at runtime when launching esbdqn\train.py via CLI.

  • learning_rate, learning rate of the Adam optimizer;
  • learn_period, learning period of the Rainbow agents;
  • batch_size, batch size of the networks withing the agents;
  • n_steps, how many steps to look in the past when agents take decisions;
  • max_global_grad_norm, global gradient norm clipping of the networks weights;
  • importance_sampling_exponent_begin_value, importance_sampling_exponent_end_value, range of the importance sampling exponent;
  • replay_capacity, experience replay buffer capacity. Should amount to about 30 repetitions of any given day;
  • priority_exponent, priority of the timesteps stored in the experience replay buffer;
  • target_network_update_period, update period from the online network to the offline network within each agent;
  • num_iterations, number of training iterations;
  • max_steps_per_episode, number of trips per episode;
  • num_eval_frames, total number of validation trips per iteration;
  • num_train_frames, total number of training trips per iteration (Should be at least double the validation trips);
  • n_lives, total number of lives per iteration. Defaults to 50.

Experiment parameters

Each call of esbdqn\train.py can be named as a different experiment with its own checkpoints.

  • exp_name, name of the experiment directory.
  • checkpoint, toggle on whether to store a checkpoint, either True or False.
  • checkpoint_period, period of storage of a new training checkpoint.

Output

Run esbdqn\train.py to train a new ESBDQN model from scratch. Otherwise, to train starting from a checkpoint, set the checkpoint toggle to True, and ensure that there is a checkpoint within the experiment directory in the form: <Experiment_dir>\models\ODySSEUS-<City>.

Results of each run will be stored as CSV files within the automatically generated directory <Experiment_dir>\results.

To reproduce the experiments in the paper:

  • Set incentive_willingness to 0 to obtain all the No incentives data.
  • Set incentive_willingness to 1 and track the columns eval_avg_pct_satisfied_demand and train_avg_pct_satisfied_demand from the CSV files for the Validation and Training data, respectively.

All our experiments have been run on Ubuntu 18.04.

Releases

No releases published

Packages

No packages published

Languages

  • Python 54.6%
  • Jupyter Notebook 45.4%