Skip to content

petroolg/risk-aware-rl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

84 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Risk-aware reinforcement learning

The repository contains scripts for safe reinforcement learning in autonomous driving environment. It includes the implementation of an MDP with an autonomous driving environment, modifiable reward function specification and the possibility of empirical model learning.

Policy initialization techniques

The module contains implementations of two initialization algorithms:

  • Behavioral Cloning (BC)
  • Generative Adversarial Imitation Learning (GAIL) [1]

Risk-aware reinforcement learning

The module contains implementations of two reinforcement learning algorithms:

  • Q-learning with risk-directed exploration [2]
  • Policy Gradient with variance constraint [3]

Get started

The code is compatible with python 3.6. Install the requirements:

pip install -r requirements.txt

Policy initialization

Policy initialization algorithms need expert demonstrations to run. The repository consists of 350 trajectories sampled manually. They are stored in the folder policy_initialization/trajectories.

To run BC script from the folder policy_initialization/:
python BC_agent.py --tp trajectories

To run GAIL script from the folder policy_initialization/:
python GAIL_agent.py --tp trajectories

Scripts output performance metrics, such as graphs and/or text.

Risk-aware reinforcement learning

To run Q-learning in model-free mode* from the folder risk_aware_rl/:
python q_learning_agent.py.

To run Policy Gradient in risk-neutral mode* from the folder risk_aware_rl/:
python policy_gradient_agent.py


Implementations are based on:
[1] Ho, Jonathan and Ermon, Stefano. “Generative Adversarial Imitation Learning”. In: (June 2016). eprint: 1606.03476. url: https://arxiv.org/pdf/1606.03476.
[2] L.M. Law, Edith. “Risk-directed Exploration in Reinforcement Learn- ing”. MA thesis. Montreal, Quebec: McGill University, Feb. 2005.
[3] Castro, Dotan Di, Tamar, Aviv, and Mannor, Shie. “Policy Gradients with Variance Related Risk Criteria”. In: (June 2012). eprint: 1206. 6404. url: https://arxiv.org/pdf/1206.6404.


* for more modes and parameters access script's help:
python <script_name> -h

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages