Hierarchical Online Planning and Reinforcement Learning on Taxi
Switch branches/tags
Nothing to show
Clone or download
Latest commit 8581e29 Oct 23, 2017

README.md

Hierarchical online planning and reinforcement learning on Taxi

Build Status

This release consists of codes for two projects:

  • The MAXQ-based hierarchical online planning algorithm: MAXQ-OP
  • The HAMQ-based hierarchical reinforcement learning algorithm: HAMQ-INT

Taxi domain:

taxi.png

Overall results:

data/reward.png

Averaged over 200 runs.

HAMQ-INT

The idea is to identify and take advantage of internal transitions within a HAM, which is represented as a partial program, for efficient hierarchical reinforcement learning. Details can be found in:

  • Efficient Reinforcement Learning with Hierarchies of Machines by Leveraging Internal Transitions, Aijun Bai, and Stuart Russell, Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI), Melbourne, Australia, August 19 - 25, 2017. [pdf][bib]

MAXQ-OP

This is the code release of MAXQ-OP algorithm on the Taxi domain as described in papers:

Files

  • maxqop.{h, cpp}: the MAXQ-OP algorithm
  • HierarchicalFSMAgent.{h, cpp}: the HAMQ-INT algorithm
  • MaxQ0Agent.{h, cpp}: the MAXQ-0 algorithm
  • MaxQQAgent.{h, cpp}: the MAXQ-Q algorithm
  • agent.h: abstract Agent class
  • state.{h, cpp}: abstract State class
  • policy.{h, cpp}: Policy classes
  • taxi.{h, cpp}: the Taxi domain
  • system.{h, cpp}: agent-environment driver code
  • table.h: tabular V/Q functions
  • dot_graph.{h, cpp}: tools to generate graphviz dot files

Dependencies

  • libboost-dev
  • libboost-program-options-dev
  • gnuplot

Related Projects