Skip to content

Algorithms

Erwin Walraven edited this page Nov 8, 2018 · 31 revisions

The toolbox provides a collection of algorithms for Markov Decision Processes and Partially Observable Markov Decision Processes with resource constraints. On this page we provide an overview of the high-level architecture of the algorithms, and we provide information about the specific algorithms that have been implemented.

Markov Decision Processes

explain input and output

Column generation

The column generation algorithm is a conditional preallocation method which has been introduced by Yost and Washburn (2000). The algorithm iteratively generates policies and adds the corresponding columns to a linear program. The algorithm derives a probability distribution over deterministic policies using this linear program, which it eventually returns as a solution.

Class: algorithms.mdp.colgen.ColGenFiniteHorizon

Supported constraints: budget, instantaneous

Parameters: tolerance which is used to determine whether the dual prices of the linear program have converged

CMDP linear program

Deterministic preallocation

Dynamic relaxation

Partially Observable Markov Decision Processes

explain input and output

CGCP

CALP

References

  • Yost, K. A., & Washburn, A. R. (2000). The LP/POMDP Marriage: Optimization with Imperfect Information. Naval Research Logistics, 47(8), 607–619.

Clone this wiki locally