-
Notifications
You must be signed in to change notification settings - Fork 1
Algorithms
The toolbox provides a collection of algorithms for Markov Decision Processes and Partially Observable Markov Decision Processes with resource constraints. On this page we provide an overview of the high-level architecture of the algorithms, and we provide information about the specific algorithms that have been implemented.
explain input and output
The column generation algorithm is a conditional preallocation method which has been introduced by Yost and Washburn (2000). The algorithm iteratively generates policies and adds the corresponding columns to a linear program. The algorithm derives a probability distribution over deterministic policies using this linear program, which it eventually returns as a solution.
Class: algorithms.mdp.colgen.ColGenFiniteHorizon
Supported constraints: budget, instantaneous
Parameters: tolerance which is used to determine whether the dual prices of the linear program have converged
The linear program for Constrained MDPs defines a stochastic policy which represents a conditional preallocation. The algorithm itself only initializes the linear program, and derives the stochastic policy from the solution after solving the linear program. More information about linear programs for Constrained MDPs can be found in a book by Altman (1999).
Class: algorithms.mdp.constrainedmdp.ConstrainedMDPFiniteHorizon
Supported constraints: budget, instantaneous
Parameters: none
The deterministic preallocation algorithm in the toolbox uses the same linear program as discussed above, except that it introduces additional binary variables to enforce that resource limits are never violated. Similar variables have been used in algorithms by Wu and Durfee (2010) and De Nijs et al. (2018).
Class: algorithms.mdp.constrainedmdp.DPFiniteHorizon
Supported constraints: budget, instantaneous
Parameters: none
explain input and output
- Yost, K. A., & Washburn, A. R. (2000). The LP/POMDP Marriage: Optimization with Imperfect Information. Naval Research Logistics, 47(8), 607–619.
- Altman, E. (1999). Constrained Markov Decision Processes. CRC Press.
- Wu, J., & Durfee, E. H. (2010). Resource-Driven Mission-Phasing Techniques for Constrained Agents in Stochastic Environments. Journal of Artificial Intelligence Research, 38, 415–473.
- De Nijs, F., Spaan, M. T. J., & De Weerdt, M. M. (2018). Preallocation and Planning under Stochastic Resource Constraints. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence (pp. 4662–4669).
The ConstrainedPlanningToolbox has been developed by the Algorithmics group at Delft University of Technology, The Netherlands. Please visit our website for more information.