Skip to content
Example implementations for paper "Projections for Approximate Policy Iteration" paper
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information. Update May 19, 2019 alternate eta_cov Jun 3, 2019 comment update Jun 4, 2019


Example implementation of key algorithms of paper Projections for Approximate Policy Iteration Algorithms [1].

The file implements Alg. 2 of the paper. It takes as input a linear-Gaussian policy and projects it to another policy that has KL divergence w.r.t. a target policy, smaller than a threshold.

The file implements a policy with an embedded strict entropy inequality constraint, to ensure that the entropy of a policy never goes below a threshold. This code can easily be extended to enforce a strict entropy equality constraint by replacing self.chol = tf.cond(ent < tent, lambda: self.chol * tf.exp((tent - ent) / act_dim), lambda: self.chol) with self.chol = self.chol * tf.exp((tent - ent) / act_dim).


[1] Akrour, R.; Pajarinen, J.; Neumann, G.; Peters, J. (2019). Projections for Approximate Policy Iteration Algorithms. Proceedings of the International Conference on Machine Learning (ICML).

You can’t perform that action at this time.