Risk-Sensitive Stochastic Optimal Control as Rao-Blackwellized Markovian Score Climbing

Implements a policy optimization technique via Markovian score climbing

Installation

Create a conda environment

conda create -n NAME python=3.10

Then head to the cloned repository and execute

pip install -e .

A policy learning example on a simple pendulum environment

python examples/feedback/rb_csmc_pendulum.py

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
examples/feedback		examples/feedback
psoc		psoc
.gitignore		.gitignore
README.md		README.md
setup.py		setup.py