optExpect

For optimizing expectations

This is meant to be an attempt at building an RL framework which takes advantage of haskell's parallelization capabilities for running a large number of actors in parallel.

There are ongoing implementations of Vanilla Policy Gradients and Proximal Policy Optimization using primitives for feed-forward networks and AD provided by concat. Both of these can be found in the PPO module.

Parametric Policies (gaussian and categorical) can be found in the Policy module. These use concat's very general formulation of parameters and utilize monad-bayes for sampling and the calculation of probabilities.

SGD is currently how concat does it, but there's also work ongoing on an implementation of ADAM so optimization works faster. This can be found here

There is also an abstraction for defining and running environments using streamly for generating streams of trajectories and calculating generalized advantage estimates on them in Env. An example of how to build environments that are compatible with the constraints on the type signature of runEpisodes can be found here

Future Work

On the roadmap is an implementation of Stochastic Computational Graphs implemented perhaps using the circuit abstraction provided by concat. Execution on GPUs, whether by switching out the NN stuff for hasktorch or haskell/tensorflow or by figuring out (or more likely waiting for) how conal means to implement compilation to GPUs with concat.

Running

Fairly straightforward if you have stack

stack build
stack run

Name		Name	Last commit message	Last commit date
Latest commit History 76 Commits
app		app
src		src
test		test
theory		theory
ChangeLog.md		ChangeLog.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
Setup.hs		Setup.hs
TODO.md		TODO.md
default.nix		default.nix
flake.lock		flake.lock
flake.nix		flake.nix
opt-expect.cabal		opt-expect.cabal
package.yaml		package.yaml
parse_plugin_trace_output.py		parse_plugin_trace_output.py
stack.yaml		stack.yaml
stack.yaml.lock		stack.yaml.lock

License

faezs/opt-expect

Folders and files

Latest commit

History

Repository files navigation

optExpect

For optimizing expectations

Future Work

Running

About

Resources

License

Stars

Watchers

Forks

Languages