recently the algorithm has been moved to https://github.com/jjkke88/RL_toolbox

trpo

trust region policy optimitztion base on gym and tensorflow

There are three versions of trpo, one for decrete action space like mountaincar, one for decreate action space task with image as input like atari games, and the last for continuous action space for pendulems.

The environment is base on openAI gym.

part of code refer to rllab

dependency

tensorflow 0.10
prettytensor
latest openai gym

constructure for code

baseline:baseline estimation of baseline function
checkpoint:folder to store model file, can not be delete or will cause some error
distribution:distribution base class, it can be used to calculate probability of distributions, for example Gaussian.
logger:have a Logger class for log data to .csv file
agent:for disperse action space and continous action space
log:store log file
experiment: contain many different main file, run main file can start trainning or testing
environment.py: environment
krylov.py: implement of some math method:conjugate gradient descent , calculating hessian matrix
parameters.py: config file
utils.py: implement of some basic function: getFlat, setFlat, lineaSearch

recent work

imple multi-thread trpo run python main_multi_thread.py to try
imple tensorflow distributed trpo
imple trpo multi-process

future work

complete trpo with image as input

Name		Name	Last commit message	Last commit date
Latest commit History 88 Commits
agent		agent
baseline		baseline
distribution		distribution
experiment		experiment
logger		logger
network		network
storage		storage
.gitignore		.gitignore
README.md		README.md
dealImage.py		dealImage.py
environment.py		environment.py
krylov.py		krylov.py
parameters.py		parameters.py
parameters.py~		parameters.py~
run.py		run.py
utils.py		utils.py

jjkke88/trpo

Folders and files

Latest commit

History

Repository files navigation

recently the algorithm has been moved to https://github.com/jjkke88/RL_toolbox

trpo

dependency

constructure for code

recent work

future work

About

Resources

Stars

Watchers

Forks

Languages