Policy Search with Eligibility Traces

A finite difference-ish approach to policy gradients. It's like PGET, but exploring in parameter space instead of action space.

Why?

Because, why search action space and then perform gradient descent -- which requires an expensive gradient tape/graph -- when you can just search in parameter space instead?

(because it's easier to search in action space than it is to search in parameter space, but it's a method worth exploring regardless)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Policy Search with Eligibility Traces

Why?

Files

README.md

Latest commit

History

README.md

File metadata and controls

Policy Search with Eligibility Traces

Why?