Skip to content

Latest commit

 

History

History
8 lines (6 loc) · 616 Bytes

README.md

File metadata and controls

8 lines (6 loc) · 616 Bytes

Policy Search with Eligibility Traces

A finite difference-ish approach to policy gradients. It's like PGET, but exploring in parameter space instead of action space.

Why?

Because, why search action space and then perform gradient descent -- which requires an expensive gradient tape/graph -- when you can just search in parameter space instead?

(because it's easier to search in action space than it is to search in parameter space, but it's a method worth exploring regardless)