Skip to content

monika58/Mtech-Thesis-Project

Repository files navigation

M.Tech Thesis: Performance Evaluation of Simultaneous Perturbation Methods for Simulation Optimization and Policy Learning

The objective is to implement SP methods on RL problems.

For implementation of these problems, we are using two SP methods: SPSA and RDSA along with a neural network based function approximator. Further, we analyze these algorithms on common discrete and continuous control environments and compare performance with the popular REINFORCE algorithm.

image

The experimental studies show that SPSA i) is easy to implement ii) takes less time in training iii) requires two function measurements per iteration and iv) outperforms REINFORCE in walking robot task.

cartpole acrobot

About

SPSA based policy learning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages