Need help to understand how grad-inv accelerate learning process #4

sarvghotra · 2016-07-12T12:53:47Z

I hope I am not troubling you too much by asking questions.

Could you please help me to understand the notion of the recent changes made to accelerate learning ?
BTW is it converging on Reacher-v1 ?
Could you please also mention the time taken to learn and your system configuration ?
Also, look at this paper for reward scaling, it could be a reason for divergence just in case it is not converging.

stevenpjg · 2016-08-14T03:17:13Z

The gradient inverter is actually meant for having bounds on the parameter space. Gradients are downscaled as the parameters approaches the bound and are inverted if the parameters exceeds the value range. My interpretation on acceleration in learning speed: think of the grad inverter as a threshold to the gradients. This reduces gradients noise. Also, empirically I found an increase in learning speed when I used grad inverter. For more details, you can refer the paper link

I used NVIDIA GTX 960M to train, it took me around 10 hours to generate the learning curve in https://github.com/stevenpjg/ddpg-aigym/blob/master/learning_curve.png

Yes, it works for Reacher-v1 now but takes a while. You can accelerate it using a small wrapper to use normalized environments and reward scaling as mentioned link

stevenpjg closed this as completed Sep 10, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Need help to understand how grad-inv accelerate learning process #4

Need help to understand how grad-inv accelerate learning process #4

sarvghotra commented Jul 12, 2016 •

edited

stevenpjg commented Aug 14, 2016 •

edited

Need help to understand how grad-inv accelerate learning process #4

Need help to understand how grad-inv accelerate learning process #4

Comments

sarvghotra commented Jul 12, 2016 • edited

stevenpjg commented Aug 14, 2016 • edited

sarvghotra commented Jul 12, 2016 •

edited

stevenpjg commented Aug 14, 2016 •

edited