You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I'm trying to use PILCO on Path tracking for my graduation thesis, but for now the control results are not ideal.
I think it could be improved with a reword for trajectory following.
Do you know an easy way to do this ?
Thanks a lot for the help
Stefan
The text was updated successfully, but these errors were encountered:
I think I managed to implement it in the original Matlab version. What you can do is:
Change the linear policy from M = Wm + b to M = Wm + b * r(t) for the current timestep t (make sure this t is passed to the function). Change the policy gradient dMdp as well - its gradient w.r.t. b used to be 1, but is r(t) now. I do not believe the gradient w.r.t the variance changes.
Alternatively, use another parametrization, as long as it uses r(t).
Pass the current time t to the cost function as well, use this r(t) for the immediate reward instead of a fixed x_target
Hi, I'm trying to use PILCO on Path tracking for my graduation thesis, but for now the control results are not ideal.
I think it could be improved with a reword for trajectory following.
Do you know an easy way to do this ?
Thanks a lot for the help
Stefan
The text was updated successfully, but these errors were encountered: