You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi xiangyu, sorry to interrupt you. I recently want to implement the exact gradient method (gradient descent for LQR with gradient oracle), which is proved in lemma 1. P_k could be calculated iteratively with P = Q, while \sigma_K is hard to compute, as it sums all xx^T from t=0 to infinity. To program it, I set the infinite length to 1000. and run for 20000 epochs with learning rate, state_dim, action_dim equals to 1e-3, 100, 20, separately . However, the result doesn't seem to converge as shown on page 38 of the original paper. Since I am new and working alone on this problem, I really appreciate you can give me some insights! Thank you!
The text was updated successfully, but these errors were encountered:
Hi xiangyu, sorry to interrupt you. I recently want to implement the exact gradient method (gradient descent for LQR with gradient oracle), which is proved in lemma 1. P_k could be calculated iteratively with P = Q, while \sigma_K is hard to compute, as it sums all xx^T from t=0 to infinity. To program it, I set the infinite length to 1000. and run for 20000 epochs with learning rate, state_dim, action_dim equals to 1e-3, 100, 20, separately . However, the result doesn't seem to converge as shown on page 38 of the original paper. Since I am new and working alone on this problem, I really appreciate you can give me some insights! Thank you!
The text was updated successfully, but these errors were encountered: