You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
targetQ should be calculated with s2_batch and a_next (a_next is predicted with s2_batch), while you use s_batch to get the a_for_critic to compute targetQ, so I think there has a logic error in your implementation.
The text was updated successfully, but these errors were encountered:
targetQ
should be calculated withs2_batch
anda_next
(a_next
is predicted withs2_batch
), while you uses_batch
to get thea_for_critic
to computetargetQ
, so I think there has a logic error in your implementation.The text was updated successfully, but these errors were encountered: