You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We utilize distributed.avg_grads to average the gradients of the actor and critic network in DDPG, but not in SAC's actor network. In fact, due to the current lack of support for off-policy algorithms in the A3C parallel paradigm (preliminary checks are present in the omnisafe/algorithms/algo_wrapper.py file, and off-policy algorithms will not initiate training in the A3C parallel paradigm), in practice, gradient averaging is not performed during the training process of off-policy algorithms. A future pull request will removes redundant code, enhances readability, and does not affect the performance of the algorithm.
The text was updated successfully, but these errors were encountered:
Required prerequisites
Questions
We utilize
distributed.avg_grads
to average the gradients of the actor and critic network in DDPG, but not in SAC's actor network. In fact, due to the current lack of support for off-policy algorithms in the A3C parallel paradigm (preliminary checks are present in theomnisafe/algorithms/algo_wrapper.py
file, and off-policy algorithms will not initiate training in the A3C parallel paradigm), in practice, gradient averaging is not performed during the training process of off-policy algorithms. A future pull request will removes redundant code, enhances readability, and does not affect the performance of the algorithm.The text was updated successfully, but these errors were encountered: