Twin Delayed Deep Deterministic (TD3)

+-------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| **Paper**         | Addressing Function Approximation Error in Actor-Critic Methods :cite:`Fujimoto2018AddressingFA`                                                                                                                                      |
+-------------------+--------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------+
| **Framework(s)**  | .. figure:: ./images/tf.png                                                                                              | .. figure:: ./images/pytorch.png                                                                           |
|                   |    :scale: 20%                                                                                                           |    :scale: 10%                                                                                             |
|                   |    :class: no-scaled-link                                                                                                |    :class: no-scaled-link                                                                                  |
|                   |                                                                                                                          |                                                                                                            |
|                   |    TensorFlow                                                                                                            |    PyTorch                                                                                                 |
+-------------------+--------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------+
| **API Reference** | `garage.tf.algos.TD3 <https://garage.readthedocs.io/en/latest/_autoapi/garage/tf/algos/index.html#garage.tf.algos.TD3>`_ |                                                                                                            |
+-------------------+--------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------+
| **Code**          | `garage/tf/algos/td3.py <https://github.com/rlworkgroup/garage/blob/master/src/garage/tf/algos/td3.py>`_                 |                                                                                                            |
+-------------------+--------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------+
| **Examples**      | :ref:`td3_pendulum_tf`                                                                                                   |                                                                                                            |
+-------------------+--------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------+
| **Benchmarks**    | :ref:`td3_garage_tf`                                                                                                     |                                                                                                            |
+-------------------+--------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------+

Twin Delayed Deep Deterministic (TD3) is an alogrithm motivated by Double Q-learning and built by taking the minimum value between two critic networks to prevent the overestimation of the value function. Garage's implementation is based on the paper's approach, which includes clipped Double Q-learning, delayed update of target and policy networks as well as target policy smoothing.

Default Parameters

target_update_tau=0.01,
policy_lr=1e-4,
qf_lr=1e-3,
discount=0.99,
exploration_policy_sigma=0.2,
exploration_policy_clip=0.5,
actor_update_period=2,

Examples

td3_pendulum_tf

.. literalinclude:: ../../examples/tf/td3_pendulum.py

Benchmarks

Benchmarks Results

td3_garage_tf

.. literalinclude:: ../../benchmarks/src/garage_benchmarks/experiments/algos/td3_garage_tf.py

References

.. bibliography:: references.bib
   :style: unsrt
   :filter: docname in docnames

This page was authored by Iris Liu (@irisliucy).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

algo_td3.md

algo_td3.md

Twin Delayed Deep Deterministic (TD3)

Default Parameters

Examples

td3_pendulum_tf

Benchmarks

Benchmarks Results

td3_garage_tf

References

Files

algo_td3.md

Latest commit

History

algo_td3.md

File metadata and controls

Twin Delayed Deep Deterministic (TD3)

Default Parameters

Examples

td3_pendulum_tf

Benchmarks

Benchmarks Results

td3_garage_tf

References