/
matd3.rst
53 lines (37 loc) · 1.84 KB
/
matd3.rst
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
.. _MATD3 tutorial:
Speaker-Listener with MATD3
====================================
This tutorial shows how to train an :ref:`MATD3<matd3>` agent on the `simple speaker listener <https://pettingzoo.farama.org/environments/mpe/simple_speaker_listener/>`_ multi-particle environment.
.. figure:: mpe_looped.gif
:height: 400
:align: center
Performance of trained MATD3 algorithm on 6 random episodes
What is MATD3?
--------------
:ref:`MATD3<matd3>` (Multi-Agent Twin Delayed Deep Deterministic Policy Gradients) extends the :ref:`MADDPG<maddpg>` (Multi-Agent Deep Deterministic Policy Gradients) algorithm to reduce overestimation bias in multi-agent domains through the use of a second set of critic networks and delayed updates of the policy networks. This enables superior performance when compared to MADDPG. For further information on MATD3, check out the :ref:`documentation<matd3>`.
Can I use it?
-------------
.. list-table::
:widths: 20 20 20
:header-rows: 1
* -
- Action
- Observation
* - Discrete
- ✔️
- ✔️
* - Continuous
- ✔️
- ✔️
Code
-----
Train multiple agents using MADDPG
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The following code should run without any issues. The comments are designed to help you understand how to use PettingZoo with AgileRL. If you have any questions, please feel free to ask in the `Discord server <https://discord.com/invite/eB8HyTA2ux>`_.
.. literalinclude:: ../../../tutorials/PettingZoo/agilerl_matd3.py
:language: python
Watch the trained agents play
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The following code allows you to load your saved MATD3 algorithm from the previous training block, test the algorithms performance, and then visualise a number of episodes as a gif.
.. literalinclude:: ../../../tutorials/PettingZoo/render_agilerl_matd3.py
:language: python