TD3_Separate_Action

Twin-Delayed Deep Deterministic Policy Gradient network specifically for HalfCheetahBulletEnv-v0 environment using PyTorch. The implementation is based on the standard version of T3D from the Udemy course Deep Reinforcement Learning 2.0 by Kirill Eremenko and Hadelin de Ponteves. The network is updated for use with the Cheetah agent as it separates the actions for each legs and calculates the action knowing the action of the other leg.

Dependencies:

The replay buffer data structure is written by P. Emami (https://github.com/pemami4911)

Results:

After 400'000'000 steps:

* After 500'000'000 steps:

* After 1'000'000'000 steps:

The original TD3 paper: https://arxiv.org/abs/1802.09477

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
Gifs		Gifs
README.md		README.md
replay_buffer.py		replay_buffer.py
td3_separate_action.py		td3_separate_action.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gifs

Gifs

README.md

README.md

replay_buffer.py

replay_buffer.py

td3_separate_action.py

td3_separate_action.py

Repository files navigation

TD3_Separate_Action

About

Releases

Packages

Languages

reiniscimurs/TD3_Separate_Action

Folders and files

Latest commit

History

Repository files navigation

TD3_Separate_Action

About

Topics

Resources

Stars

Watchers

Forks

Languages