Introduction

This is an implementation of the research paper Addressing Function Approximation Error in Actor-Critic Methods. This repo contains only the TD3 implementation, other algorithms mentioned in the paper are not implemented.

You can run this code by

$ python main.py # train an agent on a single environment(use --env_name to change the environment)
$ run.sh # train agents on the environments listed in run.sh file

The result of 10 trails (default) will be stored in the result directory. The agent undergoes training for 500,000 steps and is evaluated every 5,000 steps (including the initial policy). Each evaluation consists of 10 episodes, and during evaluation, the action noise is eliminated.

Result

The shaded area represents a standard deviation.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
result		result
.gitignore		.gitignore
TD3_agent.py		TD3_agent.py
main.py		main.py
model.py		model.py
noise_generator.py		noise_generator.py
readme.md		readme.md
replay_buffer.py		replay_buffer.py
requirements.txt		requirements.txt
run.sh		run.sh
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

result

result

.gitignore

.gitignore

TD3_agent.py

TD3_agent.py

main.py

main.py

model.py

model.py

noise_generator.py

noise_generator.py

readme.md

readme.md

replay_buffer.py

replay_buffer.py

requirements.txt

requirements.txt

run.sh

run.sh

utils.py

utils.py

Repository files navigation

Introduction

Result

About

Releases

Packages

Languages

b06b01073/Twin-Delayed-DDPG

Folders and files

Latest commit

History

Repository files navigation

Introduction

Result

About

Resources

Stars

Watchers

Forks

Languages