Skip to content

Using Pytorch to implement TD3 and play with bipedal

Notifications You must be signed in to change notification settings

henry32144/TD3-Pytorch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TD3-Pytorch

Introduction

This project is a Pytorch implentation of TD3. The paper can be found here Addressing Function Approximation Error in Actor-Critic Methods by Scott Fujimoto, Herke van Hoof and David Meger.

The implementations for each environment is written in corresponding named jupyter notebook.

Environments

Bipedal Walker

Agent

In this environment, reward is given for moving forward, total 300+ points up to the far end. If the robot falls, it gets -100. Applying motor torque costs a small amount of points, more optimal agent will get better score. State consists of hull angle speed, angular velocity, horizontal speed, vertical speed, position of joints and joints angular speed, legs contact with ground, and 10 lidar rangefinder measurements. There's no coordinates in the state vector.

(description from offical documentation)

The observation space consists of 24 variables. Each action is a vector with four numbers. Every entry in the action vector should be a number between -1 and 1.

Dependencies

In this project, I use the following libraries.

  • Python 3.6
  • Pytorch - 0.4.0
  • gym - 0.11
  • Numpy
  • Pandas
  • Matplotlib
  • Jupyter notebook

You can follow this instruction to install OpenAI gym. Notice that Bipedal belongs to Box2d class, so only perform minimal installation is not enough, Box2d is also needed.

Usage

Go to the project folder, and open it with jupyter notebook.

Reference

The code implemented by the author of the paper is really good, I reference some part from it.

About

Using Pytorch to implement TD3 and play with bipedal

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published