Skip to content

jaysonph/Pytorch-DDPG

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Deep Deterministic Policy Gradient

Deep Deterministic Policy Gradient (DDPG) algorithm is a widely used algorithm in reinforcement learning. It is a member of Actor Critic learning. It delivers a more stable training for temporal difference estimation approach in non-episodic setting. In this repository, it is implemented for a continuous control problem, in the OpenAI-gym LunarLanderContinuous-v2 environment.

Results

An optimal policy is found after playing few hundreds of episodes. See the rewards, critic loss and actor loss obtained below:

rewards c_loss a_loss

Visualization

To be added