Deep Deterministic Policy Gradient (DDPG)

Theory

Agent is using DDPG algorithm to predict continuous actions in continuous state space. It has two networks: Actor and Critic.

https://towardsdatascience.com/reinforcement-learning-w-keras-openai-actor-critic-models-f084612cfd69

https://towardsdatascience.com/hyper-parameters-in-action-part-ii-weight-initializers-35aee1a28404

https://spinningup.openai.com/en/latest/algorithms/ddpg.html

Actor topology

Critic topology

Inputs/Outputs

The Actor network has 2 inputs from game: position, velocity. The output layer consists from fully-connected 'tanh()' layer for doing actions in range (-1.0, 1.0): force. Hidden layers are using ReLU function.

The Critic network has 2 inputs from game (states) and 1 input from Actor network (action). Hidden layers are using ReLU function. The main function of this network is estimate quality of the action[t] in the state[t].

The Critic network is trained by Bellman equation:

Q_target = reward + (1-done) * gamma * Q_next_state

Q_target       ->  Q value to be trained,
reward         ->  reward from game for action in state,
gamma          ->  discount factor,
Q_next_state   ->  quality of action in next state 
done           ->  1, if it's terminal state or 0 in non-terminal state

Summary

Framework: Tensorflow 2.0
Languages: Python 3
Author: Martin Kubovcik

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
.DS_Store		.DS_Store
.gitignore		.gitignore
Actor.py		Actor.py
Critic.py		Critic.py
LICENSE		LICENSE
MountainCarContinuous.py		MountainCarContinuous.py
README.md		README.md
ReplayBuffer.py		ReplayBuffer.py
model_A.png		model_A.png
model_C.png		model_C.png
noise.py		noise.py
requirements.txt		requirements.txt
result.png		result.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Deterministic Policy Gradient (DDPG)

Theory

Inputs/Outputs

About

Releases 1

Languages

License

markub3327/DDPG-TF

Folders and files

Latest commit

History

Repository files navigation

Deep Deterministic Policy Gradient (DDPG)

Theory

Inputs/Outputs

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Languages