Deep Q-Network (DQN)

DQN is a model-free, off-policy algorithm that trains a control policies directly from high-dimensional sensory using a deep function approximator to represent the Q-value function

Paper: Playing Atari with Deep Reinforcement Learning

Algorithm implementation

Decision making (act(...))

$\epsilon \leftarrow \epsilon_{_{final}} + (\epsilon_{_{initial}} - \epsilon_{_{final}}) \; e^{-1 \; \frac{\text{timestep}}{\epsilon_{_{timesteps}}}}$
$a \leftarrow \begin{cases} a \in_R A & x < \epsilon \\ \underset{a}{\arg\max} \; Q_\phi(s) & x \geq \epsilon \end{cases} \qquad$ for x ← U(0, 1)

Learning algorithm (_update(...))

# sample a batch from memory
[s, a, r, s′, d] ← states, actions, rewards, next_states, dones of size batch_size
# gradient steps
FOR each gradient step up to gradient_steps DO
    # compute target values
    Q′ ← Q_{ϕ_target}(s′)
    $Q_{_{target}} \leftarrow \underset{a}{\max} \; Q' \qquad$ # the only difference with DDQN
    y ← r + discount_factor ¬d Q_{_target}
    # compute Q-network loss
    Q ← Q_ϕ(s)[a]
    ${Loss}_{Q_\phi} \leftarrow \frac{1}{N} \sum_{i=1}^N (Q - y)^2$
    # optimize Q-network
    ∇_ϕLoss_{Q_ϕ}
    # update target network
    *IF* it's time to update target network THEN
        ϕ_target← polyak ϕ + (1 − polyak )ϕ_target
    # update learning rate
    *IF* there is a learning_rate_scheduler THEN
        step scheduler_ϕ(optimizer_ϕ)

Configuration and hyperparameters

../../../skrl/agents/torch/dqn/dqn.py

Spaces and models

The implementation supports the following Gym spaces / Gymnasium spaces

Gym/Gymnasium spaces	Observation	Action
Discrete	▫	◼
Box	◼	▫
Dict	◼	▫

The implementation uses 2 deterministic function approximators. These function approximators (models) must be collected in a dictionary and passed to the constructor of the class under the argument models

Notation	Concept	Key	Input shape	Output shape	Type
Q_ϕ(s, a)	Q-network	`"q_network"`	observation	action	`Deterministic <models_deterministic>`
Q_{ϕ_target}(s, a)	Target Q-network	`"target_q_network"`	observation	action	`Deterministic <models_deterministic>`

Support for advanced features is described in the next table

Feature	Support and remarks
Shared model	-
RNN support	-

API

skrl.agents.torch.dqn.dqn.DQN

__init__

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

skrl.agents.dqn.rst

skrl.agents.dqn.rst

Deep Q-Network (DQN)

Algorithm implementation

Configuration and hyperparameters

Spaces and models

API

Files

skrl.agents.dqn.rst

Latest commit

History

skrl.agents.dqn.rst

File metadata and controls

Deep Q-Network (DQN)

Algorithm implementation

Configuration and hyperparameters

Spaces and models

API