Reinforcement learning framework

The reinforcement learning framework expands the supervised learning framework to capture an agent interacting with an environment over time. This enables models that can change how the data are sampled based on their predictions. It also enables models that can receive delayed feedback in the form of rewards for actions taken at previous time steps.

In particular, in the reinforcement learning framework, at each time step $t$:

The agent executes an action and receives an observation and scalar reward.
The environment receives the action from the agent, emits the next observation and emits the next scalar reward.
The time step increments.

The goal of the agent is to choose the actions that maximize the cumulative expected reward over all time steps.

Sources

Introduction to Reinforcement Learning, David Silver
Supervised Learning of Behaviors, Lecture 2, Sergey Levine

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reinforcement_learning_framework.md

reinforcement_learning_framework.md

Reinforcement learning framework

Sources

Files

reinforcement_learning_framework.md

Latest commit

History

reinforcement_learning_framework.md

File metadata and controls

Reinforcement learning framework

Sources