Continuous space RL

Tile coding

Discreatisizing example:

Each location can be located by the tiles it activates and can be represented by a bit vector (ones for tiles activated and zeros elsewhere).

The state value function computation when using this scheme:

Adaptive Tile Coding

This approach doesn't require manually designing the tiles ahead of time.

Example for devision criteria: when we are no longer learning from the data (our value function has stopped changing).

Workshop: Tile_Coding.ipynb (gym: Acrobot-v1)

Coarse Coding:

Each location on the plane is converted into a binary vector, when index i is '1', then it means that the encoded location is in circle i. This is a sparse representation of the plane.

A more continuous mapping of the area into a vector:

Function approximations:

We are interested in obtaining a good approximating of the actual value function (or q-function). This sometimes requires adding a parameter w:

This is called linear function approximation.

We obtain $W$ by optimization:

This is the rule that we will follow for each sampled state until the error (between the approximate and true state value function).

In order to do this while Q-learning, we need to approximate the action-value function (q).

But why stop here. Lets estimate the state-actions value:

Each column of the W matrix emulates a separate linear function.

Kernel Functions

We can still use a linear combination of these non-linear features and therefor use linear function approximation.

This allows the value function to represent non-linear relations between the input state and the output value.

Non-Linear function approximation

This greatly increases our representational capacity of our approximation. This is also the way neural networks work.

We can use gradient descent to optimize and estimate w:

This sets us up for deep-reinforcement learning.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Part1_Lesson11_ContinuousSpaceRL.md

Part1_Lesson11_ContinuousSpaceRL.md

Continuous space RL

Tile coding

Adaptive Tile Coding

Coarse Coding:

Function approximations:

Kernel Functions

Non-Linear function approximation

Files

Part1_Lesson11_ContinuousSpaceRL.md

Latest commit

History

Part1_Lesson11_ContinuousSpaceRL.md

File metadata and controls

Continuous space RL

Tile coding

Adaptive Tile Coding

Coarse Coding:

Function approximations:

Kernel Functions

Non-Linear function approximation