Reinforcement Learning

Reinforcement Learning is subfield of machine learning that is focused on behaviour of an agent in an environment when it is interacted with it. Being fairly new field, RL has shown some of very important breakthroughs in machine learning domain. Reinforcement Learning builds up on a very simple concept and expand across it.

Concept Behind Reinforcement Learning

Intuition behing RL is very simple. It can be explained as follow.

Initialise a software agent (Presumably a neural network)
Make agent interact with the environment (Make an action)
Collect the reward and the new state of the environment (scores of a game and frame of the game after we take the action)
Improve the agent to get better rewards over time The whole subfield of reinforcement learning build up upone the afore mentioned concepts and the rest of the theory build up on that basis

Types of Reinforcement Learning

Value Iteration Methods
Policy Iteration Methods
Model Based Methods

Before we dive deep into the rabbit hole

Though the concepts behind these approaches for reinforcement learning methods are quite simple it's quite easy to get lost(been there). In order to have a good understanding about the concepts, getting familiar with the specific terminology is mandatory.

State(S_t): Information about the environment at time t
Action(A_t): Action agent took at the time t
Action Probability of given state(P(a|s)_t):

1. Value Based Methods (Q-value methods)

Value based methods focus on approximating the values of a states and taking the optimal actions in order to gain a good trajectory in order to maximize rewards.

2. Policy Iteration Methods

Unlike value based methods, policy methods focus on approximating the trajectory in which the actions should be taken directly, in order to maximize the expected return.

3. Model Based Methods

Model based methods try to approximate the dynamics of the environment in order for agent to chose optimal action trajectory.

Policy iteration methods

Policy iteration methods have been getting so much attention recently. The reason behind the attention on policy iteration method is because policy gradient methods are relatively easier to implement and they are optimized to the required goal rather than predicting value of the current state to decide the optimal action (Value Iteration).

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.vscode		.vscode
a2c		a2c
a3c		a3c
actor-critic		actor-critic
experimental		experimental
ppo		ppo
reinforce		reinforce
runs		runs
.gitignore		.gitignore
Readme.md		Readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reinforcement Learning

Concept Behind Reinforcement Learning

Types of Reinforcement Learning

Before we dive deep into the rabbit hole

1. Value Based Methods (Q-value methods)

2. Policy Iteration Methods

3. Model Based Methods

Policy iteration methods

About

Releases

Packages

Languages

dknathalage/deep-rl

Folders and files

Latest commit

History

Repository files navigation

Reinforcement Learning

Concept Behind Reinforcement Learning

Types of Reinforcement Learning

Before we dive deep into the rabbit hole

1. Value Based Methods (Q-value methods)

2. Policy Iteration Methods

3. Model Based Methods

Policy iteration methods

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages