Skip to content

My implementations of popular reinforcement learning methods based on other developers and research papers.

Notifications You must be signed in to change notification settings

dknathalage/deep-rl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Reinforcement Learning

Reinforcement Learning is subfield of machine learning that is focused on behaviour of an agent in an environment when it is interacted with it. Being fairly new field, RL has shown some of very important breakthroughs in machine learning domain. Reinforcement Learning builds up on a very simple concept and expand across it.

Concept Behind Reinforcement Learning

Intuition behing RL is very simple. It can be explained as follow.

  1. Initialise a software agent (Presumably a neural network)
  2. Make agent interact with the environment (Make an action)
  3. Collect the reward and the new state of the environment (scores of a game and frame of the game after we take the action)
  4. Improve the agent to get better rewards over time The whole subfield of reinforcement learning build up upone the afore mentioned concepts and the rest of the theory build up on that basis

Basic concept of RL

Types of Reinforcement Learning

  1. Value Iteration Methods
  2. Policy Iteration Methods
  3. Model Based Methods

Before we dive deep into the rabbit hole

Though the concepts behind these approaches for reinforcement learning methods are quite simple it's quite easy to get lost(been there). In order to have a good understanding about the concepts, getting familiar with the specific terminology is mandatory.

State(St): Information about the environment at time t
Action(At): Action agent took at the time t
Action Probability of given state(P(a|s)t):

1. Value Based Methods (Q-value methods)

Value based methods focus on approximating the values of a states and taking the optimal actions in order to gain a good trajectory in order to maximize rewards.

2. Policy Iteration Methods

Unlike value based methods, policy methods focus on approximating the trajectory in which the actions should be taken directly, in order to maximize the expected return.

3. Model Based Methods

Model based methods try to approximate the dynamics of the environment in order for agent to chose optimal action trajectory.

Policy iteration methods

Policy iteration methods have been getting so much attention recently. The reason behind the attention on policy iteration method is because policy gradient methods are relatively easier to implement and they are optimized to the required goal rather than predicting value of the current state to decide the optimal action (Value Iteration).

About

My implementations of popular reinforcement learning methods based on other developers and research papers.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages