Understanding Reinforcement Learning

Deep Reinforcement Learning requires good amount of intuation in both deep learning and reinforcement learning. Even though, theorical part is not that hard to understand, it definitely makes it harder to understand the messy codes. Lets try to change that :D

Section 1

OpenAI gym

Since we are working with OpenAI's GYM alot, lets have better intuation about it!

OpenAi Gym enviroments

Section 2

Policy Gradient menthods

In supervised learning, it is simple to create a system that can easily map inputs X into outputs Y since there is a dataset which contains all input and output examples. On the other hand in Reinforcement learning, there are no datasets which contain examples just like datasets in supervised learning. Using Policy gradient is one way to solve this problem. The hole idea relly on encouraging the actions with good reward and discouraging the actions with bad reward. The general formula is minimizing the log(p(y | x)) A loss. In here A represent Adventage and for most vanilla version we can use discounted rewards.

Extra resources:

My blog post
Pong from pixels
Hands-On Machine Learning with Scikit-Learn and TensorFlow Chapter 16
Policy Gradients Pieter Abbeel lecture
Continuous control with deep reinforcement learning
Better Exploration with Parameter Noise

Section 3

Deep Q Networks

Before we start with DQN lets talk about Q function first. Q(s,a) is a function that maps given s (state) and a(action) pair to expected total reward untile the terminal state. It is basicaly how much reward we are gonna gate if we act with action a in state s. The reason we combine this idea with NN is it is almost imposible to find q values for all states in environment.

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
Baselines		Baselines
DQL		DQL
Dueling_DQN		Dueling_DQN
openai_gym		openai_gym
policy_gradient		policy_gradient
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Baselines

Baselines

DQL

DQL

Dueling_DQN

Dueling_DQN

openai_gym

openai_gym

policy_gradient

policy_gradient

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

Repository files navigation

Understanding Reinforcement Learning

Section 1

OpenAI gym

Section 2

Policy Gradient menthods

Section 3

Deep Q Networks

About

Releases

Packages

Languages

License

AhmetHamzaEmra/Understanding_RL

Folders and files

Latest commit

History

Repository files navigation

Understanding Reinforcement Learning

Section 1

OpenAI gym

Section 2

Policy Gradient menthods

Section 3

Deep Q Networks

About

Topics

Resources

License

Stars

Watchers

Forks

Languages