Reinforcement Learning

Reinforcement Learning with Python will help you to master basic reinforcement learning algorithms to the advanced deep reinforcement learning algorithms. From the basics to deep reinforcement learning, this repo provides easy-to-read code examples. One file for each algorithm.

Dependencies

Python 3.5
Tensorflow 1.0.0
Keras
numpy
pandas
matplot
pillow
Skimage
h5py

Install Requirements

pip install -r requirements.txt

3.1. Markov Chain and Markov Process
3.2. Markov Decision Process
3.3. Rewards and Returns
3.4. Episodic and Continous Tasks
3.5. Policy Function
3.6. State Value Function
3.7. State-Action Value Function (Q Function)
3.8. Bellman Equation and Optimality
3.9. Deriving Bellman Equation for Value and Q functions
3.10. Solving the Bellman Equation
3.11. Dynamic Programming
3.12. Solving Frozen Lake Problem using Value Iteration
3.13. Solving Frozen Lake Problem using Policy Iteration

4. Gaming with Monte Carlo Methods

4.1. Monte Carlo Methods
4.2. Estimating Value of Pi Using Monte Carlo
4.3. Monte Carlo Prediction
4.4. First visit Monte Carlo
4.5. Every visit Monte Carlo
4.6. BlackJack with Monte Carlo
4.7. Monte Carlo Control
4.8. Monte Carlo Exploration Starts
4.9. On Policy Monte Carlo Control
4.10. Off Policy Monte Carlo Control

5. Temporal Difference Learning

5.1. Temporal Difference Learning
5.2. TD Prediction
5.3. TD Control
5.4. Q Learning
5.5. Solving the Taxi Problem using Q learning
5.6. SARSA
5.7. Solving the Taxi Problem using SARSA
5.8. Difference Between Q learning and SARSA

6. Multi-Armed Bandit Problem

6.1. Multi-armed Bandit Problem
6.2. Epsilon-Greedy Algorithm
6.3. Softmax Exploration Algorithm
6.4. Upper Confidence Bound Algorithm
6.5. Thompson Sampling Algorithm
6.6. Applications of MAB
6.7. Identifying Right Advertisement Banner Using MAB
6.8. Contextual Bandits

7. Deep Learning Fundamentals

7.1. Artificial Neurons
7.2. Artificial Neural Network
7.3. Activation Functions
7.4. Deep Dive into ANN
7.5. Gradient Descent
7.6. Neural Networks in Tensorflow
7.7. Recurrent Neural Network
7.8. Backpropagation Through Time
7.9. Long Short Term Memory RNN
7.10. Generating Song Lyrics using LSTM RNN
7.11. Convolutional Neural Networks
7.12. CNN Architecture
7.13. Classifying Fashion Products Using CNN

9. Playing Doom With Deep Recurrent Q Network

9.1. Deep Recurrent Q Network
9.2. Partially Observable MDP
9.3. Architecture of DRQN
9.4. Basic Doom Game
9.5. Build an Agent to Play Doom Game using DRQN
9.6. Deep Attention Recurrent Q Network

10. Asynchronous Advantage Actor Critic Network

10.1. Asynchronous Actor Critic Algorithm
10.2. The three A's
10.3. Architecture of A3C
10.4. Working of A3C
10.5. Drive up the Mountain with A3C
10.6. Visualization in Tensorboard

11. Policy Gradients and Optimization

11.1. Policy Gradient
11.2. Lunar Lander Using Policy Gradient
11.3. Deep Deterministic Policy Gradient
11.4. Swinging up the Pendulum using DDPG
11.5. Trust Region Policy Optimizatio
11.6. Proximal Policy Optimization

12. Capstone Project: Car Racing using DQN

12.1. Environment Wrapper Functions
12.2. Dueling Network
12.3. Replay Buffer
12.4. Training the Network
12.5. Car Racing

13. Recent Advancements and Next Steps

13.1. Imagination Augmented Agents
13.2. Learning From Human Preference
13.3. Deep Q Learning From Demonstrations
13.4. Hindsight Experience Replay
13.5. Hierarchical Reinforcement Learning
13.6. Inverse Reinforcement Learning

About me

Piyush Pathak

PORTFOLIO

GITHUB

BLOG

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
Complete guide to Reinforcement Learning		Complete guide to Reinforcement Learning
.gitignore		.gitignore
LICENSE		LICENSE
Markov decision process.ipynb		Markov decision process.ipynb
Markov_Decision_Process.png		Markov_Decision_Process.png
Multi Armed Bandit Algorithm.ipynb		Multi Armed Bandit Algorithm.ipynb
Q_Learning.ipynb		Q_Learning.ipynb
README.md		README.md
SARSA Algorithm.ipynb		SARSA Algorithm.ipynb
Thompson Sampling.ipynb		Thompson Sampling.ipynb
ads.csv		ads.csv
multi armed bandit.png		multi armed bandit.png
multiarmed.png		multiarmed.png
q-learning.png		q-learning.png
q_formula.png		q_formula.png
sarsa.png		sarsa.png
sarsa2.png		sarsa2.png
thompson sampling.png		thompson sampling.png
winter.jpg		winter.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reinforcement Learning

Dependencies

Install Requirements

Table of Contents

1. Introduction to Reinforcement Learning

2. Getting Started with OpenAI and Tensorflow

3. Markov Decision Process and Dynamic Programming

4. Gaming with Monte Carlo Methods

5. Temporal Difference Learning

6. Multi-Armed Bandit Problem

7. Deep Learning Fundamentals

8. Atari Games With Deep Q Network

9. Playing Doom With Deep Recurrent Q Network

10. Asynchronous Advantage Actor Critic Network

11. Policy Gradients and Optimization

12. Capstone Project: Car Racing using DQN

13. Recent Advancements and Next Steps

About me

📫 Follw me:

About

Releases

Packages

Languages

License

piyushpathak03/Reinforcement-Learning

Folders and files

Latest commit

History

Repository files navigation

Reinforcement Learning

Dependencies

Install Requirements

Table of Contents

About me

📫 Follw me:

About

Resources

License

Stars

Watchers

Forks

Languages