01. Fundamentals of Reinforcement Learning
02. A Guide to the Gym Toolkit
03. Bellman Equation and Dynamic Programming
05. Understanding Temporal Difference Learning
5.03. Predicting the Value of States in a Frozen Lake Environment.ipynb
5.06. Computing Optimal Policy using SARSA.ipynb
5.08. Computing the Optimal Policy using Q Learning.ipynb
06. Case Study: The MAB Problem
07. Deep learning foundations
08. A primer on TensorFlow
09. Deep Q Network and its Variants
10. Policy Gradient Method
11. Actor Critic Methods - A2C and A3C
12. Learning DDPG, TD3 and SAC
13. TRPO, PPO and ACKTR Methods
14. Distributional Reinforcement Learning
15. Imitation Learning and Inverse RL
16. Deep Reinforcement Learning with Stable Baselines
17. Reinforcement Learning Frontiers
Folders and files Name Name Last commit message
Last commit date
parent directory Oct 2, 2020
Oct 2, 2020
Apr 1, 2021
Oct 2, 2020
Oct 2, 2020
Oct 2, 2020
Oct 2, 2020
View all files
5. Understanding Temporal Difference Learning
5.1. TD Learning
5.2. TD Prediction
5.2.1. TD Prediction Algorithm
5.3. Predicting the Value of States in a Frozen Lake Environment
5.4. TD Control
5.5. On-Policy TD Control - SARSA
5.6. Computing Optimal Policy using SARSA
5.7. Off-Policy TD Control - Q Learning
5.8. Computing the Optimal Policy using Q Learning
5.9. The Difference Between Q Learning and SARSA
5.10. Comparing DP, MC, and TD Methods
You can’t perform that action at this time.