Files

01. Fundamentals of Reinforcement Learning
02. A Guide to the Gym Toolkit
03. Bellman Equation and Dynamic Programming
04. Monte Carlo Methods
05. Understanding Temporal Difference Learning
- .ipynb_checkpoints
- Images
- .DS_Store
- 5.03. Predicting the Value of States in a Frozen Lake Environment.ipynb
- 5.06. Computing Optimal Policy using SARSA.ipynb
- 5.08. Computing the Optimal Policy using Q Learning.ipynb
- README.md
06. Case Study: The MAB Problem
07. Deep learning foundations
08. A primer on TensorFlow
09. Deep Q Network and its Variants
10. Policy Gradient Method
11. Actor Critic Methods - A2C and A3C
12. Learning DDPG, TD3 and SAC
13. TRPO, PPO and ACKTR Methods
14. Distributional Reinforcement Learning
15. Imitation Learning and Inverse RL
16. Deep Reinforcement Learning with Stable Baselines
17. Reinforcement Learning Frontiers
images
pdf
.DS_Store
README.md

05. Understanding Temporal Difference Learning

Sudharsan Ravichandiran

and

Sudharsan Ravichandiran

Apr 1, 2021

05ead4a · Apr 1, 2021

Name		Name	Last commit message	Last commit date
parent directory ..
.ipynb_checkpoints		.ipynb_checkpoints	update	Oct 2, 2020
Images		Images	update	Oct 2, 2020
.DS_Store		.DS_Store	update	Apr 1, 2021
5.03. Predicting the Value of States in a Frozen Lake Environment.ipynb		5.03. Predicting the Value of States in a Frozen Lake Environment.ipynb	update	Oct 2, 2020
5.06. Computing Optimal Policy using SARSA.ipynb		5.06. Computing Optimal Policy using SARSA.ipynb	update	Oct 2, 2020
5.08. Computing the Optimal Policy using Q Learning.ipynb		5.08. Computing the Optimal Policy using Q Learning.ipynb	update	Oct 2, 2020
README.md		README.md	update	Oct 2, 2020

README.md

5. Understanding Temporal Difference Learning

5.1. TD Learning
5.2. TD Prediction
- 5.2.1. TD Prediction Algorithm
5.3. Predicting the Value of States in a Frozen Lake Environment
5.4. TD Control
5.5. On-Policy TD Control - SARSA
5.6. Computing Optimal Policy using SARSA
5.7. Off-Policy TD Control - Q Learning
5.8. Computing the Optimal Policy using Q Learning
5.9. The Difference Between Q Learning and SARSA
5.10. Comparing DP, MC, and TD Methods