Fundamental Reinforcement Learning (In Progress)

A list of learning resources for fundamental reinforcement learning.

Books

Reinforcement Learning: An Introduction. Rich Sutton and Andrew Barto. [PDF]
Algorithms for Reinforcement Learning. Csaba Szepesvari. [PDF]
Reinforcement Learning: Theory and Algorithms. Alekh Agarwal, Nan Jiang, Sham Kakade. [PDF]
Neuro-Dynamic Programming. Dimitri P. Bertsekas and John Tsitsiklis.
Markov Decision Processes: Discrete Stochastic Dynamic Programming. Martin Puterman.

Lecture Notes

CS 598 Statistical Reinforcement Learning. Nan Jiang. [link]
Approximate Dynamic Programming. Ben Van Roy. [link]
Mathematical Techniques for Machine Learning. Prakash Panangaden. [link]

Papers

RL formulation

Unifying Task Specification in Reinforcement Learning. Martha White. [PDF]
Rethinking the Discount Factor in Reinforcement Learning: A Decision Theoretic Approach. Silviu Pitis. [PDF]

Objectives in RL

Scherrer B. Should one compute the Temporal Difference fix point or minimize the Bellman Residual? The unified oblique projection view. [PDF]
Schoknecht R. Optimality of Reinforcement Learning Algorithms with Linear Function Approximation. Advances in Neural Information Processing Systems. 2015 [PDF]

Approximate DP

Scherrer B. Approximate Policy Iteration Schemes: A Comparison.
Farahmand A, Szepesvári C, Munos R. Error Propagation for Approximate Policy and Value Iteration. Advances in Neural Information Processing Systems.
Munos R, Szepesvari C. Finite-Time Bounds for Fitted Value Iteration. 2008.
Munos R. Performance Bounds in $L_p$‐norm for Approximate Value Iteration. SIAM J Control Optim. 2007.
Antos A, Szepesvari C, Munos R. Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path. 2006.
Munos R. Error Bounds for Approximate Value Iteration. 2005
Munos R. Error Bounds for Approximate Policy Iteration. 2003.
Williams R, Baird LC. Tight Performance Bounds on Greedy Policies Based on Imperfect Value Functions. 1993.

Approximate LP

A Linearly Relaxed Approximate Linear Program for Markov Decision Processes. [PDF]

Temporal Differences Learning

Learning to predict by the methods of temporal differences. Rich Sutton. [PDF]
TD or not TD: Analyzing the Role of Temporal Differencing in Deep Reinforcement Learning. Artemij Amiranashvili, et al. [PDF]

Convergence of RL algorithms

Convergence of Stochastic Iterative Dynamic Programming Algorithms. [PDF]
Q-Learning. Christopher Watkins and Peter Dayan. [PDF]
Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms. [PDF]
Reinforcement Learning with Function Approximation Converges to a Region. [PDF]
Chattering in SARSA. [PDF]

RL with Function Approximation

Analysis of temporal-diffference learning with function approximation. John Tsitsiklis and Benjamin Van Roy. [PDF]
An Analysis of Linear Models, Linear Value-Function Approximation, and Feature Selection for Reinforcement Learning. [PDF]

Least-Squares Methods

Least-Squares Methods in Reinforcement Learning for Control. Michail Lagoudakis, Ronald Parr, and Michael Littman. [PDF]
Linear Least-Squares Algorithms for Temporal Difference Learning. Steven Bradtke, and Andrew Barto. [PDF]

Deep Q-learning

Towards Characterizing Divergence in Deep Q-Learning. Joshua Achiam, Ethan Knight, and Pieter Abbeel.
Diagnosing Bottlenecks in Deep Q-learning Algorithms. Justin Fu, et al.
Deep Reinforcement Learning and the Deadly Triad. Hado van Hasselt, et al. [PDF]
A Theoretical Analysis of Deep Q-Learning. Zhuoran Yang, et al. [PDF]

Math for ML

A Tutorial on Fisher Information. Alexander Ly, et al. [PDF]

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fundamental Reinforcement Learning (In Progress)

Books

Lecture Notes

Papers

RL formulation

Objectives in RL

Approximate DP

Approximate LP

Temporal Differences Learning

Convergence of RL algorithms

RL with Function Approximation

Least-Squares Methods

Deep Q-learning

Math for ML

About

Releases

Packages

VincentLiu3/awesome-fundamental-rl

Folders and files

Latest commit

History

Repository files navigation

Fundamental Reinforcement Learning (In Progress)

Books

Lecture Notes

Papers

RL formulation

Objectives in RL

Approximate DP

Approximate LP

Temporal Differences Learning

Convergence of RL algorithms

RL with Function Approximation

Least-Squares Methods

Deep Q-learning

Math for ML

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages