Skip to content

VincentLiu3/awesome-fundamental-rl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 

Repository files navigation

Fundamental Reinforcement Learning (In Progress)

A list of learning resources for fundamental reinforcement learning.

Books

  • Reinforcement Learning: An Introduction. Rich Sutton and Andrew Barto. [PDF]
  • Algorithms for Reinforcement Learning. Csaba Szepesvari. [PDF]
  • Reinforcement Learning: Theory and Algorithms. Alekh Agarwal, Nan Jiang, Sham Kakade. [PDF]
  • Neuro-Dynamic Programming. Dimitri P. Bertsekas and John Tsitsiklis.
  • Markov Decision Processes: Discrete Stochastic Dynamic Programming. Martin Puterman.

Lecture Notes

  • CS 598 Statistical Reinforcement Learning. Nan Jiang. [link]
  • Approximate Dynamic Programming. Ben Van Roy. [link]
  • Mathematical Techniques for Machine Learning. Prakash Panangaden. [link]

Papers

RL formulation

  • Unifying Task Specification in Reinforcement Learning. Martha White. [PDF]
  • Rethinking the Discount Factor in Reinforcement Learning: A Decision Theoretic Approach. Silviu Pitis. [PDF]

Objectives in RL

  • Scherrer B. Should one compute the Temporal Difference fix point or minimize the Bellman Residual? The unified oblique projection view. [PDF]
  • Schoknecht R. Optimality of Reinforcement Learning Algorithms with Linear Function Approximation. Advances in Neural Information Processing Systems. 2015 [PDF]

Approximate DP

  • Scherrer B. Approximate Policy Iteration Schemes: A Comparison.
  • Farahmand A, Szepesvári C, Munos R. Error Propagation for Approximate Policy and Value Iteration. Advances in Neural Information Processing Systems.
  • Munos R, Szepesvari C. Finite-Time Bounds for Fitted Value Iteration. 2008.
  • Munos R. Performance Bounds in $L_p$‐norm for Approximate Value Iteration. SIAM J Control Optim. 2007.
  • Antos A, Szepesvari C, Munos R. Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path. 2006.
  • Munos R. Error Bounds for Approximate Value Iteration. 2005
  • Munos R. Error Bounds for Approximate Policy Iteration. 2003.
  • Williams R, Baird LC. Tight Performance Bounds on Greedy Policies Based on Imperfect Value Functions. 1993.

Approximate LP

  • A Linearly Relaxed Approximate Linear Program for Markov Decision Processes. [PDF]

Temporal Differences Learning

  • Learning to predict by the methods of temporal differences. Rich Sutton. [PDF]
  • TD or not TD: Analyzing the Role of Temporal Differencing in Deep Reinforcement Learning. Artemij Amiranashvili, et al. [PDF]

Convergence of RL algorithms

  • Convergence of Stochastic Iterative Dynamic Programming Algorithms. [PDF]
  • Q-Learning. Christopher Watkins and Peter Dayan. [PDF]
  • Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms. [PDF]
  • Reinforcement Learning with Function Approximation Converges to a Region. [PDF]
  • Chattering in SARSA. [PDF]

RL with Function Approximation

  • Analysis of temporal-diffference learning with function approximation. John Tsitsiklis and Benjamin Van Roy. [PDF]
  • An Analysis of Linear Models, Linear Value-Function Approximation, and Feature Selection for Reinforcement Learning. [PDF]

Least-Squares Methods

  • Least-Squares Methods in Reinforcement Learning for Control. Michail Lagoudakis, Ronald Parr, and Michael Littman. [PDF]
  • Linear Least-Squares Algorithms for Temporal Difference Learning. Steven Bradtke, and Andrew Barto. [PDF]

Deep Q-learning

  • Towards Characterizing Divergence in Deep Q-Learning. Joshua Achiam, Ethan Knight, and Pieter Abbeel.
  • Diagnosing Bottlenecks in Deep Q-learning Algorithms. Justin Fu, et al.
  • Deep Reinforcement Learning and the Deadly Triad. Hado van Hasselt, et al. [PDF]
  • A Theoretical Analysis of Deep Q-Learning. Zhuoran Yang, et al. [PDF]

Math for ML

  • A Tutorial on Fisher Information. Alexander Ly, et al. [PDF]

About

Learning materials for fundamental RL

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published