[2018] Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review
[2021] A Survey of Zero-shot Generalisation in Deep Reinforcement Learning
[2023] Diffusion Models for Reinforcement Learning: A Survey
[2023] A Tutorial Introduction to Reinforcement Learning
[2015] Continuous control with deep reinforcement learning
[2015] Trust Region Policy Optimization
[2016] Asynchronous Methods for Deep Reinforcement Learning
[2016] Sample Efficient Actor-Critic with Experience Replay
[2017] Proximal Policy Optimization Algorithms
[2018] Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
[2018] Soft Actor-Critic Algorithms and Applications
[2018] Addressing Function Approximation Error in Actor-Critic Methods
[2018] IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
[2019] Stabilizing Transformers for Reinforcement Learning
[2017] Time Limits in Reinforcement Learning
[2018] Learning Temporal Point Processes via Reinforcement Learning
[2019] Making Deep Q-learning methods robust to time discretization
[2020] Thinking While Moving: Deep Reinforcement Learning with Concurrent Control
[2019] Real-Time Reinforcement Learning
[2021] Maximum Entropy RL (Provably) Solves Some Robust RL Problems
[2022] Contrastive Learning as Goal-Conditioned Reinforcement Learning
[2023] Maximum diffusion reinforcement learning
[2024] Privileged Sensing Scaffolds Reinforcement Learning
[2016] Averaged-DQN: Variance Reduction and Stabilization for Deep Reinforcement Learning
[2016] Q(λ) with Off-Policy Corrections
[2016] Safe and Efficient Off-Policy Reinforcement Learning
[2016] Sample Efficient Actor-Critic with Experience Replay
[2020] AWAC: Accelerating Online Reinforcement Learning with Offline Datasets
[2018] Deep Reinforcement Learning and the Deadly Triad
[2022] A Theory of Abstraction in Reinforcement Learning
[2017] Deep Reinforcement Learning that Matters
[2018] The Mirage of Action-Dependent Baselines in Reinforcement Learning
[2018] Deep reinforcement learning doesn’t work yet
[2020] Implementation Matters in Deep Policy Gradients: A Case Study on PPO and TRPO
[2020] Leverage the Average: an Analysis of KL Regularization in RL
[2021] How to Train Your Robot with Deep Reinforcement Learning; Lessons We've Learned
[2021] What Matters for On-Policy Deep Actor-Critic Methods? A Large-Scale Study
[2022] The Primacy Bias in Deep Reinforcement Learning
[2024] Addressing Signal Delay in Deep Reinforcement Learning