Jul
- Benchmarking Model-Based Reinforcement Learning
- Learning World Graphs to Accelerate Hierarchical Reinforcement Learning
- Perspective Taking in Deep Reinforcement Learning Agents
- On the Weaknesses of Reinforcement Learning for Neural Machine Translation
- Dynamic Face Video Segmentation via Reinforcement Learning
- Striving for Simplicity in Off-policy Deep Reinforcement Learning
- Intrinsic Motivation Driven Intuitive Physics Learning using Deep Reinforcement Learning with Intrinsic Reward Normalization
- A Communication-Efficient Multi-Agent Actor-Critic Algorithm for Distributed Reinforcement Learning
- Attentive Multi-Task Deep Reinforcement Learning
- Low Level Control of a Quadrotor with Deep Model-Based Reinforcement Learning
- Google Research Football: A Novel Reinforcement Learning Environment
- Deep Reinforcement Learning in Financial Markets
- Dynamic Input for Deep Reinforcement Learning in Autonomous Driving
- Characterizing Attacks on Deep Reinforcement Learning
- Deep Reinforcement Learning for Clinical Decision Support: A Brief Survey
- VRLS: A Unified Reinforcement Learning Scheduler for Vehicle-to-Vehicle Communications
- Deep Reinforcement Learning for Autonomous Internet of Things: Model, Applications and Challenges
- Arena: a toolkit for Multi-Agent Reinforcement Learning
- GPU-Accelerated Atari Emulation for Reinforcement Learning
- Photonic architecture for reinforcement learning
Jun
- Towards Empathic Deep Q-Learning
- Ranking Policy Gradient
- Hyp-RL : Hyperparameter Optimization by Reinforcement Learning
- Modern Deep Reinforcement Learning Algorithms
- A Framework for Automatic Question Generation from Text using Deep Reinforcement Learning
- Deep Reinforcement Learning for Unmanned Aerial Vehicle-Assisted Vehicular Networks
- Is multiagent deep reinforcement learning the answer or the question? A brief survey
- Finding Needles in a Moving Haystack: Prioritizing Alerts with Adversarial Reinforcement Learning
- Cooperative Lane Changing via Deep Reinforcement Learning
- A Hierarchical Architecture for Sequential Decision-Making in Autonomous Driving using Deep Reinforcement Learning
- Explaining Reinforcement Learning to Mere Mortals: An Empirical Study
- Language as an Abstraction for Hierarchical Deep Reinforcement Learning
- Autonomous Airline Revenue Management: A Deep Reinforcement Learning Approach to Seat Inventory Control and Overbooking
- A Survey of Reinforcement Learning Informed by Natural Language
- Load Balancing for Ultra-Dense Networks: A Deep Reinforcement Learning Based Approach
- Deep Reinforcement Learning Architecture for Continuous Power Allocation in High Throughput Satellites
- Harnessing Reinforcement Learning for Neural Motion Planning
April-May
- Reinforcement Learning with Probabilistic Guarantees for Autonomous Driving
- An Atari Model Zoo for Analyzing, Visualizing, and Comparing Deep Reinforcement Learning Agents
- On the Generalization Gap in Reparameterizable Reinforcement Learning
- Targeted Attacks on Deep Reinforcement Learning Agents through Adversarial Observations
- Inverse Reinforcement Learning in Contextual MDPs
- Teaching on a Budget in Multi-Agent Deep Reinforcement Learning
- Coordinated Exploration via Intrinsic Rewards for Multi-Agent Reinforcement Learning
- Generation of Policy-Level Explanations for Reinforcement Learning
- A Control-Model-Based Approach for Reinforcement Learning
- Interactive Teaching Algorithms for Inverse Reinforcement Learning
- Snooping Attacks on Deep Reinforcement Learning
March 2019
- IRLAS: Inverse Reinforcement Learning for Architecture Search
- Learning Hierarchical Teaching in Cooperative Multiagent Reinforcement Learning
- M3RL: Mind-aware Multi-agent Management Reinforcement Learning
- Concurrent Meta Reinforcement Learning
- Horizon: Facebook's Open Source Applied Reinforcement Learning Platform
- Using Natural Language for Reward Shaping in Reinforcement Learning
- Model-Based Reinforcement Learning for Atari
- RLOC: Neurobiologically Inspired Hierarchical Reinforcement Learning Algorithm for Continuous Control of Nonlinear Dynamical Systems
- Learning Hierarchical Teaching in Cooperative Multiagent Reinforcement Learning
- Hacking Google reCAPTCHA v3 using Reinforcement Learning
- Reinforcement Learning and Inverse Reinforcement Learning with System 1 and System 2
- Deep Reinforcement Learning with Feedback-based Exploration
- Deep Reinforcement Learning for Autonomous Driving
- Improving Safety in Reinforcement Learning Using Model-Based Architectures and Human Intervention
- Deep Hierarchical Reinforcement Learning Based Recommendations via Multi-goals Abstraction
- Explaining Reinforcement Learning to Mere Mortals: An Empirical Study
- Lifelong Federated Reinforcement Learning: A Learning Architecture for Navigation in Cloud Robotic Systems
- On the use of Deep Autoencoders for Efficient Embedded Reinforcement Learning
- Autoregressive Policies for Continuous Control Deep Reinforcement Learning
Feb 2019
- Distributional reinforcement learning with linear function approximation
- Novelty Search for Deep Reinforcement Learning Policy Network Weights by Action Sequence Edit Metric Distance
- Tsallis Reinforcement Learning: A Unified Framework for Maximum Entropy Reinforcement Learning
- Deep Reinforcement Learning for Multi-Agent Systems: A Review of Challenges, Solutions and Applications
- Reinforcement Learning for Optimal Load Distribution Sequencing in Resource-Sharing System
- Learning to Schedule Communication in Multi-agent Reinforcement Learning
- On Reinforcement Learning for Full-length Game of StarCraft
- Implicit Policy for Reinforcement Learning
- A Meta-MDP Approach to Exploration for Lifelong Reinforcement Learning
- Visual Rationalizations in Deep Reinforcement Learning for Atari Games
- Statistics and Samples in Distributional Reinforcement Learning
- A Comparative Analysis of Expected and Distributional Reinforcement Learning
- Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning
- SOLAR: Deep Structured Representations for Model-Based Reinforcement Learning
- From Language to Goals: Inverse Reinforcement Learning for Vision-Based Instruction Following
- Investigating Generalisation in Continuous Deep Reinforcement Learning
- Model-Free Adaptive Optimal Control of Episodic Fixed-Horizon Manufacturing Processes using Reinforcement Learning
- Crowd-Robot Interaction: Crowd-aware Robot Navigation with Attention-based Deep Reinforcement Learning
- Towards the Next Generation Airline Revenue Management: A Deep Reinforcement Learning Approach to Seat Inventory Control and Overbooking
- Parenting: Safe Reinforcement Learning from Human Input
- Reinforcement Learning Without Backpropagation or a Clock
- Message-Dropout: An Efficient Training Method for Multi-Agent Deep Reinforcement Learning
- A new Potential-Based Reward Shaping for Reinforcement Learning Agent
- How to Combine Tree-Search Methods in Reinforcement Learning
- Unsupervised Basis Function Adaptation for Reinforcement Learning
- Communication Topologies Between Learning Agents in Deep Reinforcement Learning
- Logically-Constrained Reinforcement Learning
- Hyperbolic Embeddings for Learning Options in Hierarchical Reinforcement Learning
- ProLoNets: Neural-encoding Human Experts' Domain Knowledge to Warm Start Reinforcement Learning
- A Framework for Automated Cellular Network Tuning with Reinforcement Learning
- Deep Reinforcement Learning for Search, Recommendation, and Online Advertising: A Survey
- The Value Function Polytope in Reinforcement Learning
- Robust Reinforcement Learning in POMDPs with Incomplete and Noisy Observations
- Deep Reinforcement Learning Based High-level Driving Behavior Decision-making Model in Heterogeneous Traffic
- Active Perception in Adversarial Scenarios using Maximum Entropy Deep Reinforcement Learning
- Verifiably Safe Off-Model Reinforcement Learning
- Off-Policy Actor-Critic in an Ensemble: Achieving Maximum General Entropy and Effective Environment Exploration in Deep Reinforcement Learning
- Optimal Tap Setting of Voltage Regulation Transformers Using Batch Reinforcement Learning
- Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning
- Exploration versus exploitation in reinforcement learning: a stochastic control approach
- ACTRCE: Augmenting Experience via Teacher's Advice For Multi-Goal Reinforcement Learning
- End-to-end Active Object Tracking and Its Real-world Deployment via Reinforcement Learning
- WiseMove: A Framework for Safe Deep Reinforcement Learning for Autonomous Driving
- Emergence of Hierarchy via Reinforcement Learning Using a Multiple Timescale Stochastic RNN
Jan 2019
- Federated Reinforcement Learning
- Verifiable Reinforcement Learning via Policy Extraction
- QFlow: A Reinforcement Learning Approach to High QoE Video Streaming over Wireless Networks
- Complementary reinforcement learning towards explainable agents
- The Multi-Agent Reinforcement Learning in MalmÖ (MARLÖ) Competition
- Hierarchical Reinforcement Learning for Multi-agent MOBA Game
- Reinforcement Learning of Markov Decision Processes with Peak Constraints
- Robust Recovery Controller for a Quadrupedal Robot using Deep Reinforcement Learning
- Understanding Multi-Step Deep Reinforcement Learning: A Systematic Study of the DQN Target
- Graph Convolutional Reinforcement Learning for Multi-Agent Cooperation
- Algorithmic Framework for Model-based Deep Reinforcement Learning with Theoretical Guarantees
- A Short Survey on Probabilistic Reinforcement Learning
- Read, Watch, and Move: Reinforcement Learning for Temporally Grounding Natural Language Descriptions in Videos
- Lifelong Federated Reinforcement Learning: A Learning Architecture for Navigation in Cloud Robotic Systems
- Hierarchically Structured Reinforcement Learning for Topically Coherent Visual Story Generation
- Recurrent Control Nets for Deep Reinforcement Learning
- Amplifying the Imitation Effect for Reinforcement Learning of UCAV's Mission Execution
- Multi-agent Reinforcement Learning Embedded Game for the Optimization of Building Energy Control and Power System Planning
- Representation Learning on Graphs: A Reinforcement Learning Application
- Evolutionarily-Curated Curriculum Learning for Deep Reinforcement Learning Agents
- Exploring applications of deep reinforcement learning for real-world autonomous driving systems
- AlphaSeq: Sequence Discovery with Deep Reinforcement Learning
- Exploration versus exploitation in reinforcement learning: a stochastic control approach
- Multi-Agent Deep Reinforcement Learning for Dynamic Power Allocation in Wireless Networks
- Energy-Efficient Thermal Comfort Control in Smart Buildings via Deep Reinforcement Learning
- Relative Importance Sampling For Off-Policy Actor-Critic in Deep Reinforcement Learning
- AutoPhase: Compiler Phase-Ordering for High Level Synthesis with Deep Reinforcement Learning
- Improving Coordination in Multi-Agent Deep Reinforcement Learning through Memory-driven Communication
- Low Level Control of a Quadrotor with Deep Model-Based Reinforcement learning
- Accelerated Methods for Deep Reinforcement Learning
- Motion Perception in Reinforcement Learning with Dynamic Objects
- A New Tensioning Method using Deep Reinforcement Learning for Surgical Pattern Cutting
- Machine Teaching for Inverse Reinforcement Learning: Algorithms and Applications
- Near-Optimal Representation Learning for Hierarchical Reinforcement Learning
- Multi-Agent Reinforcement Learning via Double Averaging Primal-Dual Optimization
- Deterministic Implementations for Reproducibility in Deep Reinforcement Learning
- Uncertainty-Based Out-of-Distribution Detection in Deep Reinforcement Learning
- Risk-Aware Active Inverse Reinforcement Learning
- A dual mode adaptive basal-bolus advisor based on reinforcement learning
- What Should I Do Now? Marrying Reinforcement Learning and Symbolic Planning
- Deep Reinforcement Learning for Imbalanced Classification
- Hierarchical Reinforcement Learning via Advantage-Weighted Information Maximization
- Finite-Sample Analyses for Fully Decentralized Multi-Agent Reinforcement Learning
- Optimal Decision-Making in Mixed-Agent Partially Observable Stochastic Environments via Reinforcement Learning
- Floyd-Warshall Reinforcement Learning: Learning from Past Experiences to Reach New Goals
- A Critical Investigation of Deep Reinforcement Learning for Navigation
- Accelerating Goal-Directed Reinforcement Learning by Model Characterization
- Machine Teaching in Hierarchical Genetic Reinforcement Learning: Curriculum Design of Reward Functions for Swarm Shepherding
- Reinforcement Learning Using Quantum Boltzmann Machines
- Communication-Efficient Distributed Reinforcement Learning
- DeepTraffic: Crowdsourced Hyperparameter Tuning of Deep Reinforcement Learning Systems for Multi-Agent Dense Traffic Navigation
- Human-Like Autonomous Car-Following Model with Deep Reinforcement Learning
- Adversarial Text Generation Without Reinforcement Learning
- End-to-End Video Captioning with Multitask Reinforcement Learning
2018
- Accelerated Methods for Deep Reinforcement Learning.
arxiv
- A Deep Reinforcement Learning Chatbot (Short Version).
arxiv
- AlphaX: eXploring Neural Architectures with Deep Neural Networks and Monte Carlo Tree Search.
arxiv
⭐ - A Survey of Inverse Reinforcement Learning: Challenges, Methods and Progress.
arxiv
- Composable Deep Reinforcement Learning for Robotic Manipulation.
arxiv
- Cooperative Multi-Agent Reinforcement Learning for Low-Level Wireless Communication.
arxiv
- Deep Reinforcement Fuzzing.
arxiv
- Deep Reinforcement Learning of Cell Movement in the Early Stage of C. elegans Embryogenesis.
arxiv
- Deep Reinforcement Learning For Sequence to Sequence Models.
arxiv
code
- Deep Reinforcement Learning for Vision-Based Robotic Grasping: A Simulated Comparative Evaluation of Off-Policy Methods.
arxiv
- Deep Reinforcement Learning in Portfolio Management.
arxiv
code
- Deep Reinforcement Learning using Capsules in Advanced Game Environments.
arxiv
- Deep Reinforcement Learning with Model Learning and Monte Carlo Tree Search in Minecraft.
arxiv
- Distributed Deep Reinforcement Learning: Learn how to play Atari games in 21 minutes.
arxiv
code
- Diversity is All You Need: Learning Skills without a Reward Function.
arxiv
- Faster Deep Q-learning using Neural Episodic Control.
arxiv
- Feedback-Based Tree Search for Reinforcement Learning.
arxiv
- Feudal Reinforcement Learning for Dialogue Management in Large Domains.
arxiv
- Forward-Backward Reinforcement Learning.
arxiv
- Hierarchical Reinforcement Learning: Approximating Optimal Discounted TSP Using Local Policies.
arxiv
- IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures.
arxiv
- Kickstarting Deep Reinforcement Learning.
arxiv
- Learning a Prior over Intent via Meta-Inverse Reinforcement Learning.
arxiv
- Meta Reinforcement Learning with Latent Variable Gaussian Processes.
arxiv
- Multi-Agent Reinforcement Learning: A Report on Challenges and Approaches.
arxiv
- Pretraining Deep Actor-Critic Reinforcement Learning Algorithms With Expert Demonstrations.
arxiv
- Psychlab: A Psychology Laboratory for Deep Reinforcement Learning Agents.
arxiv
- Recommendations with Negative Feedback via Pairwise Deep Reinforcement Learning.
arxiv
- Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review.
arxiv
- Reinforcement Learning from Imperfect Demonstrations.
arxiv
- Reinforcement Learning to Rank in E-Commerce Search Engine: Formalization, Analysis, and Application.
arxiv
- RUDDER: Return Decomposition for Delayed Rewards.
arxiv
code
- Semi-parametric Topological Memory for Navigation.
arxiv
tensorflow
- Shared Autonomy via Deep Reinforcement Learning.
arxiv
- Setting up a Reinforcement Learning Task with a Real-World Robot.
arxiv
- Simple random search provides a competitive approach to reinforcement learning.
arxiv
code
- Unsupervised Meta-Learning for Reinforcement Learning.
arxiv
- Using reinforcement learning to learn how to play text-based games.
arxiv
2017
- A Deep Reinforcement Learning Chatbot.
arxiv
- A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem.
arxiv
code
- A Deep Reinforced Model for Abstractive Summarization.
arxiv
- A Distributional Perspective on Reinforcement Learning.
arxiv
- A Laplacian Framework for Option Discovery in Reinforcement Learning.
arxiv
⭐ - Boosting the Actor with Dual Critic.
arxiv
- Bridging the Gap Between Value and Policy Based Reinforcement Learning.
arxiv
- Car Racing using Reinforcement Learning.
pdf
- Cold-Start Reinforcement Learning with Softmax Policy Gradients.
arxiv
- Curiosity-driven Exploration by Self-supervised Prediction.
arxiv
tensorflow
- Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning.
arxiv
code
- DeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning.
arxiv
code
- Deep Reinforcement Learning: An Overview.
arxiv
- Deep Reinforcement Learning for Unsupervised Video Summarization with Diversity-Representativeness Reward.
arxiv
code
- Deep reinforcement learning from human preferences.
arxiv
- Deep Reinforcement Learning that Matters.
arxiv
code
- Device Placement Optimization with Reinforcement Learning.
arxiv
- Distributional Reinforcement Learning with Quantile Regression.
arxiv
- End-to-End Optimization of Task-Oriented Dialogue Model with Deep Reinforcement Learning.
arxiv
- Evolution Strategies as a Scalable Alternative to Reinforcement Learning.
arxiv
- Feature Control as Intrinsic Motivation for Hierarchical Reinforcement Learning.
arxiv
- Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations.
arxiv
- Learning how to Active Learn: A Deep Reinforcement Learning Approach.
arxiv
tensorflow
- Learning Multimodal Transition Dynamics for Model-Based Reinforcement Learning.
arxiv
tensorflow
- MAgent: A Many-Agent Reinforcement Learning Platform for Artificial Collective Intelligence.
arxiv
code
⭐ - Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm.
arxiv
- Micro-Objective Learning : Accelerating Deep Reinforcement Learning through the Discovery of Continuous Subgoals.
arxiv
- Neural Architecture Search with Reinforcement Learning.
arxiv
tensorflow
- Neural Map: Structured Memory for Deep Reinforcement Learning.
arxiv
- Observational Learning by Reinforcement Learning.
arxiv
- Overcoming Exploration in Reinforcement Learning with Demonstrations.
arxiv
- Practical Network Blocks Design with Q-Learning.
arxiv
- Rainbow: Combining Improvements in Deep Reinforcement Learning.
arxiv
- Reinforcement Learning for Architecture Search by Network Transformation.
arxiv
code
- Reinforcement Learning via Recurrent Convolutional Neural Networks.
arxiv
code
- Reinforcement Learning with a Corrupted Reward Channel.
arxiv
⭐ - Reinforcement Learning with Deep Energy-Based Policies.
arxiv
code
- Reinforcement Learning with External Knowledge and Two-Stage Q-functions for Predicting Popular Reddit Threads.
arxiv
- Robust Deep Reinforcement Learning with Adversarial Attacks.
arxiv
- Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning.
arxiv
- Shallow Updates for Deep Reinforcement Learning.
arxiv
code
- Stochastic Neural Networks for Hierarchical Reinforcement Learning.
pdf
code
- Tackling Error Propagation through Reinforcement Learning: A Case of Greedy Dependency Parsing.
arxiv
code
- Task-Oriented Query Reformulation with Reinforcement Learning.
arxiv
code
- Teaching a Machine to Read Maps with Deep Reinforcement Learning.
arxiv
code
- TreeQN and ATreeC: Differentiable Tree-Structured Models for Deep Reinforcement Learning.
arxiv
code
- Value Prediction Network.
arxiv
- Variational Deep Q Network.
arxiv
- Virtual-to-real Deep Reinforcement Learning: Continuous Control of Mobile Robots for Mapless Navigation.
arxiv
- Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning.
arxiv
2016
- Asynchronous Methods for Deep Reinforcement Learning. [arxiv] ⭐
- Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning, E. Parisotto, et al., ICLR. [arxiv]
- A New Softmax Operator for Reinforcement Learning.[url]
- Benchmarking Deep Reinforcement Learning for Continuous Control, Y. Duan et al., ICML. [arxiv]
- Better Computer Go Player with Neural Network and Long-term Prediction, Y. Tian et al., ICLR. [arxiv]
- Deep Reinforcement Learning in Parameterized Action Space, M. Hausknecht et al., ICLR. [arxiv]
- Curiosity-driven Exploration in Deep Reinforcement Learning via Bayesian Neural Networks, R. Houthooft et al., arXiv. [url]
- Control of Memory, Active Perception, and Action in Minecraft, J. Oh et al., ICML. [arxiv]
- Continuous Deep Q-Learning with Model-based Acceleration, S. Gu et al., ICML. [arxiv]
- Continuous control with deep reinforcement learning. [arxiv] ⭐
- Deep Successor Reinforcement Learning. [arxiv]
- Dynamic Frame skip Deep Q Network, A. S. Lakshminarayanan et al., IJCAI Deep RL Workshop. [arxiv]
- Deep Exploration via Bootstrapped DQN. [arxiv] ⭐
- Deep Reinforcement Learning for Dialogue Generation. [arxiv]
tensorflow
- Deep Reinforcement Learning in Parameterized Action Space. [arxiv] ⭐
- Deep Reinforcement Learning with Successor Features for Navigation across Similar Environments.[url]
- Designing Neural Network Architectures using Reinforcement Learning.
arxiv
code
- Dialogue manager domain adaptation using Gaussian process reinforcement learning. [arxiv]
- End-to-End Reinforcement Learning of Dialogue Agents for Information Access. [arxiv]
- Generating Text with Deep Reinforcement Learning. [arxiv]
- Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization, C. Finn et al., arXiv. [arxiv]
- Hierarchical Reinforcement Learning using Spatio-Temporal Abstractions and Deep Neural Networks, R. Krishnamurthy et al., arXiv. [arxiv]
- Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation, T. D. Kulkarni et al., arXiv. [arxiv]
- Hierarchical Object Detection with Deep Reinforcement Learning. [arxiv]
- High-Dimensional Continuous Control Using Generalized Advantage Estimation, J. Schulman et al., ICLR. [arxiv]
- Increasing the Action Gap: New Operators for Reinforcement Learning, M. G. Bellemare et al., AAAI. [arxiv]
- Interactive Spoken Content Retrieval by Deep Reinforcement Learning. [arxiv]
- Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection, S. Levine et al., arXiv. [url]
- Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Networks, J. N. Foerster et al., arXiv. [url]
- Learning to compose words into sentences with reinforcement learning. [url]
- Loss is its own Reward: Self-Supervision for Reinforcement Learning.[arxiv]
- Model-Free Episodic Control. [arxiv]
- Mastering the game of Go with deep neural networks and tree search. [nature] ⭐
- MazeBase: A Sandbox for Learning from Games .[arxiv]
- Neural Architecture Search with Reinforcement Learning. [pdf]
- Neural Combinatorial Optimization with Reinforcement Learning. [arxiv]
- Non-Deterministic Policy Improvement Stabilizes Approximated Reinforcement Learning. [url]
- Online Sequence-to-Sequence Active Learning for Open-Domain Dialogue Generation. arXiv. [arxiv]
- Policy Distillation, A. A. Rusu et at., ICLR. [arxiv]
- Prioritized Experience Replay. [arxiv] ⭐
- Reinforcement Learning Using Quantum Boltzmann Machines. [arxiv]
- Safe and Efficient Off-Policy Reinforcement Learning, R. Munos et al.[arxiv]
- Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving. [arxiv]
- Sample-efficient Deep Reinforcement Learning for Dialog Control. [url]
- Self-Correcting Models for Model-Based Reinforcement Learning.[url]
- Unifying Count-Based Exploration and Intrinsic Motivation. [arxiv]
- Value Iteration Networks. [arxiv]
2015
- ADAAPT: A Deep Architecture for Adaptive Policy Transfer from Multiple Sources.
arxiv
- Action-Conditional Video Prediction using Deep Networks in Atari Games.
arxiv
⭐ - Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning.
arxiv
⭐ - [DDPG] Continuous control with deep reinforcement learning.
arxiv
⭐ - [NAF] Continuous Deep Q-Learning with Model-based Acceleration.
arxiv
⭐ - Dueling Network Architectures for Deep Reinforcement Learning.
arxiv
⭐ - Deep Reinforcement Learning with an Action Space Defined by Natural Language.
arxiv
- Deep Reinforcement Learning with Double Q-learning.
arxiv
⭐ - Deep Recurrent Q-Learning for Partially Observable MDPs.
arxiv
⭐ - DeepMPC: Learning Deep Latent Features for Model Predictive Control.
pdf
- Deterministic Policy Gradient Algorithms.
pdf
⭐ - Dueling Network Architectures for Deep Reinforcement Learning.
arxiv
- End-to-End Training of Deep Visuomotor Policies.
arxiv
⭐ - Giraffe: Using Deep Reinforcement Learning to Play Chess.
arxiv
- Generating Text with Deep Reinforcement Learning.
arxiv
- How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies.
arxiv
- Human-level control through deep reinforcement learning.
nature
⭐ - Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models.
arxiv
⭐ - Learning Simple Algorithms from Examples.
arxiv
- Language Understanding for Text-based Games Using Deep Reinforcement Learning.
pdf
⭐ - Learning Continuous Control Policies by Stochastic Value Gradients.
pdf
⭐ - Multiagent Cooperation and Competition with Deep Reinforcement Learning.
arxiv
- Maximum Entropy Deep Inverse Reinforcement Learning.
arxiv
- Massively Parallel Methods for Deep Reinforcement Learning.
pdf
] ⭐ - On Learning to Think- Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models.
arxiv
- Playing Atari with Deep Reinforcement Learning.
arxiv
- Recurrent Reinforcement Learning: A Hybrid Approach.
arxiv
- Strategic Dialogue Management via Deep Reinforcement Learning.
arxiv
- Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control.
arxiv
- Trust Region Policy Optimization.
pdf
⭐ - Universal Value Function Approximators.
pdf
- Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning.
arxiv
2014
- Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning.[url]
2013
- Evolving large-scale neural networks for vision-based reinforcement learning. [idsia] ⭐
- Playing Atari with Deep Reinforcement Learning. [toronto] ⭐
- Reinforcement Learning: A Survey, JAIR, 1996. [arxiv]
- A Tutorial Survey of Reinforcement Learning, Sadhana, 1994. [Paper]
- Reinforcement Learning in Robotics, A Survey, IJRR, 2013. [Paper]
- A Brief Survey of Deep Reinforcement Learning 2017 [arxiv]
- A Survey of Deep Network Solutions for Learning Control in Robotics: From Reinforcement to Imitation. 2018 [arxiv]
- Universal Reinforcement Learning Algorithms: Survey and Experiments. 2017 [arxiv]
- Bayesian Reinforcement Learning: A Survey. 2016
- A Survey of Inverse Reinforcement Learning: Challenges, Methods and Progress
- Benchmarking Reinforcement Learning Algorithms on Real-World Robots
- DEEP REINFORCEMENT LEARNING: AN OVERVIEW
- Steps toward Artificial Intelligence, Proceedings of the IRE, 1961. [Paper] (discusses issues in RL such as the "credit assignment problem")
- An Adaptive Optimal Controller for Discrete-Time Markov Environments, Information and Control, 1977. [Paper] (earliest publication on temporal-difference (TD) learning rule)
-
Dynamic Programming (DP):
- Learning from Delayed Rewards, Ph.D. Thesis, Cambridge University, 1989. [Thesis]
-
Monte Carlo:
-
Temporal-Difference:
- Learning to predict by the methods of temporal differences. Machine Learning 3: 9-44, 1988. [Paper]
-
Q-Learning (Off-policy TD algorithm):
- Learning from Delayed Rewards, Cambridge, 1989. [Thesis]
-
Sarsa (On-policy TD algorithm):
-
R-Learning (learning of relative values)
- A Reinforcement Learning Method for Maximizing Undiscounted Rewards, ICML, 1993. [Paper-Google Scholar]
-
Function Approximation methods (Least-Square Temporal Difference, Least-Square Policy Iteration)
-
Policy Search / Policy Gradient
- Policy Gradient Methods for Reinforcement Learning with Function Approximation, NIPS, 1999. [Paper]
- Natural Actor-Critic, ECML, 2005. [Paper]
- Policy Search for Motor Primitives in Robotics, NIPS, 2009. [Paper]
- Relative Entropy Policy Search, AAAI, 2010. [Paper]
- Path Integral Policy Improvement with Covariance Matrix Adaptation, ICML, 2012. [Paper]
- Policy Gradient Reinforcement Learning for Fast Quadrupedal Locomotion, ICRA, 2004. [Paper]
- PILCO: A Model-Based and Data-Efficient Approach to Policy Search, ICML, 2011. [Paper]
- Learning Dynamic Arm Motions for Postural Recovery, Humanoids, 2011. [Paper]
- Black-Box Data-efficient Policy Search for Robotics, IROS, 2017. [Paper]
-
Hierarchical RL
-
Deep Learning + Reinforcement Learning (A sample of recent works on DL+RL)
- Human-level Control through Deep Reinforcement Learning, Nature, 2015. [Paper]
- Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning, NIPS, 2014. [Paper]
- End-to-End Training of Deep Visuomotor Policies. ArXiv, 16 Oct 2015. [ArXiv]
- Prioritized Experience Replay, ArXiv, 18 Nov 2015. [ArXiv]
- Hado van Hasselt, Arthur Guez, David Silver, Deep Reinforcement Learning with Double Q-Learning, ArXiv, 22 Sep 2015. [ArXiv]
- Asynchronous Methods for Deep Reinforcement Learning, ArXiv, 4 Feb 2016. [ArXiv]
Traditional Games
- Backgammon - "TD-Gammon" game play using TD(λ) (Tesauro, ACM 1995) [Paper]
- Chess - "KnightCap" program using TD(λ) (Baxter, arXiv 1999) [arXiv]
- Chess - Giraffe: Using deep reinforcement learning to play chess (Lai, arXiv 2015) [arXiv]
Computer Games
- Human-level Control through Deep Reinforcement Learning (Mnih, Nature 2015) [Paper] [Code] [Video]
- Flappy Bird Reinforcement Learning [Video]
- MarI/O - learning to play Mario with evolutionary reinforcement learning using artificial neural networks (Stanley, Evolutionary Computation 2002) [Paper] [Video]
- Policy Gradient Reinforcement Learning for Fast Quadrupedal Locomotion (Kohl, ICRA 2004) [Paper]
- Robot Motor SKill Coordination with EM-based Reinforcement Learning (Kormushev, IROS 2010) [Paper] [Video]
- Generalized Model Learning for Reinforcement Learning on a Humanoid Robot (Hester, ICRA 2010) [Paper] [Video]
- Autonomous Skill Acquisition on a Mobile Manipulator (Konidaris, AAAI 2011) [Paper] [Video]
- PILCO: A Model-Based and Data-Efficient Approach to Policy Search (Deisenroth, ICML 2011) [Paper]
- Incremental Semantically Grounded Learning from Demonstration (Niekum, RSS 2013) [Paper]
- Efficient Reinforcement Learning for Robots using Informative Simulated Priors (Cutler, ICRA 2015) [Paper] [Video]
- Robots that can adapt like animals (Cully, Nature 2015) [Paper] [Video] [Code]
- Black-Box Data-efficient Policy Search for Robotics (Chatzilygeroudis, IROS 2017) [Paper] [Video] [Code]
- An Application of Reinforcement Learning to Aerobatic Helicopter Flight (Abbeel, NIPS 2006) [Paper] [Video]
- Autonomous helicopter control using Reinforcement Learning Policy Search Methods (Bagnell, ICRA 2001) [Paper]
- Scaling Average-reward Reinforcement Learning for Product Delivery (Proper, AAAI 2004) [Paper]
- Cross Channel Optimized Marketing by Reinforcement Learning (Abe, KDD 2004) [Paper]
- Optimizing Dialogue Management with Reinforcement Learning: Experiments with the NJFun System (Singh, JAIR 2002) [Paper]
- Reinforcement Learning (RL)
- Simple Reinforcement Learning with Tensorflow Part 0-8
- Deep_reinforcement_learning_Course
- Introduction to Various Reinforcement Learning Algorithms. Part I (Q-Learning, SARSA, DQN, DDPG)
- Machine Learning for Humans, Part 5: Reinforcement Learning
- Deep reinforcement learning: where to start
- Learning Policies For Learning Policies — Meta Reinforcement Learning (RL²) in Tensorflow
- Introduction to Various Reinforcement Learning Algorithms. Part II (TRPO, PPO)
- reinforcementlearning.ai-depot.com