A list of recent papers regarding reinforcement learning. I am more interested in the research that applys reinforcement learning to game AI, so most papers listed here are focus on running experiments on games. Some of them that I have read are marked and I will arrange some notes soon.
- Deep Reinforcement Learning: An Overview, Li Yuxi, arXiv, 2017.
- Deep Learning for Video Game Playing, Justesen Niels et al., arXiv, 2017.
- Reinforcement Learning: An Introduction, Richard S. Sutton and Andrew G. Barto, 2017.
These algorithms usually map the state to action-values and then map the action-values to the optimal action.
- Playing Atari with Deep Reinforcement Learning, V. Mnih et al., NIPS Workshop, 2013.
- Human-level control through deep reinforcement learning, V. Mnih et al., Nature, 2015.
- Deep Reinforcement Learning with Double Q-learning, H. van Hasselt et al., arXiv, 2015.
- Prioritized Experience Replay, T. Schaul et al., ICLR, 2016.
- Massively Parallel Methods for Deep Reinforcement Learning, A. Nair et al., ICML Workshop, 2015.
- Dueling Network Architectures for Deep Reinforcement Learning, Z. Wang et al., arXiv, 2015.
- Deep Recurrent Q-Learning for Partially Observable MDPs, M. Hausknecht and P. Stone, arXiv, 2015.
- Deep Attention Recurrent Q-Network, Ivan Sorokin et al., arXiv, 2015.
- PGQ: Combining policy gradient and Q-learning, Brendan O'Donoghue et al., arXiv, 2017.
Most of policy gradient algorithms usually need to compute the value when training, thus will be combined with value methods. And they only use policy function when test, so I classify them together as policy methods.
- Asynchronous Methods for Deep Reinforcement Learning, V. Mnih et al., arXiv, 2016.
- Reinforcement Learning Through Asynchronous Advantage Actor-Critic on a GPU, Mohammad Babaeizadeh et al., ICLR, 2017.
- Deterministic Policy Gradient Algorithms, Silver David et al., ICML, 2014.
- Continuous control with deep reinforcement learning, T. P. Lillicrap et al., ICLR, 2016.
- Reinforcement Learning With Unsupervised Auxiliary Tasks, M Jaderberg et al., ICLR, 2017.
- High-Dimensional Continuous Control Using Generalized Advantage Estimation, J. Schulman et al., ICLR, 2016.
- Sample Efficient Actor-Critic with Experience Replay, Wang Ziyu et al., ICLR, 2017.
TODO: It's too ambiguous to categorize the papers below, so I will classify them again by another measure.
- Multiagent Cooperation and Competition with Deep Reinforcement Learning, Tampuu A et al., arXiv, 2015.
- Opponent Modeling in Deep Reinforcement Learning by He H et al., ICML, 2016.
- Learning to Communicate with Deep Multi-agent Reinforcement Learning, Foerster J et al., NIPS, 2016.
- Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning, Foerster J et al., arXiv, 2017.
- Learning to communicate to solve riddles with deep distributed recurrent q-networks, Foerster J N et al., arXiv, 2016.
- Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments, Lowe R et al., arXiv, 2017.
- Learning multiagent communication with backpropagation, Sukhbaatar S and Fergus R, NIPS, 2016.
- Multiagent Bidirectionally-Coordinated Nets for Learning to Play StarCraft Combat Games, Peng P et al., arXiv, 2017.
Thanks to junhyukoh and LantaoYu. Their collection help me a lot and most information of above papers were copied from them.