References (Editing...)

Books

[1] Ian Goodfellow, Yoshua Bengio, and Aaron Courville. (2017). Deep learning. Massachusetts: MIT Press.

[2] Ankur A. Patel. (2019). Hands-On Unsupervised Learning Using Python. O'Reilly Media, Inc.

[3] 牧野貴樹ら (2017). これからの強化学習. 森北出版

[4] 久保隆宏 (2019). Pythonで学ぶ強化学習. 講談社

[5] 斎藤康毅 (2018). ゼロから作る Deep Learning 2. O'Reilly Japan

[6] 坪井裕太ら (2017). 深層学習による自然言語処理. 講談社

[7] 原田達也 (2017). 画像認識. 講談社

Papers

Structure

[1] Vincent Dumoulin and Francesco Visin. (2018). A guide to convolution arithmetic for deep learning. Retrieved from: https://arxiv.org/pdf/1603.07285.pdf

[2] Alex Krizhevsky. (2014). One weird trick for parallelizing convolutional neural networks. Retrieved from: https://arxiv.org/pdf/1404.5997.pdf

[3] Karen Simonyan and Andrew Zisserman. (2015). Very Deep Convolutional Networks for Large-Scale Image Recognition. Retrieved from: https://arxiv.org/pdf/1409.1556.pdf

[4] Forrest N. Iandola, Song Han, Matthew W. Moskewicz, Khalid Ashraf, William J. Dally, and Kurt Keutzer. (2016). SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size. Retrieved from: https://arxiv.org/pdf/1602.07360.pdf

[5] Gao Huang, Zhuang Liu, Laurens van der Maaten, and Kilian Q. Weinberger. (2018). Densely Connected Convolutional Networks. Retrieved from: https://arxiv.org/pdf/1608.06993.pdf

Optimizer

[1]Sebastian Ruder (2017). An overview of gradient descent optimization algorithms. Retrieved from: https://arxiv.org/pdf/1609.04747.pdf

[2] Diederik P. Kingma, and Jimmy Ba (2017). Adam: a Method for Stochastic Optimization. Retrieved from: https://arxiv.org/pdf/1412.6980.pdf

[3] Matthew D. Zeiler. (2012). ADADELTA: An Adaptive Learning Rate Method. Retrieved from: https://arxiv.org/pdf/1212.5701.pdf

[4] John Duchi, Elad Hazan, and Yoram Singer. (2011). Adaptive Subgradient Methods for Online Learning and Stochastic Optimization. Retrieved from: https://stanford.edu/~jduchi/projects/DuchiHaSi10_colt.pdf

[5] Liyuan Liu, Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, and Jiawei Han. (2019). On the Variance of the Adaptive Learning Rate and Beyond. Retrieved from: https://arxiv.org/pdf/1908.03265.pdf

[6] Timothy Dozat. (2015). Incorporating Nesterov Momentum intoAdam. Retrieved from: http://cs229.stanford.edu/proj2015/054_report.pdf

[7] Boris Ginsburg, Patrice Castonguay, Oleksii Hrinchuk, Oleksii Kuchaiev, Vitaly Lavrukhin, Ryan Leary, Jason Li, Huyen Nguyen, and Jonathan M. Cohen (2019). Stochastic Gradient Methods with Layer-wise Adaptive Moments for Training of Deep Networks. Retrieved from: https://arxiv.org/pdf/1905.11286.pdf

RL

[1] Wolfram Schultz. (1998). Predictive Reward Signal of Dopamine Neurons. Retrieved from: https://www.physiology.org/doi/full/10.1152/jn.1998.80.1.1

[2] Kenji Doya. (2007). Reinforcement learning: Computational theory and biological mechanisms. Retrieved from: https://www.tandfonline.com/doi/pdf/10.2976/1.2732246/10.2976/1?needAccess=true

[3] Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, and Demis Hassabis (2015). Human-level control through deep reinforcement learning. Retrieved from: https://www.nature.com/articles/nature14236

[4] Hado van Hasselt, Arthur Guez, and David Silver. (2015). Deep Reinforcement Learning with Double Q-learning. Retrieved from: https://arxiv.org/pdf/1509.06461.pdf

[5] Arun Nair, Praveen Srinivasan, Sam Blackwell, Cagdas Alcicek, Rory Fearon, Alessandro De Maria, Vedavyas Panneershelvam, Mustafa Suleyman, Charles Beattie, Stig Petersen, Shane Legg, Volodymyr Mnih, Koray Kavukcuoglu, and David Silver. (2015). Massively Parallel Methods for Deep Reinforcement Learning. Retrieved from: https://arxiv.org/pdf/1507.04296.pdf

[6] Ziyu Wang, Tom Schaul, Matteo Hessel, Hado van Hasselt, Marc Lanctot, and Nando de Freitas. (2016). Dueling Network Architectures for Deep Reinforcement Learning. Retrieved from: https://arxiv.org/pdf/1511.06581.pdf

[7] Tom Schaul, John Quan, Ioannis Antonoglou, and David Silver. (2015). Prioritized Experience Replay. Retrieved from: https://arxiv.org/pdf/1511.05952.pdf

[8] Marc G. Bellemare, Will Dabney, and Rémi Munos. (2017). A Distributional Perspective on Reinforcement Learning. Retrieved from: https://arxiv.org/pdf/1707.06887.pdf

[9] Meire Fortunato, Mohammad Gheshlaghi Azar, Bilal Piot, Jacob Menick, Ian Osband, Alex Graves, Vlad Mnih, Remi Munos, Demis Hassabis, Olivier Pietquin, Charles Blundell, and Shane Legg. (2018). Noisy Networks for Exploration. Retrieved from: https://arxiv.org/pdf/1706.10295.pdf.

[10] Matteo Hessel, Joseph Modayil, Hado van Hasselt, Tom Schaul, Georg Ostrovski, Will Dabney, Dan Horgan, Bilal Piot, Mohammad Azar, and David Silver. (2017). Rainbow: Combining Improvements in Deep Reinforcement Learning. Retrieved from: https://arxiv.org/pdf/1710.02298.pdf

[11] Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. (2015). Continuous control with deep reinforcement learning. Retrieved from: https://arxiv.org/pdf/1509.02971.pdf

[12] Scott Fujimoto, Herke van Hoof, and David Meger. (2018). Addressing Function Approximation Error in Actor-Critic Methods. Retrieved from: https://arxiv.org/pdf/1802.09477.pdf

[13] John Schulman, Sergey Levine, Philipp Moritz, Michael I. Jordan, and Pieter Abbeel. (2015). Trust Region Policy Optimization. Retrieved from: https://arxiv.org/pdf/1502.05477.pdf

GAN

[1] Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. (2014). Generative Adversarial Networks. Retrieved from: https://arxiv.org/pdf/1406.2661.pdf

[2] Alec Radford, Luke Metz, and Soumith Chintala. (2016). Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. Retrieved from: https://arxiv.org/pdf/1511.06434.pdf

[3] Mehdi Mirza, and Simon Osindero. (2014). Conditional Generative Adversarial Nets. Retrieved from: https://arxiv.org/pdf/1411.1784

[4] Xudong Mao, Qing Li, Haoran Xie, Raymond Y.K. Lau, Zhen Wang, and Stephen Paul Smolley. (2017). Least Squares Generative Adversarial Networks. Retrieved from: https://arxiv.org/pdf/1611.04076.pdf

[5] Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. (2018). Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. Retrieved from: https://arxiv.org/pdf/1703.10593.pdf

Pose Estimation

[1] Zhe Cao and Tomas Simon and Shih-En Wei and Yaser Sheikh. (2017). OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields. Retrieved from: https://arxiv.org/pdf/1812.08008.pdf

NLP

[1] Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. (2013). Efficient estimation of word representations in vector space. Retrieved from: https://arxiv.org/pdf/1301.3781.pdf

[2] Jeffrey Pennington, Richard Socher, and Christopher D. Manning. (2014). GloVe: Global Vectors for Word Representation. Retrieved from: https://nlp.stanford.edu/pubs/glove.pdf

Datasets

[1] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. (1998). Gradient-based learning applied to document recognition. Retrieved from: http://vision.stanford.edu/cs598_spring07/papers/Lecun98.pdf

[2] Xiao, H., Rasul, K., and Vollgraf, R. (2017). Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms. Retrieved from: https://arxiv.org/pdf/1708.07747.pdf

[3] Cohen, G., Afshar, S., Tapson, J., and van Schaik, A. (2017). EMNIST: an extension of MNIST to handwritten letters. Retrieved from: http://arxiv.org/abs/1702.05373

[4] Clanuwat, T., et al. (2018). Deep Learning for Classical Japanese Literature. Retrieved from: https://arxiv.org/pdf/1812.01718.pdf

[5] Huang, G.B., Ramesh, M., Berg, T., and Learned-Miller, E. (2008). Labeled Faces in the Wild: A Database for Studying Face Recognition in Unconstrained Environments. Retrieved from: http://vis-www.cs.umass.edu/lfw/lfw.pdf

[6] Alex Krizhevsky. (2009). Learning Multiple Layers of Features from Tiny Images. Retrieved from: https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf

[7] Tsung-Yi Lin, Michael Maire, Serge Belongie, Lubomir Bourdev, Ross Girshick, James Hays, Pietro Perona, Deva Ramanan, C. Lawrence Zitnick, and Piotr Dollár. (2015). Microsoft COCO: Common Objects in Context. Retrieved from: https://arxiv.org/pdf/1405.0312.pdf

Environments

[1] Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., and Zaremba, W. (2016). Openai gym. Retrieved from: https://arxiv.org/pdf/1606.01540.pdf

Online resources

[1] Pytorch.org. (2019). PyTorch documentation — PyTorch master documentation. [online] Available at: https://pytorch.org/docs/stable/index.html.

[2] SciPy.org. (2019). Numpy documentation - NumPy v1.16 Manual. [online] Available at: https://docs.scipy.org/doc/numpy/

[3] Wikipedia.org. (2019). Automatic differentiation. [online] Available at: https://en.wikipedia.org/wiki/Automatic_differentiation

[4] PyBullet.org (2019). Bullet Real-Time Physics Simulation. [online] Available at: https://pybullet.org/wordpress/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly