- [Matching Networks for One Shot Learning] (notes/oneshot.md) [arXiv]
- [Deep Neural Networks for Youtube Recommendations] (notes/youtube_recommendations.md) [[Google] (https://research.google.com/pubs/pub45530.html)]
-
[Doctor AI: Predicting Clinical Events via Recurrent Neural Networks] (notes/docai.md) [[arXiv] (http://arxiv.org/abs/1511.05942)]
-
[Distributed Representations of Words and Phrases and their Compositionality] (notes/word2vec_mikolov.md) [[NIPS] (https://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf)]
-
Multi-layer Representation Learning for Medical Concepts [[arXiv] (https://arxiv.org/abs/1602.05568)]
-
[Convolutional Neural Networks for Sentence Classification] (notes/cnn_text.md) [[arXiv] (http://arxiv.org/abs/1408.5882)]
-
Recurrent Neural Network Regularization [[arXiv] (http://arxiv.org/abs/1409.2329)]
-
Grammar as a Foreign Language [[arXiv] (http://arxiv.org/abs/1412.7449)]
-
[Sequence to Sequence Learning with Neural Networks] (notes/seq_to_seq_rnn.md) [[arXiv] (https://arxiv.org/abs/1409.3215)]
-
[Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation] (notes/rnn_encode_decode.md) [[arXiv] (http://arxiv.org/abs/1406.1078)]
-
[Neural Machine Translation by Jointly Learning to Align and Translate] (notes/rnn_attention.md) [[arXiv] (http://arxiv.org/abs/1409.0473)] - Attention in RNNs
-
[On Using Very Large Target Vocabulary for Neural Machine Translation] (notes/rnn_softmax.md) [[arXiv] (http://arxiv.org/abs/1412.2007)] - Sampled Softmax
-
[Pointer Sentinel Mixture Models] (notes/pointer_sentinel.md) [arXiv]
-
[Context-Dependent Word Representation for Neural Machine Translation] (notes/context.md) [arXiv]
-
[Learning to Translate in Real-time with Neural Machine Translation] (notes/real_time_NMT.md) [arXiv]
-
[Fully Character-Level Neural Machine Translation without Explicit Segmentation] (notes/fully_char.md) [arXiv]
-
[A Neural Conversational Model] (notes/conversation.md) [[arXiv] (http://arxiv.org/abs/1506.05869)]
-
End-To-End Memory Networks [arXiv]
-
[Ask Me Anything: Dynamic Memory Networks for Natural Language Processing] (notes/ama.md) [arXiv]
-
[Dynamic Memory Networks for Visual and Textual Question Answering] (notes/visual_qa.md) [arXiv]
-
[Dynamic Coattention Networks For Question Answering] (notes/coattention.md) [arXiv]
-
[Richard Socher on the Future of Deep Learning] (notes/future_socher.md) [OReilly]
-
[A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks][arXiv]
-
Bidirectional Attention Flow for Machine Comprehension [arXiv]
-
[Generating Long and Diverse Responses with Neural Conversation Models] (notes/diverse.md) [arXiv]
-
[Gated-Attention Readers for Text Comprehension] (notes/ga.md) [arXiv]
-
[FVQA: Fact based Visual Question Answering] (notes/fvqa.md) [arXiv]
-
[Query-Reduction Networks for Question Answering] (notes/qrn.md) [arXiv]
-
[Chains of Reasoning over Entities, Relations, and Text using Recurrent Neural Networks] [arXiv]
-
[Deep API Learning] [arXiv]
-
Episodic Exploration for Deep Deterministic Policies: An Application to StarCraft Micromanagement Tasks [[arXiv] (http://arxiv.org/abs/1609.02993)]
-
Third Person Imitation Learning [arXiv]
-
WaveNet: A Generative Model for Raw Audio [[arXiv] (https://arxiv.org/abs/1609.03499)][[Tutorial] (https://deepmind.com/blog/wavenet-generative-model-raw-audio/)]
-
Decoupled Neural Interfaces using Synthetic Gradients [[arXiv] (https://arxiv.org/abs/1608.05343)] [[Tutorial] (https://deepmind.com/blog/decoupled-neural-networks-using-synthetic-gradients/)]
-
[Neural Turing Machines] (notes/ntm.md) [[arXiv] (http://arxiv.org/abs/1410.5401)]
-
[Hybrid Computing using a Neural Network with Dynamic External Memory] [Nature]
-
[Generative Adversarial Networks] (notes/GAN.md) [[arXiv] (https://arxiv.org/abs/1406.2661)]
-
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks [arXiv]
-
Generative Adversarial Text to Image Synthesis [[arXiv] (https://arxiv.org/abs/1605.05396)]
-
Improved Techniques for Training GANs [[arXiv] (https://arxiv.org/abs/1606.03498)]
-
Learning to Protect Communications with Adversarial Neural Cryptography [arXiv]
- Understanding Deep Learning Requires Rethinking Generalization [[arXiv] (https://arxiv.org/abs/1505.00387)]
-
[Highway Networks] (notes/highway.md) [[arXiv] (https://arxiv.org/abs/1611.03530)]
-
[Maxout Networks] [arXiv]
-
[HyperNetworks] (notes/hypernetworks.md) [[arXiv] (https://arxiv.org/abs/1609.09106)]
-
[Using Fast Weights to Attend to the Recent Past] (notes/fast_weights.md) [[arXiv] (https://arxiv.org/abs/1610.06258)]
-
Learning to learn by gradient descent by gradient descent [arXiv]
-
GRAM: Graph-based Attention Model for Healthcare Representation Learning [arXiv]
-
[Language Modeling with Gated Convolutional Networks] [arXiv]
-
[Value Iteration Networks] [arXiv]
-
[Adding Gradient Noise Improves Learning for Very Deep Networks] [arXiv]
-
[Outrageously Large Neural Networks: The Sparsely-gated Mixture-of-Experts Layer] [Open Review]