Summaries and notes on Deep Learning research papers
Switch branches/tags
Nothing to show
Clone or download
Failed to load latest commit information.
notes Hindsight Experience Replay Dec 7, 2017
.gitignore Hindsight Experience Replay Dec 7, 2017 Update with last hour paper Feb 12, 2018


  • The Matrix Calculus You Need For Deep Learning [arXiv]
  • Regularized Evolution for Image Classifier Architecture Search [arXiv]
  • Online Learning: A Comprehensive Survey [arXiv]
  • Visual Interpretability for Deep Learning: a Survey [arXiv]
  • Behavior is Everything – Towards Representing Concepts with Sensorimotor Contingencies [paper] [article] [code]
  • IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures [arXiv] [article] [code]
  • DeepType: Multilingual Entity Linking by Neural Type System Evolution [arXiv] [article] [code]
  • DensePose: Dense Human Pose Estimation In The Wild [arXiv] [article]


  • Nested LSTMs [arXiv]
  • Generating Wikipedia by Summarizing Long Sequences [arXiv]
  • Scalable and accurate deep learning for electronic health records [arXiv]
  • Kernel Feature Selection via Conditional Covariance Minimization [NIPS paper] [article] [code]
  • Psychlab: A Psychology Laboratory for Deep Reinforcement Learning Agents [arXiv] [article] [code]
  • Fine-tuned Language Models for Text Classification [arXiv] [code] (soon)
  • Deep Learning: An Introduction for Applied Mathematicians [arXiv]
  • Innateness, AlphaZero, and Artificial Intelligence [arXiv]
  • Can Computers Create Art? [arXiv]
  • eCommerceGAN : A Generative Adversarial Network for E-commerce [arXiv]
  • Expected Policy Gradients for Reinforcement Learning [arXiv]
  • DroNet: Learning to Fly by Driving [UZH docs] [article] [code]
  • Symmetric Decomposition of Asymmetric Games [Scientific Reports] [article]
  • Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor [arXiv] [code]
  • SBNet: Sparse Blocks Network for Fast Inference [arXiv] [article] [code]
  • DeepMind Control Suite [arXiv] [code]
  • Deep Learning: A Critical Appraisal [arXiv]


  • Adversarial Patch [arXiv]
  • CNN Is All You Need [arXiv]
  • Learning Robot Objectives from Physical Human Interaction [paper] [article]
  • The NarrativeQA Reading Comprehension Challenge [arXiv] [dataset]
  • Objects that Sound [arXiv]
  • Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions [arXiv] [article] [article2]
  • Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning [arXiv] [article] [code]
  • Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents [arXiv] [article] [code]
  • Superhuman AI for heads-up no-limit poker: Libratus beats top professionals [Science]
  • Mathematics of Deep Learning [arXiv]
  • State-of-the-art Speech Recognition With Sequence-to-Sequence Models [arXiv] [article]
  • Peephole: Predicting Network Performance Before Training [arXiv]
  • Deliberation Network: Pushing the frontiers of neural machine translation [Research at Microsoft] [article]
  • GPU Kernels for Block-Sparse Weights [Research at OpenAI] [article] [code]
  • Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm [arXiv]
  • Deep Learning Scaling is Predictable, Empirically [arXiv] [article]


  • High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs [arXiv] [article] [code]
  • StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation [arXiv] [code]
  • Population Based Training of Neural Networks [arXiv] [article]
  • Distilling a Neural Network Into a Soft Decision Tree [arXiv]
  • Neural Text Generation: A Practical Guide [arXiv]
  • Parallel WaveNet: Fast High-Fidelity Speech Synthesis [DeepMind documents] [article]
  • CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning [arXiv] [article]
  • Non-local Neural Networks [arXiv]
  • Deep Image Prior [paper] [article] [code]
  • Online Deep Learning: Learning Deep Neural Networks on the Fly [arXiv]
  • Learning Explanatory Rules from Noisy Data [arXiv]
  • Improving Palliative Care with Deep Learning [arXiv] [article]
  • VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection [arXiv]
  • Weighted Transformer Network for Machine Translation [arXiv] [article]
  • Non-Autoregressive Neural Machine Translation [arXiv] [article]
  • Block-Sparse Recurrent Neural Networks [arXiv]
  • A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning [arXiv]
  • Neural Discrete Representation Learning [arXiv] [article]
  • Don't Decay the Learning Rate, Increase the Batch Size [arXiv]
  • Hierarchical Representations for Efficient Architecture Search [arXiv]


  • Unsupervised Machine Translation Using Monolingual Corpora Only [arXiv]
  • Dynamic Routing Between Capsules [arXiv]
  • A generative vision model that trains with high data efficiency and breaks text-based CAPTCHAs [Science] [article] [code]
  • Understanding Grounded Language Learning Agents [arXiv]
  • Planning, Fast and Slow: A Framework for Adaptive Real-Time Safe Trajectory Planning [arXiv] [article] [code] (soon)
  • Malware Detection by Eating a Whole EXE [arXiv] [article]
  • Progressive Growing of GANs for Improved Quality, Stability, and Variation [Research at Nvidia] [article] [code]
  • Meta Learning Shared Hierarchies [arXiv] [article] [code]
  • Deep Voice 3: 2000-Speaker Neural Text-to-Speech [arXiv] [article]
  • AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions [arXiv] [article] [dataset]
  • Mastering the game of Go without Human Knowledge [Nature] [article]
  • Sim-to-Real Transfer of Robotic Control with Dynamics Randomization [arXiv] [article]
  • Asymmetric Actor Critic for Image-Based Robot Learning [arXiv] [article]
  • A systematic study of the class imbalance problem in convolutional neural networks [arXiv]
  • Generalization in Deep Learning [arXiv]
  • Swish: a Self-Gated Activation Function [arXiv]
  • Emergent Translation in Multi-Agent Communication [arXiv]
  • SLING: A framework for frame semantic parsing [arXiv] [article] [code]
  • Meta-Learning for Wrestling [arXiv] [article] [code]
  • Mixed Precision Training [arXiv] [article] [article2] [code/docs]
  • Generative Adversarial Networks: An Overview [arXiv]
  • Emergent Complexity via Multi-Agent Competition [arXiv] [article] [code]
  • Deep Lattice Networks and Partial Monotonic Functions [Research at Google] [article] [code]
  • The IIT Bombay English-Hindi Parallel Corpus [arXiv] [article]
  • Rainbow: Combining Improvements in Deep Reinforcement Learning [arXiv]
  • Lifelong Learning With Dynamically Expandable Networks [arXiv]
  • Variational Inference & Deep Learning: A New Synthesis (Thesis) [dropbox]
  • Neural Task Programming: Learning to Generalize Across Hierarchical Tasks [arXiv]
  • Neural Color Transfer between Images [arXiv]
  • The hippocampus as a predictive map [biorXiv] [article]
  • Scalable and accurate deep learning for electronic health records [arXiv]


  • Variational Memory Addressing in Generative Models [arXiv]
  • Overcoming Exploration in Reinforcement Learning with Demonstrations [arXiv]
  • A Hybrid DSP/Deep Learning Approach to Real-Time Full-Band Speech Enhancement [arXiv] [article] [code]
  • ChestX-ray8: Hospital-scale Chest X-ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases [CVF] [article] [dataset]
  • NIMA: Neural Image Assessment [arXiv] [article]
  • Generating Sentences by Editing Prototypes [arXiv] [code]
  • The Consciousness Prior [arXiv]
  • StarSpace: Embed All The Things! [arXiv] [code]
  • Neural Optimizer Search with Reinforcement Learning [arXiv]
  • Dynamic Evaluation of Neural Sequence Models [arXiv]
  • Neural Machine Translation [arXiv]
  • Matterport3D: Learning from RGB-D Data in Indoor Environments [arXiv] [article] [article2] [code]
  • Deep Reinforcement Learning that Matters [arXiv] [code]
  • The Uncertainty Bellman Equation and Exploration [arXiv]
  • WESPE: Weakly Supervised Photo Enhancer for Digital Cameras [arXiv] [article]
  • Globally Normalized Reader [arXiv] [article] [code]
  • A Brief Introduction to Machine Learning for Engineers [arXiv]
  • Learning with Opponent-Learning Awareness [arXiv] [article]
  • A Deep Reinforcement Learning Chatbot [arXiv]
  • Squeeze-and-Excitation Networks [arXiv]
  • Efficient Methods and Hardware for Deep Learning (Thesis) [Stanford Digital Repository]


  • Design and Analysis of the NIPS 2016 Review Process [arXiv]
  • Fast Automated Analysis of Strong Gravitational Lenses with Convolutional Neural Networks [arXiv] [article]
  • TensorFlow Agents: Efficient Batched Reinforcement Learning in TensorFlow [white paper] [code]
  • Automated Crowdturfing Attacks and Defenses in Online Review Systems [arXiv]
  • Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning [arXiv] [article] [code]
  • Deep Learning for Video Game Playing [arXiv]
  • Deep & Cross Network for Ad Click Predictions [arXiv]
  • Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms [arXiv] [code]
  • Multi-task Self-Supervised Visual Learning [arXiv]
  • Learning a Multi-View Stereo Machine [arXiv] [article] [code] (soon)
  • Twin Networks: Using the Future as a Regularizer [arXiv]
  • A Brief Survey of Deep Reinforcement Learning [arXiv]
  • Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation [arXiv] [code]
  • On the Effectiveness of Visible Watermarks [CVPR] [article]
  • Practical Network Blocks Design with Q-Learning [arXiv]
  • On Ensuring that Intelligent Machines Are Well-Behaved [arXiv]
  • Reproducibility of Benchmarked Deep Reinforcement Learning Tasks for Continuous Control [arXiv] [code]
  • Training Deep AutoEncoders for Collaborative Filtering [arXiv] [code]
  • Learning to Perform a Perched Landing on the GroundUsing Deep Reinforcement Learning [nature]
  • Revisiting the Effectiveness of Off-the-shelf Temporal Modeling Approaches for Large-scale Video Classification [arXiv] [article]
  • Intrinsically Motivated Goal Exploration Processes with Automatic Curriculum Learning [arXiv]
  • Neural Expectation Maximization [arXiv] [code]
  • Google Vizier: A Service for Black-Box Optimization [Research at Google]
  • STARDATA: A StarCraft AI Research Dataset [arXiv] [code]
  • Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm [arXiv] [code] [article]
  • Natural Language Processing with Small Feed-Forward Networks [arXiv]


  • Photographic Image Synthesis with Cascaded Refinement Networks [arXiv] [code]
  • StarCraft II: A New Challenge for Reinforcement Learning [DeepMind Documents] [code] [article]
  • Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards [arXiv]
  • Reinforcement Learning with Deep Energy-Based Policies [arXiv] [article] [code]
  • DARLA: Improving Zero-Shot Transfer in Reinforcement Learning [arXiv]
  • Synthesizing Robust Adversarial Examples [arXiv] [article] [code] (Soon)
  • Voice Synthesis for in-the-Wild Speakers via a Phonological Loop [arXiv] [code] [article]
  • Eyemotion: Classifying facial expressions in VR using eye-tracking cameras [arXiv] [article]
  • A Distributional Perspective on Reinforcement Learning [arXiv] [article] [video]
  • On the State of the Art of Evaluation in Neural Language Models [arXiv]
  • Optimizing the Latent Space of Generative Networks [arXiv]
  • Neuroscience-Inspired Artificial Intelligence [Neuron] [article]
  • Learning Transferable Architectures for Scalable Image Recognition [arXiv]
  • Reverse Curriculum Generation for Reinforcement Learning [arXiv]
  • Imagination-Augmented Agents for Deep Reinforcement Learning [arXiv] [article]
  • Learning model-based planning from scratch [arXiv] [article]
  • Proximal Policy Optimization Algorithms [AWSS3] [code]
  • Automatic Recognition of Deceptive Facial Expressions of Emotion [arXiv]
  • Distral: Robust Multitask Reinforcement Learning [arXiv]
  • Creatism: A deep-learning photographer capable of creating professional work [arXiv] [article]
  • SCAN: Learning Abstract Hierarchical Compositional Visual Concepts [arXiv] [article]
  • Revisiting Unreasonable Effectiveness of Data in Deep Learning Era [arXiv] [article]
  • The Intentional Unintentional Agent: Learning to Solve Many Continuous Control Tasks Simultaneously [arXiv]
  • Deep Bilateral Learning for Real-Time Image Enhancement [arXiv] [code] [article]
  • Emergence of Locomotion Behaviours in Rich Environments [arXiv] [article]
  • Learning human behaviors from motion capture by adversarial imitation [arXiv] [article]
  • Robust Imitation of Diverse Behaviors [arXiv] [article]
  • Hindsight Experience Replay [arXiv]
  • Cardiologist-Level Arrhythmia Detection with Convolutional Neural Networks [arXiv] [article]
  • End-to-End Learning of Semantic Grasping [arXiv]
  • ELF: An Extensive, Lightweight and Flexible Research Platform for Real-time Strategy Games [arXiv] [code] [article]


  • Noisy Networks for Exploration [arXiv]
  • Do GANs actually learn the distribution? An empirical study [arXiv]
  • Gradient Episodic Memory for Continuum Learning [arXiv]
  • Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog [arXiv] [code]
  • Deep Interest Network for Click-Through Rate Prediction [arXiv]
  • Cognitive Psychology for Deep Neural Networks: A Shape Bias Case Study [arXiv] [article]
  • Structure Learning in Motor Control: A Deep Reinforcement Learning Model [arXiv]
  • Programmable Agents [arXiv]
  • Grounded Language Learning in a Simulated 3D World [arXiv]
  • Schema Networks: Zero-shot Transfer with a Generative Causal Model of Intuitive Physics [arXiv]
  • SVCCA: Singular Vector Canonical Correlation Analysis for Deep Learning Dynamics and Interpretability [arXiv] [article] [code]
  • One Model To Learn Them All [arXiv] [code] [article]
  • Hybrid Reward Architecture for Reinforcement Learning [arXiv]
  • Expected Policy Gradients [arXiv]
  • Variational Approaches for Auto-Encoding Generative Adversarial Networks [arXiv]
  • Deal or No Deal? End-to-End Learning for Negotiation Dialogues [S3AWS] [code] [article]
  • Attention Is All You Need [arXiv] [code] [article]
  • Sobolev Training for Neural Networks [arXiv]
  • YellowFin and the Art of Momentum Tuning [arXiv] [code] [article]
  • Forward Thinking: Building and Training Neural Networks One Layer at a Time [arXiv]
  • Depthwise Separable Convolutions for Neural Machine Translation [arXiv] [code]
  • Parameter Space Noise for Exploration [arXiv] [code] [article]
  • Deep Reinforcement Learning from human preferences [arXiv] [article]
  • Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments [arXiv] [code]
  • Self-Normalizing Neural Networks [arXiv] [code]
  • Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour [arXiv]
  • A simple neural network module for relational reasoning [arXiv] [article]
  • Visual Interaction Networks [arXiv] [article]


  • Supervised Learning of Universal Sentence Representations from Natural Language Inference Data [arXiv] [code]
  • pix2code: Generating Code from a Graphical User Interface Screenshot [arXiv] [article] [code]
  • The Cramer Distance as a Solution to Biased Wasserstein Gradients [arXiv]
  • Reinforcement Learning with a Corrupted Reward Channel [arXiv]
  • Dilated Residual Networks [arXiv] [code]
  • Bayesian GAN [arXiv] [code]
  • Gradient Descent Can Take Exponential Time to Escape Saddle Points [arXiv] [article]
  • Thinking Fast and Slow with Deep Learning and Tree Search [arXiv]
  • ParlAI: A Dialog Research Software Platform [arXiv] [code] [article]
  • Semantically Decomposing the Latent Spaces of Generative Adversarial Networks [arXiv] [article]
  • Look, Listen and Learn [arXiv]
  • Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset [arXiv] [code]
  • Convolutional Sequence to Sequence Learning [arXiv] [code] [code2] [article]
  • The Kinetics Human Action Video Dataset [arXiv] [article]
  • Safe and Nested Subgame Solving for Imperfect-Information Games [arXiv]
  • Discrete Sequential Prediction of Continuous Actions for Deep RL [arXiv]
  • Metacontrol for Adaptive Imagination-Based Optimization [arXiv]
  • Efficient Parallel Methods for Deep Reinforcement Learning [arXiv]
  • Real-Time Adaptive Image Compression [arXiv]


  • General Video Game AI: Learning from Screen Capture [arXiv]
  • Learning to Skim Text [arXiv]
  • Get To The Point: Summarization with Pointer-Generator Networks [arXiv] [code] [article]
  • Adversarial Neural Machine Translation [arXiv]
  • Deep Q-learning from Demonstrations [arXiv]
  • Learning from Demonstrations for Real World Reinforcement Learning [arXiv]
  • DSLR-Quality Photos on Mobile Devices with Deep Convolutional Networks [arXiv] [article] [code]
  • A Neural Representation of Sketch Drawings [arXiv] [code] [article]
  • Automated Curriculum Learning for Neural Networks [arXiv]
  • Hierarchical Surface Prediction for 3D Object Reconstruction [arXiv] [article]
  • Neural Message Passing for Quantum Chemistry [arXiv]
  • Learning to Generate Reviews and Discovering Sentiment [arXiv] [code]
  • Best Practices for Applying Deep Learning to Novel Applications [arXiv]


  • Improved Training of Wasserstein GANs [arXiv]
  • Evolution Strategies as a Scalable Alternative to Reinforcement Learning [arXiv]
  • Controllable Text Generation [arXiv]
  • Neural Episodic Control [arXiv]
  • A Structured Self-attentive Sentence Embedding [arXiv]
  • Multi-step Reinforcement Learning: A Unifying Algorithm [arXiv]
  • Deep learning with convolutional neural networks for brain mapping and decoding of movement-related information from the human EEG [arXiv]
  • FaSTrack: a Modular Framework for Fast and Guaranteed Safe Motion Planning [arXiv] [article] [article2]
  • Massive Exploration of Neural Machine Translation Architectures [arXiv] [code]
  • Large Pose 3D Face Reconstruction from a Single Image via Direct Volumetric CNN Regression [arXiv] [article] [code]
  • Minimax Regret Bounds for Reinforcement Learning [arXiv]
  • Sharp Minima Can Generalize For Deep Nets [arXiv]
  • Parallel Multiscale Autoregressive Density Estimation [arXiv]
  • Neural Machine Translation and Sequence-to-sequence Models: A Tutorial [arXiv]
  • Large-Scale Evolution of Image Classifiers [arXiv]
  • FeUdal Networks for Hierarchical Reinforcement Learning [arXiv]
  • Evolving Deep Neural Networks [arXiv]
  • How to Escape Saddle Points Efficiently [arXiv] [article]
  • Opening the Black Box of Deep Neural Networks via Information [arXiv] [video]
  • Understanding Synthetic Gradients and Decoupled Neural Interfaces [arXiv]
  • Learning to Optimize Neural Nets [arXiv] [article]


  • The Shattered Gradients Problem: If resnets are the answer, then what is the question? [arXiv]
  • Neural Map: Structured Memory for Deep Reinforcement Learning [arXiv]
  • Bridging the Gap Between Value and Policy Based Reinforcement Learning [arXiv]
  • Deep Voice: Real-time Neural Text-to-Speech [arXiv]
  • Beating the World's Best at Super Smash Bros. with Deep Reinforcement Learning [arXiv]
  • The Game Imitation: Deep Supervised Convolutional Networks for Quick Video Game AI [arXiv]
  • Learning to Parse and Translate Improves Neural Machine Translation [arXiv]
  • All-but-the-Top: Simple and Effective Postprocessing for Word Representations [arXiv]
  • Deep Learning with Dynamic Computation Graphs [arXiv]
  • Skip Connections as Effective Symmetry-Breaking [arXiv]
  • odelSemi-Supervised QA with Generative Domain-Adaptive Nets [arXiv]


  • Wasserstein GAN [arXiv]
  • Deep Reinforcement Learning: An Overview [arXiv]
  • DyNet: The Dynamic Neural Network Toolkit [arXiv]
  • DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker [arXiv]
  • NIPS 2016 Tutorial: Generative Adversarial Networks [arXiv]


  • A recurrent neural network without Chaos [arXiv]
  • Language Modeling with Gated Convolutional Networks [arXiv]
  • EnhanceNet: Single Image Super-Resolution Through Automated Texture Synthesis [arXiv] [article]
  • Learning from Simulated and Unsupervised Images through Adversarial Training [arXiv]
  • How Grammatical is Character-level Neural Machine Translation? Assessing MT Quality with Contrastive Translation Pairs [arXiv]
  • Improving Neural Language Models with a Continuous Cache [arXiv]
  • DeepMind Lab [arXiv] [code]
  • Deep Learning of Robotic Tasks without a Simulator using Strong and Weak Human Supervision [arXiv]
  • Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning [arXiv]
  • Overcoming catastrophic forgetting in neural networks [arXiv]

2016-11 (ICLR Edition)

Reinforcement Learning:

-Learning to reinforcement learn [arXiv]

Machine Translation & Dialog



  • Towards Deep Symbolic Reinforcement Learning [arXiv]
  • HyperNetworks [arXiv]
  • Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation [arXiv]
  • Safe and Efficient Off-Policy Reinforcement Learning [arXiv]
  • Playing FPS Games with Deep Reinforcement Learning [arXiv]
  • SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient [arXiv]
  • Episodic Exploration for Deep Deterministic Policies: An Application to StarCraft Micromanagement Tasks [arXiv]
  • Energy-based Generative Adversarial Network [arXiv]
  • Stealing Machine Learning Models via Prediction APIs [arXiv]
  • Semi-Supervised Classification with Graph Convolutional Networks [arXiv]
  • WaveNet: A Generative Model For Raw Audio [arXiv]
  • Hierarchical Multiscale Recurrent Neural Networks [arXiv]
  • End-to-End Reinforcement Learning of Dialogue Agents for Information Access [arXiv]
  • Deep Neural Networks for YouTube Recommendations [paper]


  • Semantics derived automatically from language corpora contain human-like biases [arXiv]
  • Why does deep and cheap learning work so well? [arXiv]
  • Machine Comprehension Using Match-LSTM and Answer Pointer [arXiv]
  • Stacked Approximated Regression Machine: A Simple Deep Learning Approach [arXiv]
  • Decoupled Neural Interfaces using Synthetic Gradients [arXiv]
  • WikiReading: A Novel Large-scale Language Understanding Task over Wikipedia [arXiv]
  • Temporal Attention Model for Neural Machine Translation [arXiv]
  • Residual Networks of Residual Networks: Multilevel Residual Networks [arXiv]
  • Learning Online Alignments with Continuous Rewards Policy Gradient [arXiv]




  • Hierarchical Memory Networks [arXiv]
  • Deep API Learning [arXiv]
  • Wide Residual Networks [arXiv]
  • TensorFlow: A system for large-scale machine learning [arXiv]
  • Learning Natural Language Inference using Bidirectional LSTM model and Inner-Attention [arXiv]
  • Aspect Level Sentiment Classification with Deep Memory Network [arXiv]
  • FractalNet: Ultra-Deep Neural Networks without Residuals [arXiv]
  • Learning End-to-End Goal-Oriented Dialog [arXiv]
  • One-shot Learning with Memory-Augmented Neural Networks [arXiv]
  • Deep Learning without Poor Local Minima [arXiv]
  • AVEC 2016 - Depression, Mood, and Emotion Recognition Workshop and Challenge [arXiv]
  • Data Programming: Creating Large Training Sets, Quickly [arXiv]
  • Deeply-Fused Nets [arXiv]
  • Deep Portfolio Theory [arXiv]
  • Unsupervised Learning for Physical Interaction through Video Prediction [arXiv]
  • Movie Description [arXiv]











  • Neural Random-Access Machines [arxiv]
  • Neural Programmer: Inducing Latent Programs with Gradient Descent [arXiv]
  • Neural Programmer-Interpreters [arXiv]
  • Learning Simple Algorithms from Examples [arXiv]
  • Neural GPUs Learn Algorithms [arXiv] [code]
  • On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models [arXiv]


  • ReSeg: A Recurrent Neural Network for Object Segmentation [arXiv]
  • Deconstructing the Ladder Network Architecture [arXiv]
  • Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks [arXiv]
  • Multi-Scale Context Aggregation by Dilated Convolutions [arXiv] [code]









  • Correlational Neural Networks [arXiv]




  • Hidden Technical Debt in Machine Learning Systems [NIPS]



  • The Loss Surfaces of Multilayer Networks [arXiv]




  • Convolutional Neural Networks for Sentence Classification [arxiv]





  • A Convolutional Neural Network for Modelling Sentences [arXiv]





  • Visualizing and Understanding Convolutional Networks [arXiv]
  • DeViSE: A Deep Visual-Semantic Embedding Model [pub]
  • Maxout Networks [arXiv]
  • Exploiting Similarities among Languages for Machine Translation [arXiv]
  • Efficient Estimation of Word Representations in Vector Space [arXiv]


  • Natural Language Processing (almost) from Scratch [arXiv]