summaries and notes on neural language learning papers
Latest commit ff3ccaf Dec 3, 2016 @dykang committed on GitHub Update README.md
Permalink
Failed to load latest commit information.
notes Create DLSC16.md Aug 5, 2016
README.md Update README.md Dec 3, 2016

README.md

Neural language notes

Simple notes on papers about neural language learning from arxiv, ACL, EMNLP, NAACL, and some machine/deep learning from ICLR, ICML, NIPS. This note is inspired by Denny Britz's notes. There is also dataset for neural language research [link].

Conference Papers & Groups

2016-11

  • LEARNING TO COMPOSE WORDS INTO SENTENCES WITH REINFORCEMENT LEARNING [arxiv]
  • NEWSQA: A MACHINE COMPREHENSION DATASET [arxiv]
  • Context-aware Natural Language Generation with Recurrent Neural Networks [arxiv]
  • LEARNING FEATURES OF MUSIC FROM SCRATCH [arxiv data]
  • Grammar Argumented LSTM Neural Networks with Note-Level Encoding for Music Composition [arxiv]
  • PIXELVAE: A LATENT VARIABLE MODEL FOR NATURAL IMAGES [arxiv]
  • VARIATIONAL LOSSY AUTOENCODER [arxiv]
  • Generative Deep Neural Networks for Dialogue: A Short Review [arxiv]
  • Variational Graph Auto-Encoders [arxiv]
  • MODULAR MULTITASK REINFORCEMENT LEARNING WITH POLICY SKETCHES [arxiv]
  • Neural Machine Translation with Reconstruction [arxiv]
  • TOPICRNN: A RECURRENT NEURAL NETWORK WITH LONG-RANGE SEMANTIC DEPENDENCY [arxiv]
  • REFERENCE-AWARE LANGUAGE MODELS [arxiv]
  • A JOINT MANY-TASK MODEL: GROWING A NEURAL NETWORK FOR MULTIPLE NLP TASKS [arxiv]
  • Ordinal Common-sense Inference [arxiv]
  • Dual Learning for Machine Translation [arxiv]
  • Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision [arxiv]
  • What’s in an Explanation? Characterizing Knowledge and Inference Requirements for Elementary Science Exams [coling]

2016-10

  • Cross-Modal Scene Networks [arxiv]
  • IMPROVING SAMPLING FROM GENERATIVE AUTOENCODERS WITH MARKOV CHAINS [[arxiv[(https://arxiv.org/pdf/1610.09296v2.pdf)]
  • Towards a continuous modeling of natural language domains [arxiv]
  • Adversarial Deep Averaging Networks for Cross-Lingual Sentiment Classification [arxiv]
  • Socratic Learning [arxiv]
  • Professor Forcing: A New Algorithm for Training Recurrent Networks [arxiv]
  • A Paradigm for Situated and Goal-Driven Language Learning [arxiv]
  • A Theme-Rewriting Approach for Generating Algebra Word Problems [arxiv]
  • Lexicons and Minimum Risk Training for Neural Machine Translation: NAIST-CMU at WAT2016 [arxiv]
  • Equilibrium Propagation: Bridging the Gap Between Energy-Based Models and Backpropagation [arxiv]
  • Cross-Sentence Inference for Process Knowledge [emnlp]
  • Learning to Translate in Real-time with Neural Machine Translation [arxiv]
  • Recurrent Neural Network Grammars [arxiv]
  • Connecting Generative Adversarial Networks and Actor-Critic Methods [arxiv]
  • Semantic Parsing with Semi-Supervised Sequential Autoencoders [arxiv]
  • Grounding the Lexical Sets of Causative-Inchoative Verbs with Word Embedding [arxiv]
  • Learning to Translate in Real-time with Neural Machine Translation [axiv]
  • A Dataset and Evaluation Metrics for Abstractive Compression of Sentences and Short Paragraphs [axiv]

2016-09

  • SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient [arxiv]
  • ReasoNet: Learning to Stop Reading in Machine Comprehension [arxiv]
  • SimVerb-3500: A Large-Scale Evaluation Set of Verb Similarity [arxiv]
  • NEURAL PHOTO EDITING WITH INTROSPECTIVE ADVERSARIAL NETWORKS [arxiv]
  • Language as a Latent Variable: Discrete Generative Models for Sentence Compression [arxiv]
  • Annotating Derivations: A New Evaluation Strategy and Dataset for Algebra Word Problems [arxiv]
  • Unsupervised Neural Hidden Markov Models [arxiv]
  • Creating Causal Embeddings for Question Answering with Minimal Supervision [axiv]
  • Generating Videos with Scene Dynamics [arxiv]
  • On the Similarities Between Native, Non-native and Translated Texts [arxiv]
  • Energy-based Generative Adversarial Network [arxiv]
  • Knowledge as a Teacher: Knowledge-Guided Structural Attention Networks [arxiv]
  • Episodic Exploration for Deep Deterministic Policies: An Application to StarCraft Micromanagement Tasks [arxiv]
  • WAVENET: A GENERATIVE MODEL FOR RAW AUDIO [arxiv]
  • Multimodal Attention for Neural Machine Translation [arxiv]
  • Neural Machine Translation with Supervised Attention [arxiv]
  • Formalizing Neurath's Ship: Approximate Algorithms for Online Causal Learning [arxiv]
  • Factored Neural Machine Translation [arxiv]
  • Discrete Variational Autoencoders [arxiv]
  • Deep Reinforcement Learning with a Combinatorial Action Space for Predicting Popular Reddit Threads [arxiv]
  • lamtram: A toolkit for language and translation modeling using neural networks [github]
  • C++ neural network library [github]
  • End-to-End Reinforcement Learning of Dialogue Agents for Information Access [arxiv]
  • Citation Classification for Behavioral Analysis of a Scientific Field [arxiv]
  • Reward Augmented Maximum Likelihood for Neural Structured Prediction [arxiv]
  • All Fingers are not Equal: Intensity of References in Scientific Articles [emnlp]
  • WAVENET: A GENERATIVE MODEL FOR RAW AUDIO [paper blog]
  • Hierarchical Multiscale Recurrent Neural Networks [arxiv]

2016-08

  • A Context-aware Natural Language Generator for Dialogue Systems [sigdial]
  • Investigation Into The Effectiveness Of Long Short Term Memory Networks For Stock Price Prediction [axiv]
  • HIERARCHICAL ATTENTION MODEL FOR IMPROVED MACHINE COMPREHENSION OF SPOKEN CONTENT [arxiv]
  • Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks [arxiv code]
  • Progressive Neural Networks [arxiv]
  • Neural Variational Inference for Text Processing [arxiv]
  • Generative Adversarial Text to Image Synthesis [arxiv code]
  • Sequential Neural Models with Stochastic Layers [nips]
  • Deep Learning without Poor Local Minima [nips]
  • Actor-critic versus direct policy search: a comparison based on sample complexity [arxiv]
  • Policy Networks with Two-Stage Training for Dialogue Systems [arxiv]
  • Pointing the Unknown Words [acl]
  • An Incremental Parser for Abstract Meaning Representation [arxiv]
  • Topic Sensitive Neural Headline Generation [arxiv]
  • Towards Machine Comprehension of Spoken Content: Initial TOEFL Listening Comprehension Test by Machine [interspeech]
  • Face2Face: Real-time Face Capture and Reenactment of RGB Videos [demo]
  • Image-Space Modal Bases for Plausible Manipulation of Objects in Video [demo]
  • Decoupled neural interfaces using synthetic gradients [arxiv blog]
  • Full Resolution Image Compression with Recurrent Neural Networks [arxiv]
  • Who did What: A Large-Scale Person-Centered Cloze Dataset [arxiv data]
  • Pixel Recurrent Neural Networks [arxiv]
  • Mollifying Networks [arxiv]
  • Variational Information Maximizing Exploration [arxiv]
  • Does Multimodality Help Human and Machine for Translation and Image Captioning [arxiv]
  • Learning values across many orders of magnitude [arxiv]
  • Attend, Infer, Repeat: Fast Scene Understanding with Generative Models [arxiv]
  • Architectural Complexity Measures of Recurrent Neural Network [arxiv]
  • Neural Generation of Regular Expressions from Natural Language with Minimal Domain Knowledge [arxiv]
  • Canonical Correlation Inference for Mapping Abstract Scenes to Text [arxiv]
  • Temporal Attention Model for Neural Machine Translation [arxiv]
  • Bi-directional Attention with Agreement for Dependency Parsing [arxiv]
  • Diachronic Word Embeddings Reveal Statistical Laws of Semantic Change [arxiv]
  • Recurrent Highway Networks [arxiv]
  • Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond [arxiv]
  • WIKIREADING: A Novel Large-scale Language Understanding Task over Wikipedia [arxiv]
  • Larger-Context Language Modelling with Recurrent Neural Network [acl16]
  • Learning Online Alignments with Continuous Rewards Policy Gradient [arxiv]
  • Issues in evaluating semantic spaces using word analogies [acl16]
  • Curiosity-driven Exploration in Deep Reinforcement Learning via Bayesian Neural Networks [arxiv]
  • Control of Memory, Active Perception, and Action in Minecraft [icml16]
  • Dueling Network Architectures for Deep Reinforcement Learning [arxiv]
  • Human-level control through deep reinforcement learning [nature]
  • Reinforcement Learning in Multi-Party Trading Dialog [arxiv]
  • Large-scale Simple Question Answering with Memory Network [arxiv]
  • On-line Active Reward Learning for Policy Optimisation in Spoken Dialogue Systems [arxiv]
  • A New Method to Visualize Deep Neural Networks [arxiv]
  • Dreaming of names with RBMs [blog]
  • Synthesizing Compound Words for Machine Translation [acl16]
  • Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations [arxiv]
  • Learning to Transduce with Unbounded Memory [arxiv]
  • Inferring Algorithmic Patterns with Stack-Augmented Recurrent Nets [nip16]
  • LEARNING LONGER MEMORY IN RECURRENT NEURAL NETWORKS [iclr15]
  • Attention-based Multimodal Neural Machine Translation [acl16]
  • A Two-stage Approach for Extending Event Detection to New Types via Neural Networks [acl16]
  • Learning text representation using recurrent convolutional neural network with highway layers [arxiv]
  • Training Very Deep Networks [arxiv]
  • SimVerb-3500: A Large-Scale Evaluation Set of Verb Similarity [arxiv]
  • Counter-fitting Word Vectors to Linguistic Constraints [naacl16]
  • Learning Online Alignments with Continuous Rewards Policy Gradient [arxiv]
  • NEURAL PROGRAMMER: INDUCING LATENT PROGRAMS WITH GRADIENT DESCENT [iclr16]
  • Supervised Attentions for Neural Machine Translation [arxiv]
  • A Neural Knowledge Language Model [arxiv]
  • Recurrent Models of Visual Attention [cvpr]
  • XGBoost: A Scalable Tree Boosting System [arxiv]
  • A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories [naacl]
  • Deep Learning Trends @ ICLR 2016 [blog]
  • Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift [arxiv]
  • VISUALIZING AND UNDERSTANDING RECURRENT NETWORKS [iclr]
  • Net2Net: ACCELERATING LEARNING VIA KNOWLEDGE TRANSFER [iclr]
  • A Latent Variable Recurrent Neural Network for Discourse Relation Language Models [arxiv]
  • A Recurrent Latent Variable Model for Sequential Data [arxiv]
  • ORDER-EMBEDDINGS OF IMAGES AND LANGUAGE [iclr]
  • Neural Module Networks [arxiv]
  • Learning to Compose Neural Networks for Question Answering [acl]
  • Understanding and Improving Convolutional Neural Networks via Concatenated Rectified Linear Units [arxiv]

2016-07

  • Constructing a Natural Language Inference Dataset using Generative Neural Networks [arxiv]
  • ADVERSARIAL EXAMPLES IN THE PHYSICAL WORLD [arxiv]
  • An Actor-Critic Algorithm for Sequence Prediction [arxiv]
  • Enriching Word Vectors with Subword Information [arxiv]
    • each word is represented as a bag of character n-grams in skip-gram
  • Neural Machine Translation with Recurrent Attention Modeling [arxiv]
  • The Role of Discourse Units in Near-Extractive Summarization [arxiv]
  • Bag of Tricks for Efficient Text Classification [arxiv]
  • Chains of Reasoning over Entities, Relations, and Text using Recurrent Neural Networks [arxiv]
  • Dynamic Neural Turing Machine with Soft and Hard Addressing Schemes [arxiv]
  • STransE: a novel embedding model of entities and relationships in knowledge bases [naacl16]
  • Layer Normalization [arxiv]
  • Dataset and Neural Recurrent Sequence Labeling Model for Open-Domain Factoid Question Answering [arxiv]
  • Imitation Learning with Recurrent Neural Networks [arxiv]
  • Neural Name Translation Improves Neural Machine Translation [arxiv]
  • query-regression networks for machine comprehension [arxiv]
  • Bag of Tricks for Efficient Text Classification [arxiv]
  • Chains of Reasoning over Entities, Relations, and Text using Recurrent Neural Networks [arxiv]
  • Sort Story: Sorting Jumbled Images and Captions into Stories [arxiv]
  • Separating Answers from Queries for Neural Reading Comprehension [arxiv]
  • Recurrent Highway Networks [arxiv]
  • Charagram: Embedding Words and Sentences via Character n-grams [arxiv]
  • ADVERSARIAL EXAMPLES IN THE PHYSICAL WORLD [arxiv]
  • Syntax-based Attention Model for Natural Language Inference [arxiv]
  • Open-Vocabulary Semantic Parsing with both Distributional Statistics and Formal Knowledge [arxiv]
  • Layer Normalization [arxiv]
  • Neural Sentence Ordering [arxiv code]
  • Distilling Word Embeddings: An Encoding Approach [arxiv]
  • Target-Side Context for Discriminative Models in Statistical Machine Translation [arxiv]
  • Domain Adaptation for Neural Networks by Parameter Augmentation [arxiv]
  • Towards Abstraction from Extraction: Multiple Timescale Gated Recurrent Unit for Summarization [arxiv]
  • Dynamic Neural Turing Machine with Soft and Hard Addressing Schemes [arxiv]
  • Dataset and Neural Recurrent Sequence Labeling Model for Open-Domain Factoid Question Answering [arxiv]
  • Imitation Learning with Recurrent Neural Networks [arxiv]
  • Attention-over-Attention Neural Networks for Reading Comprehension [arxiv]
  • Neural Tree Indexers for Text Understanding [arxiv]
  • Generating Images Part by Part with Composite Generative Adversarial Networks [arxiv]

2016-06

  • Neural Summarization by Extracting Sentences and Words [arxiv]
  • Visual Analysis of Hidden State Dynamics in Recurrent Neural Networks [arxiv]
  • Sequence-Level Knowledge Distillation [arxiv]
  • Text Understanding with the Attention Sum Reader Network [arxiv]
  • Query-Regression Networks for Machine Comprehension [arxiv]
  • A Correlational Encoder Decoder Architecture for Pivot Based Sequence Generation [arxiv]
  • Smart Reply: Automated Response Suggestion for Email [arxiv]
  • Minimum Risk Training for Neural Machine Translation [arxiv]
  • Compression of Neural Machine Translation Models via Pruning [arxiv]
  • Sort Story: Sorting Jumbled Images and Captions into Stories [arxiv]
  • Dialog state tracking, a machine reading approach using a memory-enhanced neural network [arxiv]
  • Predicting the Relative Difficulty of Single Sentences With and Without Surrounding Context [arxiv]
  • Learning Generative ConvNet with Continuous Latent Factors by Alternating Back-Propagation [arxiv]
  • Topic Augmented Neural Response Generation with a Joint Attention Mechanism [arxiv]
  • STransE: a novel embedding model of entities and relationships in knowledge bases [arxiv]
  • Functional Distributional Semantics [arxiv]
  • Sequence-Level Knowledge Distillation [arxiv]
  • The LAMBADA dataset: Word prediction requiring a broad discourse context [arxiv]
  • DenseCap: Fully Convolutional Localization Networks for Dense Captioning [link]
  • Visualizing Dynamics: from t-SNE to SEMI-MDPs [arxiv]
  • Algorithmic Composition of Melodies with Deep Recurrent Neural Networks [arxiv]
  • InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets [arxiv]
  • Deep Reinforcement Learning for Dialogue Generation [arxiv]
  • Key-Value Memory Networks for Directly Reading Documents [arxiv]
  • A Correlational Encoder Decoder Architecture for Pivot Based Sequence Generation [arxiv]
  • The Word Entropy of Natural Languages [arxiv]
  • Semantic Parsing to Probabilistic Programs for Situated Question Answering [arxiv]
  • Critical Behavior from Deep Dynamics: A Hidden Dimension in Natural Language [arxiv]
  • Inferring Logical Forms From Denotations [arxiv]
  • some notes from NAACL'16 Deep Learning panel discussion
    • Jacob Eisenstein made an observation, "In NLP, the things we do well on are things where context doesn't matter."
  • Rationalizing Neural Predictions [arxiv]
  • DeepMath - Deep Sequence Models for Premise Selection [arxiv]
  • A Fast Unified Model for Parsing and Sentence Understanding [arxiv]
  • A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues [arxiv]
  • Multiresolution Recurrent Neural Networks: An Application to Dialogue Response Generation [arxiv]
  • Sequence-to-Sequence Learning as Beam-Search Optimization [arxiv]
  • Tables as Semi-structured Knowledge for Question Answering [arxiv]
  • Ask Me Anything: Dynamic Memory Networks for Natural Language Processing [arxiv]
  • Dynamic Memory Networks for Visual and Textual Question Answering [arxiv]
  • http://homes.cs.washington.edu/~nasmith/papers/flanigan+dyer+smith+carbonell.naacl16.pdf [arxiv]
  • Iterative Alternating Neural Attention for Machine Reading [arxiv]
  • Vector-based Models of Semantic Composition [arxiv]
  • Generating Natural Language Inference Chains [arxiv]
  • Learning to Compose Neural Networks for Question Answering [arxiv]
  • A Latent Variable Recurrent Neural Network for Discourse Relation Language Models [arxiv]
  • Data Recombination for Neural Semantic Parsing [arxiv]
  • Natural Language Generation in Dialogue using Lexicalized and Delexicalized Data [arxiv]
  • Deep Reinforcement Learning with a Combinatorial Action Space for Predicting and Tracking Popular Discussion Threads [arxiv]
  • InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets [arxiv]
  • A Diversity-Promoting Objective Function for Neural Conversation Models [arxiv]
  • Neural Associative Memory for Dual-Sequence Modeling [arxiv]
  • Key-Value Memory Networks for Directly Reading Documents [arxiv]
  • Simple Question Answering by Attentive Convolutional Neural Network [arxiv]
  • Neural Network-Based Abstract Generation for Opinions and Arguments [arxiv]
  • A Thorough Examination of the CNN/Daily Mail Reading Comprehension Task [arxiv]
  • Generating Natural Questions About an Image [arxiv]
  • Continuously Learning Neural Dialogue Management [arxiv]
  • A Persona-Based Neural Conversation Model [arxiv]
  • Deep Reinforcement Learning for Dialogue Generation [arxiv]
  • A Decomposable Attention Model for Natural Language Inference [arxiv]
    • attention matrixs to decompose the problem into subproblems that can be solved separately
  • Memory-enhanced Decoder for Neural Machine Translation [arxiv]
  • Incorporating Discrete Translation Lexicons into Neural Machine [arxiv]
  • Can neural machine translation do simultaneous translation? [arxiv]
  • Language to Logical Form with Neural Attention [arxiv]
  • Neural Summarization by Extracting Sentences and Words [arxiv]
  • Generalizing and Hybridizing Count-based and Neural Language Models [arxiv]

2016-05

  • Variational Neural Machine Translation [arxiv]
  • Deep Generative Models with Stick-Breaking Priors [arxiv]
  • A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues [arxiv]
  • One-shot Learning with Memory-Augmented Neural Networks [arxiv]
  • Residual Networks are Exponential Ensembles of Relatively Shallow Networks [arxiv code]
  • Modelling Interaction of Sentence Pair with coupled-LSTMs [arxiv]
  • Functional Hashing for Compressing Neural Networks [arxiv]
  • Combining Recurrent and Convolutional Neural Networks for Relation Classification [arxiv]
  • Learning End-to-End Goal-Oriented Dialog [arxiv]
  • Variational Neural Machine Translation [arxiv]
  • BattRAE: Bidimensional Attention-Based Recursive Autoencoders for Learning Bilingual Phrase Embeddings [arxiv]
  • Encode, Review, and Decode: Reviewer Module for Caption Generation [arxiv]
  • Automatic Extraction of Causal Relations from Natural Language Texts: A Comprehensive Survey [arxiv]
  • A Convolutional Attention Network for Extreme Summarization of Source Code [arxiv]
  • Data recombination for neural semantic parsing.
  • Inferring logical forms from denotations
  • How much is 131 million dollars? putting numbers in perspective with compositional descriptions
  • Learning to Generate with Memory [arxiv]
  • Attention Correctness in Neural Image Captioning [arxiv]
  • Contextual LSTM (CLSTM) models for Large scale NLP tasks [arxiv]
  • Path-Normalized Optimization of Recurrent Neural Networks with ReLU Activations [arxiv]
  • Generative Adversarial Text to Image Synthesis [arxiv]
  • Query-Efficient Imitation Learning for End-to-End Autonomous Driving [arxiv]
  • Hierarchical Memory Networks [arxiv]
  • odelling Interaction of Sentence Pair with coupled-LSTMs [arsiv]
  • Recurrent Neural Network for Text Classification with Multi-Task Learning [arxiv]
  • Rationale-Augmented Convolutional Neural Networks for Text Classification [arxiv]
  • Joint Event Extraction via Recurrent Neural Networks [paper]
  • Noisy Parallel Approximate Decoding for Conditional Recurrent Language Model [paper]
  • Natural Language Semantics and Computability [arxiv]
  • Natural Language Inference by Tree-Based Convolution and Heuristic Matching [arxiv]
  • Generating Sentences from a Continuous Space [arxiv]
  • Vocabulary Manipulation for Neural Machine Translation [arxiv]
  • Chained Predictions Using Convolutional Neural Networks [arxiv]
  • Modeling Rich Contexts for Sentiment Classification with LSTM [arxiv]
  • Incorporating Selectional Preferences in Multi-hop Relation Extraction [naacl16]
  • Word Ordering Without Syntax [arxiv]
  • Compositional Sentence Representation from Character within Large Context Text [arxiv]
  • Abstractive Sentence Summarization with Attentive Recurrent Neural Networks [arxiv]
  • Mixed Incremental Cross-Entropy REINFORCE ICLR 2016 [github]

2016-04

  • Towards Conceptual Compression [arxiv]]
  • Teaching natural language to computers [arxiv]
  • Attend, Infer, Repeat Fast Scene Understanding with Generative Models
  • How NOT To Evaluate Your Dialogue System An Empirical Study of
  • Revisiting Semi-Supervised Learning with Graph Embeddings
  • Neural Summarization by Extracting Sentences and Words
  • Learning-Based Single-Document Summarization with Compression and Anaphoricity Constraints
  • LSTM-BASED DEEP LEARNING MODELS FOR NONFACTOID
  • Generating Visual Explanations
  • A Compositional Approach to Language Modeling [arxiv]
  • Evaluating Prerequisite Qualities for Learning End-to-End Dialog Systems [arxiv]
  • Building Machines That Learn and Think Like People [arxiv]
  • A Corpus and Evaluation Framework for Deeper Understanding of Commonsense Stories [arxiv]
  • Revisiting Summarization Evaluation for Scientific Articles [arxiv]
  • Reasoning About Pragmatics with Neural Listeners and Speakers [arxiv]
  • Character-Level Question Answering with Attention [arxiv]
  • Capturing Semantic Similarity for Entity Linking with Convolutional Neural Networks [arxiv]
  • Recurrent Neural Network Grammars [arxiv]

2016-03

  • Neural Programmer: Inducing Latent Programs with Gradient Descent [arxiv]
  • Adversarial Autoencoders
  • Listen, Attend and Spell: A Neural Network for Large Vocabulary Conversational Speech Recognition
  • Net2Net: Accelerating Learning via Knowledge Transfer
  • Neural Programmer: Inducing Latent Programs with Gradient Descent
  • A Neural Conversational Model
  • Neural Language Correction with Character-Based Attention [arxiv]
  • Modeling Relational Information in Question-Answer Pairs with Convolutional Neural Networks [arxiv]
  • Building Machines That Learn and Think Like People [arxiv]
  • LARGER-CONTEXT LANGUAGE MODELLING WITH RECURRENT NEURAL NETWORK [arxiv]
  • A Diversity-Promoting Objective Function for Neural Conversation Model [arxiv]
  • Hierarchical Attention Networks for Document Classification [arxiv]
  • Visual Storytelling [arxiv]
  • Using Sentence-Level LSTM Language Models for Script Inference [arxiv]
  • ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs [arxiv]
  • Character-Level Question Answering with Attention [arxiv]
  • Abstractive Text Summarization Using Sequence-to-Sequence RNNs and Beyond [arxiv]
  • Sentence Compression by Deletion with LSTMs [link]
  • A Simple Way to Initialize Recurrent Networks of Rectified Linear Units [arxiv]
  • DenseCap: Fully Convolutional Localization Networks for Dense Captioning [arxiv]
  • Nonextensive information theoretical machine [arxiv]
  • What we write about when we write about causality: Features of causal statements across large-scale social discourse [arxiv]
  • Question Answering via Integer Programming over Semi-Structured Knowledge [arxiv]
  • Dialog-based Language Learning [arxiv]
  • Bridging LSTM Architecture and the Neural Dynamics during Reading [arxiv]
  • Neural Generative Question Answering [arxiv]
  • Recurrent Memory Networks for Language Modeling [arxiv]
  • Colorful Image Colorization [paper] [code] [note]

TODO

  • votes for papers (e.g., 👍)
  • automatic crawler for citation and search counts (e.g., cite+51, tweets+42, search+523 ) like this