Skip to content
Go to file


Failed to load latest commit information.
Latest commit message
Commit time


Paper reading list in natural language processing.

Deep Learning in NLP

  • CNM: "CNM: An Interpretable Complex-valued Network for Matching". NAACL(2019) [PDF] [code]
  • word2vec: "word2vec Parameter Learning Explained". arXiv(2016) [PDF]
  • Glove: "GloVe: Global Vectors for Word Representation". EMNLP(2014) [PDF] [code]
  • ELMo: "Deep contextualized word representations". NAACL(2018) [PDF] [code]
  • VAE: "An Introduction to Variational Autoencoders". arXiv(2019) [PDF]
  • Transformer: "Attention is All you Need". NeurIPS(2017) [PDF] [code-official] [code-tf] [code-py]
  • Transformer-XL: "Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context". ACL(2019) [PDF] [code]
  • ConvS2S: "Convolutional Sequence to Sequence Learning". ICML(2017) [PDF]
  • Survey on Attention: "An Introductory Survey on Attention Mechanisms in NLP Problems". arXiv(2018) [PDF]
  • Additive Attention: "Neural Machine Translation by Jointly Learning to Align and Translate". ICLR(2015) [PDF]
  • Multiplicative Attention: "Effective Approaches to Attention-based Neural Machine Translation". EMNLP(2015) [PDF]
  • Memory Net: "End-To-End Memory Networks". NeurIPS(2015) [PDF]
  • Pointer Net: "Pointer Networks". NeurIPS(2015) [PDF]
  • Copying Mechanism: "Incorporating Copying Mechanism in Sequence-to-Sequence Learning". ACL(2016) [PDF]
  • Coverage Mechanism: "Modeling Coverage for Neural Machine Translation". ACL(2016) [PDF]
  • GAN: "Generative Adversarial Nets". NeurIPS(2014) [PDF]
  • SeqGAN: "SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient". AAAI(2017) [PDF] [code]
  • MacNet: "MacNet: Transferring Knowledge from Machine Comprehension to Sequence-to-Sequence Models". NeurIPS(2018) [PDF]
  • Graph2Seq: "Graph2Seq: Graph to Sequence Learning with Attention-based Neural Networks". arXiv(2018) [PDF]
  • Pretrained Seq2Seq: "Unsupervised Pretraining for Sequence to Sequence Learning". EMNLP(2017) [PDF]
  • Multi-task Learning: "An Overview of Multi-Task Learning in Deep Neural Networks". arXiv(2017) [PDF]
  • Gradient Descent: "An Overview of Gradient Descent Optimization Algorithms". arXiv(2016) [PDF]

Pre-trained Language Models

  • PTMs: "Pre-trained Models for Natural Language Processing: A Survey". arXiv(2020) [PDF]
  • Optimus: "OPTIMUS: Organizing Sentences via Pre-trained Modeling of a Latent Space". arXiv(2020) [PDF] [code]
  • ERNIE-GEN: "ERNIE-GEN: An Enhanced Multi-Flow Pre-training and Fine-tuning Framework for Natural Language Generation". IJCAI(2020) [PDF] [code]
  • UniLM: "Unified Language Model Pre-training for Natural Language Understanding and Generation". NeurIPS(2019) [PDF] [code]
  • Poly-encoder: "Poly-encoders: Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scorings". ICLR(2020) [PDF]
  • ALBERT: "ALBERT: A Lite BERT for Self-supervised Learning of Language Representations". ICLR(2020) [PDF]
  • TinyBERT: "TinyBERT: Distilling BERT for Natural Language Understanding". arXiv(2019) [PDF] [code]
  • Chinese BERT: "Pre-Training with Whole Word Masking for Chinese BERT". arXiv(2019) [PDF] [code]
  • SpanBERT: "SpanBERT: Improving Pre-training by Representing and Predicting Spans". TACL(2020) [PDF] [code]
  • RoBERTa: "RoBERTa: A Robustly Optimized BERT Pretraining Approach". arXiv(2019) [PDF] [code]
  • ERNIE(Tsinghua): "ERNIE: Enhanced Language Representation with Informative Entities". ACL(2019) [PDF] [code]
  • ERNIE(Baidu): "ERNIE: Enhanced Representation through Knowledge Integration". arXiv(2019) [PDF] [code]
  • XLNet: "XLNet: Generalized Autoregressive Pretraining for Language Understanding". NeurIPS(2019) [PDF] [code]
  • XLM: "Cross-lingual Language Model Pretraining". NeurIPS(2019) [PDF] [code]
  • BERT: "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding". NAACL(2019) [PDF] [code]

Dialogue System

PTMs for Dialogue

  • ToD-BERT: "ToD-BERT: Pre-trained Natural Language Understanding for Task-Oriented Dialogues". arXiv(2020) [PDF] [code]
  • DialoGPT: "DialoGPT : Large-Scale Generative Pre-training for Conversational Response Generation". ACL(2020) [PDF] [code]
  • PLATO: "PLATO: Pre-trained Dialogue Generation Model with Discrete Latent Variable". ACL(2020) [PDF] [code]
  • Guyu: "An Empirical Investigation of Pre-Trained Transformer Language Models for Open-Domain Dialogue Generation". arXiv(2020) [PDF] [code]

Knowledge-driven Conversation

  • DuRecDial: "Towards Conversational Recommendation over Multi-Type Dialogs". ACL(2020) [PDF] [code]
  • KdConv: "KdConv: A Chinese Multi-domain Dialogue Dataset Towards Multi-turn Knowledge-driven Conversation". ACL(2020) [PDF] [data]
  • DuConv: "Proactive Human-Machine Conversation with Explicit Conversation Goals". ACL(2019) [PDF] [code]
  • KBRD: "Towards Knowledge-Based Recommender Dialog System". EMNLP(2019) [PDF] [code]
  • ReDial: "Towards Deep Conversational Recommendations". NeurIPS(2018) [PDF] [data]
  • Dual Fusion: "Smarter Response with Proactive Suggestion: A New Generative Neural Conversation Paradigm". IJCAI(2018) [PDF]

Task-oriented Dialogue

  • DF-Net: "Dynamic Fusion Network for Multi-Domain End-to-end Task-Oriented Dialog". ACL(2020) [PDF] [code]
  • MALA: "MALA: Cross-Domain Dialogue Generation with Action Learning". AAAI(2020) [PDF]
  • Task-Oriented Dialogue Systems: "Learning to Memorize in Neural Task-Oriented Dialogue Systems". HKUST MPhil Thesis(2019) [PDF]
  • GLMP: "Global-to-local Memory Pointer Networks for Task-Oriented Dialogue". ICLR(2019) [PDF] [code]
  • KB Retriever: "Entity-Consistent End-to-end Task-Oriented Dialogue System with KB Retriever". EMNLP(2019) [PDF] [data]
  • TRADE: "Transferable Multi-Domain State Generator for Task-Oriented Dialogue Systems". ACL(2019) [PDF] [code]
  • WMM2Seq: "A Working Memory Model for Task-oriented Dialog Response Generation". ACL(2019) [PDF] :
  • Pretrain-Fine-tune: "Training Neural Response Selection for Task-Oriented Dialogue Systems". ACL(2019) [PDF] [data]
  • Multi-level Mem: "Multi-Level Memory for Task Oriented Dialogs". NAACL(2019) [PDF] [code]
  • BossNet: "Disentangling Language and Knowledge in Task-Oriented Dialogs ". NAACL(2019) [PDF] [code]
  • SL+RL: "Dialogue Learning with Human Teaching and Feedback in End-to-End Trainable Task-Oriented Dialogue Systems". NAACL(2018) [PDF]
  • MAD: "Memory-augmented Dialogue Management for Task-oriented Dialogue Systems". TOIS(2018) [PDF]
  • TSCP: "Sequicity: Simplifying Task-oriented Dialogue Systems with Single Sequence-to-Sequence Architectures". ACL(2018) [PDF] [code]
  • Mem2Seq: "Mem2Seq: Effectively Incorporating Knowledge Bases into End-to-End Task-Oriented Dialog Systems". ACL(2018) [PDF] [code]
  • DSR: "Sequence-to-Sequence Learning for Task-oriented Dialogue with Dialogue State Representation". COLING(2018) [PDF]
  • StateNet: "Towards Universal Dialogue State Tracking". EMNLP(2018) [PDF]
  • Topic-Seg-Label: "A Weakly Supervised Method for Topic Segmentation and Labeling in Goal-oriented Dialogues via Reinforcement Learning". IJCAI(2018) [PDF] [code]
  • AliMe: "AliMe Chat: A Sequence to Sequence and Rerank based Chatbot Engine". ACL(2017) [PDF]
  • KVR Net: "Key-Value Retrieval Networks for Task-Oriented Dialogue". SIGDIAL(2017) [PDF] [data]

Open-domain Dialogue

  • RefNet: "RefNet: A Reference-aware Network for Background Based Conversation". AAAI(2020) [PDF] [code]
  • GLKS: "Thinking Globally, Acting Locally: Distantly Supervised Global-to-Local Knowledge Selection for Background Based Conversation". AAAI(2020) [PDF] [code]
  • HDSA: "Semantically Conditioned Dialog Response Generation via Hierarchical Disentangled Self-Attention". ACL(2019) [PDF] [code]
  • PostKS: "Learning to Select Knowledge for Response Generation in Dialog Systems". IJCAI(2019) [PDF]
  • Two-Stage-Transformer: "Wizard of Wikipedia: Knowledge-Powered Conversational agents". ICLR(2019) [PDF]
  • CAS: "Skeleton-to-Response: Dialogue Generation Guided by Retrieval Memory". NAACL(2019) [PDF] [code]
  • Edit-N-Rerank: "Response Generation by Context-aware Prototype Editing". AAAI(2019) [PDF] [code]
  • HVMN: "Hierarchical Variational Memory Network for Dialogue Generation". WWW(2018) [PDF] [code]
  • XiaoIce: "The Design and Implementation of XiaoIce, an Empathetic Social Chatbot". arXiv(2018) [PDF]
  • D2A: "Dialog-to-Action: Conversational Question Answering Over a Large-Scale Knowledge Base". NeurIPS(2018) [PDF] [code]
  • DAIM: "Generating Informative and Diverse Conversational Responses via Adversarial Information Maximization". NeurIPS(2018) [PDF]
  • MTask: "A Knowledge-Grounded Neural Conversation Model". AAAI(2018) [PDF]
  • GenDS: "Flexible End-to-End Dialogue System for Knowledge Grounded Conversation". arXiv(2017) [PDF]
  • Time-Decay-SLU: "How Time Matters: Learning Time-Decay Attention for Contextual Spoken Language Understanding in Dialogues". NAACL(2018) [PDF] [code]
  • REASON: "Dialog Generation Using Multi-turn Reasoning Neural Networks". NAACL(2018) [PDF]
  • STD/HTD: "Learning to Ask Questions in Open-domain Conversational Systems with Typed Decoders". ACL(2018) [PDF] [code]
  • CSF: "Generating Informative Responses with Controlled Sentence Function". ACL(2018) [PDF] [code]
  • NKD: "Knowledge Diffusion for Neural Dialogue Generation". ACL(2018) [PDF] [data]
  • DAWnet: "Chat More: Deepening and Widening the Chatting Topic via A Deep Model". SIGIR(2018) [PDF] [code]
  • ZSDG: "Zero-Shot Dialog Generation with Cross-Domain Latent Actions". SIGDIAL(2018) [PDF] [code]
  • DUA: "Modeling Multi-turn Conversation with Deep Utterance Aggregation". COLING(2018) [PDF] [code]
  • Data-Aug: "Sequence-to-Sequence Data Augmentation for Dialogue Language Understanding". COLING(2018) [PDF] [code]
  • DC-MMI: "Generating More Interesting Responses in Neural Conversation Models with Distributional Constraints". EMNLP(2018) [PDF] [code]
  • cVAE-XGate/CGate: "Better Conversations by Modeling, Filtering, and Optimizing for Coherence and Diversity". EMNLP(2018) [PDF] [code]
  • DAM: "Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network". ACL(2018) [PDF] [code]
  • SMN: "Sequential Matching Network: A New Architecture for Multi-turn Response Selection in Retrieval-Based Chatbots". ACL(2017) [PDF] [code]
  • MMI: "A Diversity-Promoting Objective Function for Neural Conversation Models". NAACL-HLT(2016) [PDF] [code]
  • RL-Dialogue: "Deep Reinforcement Learning for Dialogue Generation". EMNLP(2016) [PDF]
  • TA-Seq2Seq: "Topic Aware Neural Response Generation". AAAI(2017) [PDF] [code]
  • MA: "Mechanism-Aware Neural Machine for Dialogue Response Generation". AAAI(2017) [PDF]
  • HRED: "Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models". AAAI(2016) [PDF] [code]
  • VHRED: "A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues". AAAI(2017) [PDF] [code]
  • CVAE/KgCVAE: "Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders". ACL(2017) [PDF] [code]
  • ERM: "Elastic Responding Machine for Dialog Generation with Dynamically Mechanism Selecting". AAAI(2018) [PDF]
  • Tri-LSTM: "Augmenting End-to-End Dialogue Systems With Commonsense Knowledge". AAAI(2018) [PDF]
  • CCM: "Commonsense Knowledge Aware Conversation Generation with Graph Attention". IJCAI(2018) [PDF] [code]
  • Retrieval+multi-seq2seq: "An Ensemble of Retrieval-Based and Generation-Based Human-Computer Conversation Systems". IJCAI(2018) [PDF]

Personalized Dialogue

  • PAML: "Personalizing Dialogue Agents via Meta-Learning". ACL(2019) [PDF] [code]
  • PCCM: "Assigning Personality/Profile to a Chatting Machine for Coherent Conversation Generation". IJCAI(2018) [PDF] [code]
  • ECM: "Emotional Chatting Machine: Emotional Conversation Generation with Internal and External Memory". AAAI(2018) [PDF] [code]


  • CrossWOZ: "CrossWOZ: A Large-Scale Chinese Cross-Domain Task-Oriented Dialogue Dataset". TACL(2020) [PDF] [code]
  • MultiWOZ: "MultiWOZ - A Large-Scale Multi-Domain Wizard-of-Oz Dataset for Task-Oriented Dialogue Modelling". EMNLP(2018) [PDF] [code]
  • Survey of Dialogue: "A Survey on Dialogue Systems: Recent Advances and New Frontiers". SIGKDD Explorations(2017) [PDF]
  • Survey of Dialogue Corpora: "A Survey of Available Corpora For Building Data-Driven Dialogue Systems: The Journal Version". Dialogue & Discourse(2018) [PDF]
  • Table-to-Text Generation (R,C,T): "Table-to-Text Generation with Effective Hierarchical Encoder on Three Dimensions (Row, Column and Time)". EMNLP(2019) [PDF] [code]
  • LU-DST: "Multi-task Learning for Joint Language Understanding and Dialogue State Tracking". SIGDIAL(2018) [PDF]
  • MTask-M: "Multi-Task Learning for Speaker-Role Adaptation in Neural Conversation Models". IJCNLP(2018) [PDF]
  • ADVMT: "One “Ruler” for All Languages: Multi-Lingual Dialogue Evaluation with Adversarial Multi-Task Learning". IJCAI(2018) [PDF]

Text Generation

  • Cascaded Generation: "Cascaded Text Generation with Markov Transformers". arXiv(2020) [PDF] [code]
  • Sequence Generation: "A Generalized Framework of Sequence Generation with Application to Undirected Sequence Models". arXiv(2019) [PDF] [code]
  • Sparse-Seq2Seq: "Sparse Sequence-to-Sequence Models". ACL(2019) [PDF] [code]

Knowledge Representation and Reasoning

  • GNTP: "Differentiable Reasoning on Large Knowledge Bases and Natural Language". AAAI(2020) [PDF] [code]
  • NTP: "End-to-End Differentiable Proving". NeurIPS(2017) [PDF] [code]

Text Summarization

  • BERTSum: "Fine-tune BERT for Extractive Summarization". arXiv(2019) [PDF] [code]
  • BERT-Two-Stage: "Pretraining-Based Natural Language Generation for Text Summarization". arXiv(2019) [PDF]
  • QASumm: "Guiding Extractive Summarization with Question-Answering Rewards". NAACL(2019) [PDF] [code]
  • Re^3Sum: "Retrieve, Rerank and Rewrite: Soft Template Based Neural Summarization". ACL(2018) [PDF] [code]
  • NeuSum: "Neural Document Summarization by Jointly Learning to Score and Select Sentences". ACL(2018) [PDF]
  • rnn-ext+abs+RL+rerank: "Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting". ACL(2018) [PDF] [Notes] [code]
  • Seq2Seq+CGU: "Global Encoding for Abstractive Summarization". ACL(2018) [PDF] [code]
  • ML+RL: "A Deep Reinforced Model for Abstractive Summarization". ICLR(2018) [PDF]
  • T-ConvS2S: "Don’t Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization". EMNLP(2018) [PDF] [code]
  • RL-Topic-ConvS2S: "A reinforced topic-aware convolutional sequence-to-sequence model for abstractive text summarization". IJCAI (2018) [PDF]
  • GANsum: "Generative Adversarial Network for Abstractive Text Summarization". AAAI(2018) [PDF]
  • FTSum: "Faithful to the Original: Fact Aware Neural Abstractive Summarization". AAAI(2018) [PDF]
  • PGN: "Get To The Point: Summarization with Pointer-Generator Networks". ACL(2017) [PDF] [code]
  • ABS/ABS+: "A Neural Attention Model for Abstractive Sentence Summarization". EMNLP(2015) [PDF]
  • RAS-Elman/RAS-LSTM: "Abstractive Sentence Summarization with Attentive Recurrent Neural Networks". NAACL(2016) [PDF] [code]
  • words-lvt2k-1sent: "Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond". CoNLL(2016) [PDF]

Topic Modeling

  • LDA: "Latent Dirichlet Allocation". JMLR(2003) [PDF] [code]
  • Parameter Estimation: "Parameter estimation for text analysis". Technical report (2005). [PDF]
  • DTM: "Dynamic Topic Models". ICML(2006) [PDF] [code]
  • cDTM: "Continuous Time Dynamic Topic Models". UAI(2008) [PDF]
  • iDocNADE: "Document Informed Neural Autoregressive Topic Models with Distributional Prior". AAAI(2019) [PDF] [code]
  • NTM: "A Novel Neural Topic Model and Its Supervised Extension". AAAI(2015) [PDF]
  • TWE: "Topical Word Embeddings". AAAI(2015) [PDF]
  • RATM-D: "Recurrent Attentional Topic Model". AAAI(2017)[PDF]
  • RIBS-TM: "Don't Forget the Quantifiable Relationship between Words: Using Recurrent Neural Network for Short Text Topic Discovery". AAAI(2017) [PDF]
  • Topic coherence: "Optimizing Semantic Coherence in Topic Models". EMNLP(2011) [PDF]
  • Topic coherence: "Automatic Evaluation of Topic Coherence". NAACL(2010) [PDF]
  • DADT: "Authorship Attribution with Author-aware Topic Models". ACL(2012) [PDF]
  • Gaussian-LDA: "Gaussian LDA for Topic Models with Word Embeddings". ACL(2015) [PDF] [code]
  • LFTM: "Improving Topic Models with Latent Feature Word Representations". TACL(2015) [PDF] [code]
  • TopicVec: "Generative Topic Embedding: a Continuous Representation of Documents". ACL (2016) [PDF] [code]
  • SLRTM: "Sentence Level Recurrent Topic Model: Letting Topics Speak for Themselves". arXiv(2016) [PDF]
  • TopicRNN: "TopicRNN: A Recurrent Neural Network with Long-Range Semantic Dependency". ICLR(2017) [PDF] [code]
  • NMF boosted: "Stability of topic modeling via matrix factorization". Expert Syst. Appl. (2018) [PDF]
  • Evaluation of Topic Models: "External Evaluation of Topic Models". Australasian Doc. Comp. Symp. (2009) [PDF]
  • Topic2Vec: "Topic2Vec: Learning distributed representations of topics". IALP(2015) [PDF]
  • L-EnsNMF: "L-EnsNMF: Boosted Local Topic Discovery via Ensemble of Nonnegative Matrix Factorization". ICDM(2016) [PDF] [code]
  • DC-NMF: "DC-NMF: nonnegative matrix factorization based on divide-and-conquer for fast clustering and topic modeling". J. Global Optimization (2017) [PDF]
  • cFTM: "The contextual focused topic model". KDD(2012) [PDF]
  • CLM: "Collaboratively Improving Topic Discovery and Word Embeddings by Coordinating Global and Local Contexts". KDD(2017) [PDF] [code]
  • GMTM: "Unsupervised Topic Modeling for Short Texts Using Distributed Representations of Words". NAACL(2015) [PDF]
  • GPU-PDMM: "Enhancing Topic Modeling for Short Texts with Auxiliary Word Embeddings". TOIS (2017) [PDF]
  • BPT: "A Two-Level Topic Model Towards Knowledge Discovery from Citation Networks". TKDE (2014) [PDF]
  • BTM: "A Biterm Topic Model for Short Texts". WWW(2013) [PDF] [code]
  • HGTM: "Using Hashtag Graph-Based Topic Model to Connect Semantically-Related Words Without Co-Occurrence in Microblogs". TKDE(2016) [PDF]
  • COTM: "A topic model for co-occurring normal documents and short texts". WWW (2018) [PDF]

Machine Translation

  • Multi-pass decoder: "Adaptive Multi-pass Decoder for Neural Machine Translation". EMNLP(2018) [PDF]
  • Deliberation Networks: "Deliberation Networks: Sequence Generation Beyond One-Pass Decoding". NeurIPS(2017) [PDF]
  • KVMem-Attention: "Neural Machine Translation with Key-Value Memory-Augmented Attention". IJCAI(2018) [PDF]
  • Interactive-Attention: "Interactive Attention for Neural Machine Translation". COLING(2016) [PDF]

Question Answering

  • CFC: "Coarse-grain Fine-grain Coattention Network for Multi-evidence Question Answering". ICLR(2019) [PDF]
  • MTQA: "Multi-Task Learning with Multi-View Attention for Answer Selection and Knowledge Base Question Answering". AAAI(2019) [PDF] [code]
  • CQG-KBQA: "Knowledge Base Question Answering via Encoding of Complex Query Graphs". EMNLP(2018) [PDF] [code]
  • HR-BiLSTM: "Improved Neural Relation Detection for Knowledge Base Question Answering". ACL(2017) [PDF]
  • KBQA-CGK: "An End-to-End Model for Question Answering over Knowledge Base with Cross-Attention Combining Global Knowledge". ACL(2017) [PDF]
  • KVMem: "Key-Value Memory Networks for Directly Reading Documents". EMNLP(2016) [PDF]

Reading Comprehension

  • DecompRC: "Multi-hop Reading Comprehension through Question Decomposition and Rescoring". ACL(2019) [PDF] [code]
  • FlowQA: "FlowQA: Grasping Flow in History for Conversational Machine Comprehension". ICLR(2019) [PDF] [code]
  • SDNet: "SDNet: Contextualized Attention-based Deep Network for Conversational Question Answering". arXiv(2018) [PDF] [code]

Image Captioning

  • MLAIC: "A Multi-task Learning Approach for Image Captioning". IJCAI(2018) [PDF] [code]
  • Up-Down Attention: "Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering". CVPR(2018) [PDF]
  • SCST: "Self-critical Sequence Training for Image Captioning". CVPR(2017) [PDF]
  • Recurrent-RSA: "Pragmatically Informative Image Captioning with Character-Level Inference". NAACL(2018) [PDF] [code]


Paper reading list in natural language processing, including dialogue system, text summarization, topic modeling, etc.




No releases published


No packages published
You can’t perform that action at this time.