This is a curated list of resources dedicated to Knowledge Distillation, Recommendation System, especially Natural Language Processing (NLP).
The goal of this repository is not only storing the references personally but also sharing with people outside.
- Introducing MASS – A pre-training method that outperforms BERT and GPT in sequence to sequence language generation tasks
- A new model and dataset for long-range memory
- Visual Paper Summary: ALBERT (A Lite BERT)
- reformer-pytorch
- Implementation in PyTorch
- Fixed Encoder Self-Attention Patterns in Transformer-Based Machine Translation
- A Primer in BERTology: What we know about how BERT works
- ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
- Universal Transformer
- Contextualized Non-local Neural Networks for Sequence Learning
- An Efficient Framework for Learning Sentence Representations
- DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference
- A Generative Model for Joint Natural Language Understanding and Generation
- Towards a Human-like Open-Domain Chatbot
- Self-Supervised Dialogue Learning
- Sequential Attention-based Network for Noetic End-to-End Response Selection
- Neural Text Generation from Rich Semantic Representations
- Pretraining Methods for Dialog Context Representation Learning
- Deep Generative Models with Learnable Knowledge Constraints
- Graph Neural Networks: Models and Applications
- Generating Logical Forms from Graph Representations of Text and Entities
- K-BERT: Enabling Language Representation With Knowledge Graph
- Learning and Reasoning on Graph for Recommendation
- Natural Language Recommendations: A novel research paper search engine developed entirely with embedding and transformer models
- Distilling Transformers into Simple Neural Networks with Unlabeled Transfer Data
- Attentive Student Meets Multi-Task Teacher: Improved Knowledge Distillation for Pretrained Models
- Robust Language Representation Learning via Multi-task Knowledge Distillation
- Understanding Knowledge Distillation in Neural Sequence Generation
- Distilling Task-Specific Knowledge from BERT into Simple Neural Networks
- Metric Learning for Dynamic Text Classification
- RUBER: An Unsupervised Method for Automatic Evaluation of Open-Domain Dialog Systems
- Better Automatic Evaluation of Open-Domain Dialogue Systems with Contextualized Embeddings
- Matching Natural Language Sentences with Hierarchical Sentence Factorization
- Instance Cross Entropy for Deep Metric Learning
- Matching Embeddings for Domain Adaptation
- Deep Metric Learning using Similarities from Nonlinear Rank Approximations
- Keyword-Attentive Deep Semantic Matching
- BERTScore: Evaluating Text Generation with BERT
- Sampling Matters! An Empirical Study of Negative Sampling Strategies for Learning of Matching Models in Retrieval-based Dialogue Systems
- pytorch-seq2seq tutorial
- Learn NLP With Me – Information Extraction – Relations – Introduction
- Stanford CS224N
- fast.ai course-nlp
- Daniel Jurafsky and James H Martin. Speech and Language Processing (3rd Edition). Draft, 2019.
- Hands on Machine Learning
I am waiting for people who wants to contribute to this document. If you know good papers, tutorial, whatsoever, Please pull request! :)