-
- Internet-Augmented Dialogue Generation
- Retrieve and Refine: Improved Sequence Generation Models For Dialogue
- Sequential Latent Knowledge Selection for Knowledge-Grounded Dialogue
- Learning to Copy Coherent Knowledge for Response Generation
- Conversational Graph Grounded Policy Learning for Open-Domain Conversation Generation
- Learning a Simple and Effective Model for Multi-turn Response Generation with Auxiliary Tasks
- Sequence to Backward and Forward Sequences: A Content-Introducing Approach to Generative Short-Text Conversation
-
- DiffCSE: Difference-based Contrastive Learning for Sentence Embeddings
- Sentence-T5: Scalable Sentence Encoders from Pre-trained Text-to-Text Models
- Dual-View Distilled BERT for Sentence Embedding
- SimCSE: Simple Contrastive Learning of Sentence Embeddings
- SBERT-WK: A Sentence Embedding Method by Dissecting BERT-based Word Models
- Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
- Learning Effective and Interpretable Semantic Models using Non-Negative Sparse Embedding
-
- Distilling the Knowledge in a Neural Network
- Improved Knowledge Distillation via Teacher Assistant
- Knowledge Distillation Meets Self-Supervistion
- DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
- TinyBERT: Distilling BERT for Natural Language Understanding
- MINILM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers
- ERNIE-Tiny : A Progressive Distillation Framework for Pretrained Transformer Compression
-
- DramaQA: Character-Centered Video Story Understanding with Hierarchical QA
- Modality Shifting Attention Network for Multi-modal Video Question Answering
- MMFT-BERT: Multimodal Fusion Transformer with BERT Encodings for Visual Question Answering
-
- Dense Passage Retrieval for Open-Domain Question Answering
- ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT
-
- Don't Stop Pretraining: Adapt Language Models to Domains and Tasks
- ELECTRA pre-training text encoders as discriminators rather than generators
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
- Unifying Vision-and-Language Tasks via Text Generation
- Understanding the difficulty of training transformers
- Do NLP Models Know Numbers? Probing Numeracy in Embeddings
-
- Overcoming catastrophic forgetting in neural networks
- Progressive Neural Networks
- Continual Learning with Deep Generative Replay
- Continual Learning for Natural Language Generation in Task-oriented Dialog Systems
-
- BRIO: Bringing Order to Abstractive Summarization
-
- BOIL: Towards Representation Change for Few-shot Learning
- Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
-
Notifications
You must be signed in to change notification settings - Fork 1
BM-K/Paper-Seminar
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
Paper Seminar
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published