This is a list of BERT-related papers. Any feedback is welcome.
- A BERT Baseline for the Natural Questions
- End-to-End Open-Domain Question Answering with BERTserini (NAALC2019)
- Latent Retrieval for Weakly Supervised Open Domain Question Answering (ACL2019)
- Learning to Ask Unanswerable Questions for Machine Reading Comprehension (ACL2019)
- Enhancing Pre-Trained Language Representations with Rich Knowledge for Machine Reading Comprehension (ACL2019)
- A Simple but Effective Method to Incorporate Multi-turn Context with BERT for Conversational Machine Comprehension (ACL2019 WS)
- BERT with History Answer Embedding for Conversational Question Answering (SIGIR2019)
- Beyond English-only Reading Comprehension: Experiments in Zero-Shot Multilingual Transfer for Bulgarian (RANLP2019)
- Utilizing BERT for Aspect-Based Sentiment Analysis via Constructing Auxiliary Sentence (NAACL2019)
- BERT Post-Training for Review Reading Comprehension and Aspect-based Sentiment Analysis (NAACL2019)
- An Investigation of Transfer Learning-Based Sentiment Analysis in Japanese (ACL2019)
- Neural Aspect and Opinion Term Extraction with Mined Rules as Weak Supervision (ACL2019)
- BERT for Joint Intent Classification and Slot Filling
- Multi-lingual Intent Detection and Slot Filling in a Joint BERT-based Model
- BERT-DST: Scalable End-to-End Dialogue State Tracking with Bidirectional Encoder Representations from Transformer (Interspeech2019)
- Assessing BERT’s Syntactic Abilities
- Simple BERT Models for Relation Extraction and Semantic Role Labeling
- Matching the Blanks: Distributional Similarity for Relation Learning (ACL2019)
- A Simple BERT-Based Approach for Lexical Simplification
- Resolving Gendered Ambiguous Pronouns with BERT (ACL2019 WS)
- Anonymized BERT: An Augmentation Approach to the Gendered Pronoun Resolution Challenge (ACL2019 WS)
- Gendered Pronoun Resolution using BERT and an extractive question answering formulation (ACL2019 WS)
- MSnet: A BERT-based Network for Gendered Pronoun Resolution (ACL2019 WS)
- Fill the GAP: Exploiting BERT for Pronoun Resolution (ACL2019 WS)
- BERT Masked Language Modeling for Co-reference Resolution (ACL2019 WS)
- How to Fine-Tune BERT for Text Classification?
- X-BERT: eXtreme Multi-label Text Classification with BERT
- Exploring Unsupervised Pretraining and Sentence Structure Modelling for Winograd Schema Challenge
- A Surprisingly Robust Trick for the Winograd Schema Challenge
- Passage Re-ranking with BERT
- Investigating the Successes and Failures of BERT for Passage Re-Ranking
- Document Expansion by Query Prediction
- CEDR: Contextualized Embeddings for Document Ranking (SIGIR2019)
- Deeper Text Understanding for IR with Contextual Neural Language Modeling (SIGIR2019)
- FAQ Retrieval using Query-Question Similarity and BERT-Based Query-Answer Relevance (SIGIR2019)
- BERT has a Mouth, and It Must Speak: BERT as a Markov Random Field Language Model (NAACL2019 WS)
- Pretraining-Based Natural Language Generation for Text Summarization
- MASS: Masked Sequence to Sequence Pre-training for Language Generation (ICML2019)
- Unified Language Model Pre-training for Natural Language Understanding and Generation
- Multi-Task Deep Neural Networks for Natural Language Understanding (ACL2019)
- BERT and PALs: Projected Attention Layers for Efficient Adaptation in Multi-Task Learning (ICML2019)
- Unifying Question Answering and Text Classification via Span Extraction
- ERNIE: Enhanced Language Representation with Informative Entities (ACL2019)
- ERNIE: Enhanced Representation through Knowledge Integration
- ERNIE 2.0: A Continual Pre-training Framework for Language Understanding
- SpanBERT: Improving Pre-training by Representing and Predicting Spans
- RoBERTa: A Robustly Optimized BERT Pretraining Approach
- KERMIT: Generative Insertion-Based Modeling for Sequences
- DisSent: Sentence Representation Learning from Explicit Discourse Relations (ACL2019)
- A Structural Probe for Finding Syntax in Word Representations (NAACL2019)
- Linguistic Knowledge and Transferability of Contextual Representations (NAACL2019)
- Probing What Different NLP Tasks Teach Machines about Function Word Comprehension (*SEM2019)
- BERT Rediscovers the Classical NLP Pipeline (ACL2019)
- Probing Neural Network Comprehension of Natural Language Arguments (ACL2019)
- What does BERT learn about the structure of language? (ACL2019)
- Open Sesame: Getting Inside BERT's Linguistic Knowledge (ACL2019 WS)
- Analyzing the Structure of Attention in a Transformer Language Model (ACL2019 WS)
- What Does BERT Look At? An Analysis of BERT's Attention (ACL2019 WS)
- Blackbox meets blackbox: Representational Similarity and Stability Analysis of Neural Language Models and Brains (ACL2019 WS)
- Inducing Syntactic Trees from BERT Representations (ACL2019 WS)
- A Multiscale Visualization of Attention in the Transformer Model (ACL2019 Demo)
- Are Sixteen Heads Really Better than One?
- Multilingual Constituency Parsing with Self-Attention and Pre-Training (ACL2019)
- Cross-lingual Language Model Pretraining
- 75 Languages, 1 Model: Parsing Universal Dependencies Universally
- Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT
- How multilingual is Multilingual BERT? (ACL2019)
- BioBERT: a pre-trained biomedical language representation model for biomedical text mining
- Transfer Learning in Biomedical Natural Language Processing: An Evaluation of BERT and ELMo on Ten Benchmarking Datasets (ACL2019 WS)
- ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission
- Publicly Available Clinical BERT Embeddings (NAACL2019 WS)
- SciBERT: Pretrained Contextualized Embeddings for Scientific Text
- PatentBERT: Patent Classification with Fine-Tuning a pre-trained BERT Model
- VideoBERT: A Joint Model for Video and Language Representation Learning
- ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks
- Selfie: Self-supervised Pretraining for Image Embedding
- Contrastive Bidirectional Transformer for Temporal Representation Learning
- Cloze-driven Pretraining of Self-attention Networks
- Learning and Evaluating General Linguistic Intelligence
- To Tune or Not to Tune? Adapting Pretrained Representations to Diverse Tasks (ACL2019 WS)
- BERTScore: Evaluating Text Generation with BERT
- Machine Translation Evaluation with BERT Regressor
- Large Batch Optimization for Deep Learning: Training BERT in 76 minutes
- Distilling Task-Specific Knowledge from BERT into Simple Neural Networks
- Is BERT Really Robust? Natural Language Attack on Text Classification and Entailment
- Improving Cuneiform Language Identification with BERT (NAACL2019 WS)