Skip to content

Latest commit

 

History

History
883 lines (875 loc) · 85.5 KB

papers_with_arxiv_link.md

File metadata and controls

883 lines (875 loc) · 85.5 KB

Long Papers

  • 2kenize: Tying Subword Sequences for Chinese Script Conversion [arXiv]
  • A Batch Normalized Inference Network Keeps the KL Vanishing Away [arXiv]
  • A Call for More Rigor in Unsupervised Cross-lingual Learning [arXiv]
  • A Comprehensive Analysis of Preprocessing for Word Representation Learning in Affective Tasks
  • A Contextual Hierarchical Attention Network with Adaptive Objective for Dialogue State Tracking
  • A Corpus for Large-Scale Phonetic Typology [arXiv]
  • A Formal Hierarchy of RNN Architectures [arXiv]
  • A Generate-and-Rank Framework with Semantic Type Regularization for Biomedical Concept Normalization
  • A Generative Model for Joint Natural Language Understanding and Generation
  • A Girl Has A Name: Detecting Authorship Obfuscation [arXiv]
  • A Graph Auto-encoder Model of Derivational Morphology
  • A Graph-based Coarse-to-fine Method for Unsupervised Bilingual Lexicon Induction
  • A Joint Model for Document Segmentation and Segment Labeling
  • A Joint Neural Model for Information Extraction with Global Features
  • A Methodology for Creating Question Answering Corpora Using Inverse Data Annotation [arXiv]
  • A Mixture of h − 1 Heads is Better than h Heads [arXiv]
  • A Monolingual Approach to Contextualized Word Embeddings for Mid-Resource Languages
  • A Multitask Learning Approach for Diacritic Restoration
  • A Novel Cascade Binary Tagging Framework for Relational Triple Extraction [arXiv]
  • A Novel Graph-based Multi-modal Fusion Encoder for Neural Machine Translation
  • A Prioritization Model for Suicidality Risk Assessment
  • A Recipe for Creating Multimodal Aligned Datasets for Sequential Tasks [arXiv]
  • A Reinforced Generation of Adversarial Examples for Neural Machine Translation [arXiv]
  • A Self-Training Method for Machine Reading Comprehension with Soft Evidence Extraction [arXiv]
  • A Span-based Linearization for Constituent Trees [arXiv]
  • A Study of Non-autoregressive Model for Sequence Generation [arXiv]
  • A Systematic Assessment of Syntactic Generalization in Neural Language Models [arXiv]
  • A Tale of Two Perplexities: Sensitivity of Neural Language Models to Lexical Retrieval Deficits in Dementia of the Alzheimer’s Type [arXiv]
  • A Top-down Neural Architecture towards Text-level Parsing of Discourse Rhetorical Structure [arXiv]
  • A Unified MRC Framework for Named Entity Recognition [arXiv]
  • Adaptive Compression of Word Embeddings
  • Addressing Posterior Collapse with Mutual Information for Improved Variational Neural Machine Translation
  • AdvAug: Robust Adversarial Augmentation for Neural Machine Translation
  • Adversarial and Domain-Aware BERT for Cross-Domain Sentiment Analysis
  • Adversarial NLI: A New Benchmark for Natural Language Understanding [arXiv]
  • Agreement Prediction of Arguments in Cyber Argumentation for Detecting Stance Polarity and Intensity
  • Aligned Dual Channel Graph Convolutional Network for Visual Question Answering
  • Amalgamation of protein sequence, structure and textual information for improving protein-protein interaction identification
  • AMR Parsing via Graph-Sequence Iterative Inference [arXiv]
  • AMR Parsing with Latent Structural Information
  • An analysis of the utility of explicit negative examples to improve the syntactic abilities of neural language models [arXiv]
  • An Effective Transition-based Model for Discontinuous NER [arXiv]
  • An Effectiveness Metric for Ordinal Classification: Formal Properties and Experimental Results
  • An Online Semantic-enhanced Dirichlet Model for Short Text Stream Clustering
  • Analysing Lexical Semantic Change with Contextualised Word Representations
  • Analyzing analytical methods: The case of phonology in neural models of spoken language [arXiv]
  • Analyzing Political Parody in Social Media [arXiv]
  • Are Natural Language Inference Models IMPPRESsive? Learning IMPlicature and PRESupposition [arXiv]
  • Asking and Answering Questions to Evaluate the Factual Consistency of Summaries [arXiv]
  • Aspect Sentiment Classification with Document-level Sentiment Preference Modeling
  • ASSET: A Dataset for Tuning and Evaluation of Sentence Simplification Models with Multiple Rewriting Transformations [arXiv]
  • Attend, Translate and Summarize: An Efficient Method for Neural Cross-Lingual Summarization
  • Attentive Pooling with Learnable Norms for Text Representation
  • Autoencoding Pixies: Amortised Variational Inference with Graph Convolutions for Functional Distributional Semantics [arXiv]
  • Automated Evaluation of Writing – 50 Years and Counting
  • Automatic Detection of Generated Text is Easiest when Humans are Fooled [arXiv]
  • Automatic Generation of Citation Texts in Scholarly Papers: A Pilot Study
  • Automatic Poetry Generation from Prosaic Text
  • BabyWalk: Going Farther in Vision-and-Language Navigation by Taking Baby Steps [arXiv]
  • Balancing Objectives in Counseling Conversations: Advancing Forwards or Looking Backwards [arXiv]
  • Balancing Training for Multilingual Neural Machine Translation [arXiv]
  • BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension [arXiv]
  • Benchmarking Multimodal Regex Synthesis with Complex Structures [arXiv]
  • BERTRAM: Improved Word Embeddings Have Big Impact on Contextualized Model Performance [arXiv]
  • Beyond Accuracy: Behavioral Testing of NLP Models with CheckList [arXiv]
  • Beyond Possession Existence: Duration and Co-Possession
  • Beyond User Self-Reported Likert Scale Ratings: A Comparison Model for Automatic Dialog Evaluation [arXiv]
  • Bilingual Dictionary Based Neural Machine Translation without Using Parallel Sentences
  • Biomedical Entity Representations with Synonym Marginalization [arXiv]
  • Bipartite Flat-Graph Network for Nested Named Entity Recognition [arXiv]
  • BiRRE: Learning Bidirectional Residual Relation Embeddings for Supervised Hypernymy Detection
  • BLEURT: Learning Robust Metrics for Text Generation [arXiv]
  • Boosting Neural Machine Translation with Similar Translations
  • Bootstrapping Techniques for Polysynthetic Morphological Analysis [arXiv]
  • BPE-Dropout: Simple and Effective Subword Regularization [arXiv]
  • Breaking Through the 80% Glass Ceiling: Raising the State of the Art in Word Sense Disambiguation by Incorporating Knowledge Graph Information
  • Bridging Anaphora Resolution as Question Answering [arXiv]
  • Bridging the Structural Gap Between Encoding and Decoding for Data-To-Text Generation
  • Building a User-Generated Content North-African Arabizi Treebank: Tackling Hell
  • Calibrating Structured Output Predictors for Natural Language Processing [arXiv]
  • CamemBERT: a Tasty French Language Model [arXiv]
  • Can We Predict New Facts with Open Knowledge Graph Embeddings? A Benchmark for Open Link Prediction
  • Can You Put it All Together: Evaluating Conversational Agents’ Ability to Blend Skills [arXiv]
  • CDL: Curriculum Dual Learning for Emotion-Controllable Response Generation [arXiv]
  • ChartDialogs: Plotting from Natural Language Instructions
  • CH-SIMS: A Chinese Multimodal Sentiment Analysis Dataset with Fine-grained Annotation of Modality
  • Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data
  • Clinical Reading Comprehension: A Thorough Analysis of the emrQA Dataset [arXiv]
  • CluBERT: A Cluster-Based Approach for Learning Sense Distributions in Multiple Languages
  • CluHTM - Semantic Hierarchical Topic Modeling based on CluWords
  • Code and Named Entity Recognition in StackOverflow [arXiv]
  • CompGuessWhat?!: A Multi-task Evaluation Framework for Grounded Language Learning
  • Compositionality and Generalization In Emergent Languages [arXiv]
  • Conditional Augmentation for Aspect Term Extraction via Masked Sequence-to-Sequence Generation [arXiv]
  • Connecting Embeddings for Knowledge Graph Entity Typing
  • Contextualized Weak Supervision for Text Classification
  • Continual Relation Learning via Episodic Memory Activation and Reconsolidation
  • Conversational Graph Grounded Policy Learning for Open-Domain Conversation Generation
  • CorefQA: Coreference Resolution as Query-based Span Prediction [arXiv]
  • Coupling Distant Annotation and Adversarial Training for Cross-Domain Chinese Word Segmentation
  • CraftAssist Instruction Parsing: Semantic Parsing for a Voxel-World Assistant [arXiv]
  • Cross-Lingual Semantic Role Labeling with High-Quality Translated Training Corpus [arXiv]
  • Cross-Lingual Unsupervised Sentiment Classification with Multi-View Transfer Learning
  • Cross-Linguistic Syntactic Evaluation of Word Prediction Models [arXiv]
  • Cross-media Structured Common Space for Multimedia Event Extraction [arXiv]
  • Cross-modal Coherence Modeling for Caption Generation [arXiv]
  • Cross-modal Language Generation using Pivot Stabilization for Web-scale Language Coverage [arXiv]
  • Cross-Modality Relevance for Reasoning on Language and Vision [arXiv]
  • Curriculum Learning for Natural Language Understanding
  • Curriculum Pre-training for End-to-End Speech Translation [arXiv]
  • Data Manipulation: Towards Effective Instance Learning for Neural Dialogue Generation via Learning to Augment and Reweight [arXiv]
  • DeFormer: Decomposing Pre-trained Transformers for Faster Question Answering [arXiv]
  • Demographics Should Not Be the Reason of Toxicity: Mitigating Discrimination in Text Classifications with Instance Weighting [arXiv]
  • Dense-Caption Matching and Frame-Selection Gating for Temporal Localization in VideoQA [arXiv]
  • Dependency Graph Enhanced Dual-transformer Structure for Aspect-based Sentiment Classification
  • DeSePtion: Dual Sequence Prediction and Adversarial Examples for Improved Fact-Checking [arXiv]
  • Detecting Perceived Emotions in Hurricane Disasters [arXiv]
  • Dialogue Coherence Assessment Without Explicit Dialogue Act Labels
  • Dialogue-Based Relation Extraction [arXiv]
  • Dice Loss for Data-imbalanced NLP Tasks [arXiv]
  • Differentiable Window for Dynamic Local Attention
  • Discourse as a Function of Event: Profiling Discourse Structure in News Articles around the Main Event
  • Discourse-Aware Neural Extractive Text Summarization [arXiv]
  • Discrete Latent Variable Representations for Low-Resource Text Classification
  • Discrete Optimization for Unsupervised Sentence Summarization with Word-Level Extraction [arXiv]
  • Distilling Annotations via Active Imitation Learning
  • Distilling Knowledge Learned in BERT for Text Generation [arXiv]
  • Distinguish Confusing Law Articles for Legal Judgment Prediction [arXiv]
  • Diverse and Informative Dialogue Generation with Context-Specific Commonsense Knowledge Awareness
  • Diversifying Dialogue Generation with Non-Conversational Text [arXiv]
  • Do Neural Language Models Show Preferences for Syntactic Formalisms? [arXiv]
  • Do Neural Models Learn Systematicity of Monotonicity Inference in Natural Language? [arXiv]
  • Document Modeling with Graph Attention Networks for Multi-grained Machine Reading Comprehension [arXiv]
  • Document Translation vs. Query Translation for Cross-Lingual Information Retrieval in the Medical Domain
  • Document-Level Event Role Filler Extraction using Multi-Granularity Contextualized Encoding [arXiv]
  • Don’t Say That! Making Inconsistent Dialogue Unlikely with Unlikelihood Training [arXiv]
  • Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks [arXiv]
  • DoQA - Accessing Domain-Specific FAQs via Conversational QA [arXiv]
  • Double-Hard Debias: Tailoring Word Embeddings for Gender Bias Mitigation [arXiv]
  • DRTS Parsing with Structure-Aware Encoding and Decoding [arXiv]
  • DTCA: Decision Tree-based Co-Attention Networks for Explainable Claim Verification [arXiv]
  • Dynamic Fusion Network for Multi-Domain End-to-end Task-Oriented Dialog [arXiv]
  • Dynamic Online Conversation Recommendation
  • Dynamic Programming Encoding for Subword Segmentation in Neural Machine Translation [arXiv]
  • ECPE-2D: Emotion-Cause Pair Extraction based on Joint Two-Dimensional Representation, Interaction and Prediction
  • Effective Estimation of Deep Generative Language Models [arXiv]
  • Effective Inter-Clause Modeling for End-to-End Emotion-Cause Pair Extraction
  • Efficient Constituency Parsing by Pointing
  • Efficient Dialogue State Tracking by Selectively Overwriting Memory [arXiv]
  • Efficient Pairwise Annotation of Argument Quality
  • Efficient Second-Order TreeCRF for Neural Dependency Parsing [arXiv]
  • Emergence of Syntax Needs Minimal Supervision [arXiv]
  • Emerging Cross-lingual Structure in Pretrained Language Models [arXiv]
  • Empower Entity Set Expansion via Language Model Probing [arXiv]
  • Empowering Active Learning to Jointly Optimize System and User Demands [arXiv]
  • End-to-End Bias Mitigation by Modelling Biases in Corpora [arXiv]
  • End-to-End Neural Pipeline for Goal-Oriented Dialogue Systems using GPT-2
  • End-to-End Neural Word Alignment Outperforms GIZA++ [arXiv]
  • Enhancing Answer Boundary Detection for Multilingual Machine Reading Comprehension [arXiv]
  • Enhancing Cross-target Stance Detection with Transferable Semantic-Emotion Knowledge
  • ERASER: A Benchmark to Evaluate Rationalized NLP Models [arXiv]
  • ESPRIT: Explaining Solutions to Physical Reasoning Tasks [arXiv]
  • Estimating predictive uncertainty for rumour verification models [arXiv]
  • Estimating the influence of auxiliary tasks for multi-task learning of sequence tagging tasks
  • Evaluating and Enhancing the Robustness of Neural Network-based Dependency Parsing Models with Adversarial Examples
  • Evaluating Explainable AI: Which Algorithmic Explanations Help Users Predict Model Behavior? [arXiv]
  • Evaluating Explanation Methods for Neural Machine Translation [arXiv]
  • Evidence-Aware Inferential Text Generation with Vector Quantised Variational AutoEncoder
  • Exact yet Efficient Graph Parsing, Bi-directional Locality and the Constructivist Hypothesis
  • Examining Citations of Natural Language Processing Literature [arXiv]
  • Examining the State-of-the-Art in News Timeline Summarization [arXiv]
  • Exclusive Hierarchical Decoding for Deep Keyphrase Generation [arXiv]
  • Expertise Style Transfer: A New Task Towards Better Communication between Experts and Laymen [arXiv]
  • Explaining Black Box Predictions and Unveiling Data Artifacts through Influence Functions [arXiv]
  • Explicit Memory Tracker with Coarse-to-Fine Reasoning for Conversational Machine Reading [arXiv]
  • Explicit Semantic Decomposition for Definition Generation
  • Exploiting Syntactic Structure for Better Language Modeling: A Syntactic Distance Approach [arXiv]
  • Exploiting the Syntax-Model Consistency for Neural Relation Extraction
  • Exploring Contextual Word-level Style Relevance for Unsupervised Style Transfer [arXiv]
  • Exploring Unexplored Generalization Challenges for Cross-Database Semantic Parsing
  • Extracting Headless MWEs from Dependency Parse Trees: Parsing, Tagging, and Joint Modeling Approaches [arXiv]
  • Extractive Summarization as Text Matching [arXiv]
  • Facet-Aware Evaluation for Extractive Summarization [arXiv]
  • Fact-based Text Editing
  • Fast and Accurate Deep Bidirectional Language Representations for Unsupervised Learning [arXiv]
  • Fast and Accurate Non-Projective Dependency Tree Linearization
  • FastBERT: a Self-distilling BERT with Adaptive Inference Time [arXiv]
  • Feature Projection for Improved Text Classification
  • FEQA: A Question Answering Evaluation Framework for Faithfulness Assessment in Abstractive Summarization [arXiv]
  • Few-shot Slot Tagging with Collapsed Dependency Transfer and Label-enhanced Task-adaptive Projection Network
  • Finding Universal Grammatical Relations in Multilingual BERT [arXiv]
  • Fine-Grained Analysis of Cross-Linguistic Syntactic Divergences [arXiv]
  • Fine-grained Fact Verification with Kernel Graph Attention Network [arXiv]
  • Fine-grained Interest Matching for Neural News Recommendation
  • Fluent Response Generation for Conversational Question Answering [arXiv]
  • From Arguments to Key Points: Towards Automatic Argument Summarization [arXiv]
  • From English to Code-Switching: Transfer Learning with Strong Morphological Clues [arXiv]
  • From SPMRL to NMRL: What Did We Learn (and Unlearn) in a Decade of Parsing Morphologically-Rich Languages (MRLs)? [arXiv]
  • From Zero to Hero: Human-In-The-Loop Entity Linking in Low Resource Domains
  • Frugal Paradigm Completion
  • Gated Convolutional Bidirectional Attention-based Model for Off-topic Spoken Response Detection [arXiv]
  • GCAN: Graph-aware Co-Attention Networks for Explainable Fake News Detection on Social Media [arXiv]
  • Gender Bias in Multilingual Embeddings and Cross-Lingual Transfer [arXiv]
  • Gender Gap in Natural Language Processing Research: Disparities in Authorship and Citations [arXiv]
  • Gender in Danger? Evaluating Speech Translation Technology on the MuST-SHE Corpus
  • Generalized Entropy Regularization or: There’s Nothing Special about Label Smoothing [arXiv]
  • Generalizing Natural Language Analysis through Span-relation Representations [arXiv]
  • Generate, Delete and Rewrite: A Three-Stage Framework for Improving Persona Consistency of Dialogue Generation [arXiv]
  • Generating Counter Narratives against Online Hate Speech: Data and Strategies [arXiv]
  • Generating Diverse and Consistent QA pairs from Contexts with Information-Maximizing Hierarchical Conditional VAEs [arXiv]
  • Generating Fact Checking Explanations [arXiv]
  • Generating Hierarchical Explanations on Text Classification via Feature Interaction Detection [arXiv]
  • Generating Informative Conversational Response using Recurrent Knowledge-Interaction and Knowledge-Copy
  • Generative Semantic Hashing Enhanced via Boltzmann Machines
  • GLUECoS: An Evaluation Benchmark for Code-Switched NLP [arXiv]
  • GoEmotions: A Dataset of Fine-Grained Emotions [arXiv]
  • Good-Enough Compositional Data Augmentation [arXiv]
  • Graph Neural News Recommendation with Unsupervised Preference Disentanglement
  • Graph-to-Tree Learning for Solving Math Word Problems
  • Grounded Conversation Generation as Guided Traverses in Commonsense Knowledge Graphs [arXiv]
  • Grounding Conversations with Improvised Dialogues [arXiv]
  • Guiding Variational Response Generator to Exploit Persona [arXiv]
  • Handling Rare Entities for Neural Sequence Labeling
  • Hard-Coded Gaussian Attention for Neural Machine Translation [arXiv]
  • Harnessing the linguistic signal to predict scalar inferences [arXiv]
  • Harvesting and Refining Question-Answer Pairs for Unsupervised QA [arXiv]
  • HAT: Hardware-Aware Transformers for Efficient Natural Language Processing [arXiv]
  • He said “who’s gonna take care of your children when you are at ACL?”: Reported Sexist Acts are Not Sexist
  • Heterogeneous Graph Neural Networks for Extractive Document Summarization [arXiv]
  • Heterogeneous Graph Transformer for Graph-to-Sequence Learning [arXiv]
  • Hierarchical Entity Typing via Multi-level Learning to Rank [arXiv]
  • Hierarchical Modeling for User Personality Prediction: The Role of Message-Level Attention
  • Hierarchy-Aware Global Model for Hierarchical Text Classification
  • Highway Transformer: Self-Gating Enhanced Self-Attentive Networks [arXiv]
  • Hiring Now: A Skill-Aware Multi-Attention Model for Job Posting Generation
  • History for Visual Dialog: Do we really need it? [arXiv]
  • Hooks in the Headline: Learning to Generate Headlines with Controlled Styles [arXiv]
  • How Accents Confound: Probing for Accent Information in End-to-End Speech Recognition Systems
  • How does BERT’s attention change when you fine-tune? An analysis methodology and a case study in negation scope
  • How Does NLP Benefit Legal System: A Summary of Legal Artificial Intelligence [arXiv]
  • How Does Selective Mechanism Improve Self-Attention Networks? [arXiv]
  • How to Ask Good Questions? Try to Leverage Paraphrases
  • Human Attention Maps for Text Classification: Do Humans and Neural Networks Focus on the Same Words?
  • Hyperbolic Capsule Networks for Multi-Label Classification
  • HyperCore: Hyperbolic and Co-graph Representation for Automatic ICD Coding
  • Image-Chat: Engaging Grounded Conversations [arXiv]
  • IMoJIE: Iterative Memory-Based Joint Open Information Extraction [arXiv]
  • Improved Natural Language Generation via Loss Truncation [arXiv]
  • Improving Adversarial Text Generation by Modeling the Distant Future [arXiv]
  • Improving Chinese Word Segmentation with Wordhood Memory Networks
  • Improving Disentangled Text Representation Learning with Information-Theoretic Guidance
  • Improving Disfluency Detection by Self-Training a Self-Attentive Model [arXiv]
  • Improving Event Detection via Open-domain Trigger Knowledge
  • Improving Image Captioning Evaluation by Considering Inter References Variance
  • Improving Image Captioning with Better Use of Caption
  • Improving Massively Multilingual Neural Machine Translation and Zero-Shot Translation [arXiv]
  • Improving Multi-hop Question Answering over Knowledge Graphs using Knowledge Base Embeddings
  • Improving Multimodal Named Entity Recognition via Entity Span Detection with Unified Multimodal Transformer
  • Improving Neural Machine Translation with Soft Template Prediction
  • Improving Segmentation for Technical Support Problems [arXiv]
  • Improving Transformer Models by Reordering their Sublayers [arXiv]
  • Improving Truthfulness of Headline Generation [arXiv]
  • In Layman’s Terms: Semi-Open Relation Extraction from Scientific Texts [arXiv]
  • In Neural Machine Translation, What Does Transfer Learning Transfer?
  • Inflecting when there’s no majority: Limitations of encoder-decoder neural networks as cognitive models for German plurals [arXiv]
  • Influence Paths for Characterizing Subject-Verb Number Agreement in LSTM Language Models [arXiv]
  • Information-Theoretic Probing for Linguistic Structure [arXiv]
  • INFOTABS: Inference on Tables as Semi-structured Data [arXiv]
  • Injecting Numerical Reasoning Skills into Language Models [arXiv]
  • INSET: Sentence Infilling with INter-SEntential Transformer
  • Integrating Multimodal Information in Large Pretrained Transformers
  • Integrating Semantic and Structural Information with Graph Convolutional Network for Controversy Detection [arXiv]
  • Interactive Classification by Asking Informative Questions [arXiv]
  • Interactive Construction of User-Centric Dictionary for Text Analytics
  • Interactive Machine Comprehension with Information Seeking Agents [arXiv]
  • Intermediate-Task Transfer Learning with Pretrained Language Models: When and Why Does It Work? [arXiv]
  • Interpreting Pretrained Contextualized Representations via Reductions to Static Embeddings
  • Investigating the effect of auxiliary objectives for the automated grading of learner English speech transcriptions
  • Investigating Word-Class Distributions in Word Vector Spaces
  • iSarcasm: A Dataset of Intended Sarcasm [arXiv]
  • It Takes Two to Lie: One to Lie, and One to Listen
  • It’s Morphin’ Time! Combating Linguistic Discrimination with Inflectional Perturbations [arXiv]
  • Iterative Edit-Based Unsupervised Sentence Simplification
  • Joint Chinese Word Segmentation and Part-of-speech Tagging via Two-way Attentions of Auto-analyzed Knowledge
  • Joint Diacritization, Lemmatization, Normalization, and Fine-Grained Morphological Tagging [arXiv]
  • Joint Modelling of Emotion and Abusive Language Detection [arXiv]
  • Jointly Learning to Align and Summarize for Neural Cross-Lingual Summarization
  • Jointly Masked Sequence-to-Sequence Model for Non-Autoregressive Neural Machine Translation
  • KdConv: A Chinese Multi-domain Dialogue Dataset Towards Multi-turn Knowledge-driven Conversation [arXiv]
  • KinGDOM: Knowledge-Guided DOMain adaptation for sentiment analysis [arXiv]
  • KLEJ: Comprehensive Benchmark for Polish Language Understanding [arXiv]
  • Knowledge Distillation for Multilingual Unsupervised Neural Machine Translation [arXiv]
  • Knowledge Graph Embedding Compression
  • Knowledge Graph-Augmented Abstractive Summarization with Semantic-Driven Cloze Reward [arXiv]
  • Language (Re)modelling: Towards Embodied Language Understanding [arXiv]
  • Language (technology) is power: The need to be explicit about NLP harms
  • Language Models as an Alternative Evaluator of Word Order Hypotheses: A Case Study in Japanese [arXiv]
  • Language to Network: Conditional Parameter Adaptation with Natural Language Descriptions
  • Large Scale Multi-Actor Generative Dialog Modeling [arXiv]
  • Learning a Multi-Domain Curriculum for Neural Machine Translation [arXiv]
  • Learning and Evaluating Emotion Lexicons for 91 Languages [arXiv]
  • Learning Architectures from an Extended Search Space for Language Modeling [arXiv]
  • Learning Constraints for Structured Prediction Using Rectifier Networks
  • Learning Dialog Policies from Weak Demonstrations [arXiv]
  • Learning Efficient Dialogue Policy from Demonstrations through Shaping
  • Learning Interpretable Relationships between Entities, Relations and Concepts via Bayesian Structure Learning on Open Domain Facts
  • Learning Source Phrase Representations for Neural Machine Translation [arXiv]
  • Learning to Ask More: Semi-Autoregressive Sequential Question Generation under Dual-Graph Interaction
  • Learning to Contextually Aggregate Multi-Source Supervision for Sequence Labeling [arXiv]
  • Learning to Customize Model Structures for Few-shot Dialogue Generation Tasks [arXiv]
  • Learning to Deceive with Attention-Based Explanations [arXiv]
  • Learning to execute instructions in a Minecraft dialogue
  • Learning to Faithfully Rationalize by Construction [arXiv]
  • Learning to Identify Follow-Up Questions in Conversational Question Answering
  • Learning to Recover from Multi-Modality Errors for Non-Autoregressive Neural Machine Translation
  • Learning to Segment Actions from Observation and Narration [arXiv]
  • Learning to Update Natural Language Comments Based on Code Changes [arXiv]
  • Learning Web-based Procedures by Reasoning over Explanations and Demonstrations in Context
  • Leveraging Graph to Improve Abstractive Multi-Document Summarization [arXiv]
  • Line Graph Enhanced AMR-to-Text Generation with Mix-Order Graph Attention Networks
  • Location Attention for Extrapolation to Longer Sequences [arXiv]
  • Logical Natural Language Generation from Open-Domain Tables [arXiv]
  • LogicalFactChecker: Leveraging Logical Operations for Fact Checking with Graph Module Network [arXiv]
  • Low-Dimensional Hyperbolic Knowledge Graph Embeddings [arXiv]
  • Low-Resource Generation of Multi-hop Reasoning Questions
  • Machine Reading of Historical Events
  • Mapping Natural Language Instructions to Mobile UI Action Sequences [arXiv]
  • MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning [arXiv]
  • Masked Language Model Scoring [arXiv]
  • MATINF: A Jointly Labeled Large-Scale Dataset for Classification, Question Answering and Summarization [arXiv]
  • Max-Margin Incremental CCG Parsing
  • Measuring Forecasting Skill from Text
  • Meta-Reinforced Multi-Domain State Generator for Dialogue Systems
  • MIE: A Medical Information Extractor towards Medical Dialogues
  • Mind the Trade-off: Debiasing NLU Models without Degrading the In-distribution Performance [arXiv]
  • MIND: A Large-scale Dataset for News Recommendation
  • MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification [arXiv]
  • MLQA: Evaluating Cross-lingual Extractive Question Answering [arXiv]
  • MMPE: A Multi-Modal Interface for Post-Editing Machine Translation
  • MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices [arXiv]
  • Modeling Code-Switch Languages Using Bilingual Parallel Corpus
  • Modeling Morphological Typology for Unsupervised Learning of Language Morphology
  • Modelling Context and Syntactical Features for Aspect-based Sentiment Analysis
  • More Diverse Dialogue Datasets via Diversity-Informed Data Collection
  • Moving Down the Long Tail of Word Sense Disambiguation with Gloss Informed Bi-encoders [arXiv]
  • Multi-agent Communication meets Natural Language: Synergies between Functional and Structural Language Learning [arXiv]
  • Multi-Agent Task-Oriented Dialog Policy Learning with Role-Aware Reward Decomposition [arXiv]
  • Multi-Cell Compositional LSTM for NER Domain Adaptation
  • Multidirectional Associative Optimization of Function-Specific Word Representations
  • Multi-Domain Dialogue Acts and Response Co-Generation [arXiv]
  • Multi-Domain Named Entity Recognition with Genre-Aware and Agnostic Inference
  • Multi-Domain Neural Machine Translation with Word-Level Adaptive Layer-wise Domain Mixing [arXiv]
  • Multi-Granularity Interaction Network for Extractive and Abstractive Multi-Document Summarization
  • Multi-Hypothesis Machine Translation Evaluation
  • Multi-Label and Multilingual News Framing Analysis
  • Multimodal Neural Graph Memory Networks for Visual Question Answering
  • MultiQT: Multimodal learning for real-time question tracking in speech [arXiv]
  • Multiscale Collaborative Deep Models for Neural Machine Translation [arXiv]
  • Multi-Sentence Argument Linking [arXiv]
  • Multi-source Meta Transfer for Low Resource Multiple-Choice Question Answering
  • MuTual: A Dataset for Multi-Turn Dialogue Reasoning [arXiv]
  • Named Entity Recognition without Labelled Data: A Weak Supervision Approach [arXiv]
  • NAT: Noise-Aware Training for Robust Neural Sequence Labeling [arXiv]
  • Negative Training for Neural Dialogue Response Generation [arXiv]
  • Neighborhood Matching Network for Entity Alignment [arXiv]
  • NeuInfer: Knowledge Inference on N-ary Facts
  • Neural CRF Model for Sentence Alignment in Text Simplification [arXiv]
  • Neural Data-to-Text Generation via Jointly Learning the Segmentation and Correspondence [arXiv]
  • Neural Generation of Dialogue Response Timings [arXiv]
  • Neural Mixed Counting Models for Dispersed Topic Discovery
  • Neural Reranking for Dependency Parsing: An Evaluation
  • Neural Syntactic Preordering for Controlled Paraphrase Generation [arXiv]
  • Neural Topic Modeling with Bidirectional Adversarial Training [arXiv]
  • NILE : Natural Language Inference with Faithful Natural Language Explanations [arXiv]
  • Norm-Based Curriculum Learning for Neural Machine Translation
  • Not All Claims are Created Equal: Choosing the Right Statistical Approach to Assess Hypotheses [arXiv]
  • Null It Out: Guarding Protected Attributes by Iterative Nullspace Projection [arXiv]
  • Obtaining Faithful Interpretations from Compositional Neural Networks [arXiv]
  • On Faithfulness and Factuality in Abstractive Summarization [arXiv]
  • On the Cross-lingual Transferability of Monolingual Representations [arXiv]
  • On the Encoder-Decoder Incompatibility in Variational Text Modeling and Beyond [arXiv]
  • On The Evaluation of Machine Translation SystemsTrained With Back-Translation [arXiv]
  • On the Inference Calibration of Neural Machine Translation [arXiv]
  • On the Limitations of Cross-lingual Encoders as Exposed by Reference-Free Machine Translation Evaluation [arXiv]
  • On the Robustness of Language Encoders against Grammatical Errors [arXiv]
  • One Size Does Not Fit All: Generating and Evaluating Variable Number of Keyphrases [arXiv]
  • Optimizing the Factual Correctness of a Summary: A Study of Summarizing Radiology Reports [arXiv]
  • Orthogonal Relation Transforms with Graph Context Modeling for Knowledge Graph Embedding [arXiv]
  • Out of the Echo Chamber: Detecting Countering Debate Speeches [arXiv]
  • ParaCrawl: Web-Scale Acquisition of Parallel Corpora
  • Parallel Corpus Filtering via Pre-trained Language Models [arXiv]
  • Paraphrase Augmented Task-Oriented Dialog Generation [arXiv]
  • Paraphrase Generation by Learning How to Edit from Samples
  • Parsing into Variable-in-situ Logico-Semantic Graphs
  • Perturbed Masking: Parameter-free Probing for Analyzing and Interpreting BERT [arXiv]
  • PeTra: A Sparsely Supervised Memory Model for People Tracking [arXiv]
  • Phone Features Improve Speech Translation [arXiv]
  • Phonetic and Visual Priors for Decipherment of Informal Romanization [arXiv]
  • PLATO: Pre-trained Dialogue Generation Model with Discrete Latent Variable [arXiv]
  • Politeness Transfer: A Tag and Generate Approach
  • Posterior Control of Blackbox Generation [arXiv]
  • Predicting Declension Class from Form and Meaning [arXiv]
  • Predicting Depression in Screening Interviews from Latent Categorization of Interview Prompts
  • Predicting Performance for Natural Language Processing Tasks [arXiv]
  • Predicting the Focus of Negation: Model and Error Analysis
  • Predicting the Growth of Morphological Families from Social and Linguistic Factors
  • Predicting the Topical Stance and Political Leaning of Media using Tweets
  • Predictive Biases in Natural Language Processing Models: A Conceptual Framework and Overview [arXiv]
  • Premise Selection in Natural Language Mathematical Texts
  • Pre-train and Plug-in: Flexible Conditional Text Generation with Variational Auto-Encoders [arXiv]
  • Pre-training Is (Almost) All You Need: An Application to Commonsense Reasoning [arXiv]
  • Pretraining with Contrastive Sentence Objectives Improves Discourse Performance of Language Models [arXiv]
  • Probabilistic Assumptions Matter: Improved Models for Distantly-Supervised Document-Level Question Answering [arXiv]
  • Probabilistically Masked Language Model Capable of Autoregressive Generation in Arbitrary Word Order [arXiv]
  • Probing for referential information in language models
  • Probing Linguistic Features of Sentence-Level Representations in Relation Extraction [arXiv]
  • Probing Linguistic Systematicity [arXiv]
  • Programming in Natural Language with fuSE: Synthesizing Methods from Spoken Utterances Using Deep Natural Language Understanding
  • PuzzLing Machines: A Challenge on Learning From Small Data [arXiv]
  • Pyramid: A Layered Model for Nested Named Entity Recognition
  • QuASE: Question-Answer Driven Sentence Encoding [arXiv]
  • R^3: Reverse, Retrieve, and Rank for Sarcasm Generation with Commonsense Knowledge [arXiv]
  • Rationalizing Medical Relation Prediction from Corpus-level Statistics [arXiv]
  • Rationalizing Text Matching: Learning Sparse Alignments via Optimal Transport [arXiv]
  • RAT-SQL: Relation-Aware Schema Encoding and Linking for Text-to-SQL Parsers [arXiv]
  • Reasoning Over Semantic-Level Graph for Fact Checking [arXiv]
  • Reasoning with Latent Structure Refinement for Document-Level Relation Extraction [arXiv]
  • Reasoning with Multimodal Sarcastic Tweets via Modeling Cross-Modality Contrast and Semantic Association
  • (Re)construing Meaning in NLP [arXiv]
  • Recurrent Chunking Mechanisms for Long-Text Machine Reading Comprehension [arXiv]
  • Recurrent Neural Network Language Models Always Learn English-Like Relative Clause Attachment
  • Reducing Gender Bias in Neural Machine Translation as a Domain Adaptation Problem [arXiv]
  • Refer360° : A Referring Expression Recognition Dataset in 360° Images
  • ReInceptionE: Relation-Aware Inception Network with Joint Local-Global Structural Information for Knowledge Graph Embedding
  • Relabel the Noise: Joint Extraction of Entities and Relations via Cooperative Multiagents [arXiv]
  • Relational Graph Attention Network for Aspect-based Sentiment Analysis [arXiv]
  • Relation-Aware Collaborative Learning for Unified Aspect-Based Sentiment Analysis
  • Representation Learning for Information Extraction from Form-like Documents
  • Response-Anticipated Memory for On-Demand Knowledge Integration in Response Generation [arXiv]
  • Rethinking Dialogue State Tracking with Reasoning [arXiv]
  • Review-based Question Generation with Adaptive Instance Transfer and Augmentation [arXiv]
  • Revisiting the Context Window for Cross-lingual Word Embeddings [arXiv]
  • Rigid Formats Controlled Text Generation [arXiv]
  • RikiNet: Reading Wikipedia Pages for Natural Question Answering [arXiv]
  • Robust Encodings: A Framework for Combating Adversarial Typos [arXiv]
  • Roles and Utilization of Attention Heads in Transformer-based Neural Language Models
  • S2ORC: The Semantic Scholar Open Research Corpus [arXiv]
  • SAS: Dialogue State Tracking via Slot Attention and Slot Information Sharing
  • SCDE: Sentence Cloze Dataset with High Quality Distractors From Examinations [arXiv]
  • schuBERT: Optimizing Elements of BERT [arXiv]
  • SciREX: A Challenge Dataset for Document-Level Information Extraction [arXiv]
  • Screenplay Summarization Using Latent Narrative Structure [arXiv]
  • ScriptWriter: Narrative-Guided Script Generation [arXiv]
  • SEEK: Segmented Embedding of Knowledge Graphs [arXiv]
  • Selecting Backtranslated Data from Multiple Sources for Improved Neural Machine Translation [arXiv]
  • Selective Question Answering under Domain Shift
  • Semantic Graphs for Generating Deep Questions [arXiv]
  • Semantic Parsing for English as a Second Language
  • Semantic Scaffolds for Pseudocode-to-Code Generation [arXiv]
  • Semi-supervised Contextual Historical Text Normalization
  • Semi-Supervised Dialogue Policy Learning via Stochastic Reward Estimation [arXiv]
  • Semi-Supervised Semantic Dependency Parsing Using CRF Autoencoders
  • SenseBERT: Driving Some Sense into BERT [arXiv]
  • SentiBERT: A Transferable Transformer-Based Architecture for Compositional Sentiment Semantics [arXiv]
  • Sentiment and Emotion help Sarcasm? A Multi-task Learning Framework for Multi-Modal Sarcasm, Sentiment and Emotion Analysis
  • SeqVAT: Virtual Adversarial Training for Semi-Supervised Sequence Labeling
  • Should All Cross-Lingual Embeddings Speak English?
  • Similarity Analysis of Contextual Word Representation Models [arXiv]
  • Simple, Interpretable and Stable Method for Detecting Words with Usage Change across Corpora
  • Simplify the Usage of Lexicon in Chinese NER [arXiv]
  • SimulSpeech: End-to-End Simultaneous Speech to Text Translation
  • Single-/Multi-Source Cross-Lingual NER via Teacher-Student Learning on Unlabeled Data in Target Language [arXiv]
  • SKEP: Sentiment Knowledge Enhanced Pre-training for Sentiment Analysis [arXiv]
  • Slot-consistent NLG for Task-oriented Dialogue Systems with Iterative Rectification Network
  • SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization [arXiv]
  • Social Bias Frames: Reasoning about Social and Power Implications of Language [arXiv]
  • Sources of Transfer in Multilingual Named Entity Recognition [arXiv]
  • Span Selection Pre-training for Question Answering [arXiv]
  • Span-based Localizing Network for Natural Language Video Localization [arXiv]
  • SpanMlt: A Span-based Multi-Task Learning Framework for Pair-wise Aspect and Opinion Terms Extraction
  • Speak to your Parser: Interactive Text-to-SQL with Natural Language Feedback [arXiv]
  • Speaker Sensitive Response Evaluation Model
  • Speakers enhance contextually confusable words
  • SPECTER: Document-level Representation Learning using Citation-informed Transformers [arXiv]
  • Speech Translation and the End-to-End Promise: Taking Stock of Where We Are [arXiv]
  • SpellGCN: Incorporating Phonological and Visual Similarities into Language Models for Chinese Spelling Check [arXiv]
  • Spelling Error Correction with Soft-Masked BERT [arXiv]
  • Spying on your neighbors: Fine-grained probing of contextual embeddings for information about surrounding words [arXiv]
  • STARC: Structured Annotations for Reading Comprehension [arXiv]
  • Stock Embeddings Acquired from News Articles and Price History, and an Application to Portfolio Optimization
  • Storytelling with Dialogue: A Critical Role Dungeons and Dragons Dataset
  • Structural Information Preserving for Graph-to-Text Generation
  • Structured Tuning for Semantic Role Labeling [arXiv]
  • Structure-Level Knowledge Distillation For Multilingual Sequence Labeling [arXiv]
  • Suspense in Short Stories is Predicted By Uncertainty Reduction over Neural Story Representation [arXiv]
  • Synchronous Double-channel Recurrent Network for Aspect-Opinion Pair Extraction
  • Syn-QG: Syntactic and Shallow Semantic Rules for Question Generation [arXiv]
  • Syntax-Aware Opinion Role Labeling with Dependency Graph Convolutional Networks
  • TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data [arXiv]
  • TACRED Revisited: A Thorough Evaluation of the TACRED Relation Extraction Task [arXiv]
  • TAG : Type Auxiliary Guiding for Code Comment Generation [arXiv]
  • Tangled up in BLEU: Reevaluating the Evaluation of Automatic Machine Translation Evaluation Metrics
  • TaPas: Weakly Supervised Table Parsing via Pre-training [arXiv]
  • Target Inference in Argument Conclusion Generation
  • Taxonomy Construction of Unseen Domains via Graph-based Cross-Domain Knowledge Transfer
  • Tchebycheff Procedure for Multi-task Text Classification
  • Temporal Common Sense Acquisition with Minimal Supervision [arXiv]
  • Temporally-Informed Analysis of Named Entity Recognition
  • Text and Causal Inference: A Review of Using Text to Remove Confounding from Causal Estimates [arXiv]
  • Text-Based Ideal Points [arXiv]
  • That is a Known Lie: Detecting Previously Fact-Checked Claims [arXiv]
  • “The Boating Store Had Its Best Sail Ever”: Pronunciation-attentive Contextualized Pun Recognition [arXiv]
  • The Cascade Transformer: an Application for Efficient Answer Sentence Selection [arXiv]
  • The Dialogue Dodecathlon: Open-Domain Knowledge and Image Grounded Conversational Agents [arXiv]
  • The Paradigm Discovery Problem [arXiv]
  • The Right Tool for the Job: Matching Model and Instance Complexities [arXiv]
  • The Sensitivity of Language Models and Humans to Winograd Schema Perturbations [arXiv]
  • The SOFC-Exp Corpus and Neural Approaches to Information Extraction in the Materials Science Domain
  • The State and Fate of Linguistic Diversity and Inclusion in the NLP World [arXiv]
  • The Summary Loop: Learning to Write Abstractive Summaries Without Examples
  • The TechQA Dataset [arXiv]
  • The Unstoppable Rise of Computational Linguistics in Deep Learning [arXiv]
  • To Boldly Query What No One Has Annotated Before? The Frontiers of Corpus Querying
  • To Test Machine Comprehension, Start by Defining Comprehension [arXiv]
  • Toward Gender-Inclusive Coreference Resolution [arXiv]
  • Towards Conversational Recommendation over Multi-Type Dialogs [arXiv]
  • Towards Debiasing Sentence Representations
  • Towards Emotion-aided Multi-modal Dialogue Act Classification
  • Towards Faithful Neural Table-to-Text Generation with Content-Matching Constraints [arXiv]
  • Towards Holistic and Automatic Evaluation of Open-Domain Dialogue Generation
  • Towards Interpretable Clinical Diagnosis with Bayesian Network Ensembles Stacked on Entity-Aware CNNs
  • Towards Robustifying NLI Models Against Lexical Dataset Biases [arXiv]
  • Towards Transparent and Explainable Attention Models [arXiv]
  • Towards Understanding Gender Bias in Relation Extraction [arXiv]
  • Towards Unsupervised Language Understanding and Generation by Joint Dual Learning [arXiv]
  • Toxicity Detection: Does Context Really Matter?
  • Transition-based Directed Graph Construction for Emotion-Cause Pair Extraction
  • Transition-based Semantic Dependency Parsing with Pointer Networks [arXiv]
  • Translationese as a Language in “Multilingual” NMT [arXiv]
  • TransS-Driven Joint Learning Architecture for Implicit Discourse Relation Recognition
  • TVQA+: Spatio-Temporal Grounding for Video Question Answering [arXiv]
  • TXtract: Taxonomy-Aware Knowledge Extraction for Thousands of Product Categories [arXiv]
  • Uncertainty-Aware Curriculum Learning for Neural Machine Translation [arXiv]
  • Understanding Attention for Text Classification
  • Understanding the Language of Political Agreement and Disagreement in Legislative Texts
  • Universal Decompositional Semantic Parsing
  • Unknown Intent Detection Using Gaussian Mixture Model with an Application to Zero-shot Intent Classification
  • Unsupervised Alignment-based Iterative Evidence Retrieval for Multi-hop Question Answering [arXiv]
  • Unsupervised Cross-lingual Representation Learning at Scale [arXiv]
  • Unsupervised Domain Clusters in Pretrained Language Models [arXiv]
  • Unsupervised Dual Paraphrasing for Two-stage Semantic Parsing [arXiv]
  • Unsupervised Morphological Paradigm Completion [arXiv]
  • Unsupervised Multimodal Neural Machine Translation with Pseudo Visual Pivoting [arXiv]
  • Unsupervised Opinion Summarization as Copycat-Review Generation [arXiv]
  • Unsupervised Opinion Summarization with Noising and Denoising [arXiv]
  • Unsupervised Paraphrasing by Simulated Annealing [arXiv]
  • USR: An Unsupervised and Reference Free Evaluation Metric for Dialog Generation [arXiv]
  • Weight Poisoning Attacks on Pretrained Models [arXiv]
  • What are the Goals of Distributional Semantics? [arXiv]
  • What determines the order of adjectives in English? Comparing efficiency-based theories using dependency treebanks
  • What Question Answering can Learn from Trivia Nerds [arXiv]
  • What Was Written vs. Who Read It: News Media Profiling Using Text Analysis and Social Media Context [arXiv]
  • When do Word Embeddings Accurately Reflect Surveys on our Beliefs About People? [arXiv]
  • “Who said it, and Why?” Provenance for Natural Language Claims
  • WinoWhy: A Deep Diagnosis of Essential Commonsense Knowledge for Answering Winograd Schema Challenge [arXiv]
  • Word-level Textual Adversarial Attacking as Combinatorial Optimization [arXiv]
  • XtremeDistil: Multi-stage Distillation for Massive Multilingual Models [arXiv]
  • You Impress Me: Dialogue Generation via Mutual Persona Perception [arXiv]
  • Zero-shot Text Classification via Reinforced Self-training
  • Zero-Shot Transfer Learning with Synthesized Data for Multi-Domain Dialogue State Tracking
  • ZeroShotCeres: Zero-Shot Relation Extraction from Semi-Structured Webpages [arXiv]

Short Papers

  • A Complete Shift-Reduce Chinese Discourse Parser with Robust Dynamic Oracle
  • A Diverse Corpus for Evaluating and Developing English Math Word Problem Solvers
  • A Frame-based Sentence Representation for Machine Reading Comprehension
  • A Large-Scale Multi-Document Summarization Dataset from the Wikipedia Current Events Portal [arXiv]
  • A Multi-Perspective Architecture for Semantic Code Search
  • A negative case analysis of visual grounding methods for VQA [arXiv]
  • A Probabilistic Generative Model for Typographical Analysis of Early Modern Printing [arXiv]
  • A Re-evaluation of Knowledge Graph Completion Methods [arXiv]
  • A Relational Memory-based Embedding Model for Triple Classification and Search Personalization [arXiv]
  • A Relaxed Matching Procedure for Unsupervised BLI
  • A Retrieve-and-Rewrite Initialization Method for Unsupervised Machine Translation
  • A Simple and Effective Unified Encoder for Document-Level Machine Translation
  • A Tale of a Probe and a Parser [arXiv]
  • A Three-Parameter Rank-Frequency Relation in Natural Languages
  • A Transformer-based Approach for Source Code Summarization [arXiv]
  • A Two-Stage Masked LM Method for Term Set Expansion [arXiv]
  • A Two-Step Approach for Implicit Event Argument Detection
  • Active Learning for Coreference Resolution using Discrete Annotation [arXiv]
  • An Empirical Comparison of Unsupervised Constituency Parsing Methods
  • Analyzing the Persuasive Effect of Style in News Editorial Argumentation
  • Are we Estimating or Guesstimating Translation Quality?
  • Attend to Medical Ontologies: Content Selection for Clinical Abstractive Summarization [arXiv]
  • Autoencoding Keyword Correlation Graph for Document Clustering
  • Automated Topical Component Extraction Using Neural Network Attention Scores from Source-based Essay Scoring
  • Automatic Machine Translation Evaluation using Source Language Inputs and Cross-lingual Language Model
  • Bayesian Hierarchical Words Representation Learning [arXiv]
  • Benefits of Intermediate Annotations in Reading Comprehension
  • Camouflaged Chinese Spam Content Detection with Semi-supervised Generative Active Learning
  • Character-Level Translation with Self-attention [arXiv]
  • ClarQ: A large-scale and diverse dataset for Clarification Question Generation
  • Classification-Based Self-Learning for Weakly Supervised Bilingual Lexicon Induction
  • Clinical Concept Linking with Contextualized Neural Representations
  • Closing the Gap: Joint De-Identification and Concept Extraction in the Clinical Domain [arXiv]
  • Coach: A Coarse-to-Fine Approach for Cross-domain Slot Filling [arXiv]
  • Code-switching patterns can be an effective route to improve performance of downstream NLP applications: A case study of humour, sarcasm and hate speech detection [arXiv]
  • Composing Elementary Discourse Units in Abstractive Summarization
  • Content Word Aware Neural Machine Translation
  • Contextual Embeddings: When Are They Worth It?
  • Contextual Neural Machine Translation Improves Translation of Cataphoric Pronouns [arXiv]
  • Contextualized Sparse Representations for Real-Time Open-Domain Question Answering [arXiv]
  • Contextualizing Hate Speech Classifiers with Post-hoc Explanation [arXiv]
  • Contrastive Self-Supervised Learning for Commonsense Reasoning [arXiv]
  • Controlled Crowdsourcing for High-Quality QA-SRL Annotation [arXiv]
  • Conversational Word Embedding for Retrieval-Based Dialog System [arXiv]
  • Crawling and Preprocessing Mailing Lists At Scale for Dialog Analysis
  • Crossing Variational Autoencoders for Answer Retrieval [arXiv]
  • DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference [arXiv]
  • Designing Precise and Robust Dialogue Response Evaluators [arXiv]
  • Dialogue State Tracking with Explicit Slot Connection Modeling
  • Do Transformers Need Deep Long-Range Memory?
  • Do you have the right scissors? Tailoring Pre-trained Language Models via Monte-Carlo Methods
  • Does Multi-Encoder Help? A Case Study on Context-Aware Neural Machine Translation [arXiv]
  • Don’t Eclipse Your Arts Due to Small Discrepancies: Boundary Repositioning with a Pointer Network for Aspect Extraction
  • Dscorer: A Fast Evaluation Metric for Discourse Representation Structure Parsing
  • Dynamic Memory Induction Networks for Few-Shot Text Classification [arXiv]
  • Dynamic Sampling Strategies for Multi-Task Reading Comprehension
  • Dynamically Adjusting Transformer Batch Size by Monitoring Gradient Direction Change [arXiv]
  • Efficient strategies for hierarchical text classification: external knowledge and auxiliary tasks [arXiv]
  • Embarrassingly Simple Unsupervised Aspect Extraction [arXiv]
  • Enabling Language Models to Fill in the Blanks [arXiv]
  • Encoder-Decoder Models Can Benefit from Pre-trained Masked Language Models in Grammatical Error Correction [arXiv]
  • ENGINE: Energy-Based Inference Networks for Non-Autoregressive Machine Translation [arXiv]
  • Enhancing Machine Translation with Dependency-Aware Self-Attention [arXiv]
  • Enhancing Pre-trained Chinese Character Representation with Word-aligned Attention [arXiv]
  • Enriched In-Order Linearization for Faster Sequence-to-Sequence Constituent Parsing [arXiv]
  • Entity-Aware Dependency-Based Deep Graph Attention Network for Comparative Preference Classification
  • Estimating Mutual Information Between Dense Word Embeddings
  • Evaluating Dialogue Generation Systems via Response Selection [arXiv]
  • Evaluating Robustness to Input Perturbations for Neural Machine Translation [arXiv]
  • Every Document Owns Its Structure: Inductive Text Classification via Graph Neural Networks [arXiv]
  • ExpBERT: Representation Engineering with Natural Language Explanations [arXiv]
  • Exploiting Personal Characteristics of Debaters for Predicting Persuasiveness
  • Exploring Content Selection in Summarization of Novel Chapters [arXiv]
  • Fact-based Content Weighting for Evaluating Abstractive Summarisation
  • Fatality Killed the Cat or: BabelPic, a Multimodal Dataset for Non-Concrete Concepts
  • Few-Shot NLG with Pre-Trained Language Model [arXiv]
  • FLAT: Chinese NER Using Flat-Lattice Transformer [arXiv]
  • GAN-BERT: Generative Adversarial Learning for Robust Text Classification with a Bunch of Labeled Examples
  • Geometry-aware domain adaptation for unsupervised alignment of word embeddings [arXiv]
  • Give Me Convenience and Give Her Death: Who Should Decide What Uses of NLP are Appropriate, and on What Basis? [arXiv]
  • Glyph2Vec: Learning Chinese Out-of-Vocabulary Word Embedding from Glyphs
  • GPT-too: A language-model-first approach for AMR-to-text generation [arXiv]
  • How Can We Accelerate Progress Towards Human-like Linguistic Generalization? [arXiv]
  • Hypernymy Detection for Low-Resource Languages via Meta Learning
  • Identifying Principals and Accessories in a Complex Case based on the Comprehension of Fact Description
  • Implicit Discourse Relation Classification: We Need to Talk about Evaluation
  • Improved Speech Representations with Multi-Target Autoregressive Predictive Coding [arXiv]
  • Improving Entity Linking through Semantic Reinforced Entity Embeddings
  • Improving Low-Resource Named Entity Recognition using Joint Sentence and Token Labeling
  • Improving Non-autoregressive Neural Machine Translation with Monolingual Data [arXiv]
  • Incorporating External Knowledge through Pre-training for Natural Language to Code Generation [arXiv]
  • Instance-Based Learning of Span Representations: A Case Study through Named Entity Recognition [arXiv]
  • Interpretable Operational Risk Classification with Semi-Supervised Variational Autoencoder
  • Interpreting Twitter User Geolocation
  • Is Your Classifier Actually Biased? Measuring Fairness under Uncertainty with Bernstein Bounds [arXiv]
  • It’s Easier to Translate out of English than into it: Measuring Neural Translation Difficulty by Cross-Mutual Information [arXiv]
  • Keyphrase Generation for Scientific Document Retrieval
  • Knowledge Supports Visual Language Grounding: A Case Study on Colour Terms
  • Language-aware Interlingua for Multilingual Neural Machine Translation
  • Learning an Unreferenced Metric for Online Dialogue Evaluation [arXiv]
  • Learning Implicit Text Generation via Feature Matching [arXiv]
  • Learning Low-Resource End-To-End Goal-Oriented Dialog for Fast and Reliable System Deployment
  • Learning Robust Models for e-Commerce Product Search [arXiv]
  • Learning Spoken Language Representations with Neural Lattice Language Modeling
  • Learning to Tag OOV Tokens by Integrating Contextual Representation and Background Knowledge
  • Learning to Understand Child-directed and Adult-directed Speech [arXiv]
  • Let Me Choose: From Verbal Context to Font Selection [arXiv]
  • Leveraging Monolingual Data with Self-Supervision for Multilingual Neural Machine Translation [arXiv]
  • Lexically Constrained Neural Machine Translation with Levenshtein Transformer [arXiv]
  • Lipschitz Constrained Parameter Initialization for Deep Transformers [arXiv]
  • Logic-Guided Data Augmentation and Regularization for Consistent Question Answering [arXiv]
  • Low Resource Sequence Tagging using Sentence Reconstruction
  • Make Up Your Mind! Adversarial Generation of Inconsistent Natural Language Explanations [arXiv]
  • Masking Actor Information Leads to Fairer Political Claims Detection
  • Meta-Transfer Learning for Code-Switched Speech Recognition [arXiv]
  • Mitigating Gender Bias Amplification in Distribution by Posterior Regularization [arXiv]
  • Modeling Label Semantics for Predicting Emotional Reactions
  • Modeling Long Context for Task-Oriented Dialogue State Generation [arXiv]
  • Modeling Word Formation in English–German Neural Machine Translation
  • MOOCCube: A Large-scale Data Repository for NLP Applications in MOOCs
  • Multimodal and Multiresolution Speech Recognition with Transformers
  • Multimodal Quality Estimation for Machine Translation
  • Multimodal Transformer for Multimodal Machine Translation
  • Named Entity Recognition as Dependency Parsing [arXiv]
  • Negated and Misprimed Probes for Pretrained Language Models: Birds Can Talk, But Cannot Fly [arXiv]
  • Neural Graph Matching Networks for Chinese Short Text Matching
  • Neural Temporal Opinion Modelling for Opinion Prediction on Twitter [arXiv]
  • Neural-DINF: A Neural Network based Framework for Measuring Document Influence
  • Non-Linear Instance-Based Cross-Lingual Mapping for Non-Isomorphic Embedding Spaces
  • “None of the Above”: Measure Uncertainty in Dialog Response Retrieval [arXiv]
  • On Exposure Bias, Hallucination and Domain Shift in Neural Machine Translation [arXiv]
  • On Forgetting to Cite Older Papers: An Analysis of the ACL Anthology
  • On Importance Sampling-Based Evaluation of Latent Language Models
  • On the Importance of Diversity in Question Generation for QA
  • On the Spontaneous Emergence of Discrete and Compositional Signals [arXiv]
  • OpinionDigest: A Simple Framework for Opinion Summarization [arXiv]
  • Opportunistic Decoding with Timely Correction for Simultaneous Translation [arXiv]
  • Overestimation of Syntactic Representation in Neural Language Models [arXiv]
  • Parallel Data Augmentation for Formality Style Transfer [arXiv]
  • Parallel Sentence Mining by Constrained Decoding
  • Posterior Calibrated Training on Sentence Classification Tasks [arXiv]
  • Predicting Degrees of Technicality in Automatic Terminology Extraction
  • Pretrained Transformers Improve Out-of-Distribution Robustness [arXiv]
  • Quantifying Attention Flow in Transformers [arXiv]
  • Query Graph Generation for Answering Multi-hop Complex Questions from Knowledge Bases
  • R4C: A Benchmark for Evaluating RC Systems to Get the Right Answer for the Right Reason [arXiv]
  • Recollection versus Imagination: Exploring Human Memory and Cognition via Neural Language Models
  • Recursive Template-based Frame Generation for Task Oriented Dialog
  • Regularized Context Gates on Transformer for Machine Translation [arXiv]
  • Relation Extraction with Explanation
  • Representations of Syntax [MASK] Useful: Effects of Constituency and Dependency Structure in Recursive LSTMs [arXiv]
  • Returning the N to NLP: Towards Contextually Personalized Classification Models
  • Reverse Engineering Configurations of Neural Text Generation Models [arXiv]
  • Revisiting Higher-Order Dependency Parsers
  • Revisiting Unsupervised Relation Extraction [arXiv]
  • SAFER: A Structure-free Approach for Certified Robustness to Adversarial Word Substitutions
  • Self-Attention Guided Copy Mechanism for Abstractive Summarization
  • Self-Attention with Cross-Lingual Position Representation [arXiv]
  • Sentence Meta-Embeddings for Unsupervised Semantic Textual Similarity [arXiv]
  • Shape of synth to come: Why we should use synthetic data for English surface realization [arXiv]
  • Shaping Visual Representations with Language for Few-Shot Classification [arXiv]
  • Showing Your Work Doesn’t Always Work [arXiv]
  • Simple and Effective Retrieve-Edit-Rerank Text Generation
  • Simultaneous Translation Policies: From Fixed to Adaptive [arXiv]
  • Single Model Ensemble using Pseudo-Tags and Distinct Vectors [arXiv]
  • Smart To-Do: Automatic Generation of To-Do Items from Emails
  • Social Biases in NLP Models as Barriers for Persons with Disabilities [arXiv]
  • Soft Gazetteers for Low-Resource Named Entity Recognition [arXiv]
  • Span-ConveRT: Few-shot Span Extraction for Dialog with Pretrained Conversational Representations [arXiv]
  • Stolen Probability: A Structural Weakness of Neural Language Models [arXiv]
  • Successfully Applying the Stabilized Lottery Ticket Hypothesis to the Transformer Architecture [arXiv]
  • SUPERT: Towards New Frontiers in Unsupervised Evaluation Metrics for Multi-Document Summarization [arXiv]
  • Supervised Grapheme-to-Phoneme Conversion of Orthographic Schwas in Hindi and Punjabi [arXiv]
  • Syntactic Data Augmentation Increases Robustness to Inference Heuristics [arXiv]
  • Tagged Back-translation Revisited: Why Does It Really Work?
  • tBERT: Topic Models and BERT Joining Forces for Semantic Similarity Detection
  • Template-Based Question Generation from Retrieved Sentences for Improved Unsupervised Question Answering [arXiv]
  • Tetra-Tagging: Word-Synchronous Parsing with Linear-Time Inference [arXiv]
  • Text Classification with Negative Supervision
  • To Pretrain or Not to Pretrain: Examining the Benefits of Pretrainng on Resource Rich Tasks
  • Topological Sort for Sentence Ordering [arXiv]
  • Toward Better Storylines with Sentence-Level Language Models [arXiv]
  • Towards Better Non-Tree Argument Mining: Proposition-Level Biaffine Parsing with Task-Specific Parameterization
  • Towards end-2-end learning for predicting behavior codes from spoken utterances in psychotherapy conversations
  • Towards Faithfully Interpretable NLP Systems: How should we define and evaluate faithfulness? [arXiv]
  • Towards Open Domain Event Trigger Identification using Adversarial Domain Adaptation [arXiv]
  • Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-based Question Answering [arXiv]
  • Treebank Embedding Vectors for Out-of-domain Dependency Parsing [arXiv]
  • Tree-Structured Neural Topic Model
  • TriggerNER: Learning with Entity Triggers as Explanations for Named Entity Recognition [arXiv]
  • Two Birds, One Stone: A Simple, Unified Model for Text Generation from Structured and Unstructured Data [arXiv]
  • Uncertain Natural Language Inference [arXiv]
  • Understanding Advertisements with BERT
  • Unsupervised FAQ Retrieval with Question Generation and BERT
  • Using Context in Neural Machine Translation Training Objectives [arXiv]
  • Variational Neural Machine Translation with Normalizing Flows [arXiv]
  • Verbal Multiword Expressions for Identification of Metaphor
  • Video-Grounded Dialogues with Pretrained Generation Language Models
  • What Does BERT with Vision Look At?
  • What is Learned in Visually Grounded Neural Syntax Acquisition [arXiv]
  • Why Overfitting Isn’t Always Bad: Retrofitting Cross-Lingual Word Embeddings to Dictionaries [arXiv]
  • Will-They-Won’t-They: A Very Large Dataset for Stance Detection on Twitter [arXiv]
  • Words aren’t enough, their order matters: On the Robustness of Grounding Visual Referring Expressions [arXiv]
  • Worse WER, but Better BLEU? Leveraging Word Embedding as Intermediate in Multitask End-to-End Speech Translation [arXiv]
  • Would you Rather? A New Benchmark for Learning Machine Alignment with Cultural Values and Social Preferences
  • You Don’t Have Time to Read This: An Exploration of Document Reading Time Prediction
  • ``You Sound Just Like Your Father’’ Commercial Machine Translation Systems Include Stylistic Biases
  • ZPR2: Joint Zero Pronoun Recovery and Resolution using Multi-Task Learning and BERT

System Demonstrations

  • ADVISER: A Toolkit for Developing Multi-modal, Multi-domain and Socially-engaged Conversational Agents [arXiv]
  • BENTO: A Visual Platform for Building Clinical NLP Pipelines Based on CodaLab
  • Clinical-Coder: Assigning Interpretable ICD-10 Codes to Chinese Clinical Notes
  • CLIReval: Evaluating Machine Translation as a Cross-Lingual Information Retrieval Task
  • Conversation Learner - A Machine Teaching Tool for Building Dialog Managers for Task-Oriented Dialog Systems [arXiv]
  • ConvLab-2: An Open-Source Toolkit for Building, Evaluating, and Diagnosing Dialogue Systems [arXiv]
  • DIALOGPT : Large-Scale Generative Pre-training for Conversational Response Generation [arXiv]
  • Embedding-based Scientific Literature Discovery in a Text Editor Application [arXiv]
  • ESPnet-ST: All-in-One Speech Translation Toolkit [arXiv]
  • EVIDENCEMINER: Textual Evidence Discovery for Life Sciences
  • exBERT: A Visual Analysis Tool to Explore Learned Representations in Transformer Models [arXiv]
  • GAIA: A Fine-grained Multimedia Knowledge Extraction System
  • Interactive Task Learning from GUI-Grounded Natural Language Instructions and Demonstrations [arXiv]
  • jiant: A Software Toolkit for Research on General-Purpose Text Understanding Models
  • Label Noise in Context
  • LEAN-LIFE: A Label-Efficient Annotation Framework Towards Learning from Explanation [arXiv]
  • LinggleWrite: a Coaching System for Essay Writing
  • MixingBoard: a Knowledgeable Stylized Integrated Text Generation Platform [arXiv]
  • MMPE: A Multi-Modal Interface using Handwriting, Touch Reordering, and Speech Commands for Post-Editing Machine Translation
  • Multilingual Universal Sentence Encoder for Semantic Retrieval [arXiv]
  • Nakdan: Professional Hebrew Diacritizer [arXiv]
  • NLP Scholar: An Interactive Visual Explorer for Natural Language Processing Literature
  • NSTM: Real-Time Query-Driven News Overview Composition at Bloomberg
  • OpusFilter: A Configurable Parallel Corpus Filtering Toolbox
  • Penman: An Open-Source Library and Tool for AMR Graphs
  • Personalized PageRank with Syntagmatic Information for Multilingual Word Sense Disambiguation
  • Photon: A Robust Cross-Domain Text-to-SQL System
  • Prta: A System to Support the Analysis of Propaganda Techniques in the News [arXiv]
  • pyBART: Evidence-based Syntactic Transformations for IE [arXiv]
  • Stanza: A Python Natural Language Processing Toolkit for Many Human Languages [arXiv]
  • Stimulating Creativity with FunLines: A Case Study of Humor Generation in Headlines [arXiv]
  • SUPP.AI: finding evidence for supplement-drug interactions
  • Syntactic Search by Example
  • SyntaxGym: An Online Platform for Targeted Evaluation of Language Models
  • Tabouid: a Wikipedia-based word guessing game
  • Talk to Papers: Bringing Neural Question Answering to Academic Search [arXiv]
  • TextBrewer: An Open-Source Knowledge Distillation Toolkit for Natural Language Processing [arXiv]
  • The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language Understanding [arXiv]
  • Torch-Struct: Deep Structured Prediction Library [arXiv]
  • Trialstreamer: Mapping and Browsing Medical Evidence in Real-Time [arXiv]
  • Usnea: An Authorship Tool for Interactive Fiction using Retrieval Based Semantic Parsing
  • What’s The Latest? A Question-driven News Chatbot
  • Xiaomingbot: A Multilingual Robot News Reporter

Student Research Workshop

  • #NotAWhore! A Computational Linguistic Perspective of Rape Culture and Victimization on Social Media
  • A Geometry-Inspired Attack for Generating Natural Language Adversarial Examples
  • A Simple and Effective Dependency parser for Telugu
  • Adaptive Transformers for Learning Multimodal Representations [arXiv]
  • AraDIC: Arabic Document Classification Using Image-Based Character Embeddings and Class-Balanced Loss
  • Building a Japanese Typo Dataset from Wikipedia’s Revision History
  • Checkpoint Reranking: An Approach To Select Better Hypothesis For Neural Machine Translation Systems
  • Combining Subword Representations into Word-level Representations in the Transformer Architecture
  • Compositional generalization by factorizing alignment and translation
  • Considering Likelihood in NLP Classification Explanations with Occlusion and Language Modeling [arXiv]
  • Crossing the Line: Where do Demographic Variables Fit into Humor Detection?
  • Cross-Lingual Disaster-related Multi-label Tweet Classification with Manifold Mixup
  • Dominance as an Indicator of Rapport and Learning in Human-Agent Communication
  • Effectively Aligning and Filtering Parallel Corpora under Sparse Data Conditions
  • Efficient Neural Machine Translation for Low-Resource Languages via Exploiting Related Languages
  • Embeddings of Label Components for Sequence Labeling: A Case Study of Fine-grained Named Entity Recognition
  • Enhancing Word Embeddings with Knowledge Extracted from Lexical Resources [arXiv]
  • Exploring Interpretability in Event Extraction: Multitask Learning of a Neural Event Classifier and an Explanation Decoder
  • Exploring the Role of Context to Distinguish Rhetorical and Information-Seeking Questions
  • Feature Difference Makes Sense: A medical image captioning model exploiting feature difference and tag information
  • Grammatical Error Correction Using Pseudo Learner Corpus Considering Learner’s Error Tendency
  • HGCN4MeSH: Hybrid Graph Convolution Network for MeSH Indexing
  • How much complexity does an RNN architecture need to learn syntax-sensitive dependencies?
  • υBLEU: Uncertainty-Aware Automatic Evaluation Method for Open-Domain Dialogue Systems
  • Inducing Grammar from Long Short-Term Memory Networks by Shapley Decomposition
  • Let’s be Humorous: Knowledge Enhanced Humor Generation [arXiv]
  • Logical Inferences with Comparatives and Generalized Quantifiers [arXiv]
  • Media Bias, the Social Sciences, and NLP: Automating Frame Analyses to Identify Bias by Word Choice and Labeling
  • Multi-Task Neural Model for Agglutinative Language Translation
  • Noise-Based Augmentation Techniques for Emotion Datasets: What do we Recommend?
  • Non-Topical Coherence in Social Talk: A Call for Dialogue Model Enrichment
  • Pointwise Paraphrase Appraisal is Potentially Problematic [arXiv]
  • Pre-training via Leveraging Assisting Languages for Neural Machine Translation [arXiv]
  • Preventing Critical Scoring Errors in Short Answer Scoring with Confidence Estimation
  • Reflection-based Word Attribute Transfer
  • Research on Task Discovery for Transfer Learning in Deep Neural Networks
  • Research Replication Prediction Using Weakly Supervised Learning
  • RPD: A Distance Function Between Word Embeddings [arXiv]
  • SCAR: Sentence Compression using Autoencoders for Reconstruction
  • Self-Attention is Not Only a Weight: Analyzing BERT with Vector Norms [arXiv]
  • Story-level Text Style Transfer: A Proposal
  • To compress or not to compress? A Finite-State approach to Nen verbal morphology
  • Topic balancing with additive regularization of topic models
  • Transferring Monolingual Model to Low-Resource Language: The Case of Tigrinya
  • Understanding Points of Correspondence between Sentences for Abstractive Summarization
  • Unsupervised Multilingual Sentence Embeddings for Parallel Corpus Mining
  • Unsupervised Paraphasia Classification in Aphasic Speech
  • Why is penguin more similar to polar bear than to sea gull? Analyzing conceptual knowledge in distributional models
  • Zero-shot North Korean to English Neural Machine Translation by Character Tokenization and Phoneme Decomposition