Skip to content

Latest commit

 

History

History
218 lines (200 loc) · 15.2 KB

File metadata and controls

218 lines (200 loc) · 15.2 KB

Introduction

Human reading comprehension belongs to cognitive psychology field. Roughly, there are three different comprehension types, i.e., literal comprehension (字面理解), inference comprehension (推断理解) and critical comprehension (评价理解).

For machine reading comprehension (mrc), Deep read: A reading comprehension system in ACL 1999 gives the first study. Towards the Machine Comprehension of Text: An Essay by Microsoft gives a review. EMNLP 2014 best paper Modeling Biological Processes for Reading Comprehension proposes feature engineering based symbolic models. After that, lots of deep learning models appear. Tencent AI part 1 illustrates building blocks of mrc deep learning models. Tencent AI Part 2 proposes their new Dual Ask-Answer Network. bAbI datasets from Facebook gives the ai-complete concept. Neural Machine Reading Comprehension: Methods and Trends presents a new review about MRC.

MRC components:

  • Passage
    • Single or multiple
  • Question
    • Cloze or query
  • Candidate
  • Answer
    • Extraction or generation

Deep learning Models

Model list

  • BiDAF from AllenNLP, baseline for MS-MARCO

    • Attention flow layer: context-to-query attention (i.e., which query words are most relevant to each context word, softmax(row)) and query-to-context attention (i.e., which context words have the closest similarity to one of the query word, softmax(max(column))), based on similarity matrix
    • Similarity function
    • Model structure
    • Official implementation
    • Model illustration
    • BiDAF + Self attention + ELMo
  • R-Net for MS-MARCO

    • Core layer 1: gated (applied to passage word and attention-pooling of question) attention-based recurrent network matches passage and question to obtain question-aware passage representation
    • Core layer 2: self-matching layer to aggregate the passage information
    • Model structure
    • Implementation: pytorch, tensorflow
  • S-Net for MS-MARCO

    • Step 1: extracts evidence snippets by matching question and passage via pointer network. Add passage ranking as an additional task to conduct multi-task learning.
    • Step 2: generate the answer by synthesizing the passage, question and evidence snippets via seq2seq. Evidence snippets are labeled as features.
  • QANet

    • Separable convolution + self-attention (Each position as a query to match all positions as keys)
    • Data augmentation via backtranslation
    • Model structure
    • Implementation
  • Multi-answer Multi-task

    • Three loss for multiple answer span
      • Average loss
      • Weighted average loss
      • Minimum value of the loss
    • Combine passage ranking as multi-task learning
      • As answer span can occur in multiple passages, pointwise sigmoid function instead of softmax function is used
    • Minimum risk training
      • Direct optimize the evaluation metric instead of maximizing MLE
    • Prediction is only single answer span
  • Match-LSTM

  • U-Net

  • Dual Ask-Answer Network

  • Gated Self-Matching Networks

  • V-Net from Baidu NLP for MS-MARCO

  • FastQA, comment

  • Documentqa

  • Model reviews part 1 and part 2

Model structure

  • Embedding layer
  • Encoding layer
    • Concatenation of forword and backword hidden states of BiRNN (BiDAF)
    • [convolution-layer * # + self-attention layer + feed-forward layer] (QANet)
  • Context-query attention layer
    • Context and query similarity matrix (BiDAF, QANet)
  • Model layer
    • BiRNN (BiDAF)
    • Gated attention-based recurrent network (R-Net)
    • Passage self-matching
    • [convolution-layer * # + self-attention layer + feed-forward layer] (QANet)
  • Output layer
    • Direct output (BiDAF, QANet)
    • Pointer network (R-Net)
      • Simplify seq2seq mechanism
      • It only points at the probability of elements and get a permutation of inputs
      • Not all pointers is necessary, for mrc and summarization, for example, only two pointers is needed

Dataset

Evaluation metrics

  • Exact Match
    • Clean the text: remove a, an, the, whitespace, punctuation and lowercase
    • Implementation
  • F1
    • It measures the portion of overlap tokens between the predicted answer and groundtruth
    • Implementation
  • BLEU
  • ROUGE-L

In action

  • Sougou MRC Toolkit, paper

  • 2019 Dureader competition

  • 2018 DuReader competition summary

  • Naturali video version, text version

  • Paperweekly seminar

  • Zhuiyi video, text 1 and text 2

    • Data preprocess
      • Filter out query or answer in None
      • Context normalization, i.e., lowercase, punctuation
      • Answer length limit, context length limit (threshold is determined by statistics)
      • Data augmentation i.e., back-translation or similar QA data
      • Training data quality, e.g., same query type has different answer format, 1963 year, 1990 year stop the usage
    • Feature engineering
      • Query type
        • Who, when, where, how, number, why, how long
      • ELMo
        • Word level
    • Model (based on R-Net)
      • Embedding
        • ELMo only (without word2vec)
        • POS embedding
        • Query type embedding
        • Binary word-in-question feature
      • Encoding
        • Multi-layer BiGRU
      • Context-query attention
        • Gated-dropout (filtering useful message) for query
      • Prediction
        • Pointer network
        • Probability = start * stop
    • Training
      • Born-Again Neural Network, teacher = student

Applications

Take-home messages