# Project Proposal: Machine Learning System Design and Implementation for Question Answering

**Author**: Nam Phung \
**Advisor**: Professor Susan Fox \
*Macalester College, Department of Mathematics, Statistics, and Computer Science* \
Fall 2019

**Keywords**: Natural Language Processing, Question Answering, Deep Learning, Software Engineering.
<!--TOC-->

## 1. Introduction

For this project, we will be exploring the task of **Question Answering** (QA), one of the primary tasks in Natural Language Processing. The goal of a QA system is to answer a specific query given a context (paragraph, document, web page, etc.). In other words, the system should be able to extract relevant information from the context conditioned on some query issued by the user. QA systems have seen huge application across different application domains, most notably in chatbots designed to streamline information gathering, provide support and recommendations, etc. In particular, we will be focusing on **Deep Learning** techniques to address this problem, which has become extrememly popular thanks to the sheer amount of text data, as well as more and more computational power. The goal of the project is to build a complete pipeline for a QA system that can serve as the *minimal viable product* (MVP) for a client-facing service.

## 2. Related Works
Almost all proposed approaches to Question Answering (and NLP in general) use some variation of **Recurrent Neural Networks** (RNNs). RNNs are a natural generalization of the feed-forward neural network architecture to sequence data by introducing feedback connections. In particular, the output that an RNN computes at timestep $t$ of a sequence depends on both the corresponding input at that timestep and some computed value from the previous timestep. The task of Question Answering can then be modeled using a sequence-to-sequence (seq2seq) architecture, where an Encoder RNN is used to encode the data from the context and the query into a fixed-length vector, and a Decoder is then used to generate a response to the question [seq2seq paper]. Though it showed some initial success in tasks such as translation, one potential issue with this architecture is that it needs to encode the information from the sequence into a single vector. This bottleneck issue often makes it difficult to model very long input sequences. Bahdanau et. al. introduces the concept of *attention*, which allows the model to choose specific parts of the input sequence to pay attention to during the decoding process [Bahdanau attention paper]. This attention mechanism has been a key factor in recent development in QA. Chen et. al. (2016) proposed the Stanford Attentive Reader, which uses deep bidirectional LSTM along with attention mechanism to predict the *start* and *end* indices of the span of the context that answers the question [Stanford attentive reader]. Seo et al. (2017) proposed a more complex architecture for QA called Bidirectional Attention Flow for Machine Comprehension (BiDAF), where attention is applied to both the context and the question to produce new representations, which is then passed through a fully-connected layer to produce the start and end indices for the answer. 

There are many more works in recent year that are mostly small fine-tuning of these models, augmented with new variants of attention. However, RNN-based architecture still suffered from the lack of contextual modeling. One of the most prominent recent research in NLP is thus concerned with contextual representations, with some promising results making use of *convolutional networks* [cites]. In 2017, Vaswani et al proposed the Transformers, a new encoder-decoder architecture based solely on attention that has since eclipsed all variations of RNNs in NLP tasks, since it was shown to be superior in various tasks, while also speeding up training by almost an order of magnitude [attention is all you need paper]. We will be focusing on the Transformers architecture and the related BERT model [bert paper] for this project.