# CORD-19 Q&A with BioBERT and domain expertise

## Overview
We present an extractive question-answering approach to the COVID-19 Open Research Dataset Challenge, using a pretrained BioBERT model fine-tuned on SQuAD 2.0 to extract relevant information for the challenge tasks from the available documents. Our goal is to produce a smart literature review where we use question-answering to find the relevant answers in the document set as well as the evidence for each answer. We believe the ideal literature review should combine automated machine learning with domain expertise. Our final approach uses a mixture of domain/expert-knowledge to expand on the given tasks and systems for information retrieval and question answering.

## Methodology
The set of steps to produce this work can be split into two parts as follows, one requiring machine learning expertise and one that made use of domain expertise. These parts were done in parallel where possible.

*ML system building*

1. Filter the set of CORD-19 documents to a list relevant to COVID-19 (keyword-based).
2. Build an information retrieval system using TF-IDF to quickly retrieve the relevant set of documents for a particular part of the task.
3. Build and use a question answering system to produce answers to task-relevant questions.
4. Evaluate the answers by using ROUGE and document recall to compare with a set of clinically written answers for a subset of questions and use these to set the thresholds for displaying the answers.

*Clinical expertise*

1. With the aid of clinicians, convert the topics in the task to a set of precise medical questions (included in the data for this notebook, Cov19-questions).
2. For some of those questions, produce a set of model answers.
3. Sanity-check the output of the model for a subset of the questions.


## Q&A Model Background
BERT ([Devlin et al. 2018](https://arxiv.org/abs/1810.04805)) is a contextual word representation model learned from large-scale language model pretraining of a bidirectional Transformer ([Vaswani et al. 2017](https://arxiv.org/abs/1706.03762)). Following pretraining on general natural language corpora, BERT can readily be fine-tuned for many downstream NLP tasks, achieving high performance using much smaller datasets. Recent work has shown major improvements in a wide variety of tasks using BERT or similar Transformer models, and variants of BERT have been adapted to specific domains by pretraining on specialized corpora – e.g. BioBERT ([Lee et al. 2019](https://arxiv.org/abs/1901.08746)) augments the BERT pretraining corpus with scholarly biomedical literature from PubMed.

A common downstream NLP task is extractive question-answering: given a question and a corresponding passage, a model learns to extract the excerpt from the passage that best answers the question. The standard benchmark dataset for this task is [SQuAD](https://rajpurkar.github.io/SQuAD-explorer/). Notably, in the latest version (SQuAD 2.0, [Rajpurkar et al. 2018](https://arxiv.org/abs/1806.03822)), approximately ⅓ of the questions in the dataset are intentionally unanswerable, so that high-quality models must learn to abstain from answering when provided with insufficient evidence. (We found that fine-tuning on this dataset improved the quality of our answers compared to SQuAD 1.1.)

Given the biomedical content of the CORD-19 corpus, and the sparse, uncertain nature of question answering with this dataset (i.e. most documents do not contain good answers to most questions), our Q&A system is thus powered by a pretrained BioBERT model fine-tuned for extractive question answering using SQuAD 2.0.

## Implementation
We use the excellent [Huggingface Transformers](https://huggingface.co/transformers/) library for running inference with our Q&A model. The model checkpoint is included with the submission and can be loaded below. To reproduce this checkpoint, use the run_squad.py script included in the [Huggingface Transformers examples](https://github.com/huggingface/transformers/tree/master/examples#squad) with the following command (takes ~8 hours on a GTX 1080):

```
python run_squad.py \
  --model_type bert \
  --model_name_or_path monologg/biobert_v1.1_pubmed \
  --do_train \
  --do_eval \
  --train_file $SQUAD_DIR/train-v2.0.json \
  --predict_file $SQUAD_DIR/dev-v2.0.json \
  --per_gpu_train_batch_size 8 \
  --learning_rate 3e-5 \
  --num_train_epochs 4 \
  --max_seq_length 384 \
  --doc_stride 128 \
  --output_dir /tmp/biobert_squad2/ \
  --version_2_with_negative
```

To improve accuracy and relevance of the results, we pre-filter the corpus using a list of keywords related to COVID-19, as shown in [this kernel](https://www.kaggle.com/mimisun/covid-19-articles). At query time, the top candidate documents are automatically retrieved based on TF-IDF cosine similarity with the query, with the option to manually filter using only keywords if desired. By default, we extract answers from the abstract, discussion, and conclusion sections. Answers are ranked based on the model's start score of the answer span. We also extract cohort size, where available, by simply asking the model “How many patients?” over the study abstracts.


In [None]:
import tensorflow as tf
import pandas as pd
import numpy as np
import sys
import time
from transformers import AutoTokenizer, TFAutoModelForQuestionAnswering
import textwrap
import re
import attr
import abc
import string
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
from IPython.display import HTML
from os import listdir
from os.path import isfile, join

import warnings  
warnings.filterwarnings('ignore')
MAX_ARTICLES = 1000
base_dir = '/kaggle/input'
data_dir = base_dir + '/covid-19-articles'
data_path = data_dir + '/covid19.csv'
model_path = base_dir + '/biobert-qa/biobert_squad2_cased'

class ResearchQA(object):
    def __init__(self, data_path, model_path):
        print('Loading data from', data_path)
        self.df = pd.read_csv(data_path)
        print('Initializing model from', model_path)
        self.model = TFAutoModelForQuestionAnswering.from_pretrained(model_path, from_pt=True)
        self.tokenizer = AutoTokenizer.from_pretrained(model_path)
        self.retrievers = {}
        self.build_retrievers()
        self.main_question_dict = dict()
        
    
    def build_retrievers(self):
        df = self.df
        abstracts = df[df.abstract.notna()].abstract
        self.retrievers['abstract'] = TFIDFRetrieval(abstracts)
        body_text = df[df.body_text.notna()].body_text
        self.retrievers['body_text'] = TFIDFRetrieval(body_text)

    def retrieve_candidates(self, section_path, question, top_n):
        candidates = self.retrievers[section_path[0]].retrieve(question, top_n)
        return self.df.loc[candidates.index]

    def get_questions(self, question_path):
        print('Loading questions from', question_path)
        expert_question_answer = pd.read_csv(question_path, sep='\t')
        self.main_question_dict = dict()
        
        question_list_str = ''
        for _, row in expert_question_answer.iterrows():
          task = row['Task #']
          main_question = row['Main Question']
          questions = row['Question']
          answer = row['Answer']
          doi = row['DOI']
          cohort_size = row['Cohort Size']
          study_type = row['Study type']
          if pd.notna(main_question):
            # Get first 5 words in the main question
            main_question_abbr = '_'.join(main_question.replace('(', ' ').replace(')', ' ').split(' ')[:5])
            self.main_question_dict[main_question_abbr] = dict()
          if pd.notna(questions):
            answer_list = []
            question_list_str = questions.replace('(', '_').replace(')', '_')
          if pd.notna(answer):      
            answer_list.append((answer, doi, cohort_size, study_type))
          self.main_question_dict[main_question_abbr][question_list_str] = answer_list            
        
    def get_answers(self, question, section='abstract', keyword=None, max_articles=1000, batch_size=12):
        df = self.df
        answers = []
        section_path = section.split('/')

        if keyword:
            candidates = df[df[section_path[0]].str.contains(keyword, na=False, case=False)]
        else:
            candidates = self.retrieve_candidates(section_path, question, top_n=max_articles)
        if max_articles:
            candidates = candidates.head(max_articles)

        text_list = []
        indices = []
        for idx, row in candidates.iterrows():
            if section_path[0] == 'body_text':
                text = self.get_body_section(row.body_text, section_path[1])
            else:
                text = row[section]
            if text and isinstance(text, str):
                text_list.append(text)
                indices.append(idx)

        num_batches = len(text_list) // batch_size
        all_answers = []
        for i in range(num_batches):
            batch = text_list[i * batch_size:(i+1) * batch_size]
            answers = self.get_answers_from_text_list(question, batch)
            all_answers.extend(answers)

        last_batch = text_list[batch_size * num_batches:]
        if last_batch:
            all_answers.extend(self.get_answers_from_text_list(question, last_batch))

        columns = ['doi', 'authors', 'journal', 'publish_time', 'title', 'cohort_size']
        processed_answers = []
        for i, a in enumerate(all_answers):
            if a:
                row = candidates.loc[indices[i]]
                new_row = [a.text, a.start_score, a.end_score, a.input_text]
                new_row.extend(row[columns].values)
                processed_answers.append(new_row)
        answer_df = pd.DataFrame(processed_answers, columns=(['answer', 'start_score',
                                                 'end_score', 'context'] + columns))
        return answer_df.sort_values(['start_score', 'end_score'], ascending=False)

    def get_body_section(self, body_text, section_name):
      sections = body_text.split('<SECTION>\n')
      for section in sections:
        lines = section.split('\n')
        if len(lines) > 1:
          if section_name.lower() in lines[0].lower():
            return section

    def get_answers_from_text_list(self, question, text_list, max_tokens=512):
      tokenizer = self.tokenizer
      model = self.model
      inputs = tokenizer.batch_encode_plus(
          [(question, text) for text in text_list], add_special_tokens=True, return_tensors='tf',
          max_length=max_tokens, truncation_strategy='only_second', pad_to_max_length=True)
      input_ids = inputs['input_ids'].numpy()
      answer_start_scores, answer_end_scores = model(inputs)
      answer_start = tf.argmax(
          answer_start_scores, axis=1
      ).numpy()  # Get the most likely beginning of each answer with the argmax of the score
      answer_end = (
          tf.argmax(answer_end_scores, axis=1) + 1
      ).numpy()  # Get the most likely end of each answer with the argmax of the score

      answers = []
      for i, text in enumerate(text_list):
        input_text = tokenizer.decode(input_ids[i, :], clean_up_tokenization_spaces=True)
        input_text = input_text.split('[SEP] ', 2)[1]
        answer = tokenizer.decode(
            input_ids[i, answer_start[i]:answer_end[i]], clean_up_tokenization_spaces=True)
        score_start = answer_start_scores.numpy()[i][answer_start[i]]
        score_end = answer_end_scores.numpy()[i][answer_end[i]-1]
        if answer and not '[CLS]' in answer:
          answers.append(Answer(answer, score_start, score_end, input_text))
        else:
          answers.append(None)
      return answers

    def output_answers_for_task(self, task):
      question_path = base_dir + '/cov19questions/CORD-19-research-challenge-tasks - Question_{}.tsv'.format(task)
      # Get questions related to this task from the expert generated question file.
      self.get_questions(question_path)        
        
      for main_question, value in self.main_question_dict.items():
        print(f"Output for main Question: {main_question}") 
        output_csvfile = main_question + '.csv'
        new_main_question = True

        for questions, answers in value.items():
          question_list = (questions.split(','))
          for question in question_list:
            print(f"Writing answer for question: {question}")
            for sec in ['abstract', 'body_text/discussion', 'body_text/conclusion']:
              model_prediction = self.get_answers(question, section=sec, max_articles=MAX_ARTICLES, batch_size=12)
              # if this is a new main question, create a new answer file. 
              if new_main_question:
                model_prediction.to_csv(output_csvfile, header=model_prediction.columns.to_list(), index=False)
                new_main_question = False
              else:
                model_prediction.to_csv(output_csvfile, mode='a', header=False, index=False)
    

class Retrieval(abc.ABC):
  """Base class for retrieval methods."""

  def __init__(self, docs, keys=None):
    """
    Args:
      docs: a pd.Series of strings. The text to retrieve.
      keys: a pd.Series. Keys (e.g. ID, title) associated with each document.
    """
    self._docs = docs.copy()
    if keys is not None:
      self._docs.index = keys
    self._model = None
    self._doc_vecs = None

  def _top_documents(self, q_vec, top_n=10):
    similarity = cosine_similarity(self._doc_vecs, q_vec)
    rankings = np.argsort(np.squeeze(similarity))[::-1]
    ranked_indices = self._docs.index[rankings]
    return self._docs[ranked_indices][:top_n]

  @abc.abstractmethod
  def retrieve(self, query, top_n=10):
    pass

class TFIDFRetrieval(Retrieval):
  """Retrieve documents based on cosine similarity of TF-IDF vectors with query."""

  def __init__(self, docs, keys=None):
    """
    Args:
      docs: a list or pd.Series of strings. The text to retrieve.
      keys: a list or pd.Series. Keys (e.g. ID, title) associated with each document.
    """
    super(TFIDFRetrieval, self).__init__(docs, keys)
    self._model = TfidfVectorizer()
    self._doc_vecs = self._model.fit_transform(docs)

  def retrieve(self, query, top_n=10):
    q_vec = self._model.transform([query])
    return self._top_documents(q_vec, top_n)

@attr.s
class Answer(object):
    text = attr.ib()
    start_score = attr.ib()
    end_score = attr.ib()
    input_text = attr.ib()
    
def answer_questions(questions, qa, max_articles, section='abstract'):
  for question_group in questions:
    main_question = question_group[0]
    answers = {}
    for q in question_group[1:]:
      answers[q] = qa.get_answers(q, section=section, max_articles=max_articles)
    render_results(main_question, answers)
  # Return the last set for debugging.
  return main_question, answers


style = '''
<style>
.hilight {
  background-color:#cceeff;
}
a {
  color: #000 !important;
  text-decoration: underline;
}
.question {
  font-size: 20px;
  font-style: italic;
  margin: 10px 0;
}
.info {
  padding: 10px 0;
}
table.dataframe {
  max-height: 450px;
  text-align: left;
}
.meta {
  margin-top: 10px;
}
.journal {
  color: green;
}
.footer {
  position: absolute;
  bottom: 20px;
  left: 20px;
}
</style>
'''

def format_context(row):
  text = row.context
  answer = row.answer
  highlight_start = text.find(answer)

  def find_context_start(text):
    idx = len(text) - 1
    while idx >= 2:
      if text[idx].isupper() and re.match(r'\W ', text[idx - 2:idx]):
        return idx
      idx -= 1
    return 0 
  context_start = find_context_start(text[:highlight_start])
  highlight_end = highlight_start + len(answer)
  context_html = (text[context_start:highlight_start] + '<span class=hilight>' + 
                  text[highlight_start:highlight_end] + '</span>' + 
                  text[highlight_end:highlight_end + 1 + text[highlight_end:].find('. ')])
  context_html += f'<br><br>score: {row.start_score:.2f}'
  return context_html


def format_author(authors):
  if not authors or not isinstance(authors, str):
    return 'Unknown Authors'
  name = authors.split(';')[0]
  name = name.split(',')[0]
  return name + ' et al'

def format_info(row):
  meta = []
  authors = format_author(row.authors) 
  if authors:
    meta.append(authors)
  meta.append(row.publish_time)
  meta = ', '.join(meta)
 
  html = f'''\
  <a class="title" target=_blank href="http://doi.org/{row.doi}">{row.title}</a>\
  <div class="meta">{meta}</div>\
  '''

  journal = row.journal
  if journal and isinstance(journal, str):
    html += f'<div class="journal">{journal}</div>'

  return html

def render_results(main_question, answers):
  id = main_question[:20].replace(' ', '_')
  html = f'<h1 id="{id}" style="font-size:20px;">{main_question}</h1>'
  has_answer = False
  for q, a in answers.items():
    # TODO: skipping empty answers. Maybe we should show
    # top retrieved docs.
    if a.empty:
      continue
    has_answer = True
    # clean up question
    if '?' in q:
        q = q.split('?')[0] + '?'
    html += f'<div class=question>{q}</div>' + format_answers(a)
  if has_answer:
      display(HTML(style + html))

def format_answers(a):
    a = a.sort_values('start_score', ascending=False)
    a.drop_duplicates('doi', inplace=True)
    out = []
    for i, row in a.iterrows():
      if row.start_score < 0:
        continue
      info = format_info(row)
      context = format_context(row)
      cohort = ''
      if not np.isnan(row.cohort_size):
        cohort = int(row.cohort_size)
      out.append([context, info, cohort])
    out = pd.DataFrame(out, columns=['answer', 'article', 'cohort size'])
    return out.to_html(escape=False, index=False)

def render_answers(a):
    display(HTML(style + format_answers(a)))

In [None]:
qa = ResearchQA(data_path, model_path)

In [None]:
# Get cohort size for all articles.
get_cohort_size = False
if get_cohort_size:
    qa.df['cohort_size'] = ''
    cohort = qa.get_answers('How many patients?', section='abstract', max_articles=10000)
    cohort['cohort_size'] = pd.to_numeric(cohort.answer, errors='coerce')
    del qa.df['cohort_size']
    qa.df = qa.df.merge(cohort[['doi', 'cohort_size']], on='doi', how='left')
    qa.df.cohort_size.describe()

## Task 5: What has been published about medical care?

### Question Generation
Our Q&A model requires a natural language question as an input. Using the medical expertise of two clinician-scientists (C. Chen and S. Kang), we generated 24 questions based on the high-level overview for this challenge task. This strategy models that of researchers doing a literature review – starting with a broad task and then breaking it down to manageable pieces to further explore. We recruited the help of clinical researchers, our potential end users, to ensure that the questions were relevant and accurate.

### Evaluation
For a subset of the questions, our clinical experts manually identified relevant articles, key words, and DOIs. We evaluated the quality of the model answers against the expert-selected answers using document recall and ROUGE (Recall-Oriented Understudy for Gisting Evaluation) We bucketed each model-produced answer based on the score from the Q&A model. We then assigned the document recall and ROUGE scores to each model-produced bucket. We set the Q&A threshold by finding an appropriate recall/precision tradeoff based on these score buckets.

### Model output
Try asking your own question here:

In [None]:
answers = qa.get_answers('How effective are early treatments', max_articles=10)
render_answers(answers)

View the answers for our chosen questions below:

In [None]:
def build_task_main_question_dict():
    task_main_question_dict = dict()
    for task in ['Task1', 'Task2', 'Task4', 'Task5']:
        question_path = base_dir + '/cov19questions/CORD-19-research-challenge-tasks - Question_{}.tsv'.format(task)
        if task not in task_main_question_dict:
            task_main_question_dict[task] = dict()
        expert_question_answer = pd.read_csv(question_path, sep='\t')
        for _, row in expert_question_answer.iterrows():
            main_question = row['Main Question']            
            questions = row['Question']
            if pd.notna(main_question):
                main_question_abbr = '_'.join(main_question.replace('(', ' ').replace(')', ' ').split(' ')[:5])
            if main_question_abbr not in task_main_question_dict[task]:
                questions_list = []
                task_main_question_dict[task][main_question_abbr] = (main_question, questions_list)            
            if pd.notna(questions):
                main_question, questions_list = task_main_question_dict[task][main_question_abbr]
                questions_list.append(questions)
                task_main_question_dict[task][main_question_abbr] = (main_question, questions_list)
    return task_main_question_dict
    
def get_task_and_main_question_question_list_for(model_prediction_file_name, task_main_question_dict):
    model_prediction_file_name_prefix = model_prediction_file_name.split('.')[0]
    for task, main_question_entry in task_main_question_dict.items():
        for main_question_abbr, main_question_question_list in main_question_entry.items():
            if model_prediction_file_name_prefix == main_question_abbr:
                main_question, questions = main_question_question_list
                return task, main_question, questions
    return None, None, None

def calculate_score_buckets():
    # Calculate the eval scores for each bucket of scores from the QA model. This is used to decide a threshold.
    eval_path = base_dir + '/model-eval'
    model_result_dir = eval_path
    count = 0
    main_question_files = [f for f in listdir(model_result_dir) if isfile(join(model_result_dir, f))]
    score_doi_match = []
    for model_eval_file in main_question_files:
        if not model_eval_file.endswith('.csv'):
            continue
        model_prediction_path = join(model_result_dir, model_eval_file)
        model_prediction = pd.read_csv(model_prediction_path, sep=',')
        for _, row in model_prediction.iterrows():
            score_doi_match.append((row['start_score'], row['end_score'], row['doi_match'], row['best_rouge1_fmeasure'], row['best_rouge2_fmeasure'], 
                      row['best_rougeL_fmeasure'], row['best_dist_euclidean'], 
                      row['best_dist_cosine']))
    score_doi_match = np.asarray(score_doi_match, np.float32)
    sort_indices = np.argsort(score_doi_match[:, 0])
    for compare_score_index in range(2, 7):
        print('Comparing score index ' + str(compare_score_index))
        # Calculate % of DOI matches in each score bucket for the start_score.
        bucket_counts = {}
        for i in range(-8, 15):
            bucket_counts[str(i)] = [0, 0]
        print('Start score buckets')
        for row in score_doi_match[sort_indices]:
            score = int(round(row[0]))

            bucket_counts[str(score)][1] += 1
            if compare_score_index == 2:
                if row[2] > 0:
                    bucket_counts[str(score)][0] += 1
            else:
                bucket_counts[str(score)][0] += row[compare_score_index]
        for (score, count) in bucket_counts.items():
            if (count[1] > 0):
                print(score, count[0] / count[1])
        # Calculate % of DOI matches in each score bucket for the end_score.
        bucket_counts = {}
        for i in range(-8, 15):
            bucket_counts[str(i)] = [0, 0]
        print('End score buckets')
        for row in score_doi_match[sort_indices]:
            score = int(round(row[1]))
            bucket_counts[str(score)][1] += 1
            if compare_score_index == 2:
                if row[2] > 0:
                    bucket_counts[str(score)][0] += 1
            else:
                bucket_counts[str(score)][0] += row[compare_score_index]
        for (score, count) in bucket_counts.items():
            if (count[1] > 0):
                print(score, count[0] / count[1])

def get_all_answers_for_task(task, task_main_question_dict, start_threshold=0, end_threshold=2, max_results=20):  
    pregenerated_model_answers_path = base_dir + '/model-answers'
    model_prediction_files = [f for f in listdir(pregenerated_model_answers_path) if isfile(join(pregenerated_model_answers_path, f))]

    doc_columns = ['doi', 'authors', 'journal', 'publish_time', 'title','cohort_size']
    answer_columns = ['answer', 'start_score', 'end_score', 'context']    
    blacklist = ['Efforts to determine adjunctive and supportive interventions that can improve the clinical outcomes of infected patients (e.g. steroids, high flow oxygen)']
    
    for model_prediction_file in model_prediction_files:
        predict_task, main_question, questions_list = get_task_and_main_question_question_list_for(model_prediction_file, task_main_question_dict)
        if main_question in blacklist:
            continue
        if predict_task == task:
            model_prediction_path = join(pregenerated_model_answers_path, model_prediction_file)
            model_prediction = pd.read_csv(model_prediction_path, sep=',')
            main_question_abbr = model_prediction_file.split('.')[0]
            question_answer_dict = dict()
            for _, row in model_prediction.iterrows():
                if row['start_score'] < start_threshold or row['end_score'] < end_threshold:
                    continue
                model_question = row['question']
                cohort_size = row['cohort_size_from_abstract'] if pd.notna(row['cohort_size_from_abstract']) else row['cohort_size_from_body']      
            
                for questions_str in questions_list:
                    if model_question in questions_str:
                        if questions_str not in question_answer_dict:
                            answer_and_eval = pd.DataFrame(columns=(answer_columns + doc_columns))                    
                            question_answer_dict[questions_str] = answer_and_eval
                        answer_and_eval = question_answer_dict[questions_str]  
                        answer_and_eval.loc[len(answer_and_eval)] = [row['answer'], row['start_score'], row['end_score'], row['context'], row['doi'], 
                                                               row['authors'], row['journal'], row['publish_time'], row['title'], cohort_size]
                # Sort by score before dedup.
                answer_and_eval = answer_and_eval.sort_values('start_score', ascending=False)
                answer_and_eval.drop_duplicates('doi', inplace=True)
            render_results(main_question, question_answer_dict)

In [None]:
task_main_question_dict = build_task_main_question_dict()
get_all_answers_for_task('Task5', task_main_question_dict, 0, 2)

Run this cell to generate the file containing all answers for our questions on the given task. (Cached answer files are provided with the notebook.)

In [None]:
# qa.output_answers_for_task('Task5')

## Discussion
Literature reviews, while time-consuming, are perhaps the most important step in understanding a particular research topic or question. During this pandemic, the amount of literature generated has been overwhelming, and manually digesting this volume of information is nearly impossible.  

By applying NLP methods, we are able to automatically filter out much of the noise and directly surface the most relevant information. Using a streamlined, reproducible process, our Q&A model can easily be rerun to incorporate new literature as it is published, giving researchers easy access to the most up-to-date results. Moreover, the questions we present here were generated with the needs of clinical users in mind, and formulated broadly enough to be reusable in future pandemics.

In the future, we would like to improve our system by adding summarization capabilities to aggregate results across articles and extract additional context from each article, including credibility or level of evidence supporting the conclusions. These are challenging open problems in NLP. The extracted studies should be critically reviewed by experts, and consensus should be built between healthcare professionals and the public.

## Contributors
This notebook was produced by the following contributors from the [Google Health medical records research team](https://github.com/Google-Health/records-research): 

*Modeling:* Andrew M. Dai, Jonas Kemp, Mimi Sun, Zhen Xu, and Emily Xue.
*Clinical:* Christina Chen and Shawn Kang. 