<a href="https://colab.research.google.com/github/aislinblack/CS6120-NLP-Project/blob/main/Albert_original_version.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Question and Answering with ALBERT

Based on the following jupyter notebook https://colab.research.google.com/github/spark-ming/albert-qa-demo/blob/master/Question_Answering_with_ALBERT.ipynb#scrollTo=1qfQAtRsMVl7

## Introduction to ALBERT


ALBERT stands for A Lite BERT and is a modified version of BERT NLP model. It builds on three key points such as Parameter Sharing, Embedding Factorization and Sentence Order Prediction (SOP). 





## 1.0 Setup

Let's check out what kind of GPU our friends at Google gave us. This notebook should be configured to give you a P100 😃 (saved in metadata)

In [1]:
!nvidia-smi

Sat Aug 13 18:02:15 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla P100-PCIE...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   38C    P0    28W / 250W |      0MiB / 16280MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

First, we clone the Hugging Face transformer library from Github

In [2]:
!git clone https://github.com/huggingface/transformers \
&& cd transformers \
&& git checkout a3085020ed0d81d4903c50967687192e3101e770 

Cloning into 'transformers'...
remote: Enumerating objects: 105690, done.[K
remote: Counting objects: 100% (357/357), done.[K
remote: Compressing objects: 100% (161/161), done.[K
remote: Total 105690 (delta 199), reused 280 (delta 159), pack-reused 105333[K
Receiving objects: 100% (105690/105690), 98.13 MiB | 22.57 MiB/s, done.
Resolving deltas: 100% (78049/78049), done.
Note: checking out 'a3085020ed0d81d4903c50967687192e3101e770'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b <new-branch-name>

HEAD is now at a3085020e Added repetition penalty to PPLM example (#2436)


In [3]:
!pip install ./transformers
!pip install tensorboardX

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Processing ./transformers
[33m  DEPRECATION: A future pip version will change local packages to be built in-place without first copying to a temporary directory. We recommend you use --use-feature=in-tree-build to test your packages with this new behavior before it becomes the default.
   pip 21.3 will remove support for this functionality. You can find discussion regarding this at https://github.com/pypa/pip/issues/7555.[0m
Collecting tokenizers==0.0.11
  Downloading tokenizers-0.0.11-cp37-cp37m-manylinux1_x86_64.whl (4.7 MB)
[K     |████████████████████████████████| 4.7 MB 14.8 MB/s 
[?25hCollecting boto3
  Downloading boto3-1.24.51-py3-none-any.whl (132 kB)
[K     |████████████████████████████████| 132 kB 94.8 MB/s 
Collecting sentencepiece
  Downloading sentencepiece-0.1.97-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
[K     |███████████████████████████████

## 2.0 Train Model

Now, we could definitely train our own model (and you can see how to do that in the other linked jupyter notebook), but it would take a really long time, and because of this hugging face lets us borrow a pretrained albert model which was already trained on the SQuAD dataset.

The tutorial lets us know that it takes about 1.5 hours per epoch to train ALBERT on SQuAD because the dataset is so large.



## 3.0 Setup prediction code

Now we can use the Hugging Face library to make predictions using our newly trained model. Note that a lot of the code is pulled from `run_squad.py` in the Hugging Face repository, with all the training parts removed. This modified code allows to run predictions we pass in directly as strings, rather .json format like the training/test set.

NOTE if you decided train your own mode, change the flag `use_own_model` to `True`


In [4]:
import os
import torch
import time
from torch.utils.data import DataLoader, RandomSampler, SequentialSampler

from transformers import (
    AlbertConfig,
    AlbertForQuestionAnswering,
    AlbertTokenizer,
    squad_convert_examples_to_features
)

from transformers.data.processors.squad import SquadResult, SquadV2Processor, SquadExample

from transformers.data.metrics.squad_metrics import compute_predictions_logits

# READER NOTE: Set this flag to use own model, or use pretrained model in the Hugging Face repository
use_own_model = False

if use_own_model:
  model_name_or_path = "/content/model_output"
else:
  model_name_or_path = "twmkn9/albert-base-v2-squad2" ## using this model because it is closest to the bert being used

output_dir = ""

# Config
n_best_size = 1
max_answer_length = 30
do_lower_case = True
null_score_diff_threshold = 0.0

def to_list(tensor):
    return tensor.detach().cpu().tolist()

# Setup model
config_class, model_class, tokenizer_class = (
    AlbertConfig, AlbertForQuestionAnswering, AlbertTokenizer)
config = config_class.from_pretrained(model_name_or_path)
tokenizer = tokenizer_class.from_pretrained(
    model_name_or_path, do_lower_case=True)
model = model_class.from_pretrained(model_name_or_path, config=config)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

model.to(device)

processor = SquadV2Processor()

def run_prediction(question_texts, context_text):
    """Setup function to compute predictions"""
    examples = []

    for i, question_text in enumerate(question_texts):
        example = SquadExample(
            qas_id=str(i),
            question_text=question_text,
            context_text=context_text,
            answer_text=None,
            start_position_character=None,
            title="Predict",
            is_impossible=False,
            answers=None,
        )

        examples.append(example)

    features, dataset = squad_convert_examples_to_features(
        examples=examples,
        tokenizer=tokenizer,
        max_seq_length=384,
        doc_stride=128,
        max_query_length=64,
        is_training=False,
        return_dataset="pt",
        threads=1,
    )

    eval_sampler = SequentialSampler(dataset)
    eval_dataloader = DataLoader(dataset, sampler=eval_sampler, batch_size=10)

    all_results = []

    for batch in eval_dataloader:
        model.eval()
        batch = tuple(t.to(device) for t in batch)

        with torch.no_grad():
            inputs = {
                "input_ids": batch[0],
                "attention_mask": batch[1],
                "token_type_ids": batch[2],
            }

            example_indices = batch[3]

            outputs = model(**inputs)

            for i, example_index in enumerate(example_indices):
                eval_feature = features[example_index.item()]
                unique_id = int(eval_feature.unique_id)

                output = [to_list(output[i]) for output in outputs]

                start_logits, end_logits = output
                result = SquadResult(unique_id, start_logits, end_logits)
                all_results.append(result)

    output_prediction_file = "predictions.json"
    output_nbest_file = "nbest_predictions.json"
    output_null_log_odds_file = "null_predictions.json"

    predictions = compute_predictions_logits(
        examples,
        features,
        all_results,
        n_best_size,
        max_answer_length,
        do_lower_case,
        output_prediction_file,
        output_nbest_file,
        output_null_log_odds_file,
        False,  # verbose_logging
        True,  # version_2_with_negative
        null_score_diff_threshold,
        tokenizer,
    )

    return predictions

Downloading:   0%|          | 0.00/716 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/760k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/46.7M [00:00<?, ?B/s]

## 4.0 Run predictions on the Covid QA set

Now for the fun part... testing out your model on different inputs. Pretty rudimentary example here. But the possibilities are endless with this function.

### 4.1 Reading in the Covid QA set from json

In [5]:
import pandas as pd

def read_test_set():
    df = pd.read_json("Covid-QA-more-focused.json")
    return df['data']

data = read_test_set()

### 4.2 Setting up data collection 

Also making sure we are connected to GPU

In [6]:
## set up data
import time
import tensorflow as tf
print(tf.test.gpu_device_name())

num_right = 0 # giving credit for whenever it comes up with a subset of the string
total = 0
all_questions = []
all_answers = []

/device:GPU:0


### 4.3 Run prediction on each paragraph of Q&A test set

In [7]:
total_start = time.time()
for item in data:
    start = time.time()

    paragraph = item["paragraphs"][0]
    
    questions_with_answers = paragraph["qas"]
    context = paragraph["context"]
    questions = []
    answers = []

    for qa in questions_with_answers:
        questions.append(qa["question"])
        answers.append(qa["answers"])

    predictions = run_prediction(questions, context)
    idx = 0
    for key in predictions.keys():
      pos_answers = answers[idx]
      correct = False

      all_questions = all_questions + questions
      all_answers.append({'model_answer': predictions[key], 'valid_answers': pos_answers})
      for answer in pos_answers:
        answer_text = answer['text']
        correct = correct or (predictions[key] in answer_text)
      if correct:
        num_right += 1
      end = time.time()

      total += 1
      idx += 1
    end = time.time()
    print(end - start)

print("total time: ", total_start - time.time())

convert squad examples to features: 100%|██████████| 10/10 [00:03<00:00,  3.12it/s]
add example index and unique id: 100%|██████████| 10/10 [00:00<00:00, 20867.18it/s]


11.079823017120361


convert squad examples to features: 100%|██████████| 2/2 [00:00<00:00,  2.81it/s]
add example index and unique id: 100%|██████████| 2/2 [00:00<00:00, 9372.75it/s]


2.1874256134033203


convert squad examples to features: 100%|██████████| 1/1 [00:00<00:00,  4.84it/s]
add example index and unique id: 100%|██████████| 1/1 [00:00<00:00, 6786.90it/s]


0.7935643196105957


convert squad examples to features: 100%|██████████| 10/10 [00:02<00:00,  3.56it/s]
add example index and unique id: 100%|██████████| 10/10 [00:00<00:00, 24188.60it/s]


8.603464841842651


convert squad examples to features: 100%|██████████| 8/8 [00:01<00:00,  4.59it/s]
add example index and unique id: 100%|██████████| 8/8 [00:00<00:00, 18925.23it/s]


5.4849255084991455


convert squad examples to features: 100%|██████████| 5/5 [00:01<00:00,  3.14it/s]
add example index and unique id: 100%|██████████| 5/5 [00:00<00:00, 15196.75it/s]


4.7098963260650635


convert squad examples to features: 100%|██████████| 4/4 [00:00<00:00, 23.61it/s]
add example index and unique id: 100%|██████████| 4/4 [00:00<00:00, 44739.24it/s]


0.6992290019989014


convert squad examples to features: 100%|██████████| 11/11 [00:01<00:00,  5.61it/s]
add example index and unique id: 100%|██████████| 11/11 [00:00<00:00, 28944.38it/s]


6.380980968475342


convert squad examples to features: 100%|██████████| 8/8 [00:03<00:00,  2.63it/s]
add example index and unique id: 100%|██████████| 8/8 [00:00<00:00, 20068.44it/s]


8.893790245056152


convert squad examples to features: 100%|██████████| 5/5 [00:00<00:00,  6.08it/s]
add example index and unique id: 100%|██████████| 5/5 [00:00<00:00, 24216.54it/s]


2.9147472381591797


convert squad examples to features: 100%|██████████| 29/29 [00:08<00:00,  3.49it/s]
add example index and unique id: 100%|██████████| 29/29 [00:00<00:00, 29057.53it/s]


24.728123903274536


convert squad examples to features: 100%|██████████| 4/4 [00:00<00:00, 19.00it/s]
add example index and unique id: 100%|██████████| 4/4 [00:00<00:00, 27730.94it/s]


0.9033982753753662


convert squad examples to features: 100%|██████████| 1/1 [00:00<00:00, 11.09it/s]
add example index and unique id: 100%|██████████| 1/1 [00:00<00:00, 7667.83it/s]


0.37380385398864746


convert squad examples to features: 100%|██████████| 27/27 [00:02<00:00, 12.67it/s]
add example index and unique id: 100%|██████████| 27/27 [00:00<00:00, 55458.48it/s]


8.038163185119629


convert squad examples to features: 100%|██████████| 20/20 [00:04<00:00,  4.69it/s]
add example index and unique id: 100%|██████████| 20/20 [00:00<00:00, 29016.29it/s]


12.982990503311157


convert squad examples to features: 100%|██████████| 1/1 [00:00<00:00, 30.15it/s]
add example index and unique id: 100%|██████████| 1/1 [00:00<00:00, 11305.40it/s]


0.24875593185424805


convert squad examples to features: 100%|██████████| 30/30 [00:05<00:00,  5.40it/s]
add example index and unique id: 100%|██████████| 30/30 [00:00<00:00, 38211.09it/s]


18.013540983200073


convert squad examples to features: 100%|██████████| 59/59 [00:35<00:00,  1.68it/s]
add example index and unique id: 100%|██████████| 59/59 [00:00<00:00, 18449.56it/s]


92.49353623390198


convert squad examples to features: 100%|██████████| 15/15 [00:06<00:00,  2.24it/s]
add example index and unique id: 100%|██████████| 15/15 [00:00<00:00, 18547.92it/s]


19.331637144088745


convert squad examples to features: 100%|██████████| 13/13 [00:01<00:00,  6.71it/s]
add example index and unique id: 100%|██████████| 13/13 [00:00<00:00, 43343.36it/s]


6.268296718597412


convert squad examples to features: 100%|██████████| 5/5 [00:01<00:00,  4.72it/s]
add example index and unique id: 100%|██████████| 5/5 [00:00<00:00, 23590.01it/s]


3.4364125728607178


convert squad examples to features: 100%|██████████| 119/119 [03:33<00:00,  1.79s/it]
add example index and unique id: 100%|██████████| 119/119 [00:00<00:00, 11111.36it/s]


445.75199460983276


convert squad examples to features: 100%|██████████| 23/23 [00:09<00:00,  2.35it/s]
add example index and unique id: 100%|██████████| 23/23 [00:00<00:00, 20808.67it/s]


27.428820848464966


convert squad examples to features: 100%|██████████| 1/1 [00:00<00:00, 17.58it/s]
add example index and unique id: 100%|██████████| 1/1 [00:00<00:00, 5769.33it/s]


0.32392001152038574


convert squad examples to features: 100%|██████████| 10/10 [00:01<00:00,  6.45it/s]
add example index and unique id: 100%|██████████| 10/10 [00:00<00:00, 35365.13it/s]


5.171037435531616


convert squad examples to features: 100%|██████████| 1/1 [00:01<00:00,  1.15s/it]
add example index and unique id: 100%|██████████| 1/1 [00:00<00:00, 6026.30it/s]


2.761451482772827


convert squad examples to features: 100%|██████████| 24/24 [00:06<00:00,  3.91it/s]
add example index and unique id: 100%|██████████| 24/24 [00:00<00:00, 20068.44it/s]


18.64723253250122


convert squad examples to features: 100%|██████████| 2/2 [00:01<00:00,  1.13it/s]
add example index and unique id: 100%|██████████| 2/2 [00:00<00:00, 9811.24it/s]


4.311910152435303


convert squad examples to features: 100%|██████████| 9/9 [00:00<00:00, 13.18it/s]
add example index and unique id: 100%|██████████| 9/9 [00:00<00:00, 22822.69it/s]


2.4292891025543213


convert squad examples to features: 100%|██████████| 3/3 [00:00<00:00, 13.45it/s]
add example index and unique id: 100%|██████████| 3/3 [00:00<00:00, 2517.09it/s]


0.934079647064209


convert squad examples to features: 100%|██████████| 21/21 [00:03<00:00,  6.92it/s]
add example index and unique id: 100%|██████████| 21/21 [00:00<00:00, 50475.86it/s]


9.960541725158691


convert squad examples to features: 100%|██████████| 125/125 [01:18<00:00,  1.59it/s]
add example index and unique id: 100%|██████████| 125/125 [00:00<00:00, 17786.95it/s]


205.25598883628845


convert squad examples to features: 100%|██████████| 56/56 [01:31<00:00,  1.64s/it]
add example index and unique id: 100%|██████████| 56/56 [00:00<00:00, 10810.56it/s]


197.11518120765686


convert squad examples to features: 100%|██████████| 6/6 [00:00<00:00,  7.69it/s]
add example index and unique id: 100%|██████████| 6/6 [00:00<00:00, 26241.74it/s]


2.713545083999634


convert squad examples to features: 100%|██████████| 53/53 [00:09<00:00,  5.84it/s]
add example index and unique id: 100%|██████████| 53/53 [00:00<00:00, 45063.47it/s]


29.40903949737549


convert squad examples to features: 100%|██████████| 16/16 [00:03<00:00,  5.22it/s]
add example index and unique id: 100%|██████████| 16/16 [00:00<00:00, 22849.46it/s]


9.884322881698608


convert squad examples to features: 100%|██████████| 46/46 [00:09<00:00,  4.70it/s]
add example index and unique id: 100%|██████████| 46/46 [00:00<00:00, 34657.44it/s]


32.1949737071991


convert squad examples to features: 100%|██████████| 1/1 [00:01<00:00,  1.15s/it]
add example index and unique id: 100%|██████████| 1/1 [00:00<00:00, 6442.86it/s]


2.744664192199707


convert squad examples to features: 100%|██████████| 2/2 [00:00<00:00, 43.43it/s]
add example index and unique id: 100%|██████████| 2/2 [00:00<00:00, 22075.28it/s]


0.3238542079925537


convert squad examples to features: 100%|██████████| 2/2 [00:00<00:00,  9.40it/s]
add example index and unique id: 100%|██████████| 2/2 [00:00<00:00, 17810.21it/s]


0.9157736301422119


convert squad examples to features: 100%|██████████| 14/14 [00:03<00:00,  4.50it/s]
add example index and unique id: 100%|██████████| 14/14 [00:00<00:00, 28217.33it/s]


9.716186285018921


convert squad examples to features: 100%|██████████| 26/26 [00:01<00:00, 20.94it/s]
add example index and unique id: 100%|██████████| 26/26 [00:00<00:00, 91180.52it/s]


4.574407339096069
total time:  -1251.1406190395355


### 4.4 Export data to json for easy reuse

In [8]:
import json

print("num-right:", num_right)
print("total:", total)
# print(all_questions)
# print(all_answers)
with open("answers.json", "w") as f:
    json.dump(all_answers, f)

with open("questions.json", "w") as g:
    json.dump(all_questions, g)

num-right: 573
total: 828


### 4.5 Evaluate the model
We will evalue the model using https://huggingface.co/spaces/evaluate-metric/bertscore 


#### 4.5.1 Imports

In [9]:
## Imports and setup
!pip3 install evaluate
!pip3 install bert_score
from evaluate import load
import pandas as pd

bertscore = load("bertscore")


Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting evaluate
  Downloading evaluate-0.2.2-py3-none-any.whl (69 kB)
[K     |████████████████████████████████| 69 kB 3.1 MB/s 
Collecting datasets>=2.0.0
  Downloading datasets-2.4.0-py3-none-any.whl (365 kB)
[K     |████████████████████████████████| 365 kB 8.4 MB/s 
Collecting xxhash
  Downloading xxhash-3.0.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (212 kB)
[K     |████████████████████████████████| 212 kB 50.6 MB/s 
[?25hCollecting multiprocess
  Downloading multiprocess-0.70.13-py37-none-any.whl (115 kB)
[K     |████████████████████████████████| 115 kB 72.5 MB/s 
[?25hCollecting huggingface-hub>=0.7.0
  Downloading huggingface_hub-0.8.1-py3-none-any.whl (101 kB)
[K     |████████████████████████████████| 101 kB 13.4 MB/s 
[?25hCollecting responses<0.19
  Downloading responses-0.18.0-py3-none-any.whl (38 kB)
Collecting fsspec[http]>=2021.05.0
  Downloading 

Downloading builder script:   0%|          | 0.00/7.79k [00:00<?, ?B/s]

#### 4.5.2 Loading the data

We saved the data to a json so lets load it in

In [None]:

model_answers = pd.read_json("model_answers.json")

answers = []
references = []

for answer, valid in zip(model_answers["model_answer"], model_answers["valid_answers"]):
  answers.append(answer)
  references.append(valid[0]["text"])
    

#### 4.5.3 Calculate the metrics

In [None]:
results = bertscore.compute(predictions=answers, references=references, lang="en")


In [None]:

total_precision = 0
total_recall = 0
total_f1 = 0

count = 0

for prec, recall, f1 in zip(results["precision"], results["recall"], results["f1"]):
  if prec != 0:
    count += 1
    total_precision += prec
    total_recall += recall
    total_f1 += f1

precision = total_precision / count
recall = total_recall / count
f1 = total_f1 / count

print("Precision: ",precision)
print("Recall: ", recall)
print("F1: ", f1)

## 5.0 So How does it Perform for questions using the CDCs covid advice website?

In [None]:
cdc_guidance = "IF YOU Were exposed to COVID-19 and are NOT up to date on COVID-19 vaccinations Quarantine for at least 5 days Stay home Stay home and quarantine for at least 5 full days. Wear a well-fitting mask if you must be around others in your home. Do not travel. Get tested Even if you don’t develop symptoms, get tested at least 5 days after you last had close contact with someone with COVID-19. After quarantine Watch for symptoms Watch for symptoms until 10 days after you last had close contact with someone with COVID-19. Avoid travel It is best to avoid travel until a full 10 days after you last had close contact with someone with COVID-19. If you develop symptoms Isolate immediately and get tested. Continue to stay home until you know the results. Wear a well-fitting mask around others. Take precautions until day 10 Wear a well-fitting mask Wear a well-fitting mask for 10 full days any time you are around others inside your home or in public. Do not go to places where you are unable to wear a well-fitting mask. If you must travel during days 6-10, take precautions. Avoid being around people who are more likely to get very sick from COVID-19. IF YOU Were exposed to COVID-19 and are up to date on COVID-19 vaccinations No quarantine You do not need to stay home unless you develop symptoms. Get tested Even if you don’t develop symptoms, get tested at least 5 days after you last had close contact with someone with COVID-19. Watch for symptoms Watch for symptoms until 10 days after you last had close contact with someone with COVID-19. If you develop symptoms Isolate immediately and get tested. Continue to stay home until you know the results. Wear a well-fitting mask around others. Take precautions until day 10 Wear a well-fitting mask Wear a well-fitting mask for 10 full days any time you are around others inside your home or in public. Do not go to places where you are unable to wear a well-fitting mask. Take precautions if traveling Avoid being around people who are more likely to get very sick from COVID-19. IF YOU were exposed to COVID-19 and had confirmed COVID-19 within the past 90 days (you tested positive using a viral test) No quarantine You do not need to stay home unless you develop symptoms. Watch for symptoms Watch for symptoms until 10 days after you last had close contact with someone with COVID-19.  If you develop symptoms Isolate immediately and get tested. Continue to stay home until you know the results. Wear a well-fitting mask around others. Take precautions until day 10 Wear a well-fitting mask Wear a well-fitting mask for 10 full days any time you are around others inside your home or in public. Do not go to places where you are unable to wear a well-fitting mask. Take precautions if traveling Avoid being around people who are more likely to get very sick from COVID-19. Calculating Isolation Day 0 is your first day of symptoms or a positive viral test. Day 1 is the first full day after your symptoms developed or your test specimen was collected. If you have COVID-19 or have symptoms, isolate for at least 5 days. IF YOU Tested positive for COVID-19 or have symptoms, regardless of vaccination status Stay home for at least 5 days Stay home for 5 days and isolate from others in your home. Wear a well-fitting mask if you must be around others in your home. Do not travel. Ending isolation if you had symptoms End isolation after 5 full days if you are fever-free for 24 hours (without the use of fever-reducing medication) and your symptoms are improving. Ending isolation if you did NOT have symptoms End isolation after at least 5 full days after your positive test. If you got very sick from COVID-19 or have a weakened immune system You should isolate for at least 10 days. Consult your doctor before ending isolation. Take precautions until day 10 Wear a well-fitting mask  Wear a well-fitting mask for 10 full days any time you are around others inside your home or in public. Do not go to places where you are unable to wear a mask. Do not travel Do not travel until a full 10 days after your symptoms started or the date your positive test was taken if you had no symptoms. Avoid being around people who are more likely to get very sick from COVID-19.  DEFINITIONS Exposure Contact with someone infected with SARS-CoV-2, the virus that causes COVID-19, in a way that increases the likelihood of getting infected with the virus. Close Contact A close contact is someone who was less than 6 feet away from an infected person (laboratory-confirmed or a clinical diagnosis) for a cumulative total of 15 minutes or more over a 24-hour period. For example, three individual 5-minute exposures for a total of 15 minutes. People who are exposed to someone with COVID-19 after they completed at least 5 days of isolation are not considered close contacts. Quarantine Quarantine is a strategy used to prevent transmission of COVID-19 by keeping people who have been in close contact with someone with COVID-19 apart from others. Who does not need to quarantine? If you had close contact with someone with COVID-19 and you are in one of the following groups, you do not need to quarantine. You are up to date with your COVID-19 vaccines. You had confirmed COVID-19 within the last 90 days (meaning you tested positive using a viral test). If you are up to date with COVID-19 vaccines, you should wear a well-fitting mask around others for 10 days from the date of your last close contact with someone with COVID-19 (the date of last close contact is considered day 0). Get tested at least 5 days after you last had close contact with someone with COVID-19. If you test positive or develop COVID-19 symptoms, isolate from other people and follow recommendations in the Isolation section below. If you tested positive for COVID-19 with a viral test within the previous 90 days and subsequently recovered and remain without COVID-19 symptoms, you do not need to quarantine or get tested after close contact. You should wear a well-fitting mask around others for 10 days from the date of your last close contact with someone with COVID-19 (the date of last close contact is considered day 0). If you have COVID-19 symptoms, get tested and isolate from other people and follow recommendations in the Isolation section below. Who should quarantine? If you come into close contact with someone with COVID-19, you should quarantine if you are not up to date on COVID-19 vaccines. This includes people who are not vaccinated. What to do for quarantine Stay home and away from other people for at least 5 days (day 0 through day 5) after your last contact with a person who has COVID-19. The date of your exposure is considered day 0. Wear a well-fitting mask when around others at home, if possible. For 10 days after your last close contact with someone with COVID-19, watch for fever (100.4◦F or greater), cough, shortness of breath, or other COVID-19 symptoms. If you develop symptoms, get tested immediately and isolate until you receive your test results. If you test positive, follow isolation recommendations. If you do not develop symptoms, get tested at least 5 days after you last had close contact with someone with COVID-19. If you test negative, you can leave your home, but continue to wear a well-fitting mask when around others at home and in public until 10 days after your last close contact with someone with COVID-19. If you test positive, you should isolate for at least 5 days from the date of your positive test (if you do not have symptoms). If you do develop COVID-19 symptoms, isolate for at least 5 days from the date your symptoms began (the date the symptoms started is day 0). Follow recommendations in the isolation section below. If you are unable to get a test 5 days after last close contact with someone with COVID-19, you can leave your home after day 5 if you have been without COVID-19 symptoms throughout the 5-day period. Wear a well-fitting mask for 10 days after your date of last close contact when around others at home and in public. Avoid people who are have weakened immune systems or are more likely to get very sick from COVID-19, and nursing homes and other high-risk settings, until after at least 10 days. If possible, stay away from people you live with, especially people who are at higher risk for getting very sick from COVID-19, as well as others outside your home throughout the full 10 days after your last close contact with someone with COVID-19. If you are unable to quarantine, you should wear a well-fitting mask for 10 days when around others at home and in public. If you are unable to wear a mask when around others, you should continue to quarantine for 10 days. Avoid people who have weakened immune systems or are more likely to get very sick from COVID-19, and nursing homes and other high-risk settings, until after at least 10 days. See additional information about travel. Do not go to places where you are unable to wear a mask, such as restaurants and some gyms, and avoid eating around others at home and at work until after 10 days after your last close contact with someone with COVID-19. After quarantine Watch for symptoms until 10 days after your last close contact with someone with COVID-19. If you have symptoms, isolate immediately and get tested. Quarantine in high-risk congregate settings In certain congregate settings that have high risk of secondary transmission (such as correctional and detention facilities, homeless shelters, or cruise ships), CDC recommends a 10-day quarantine for residents, regardless of vaccination and booster status. During periods of critical staffing shortages, facilities may consider shortening the quarantine period for staff to ensure continuity of operations. Decisions to shorten quarantine in these settings should be made in consultation with state, local, tribal, or territorial health departments and should take into consideration the context and characteristics of the facility. CDC’s setting-specific guidance provides additional recommendations for these settings. Isolation Isolation is used to separate people with confirmed or suspected COVID-19 from those without COVID-19. People who are in isolation should stay home until it’s safe for them to be around others. At home, anyone sick or infected should separate from others, or wear a well-fitting mask when they need to be around others. People in isolation should stay in a specific “sick room” or area and use a separate bathroom if available. Everyone who has presumed or confirmed COVID-19 should stay home and isolate from other people for at least 5 full days (day 0 is the first day of symptoms or the date of the day of the positive viral test for asymptomatic persons). They should wear a mask when around others at home and in public for an additional 5 days. People who are confirmed to have COVID-19 or are showing symptoms of COVID-19 need to isolate regardless of their vaccination status. This includes: People who have a positive viral test for COVID-19, regardless of whether or not they have symptoms. People with symptoms of COVID-19, including people who are awaiting test results or have not been tested. People with symptoms should isolate even if they do not know if they have been in close contact with someone with COVID-19. What to do for isolation Monitor your symptoms. If you have an emergency warning sign (including trouble breathing), seek emergency medical care immediately. Stay in a separate room from other household members, if possible. Use a separate bathroom, if possible. Take steps to improve ventilation at home, if possible. Avoid contact with other members of the household and pets. Don’t share personal household items, like cups, towels, and utensils. Wear a well-fitting mask when you need to be around other people. Learn more about what to do if you are sick and how to notify your contacts. Top of Page Ending isolation for people who had COVID-19 and had symptoms If you had COVID-19 and had symptoms, isolate for at least 5 days. To calculate your 5-day isolation period, day 0 is your first day of symptoms. Day 1 is the first full day after your symptoms developed. You can leave isolation after 5 full days. You can end isolation after 5 full days if you are fever-free for 24 hours without the use of fever-reducing medication and your other symptoms have improved (Loss of taste and smell may persist for weeks or months after recovery and need not delay the end of isolation ). You should continue to wear a well-fitting mask around others at home and in public for 5 additional days (day 6 through day 10) after the end of your 5-day isolation period. If you are unable to wear a mask when around others, you should continue to isolate for a full 10 days. Avoid people who have weakened immune systems or are more likely to get very sick from COVID-19, and nursing homes and other high-risk settings, until after at least 10 days. If you continue to have fever or your other symptoms have not improved after 5 days of isolation, you should wait to end your isolation until you are fever-free for 24 hours without the use of fever-reducing medication and your other symptoms have improved. Continue to wear a well-fitting mask through day 10. Contact your healthcare provider if you have questions. See additional information about travel. Do not go to places where you are unable to wear a mask, such as restaurants and some gyms, and avoid eating around others at home and at work until a full 10 days after your first day of symptoms. If an individual has access to a test and wants to test, the best approach is to use an antigen test1 towards the end of the 5-day isolation period. Collect the test sample only if you are fever-free for 24 hours without the use of fever-reducing medication and your other symptoms have improved (loss of taste and smell may persist for weeks or months after recovery and need not delay the end of isolation). If your test result is positive, you should continue to isolate until day 10. If your test result is negative, you can end isolation, but continue to wear a well-fitting mask around others at home and in public until day 10. Follow additional recommendations for masking and avoiding travel as described above. 1As noted in the labeling for authorized over-the counter antigen tests:   Negative results should be treated as presumptive. Negative results do not rule out SARS-CoV-2 infection and should not be used as the sole basis for treatment or patient management decisions, including infection control decisions. To improve results, antigen tests should be used twice over a three-day period with at least 24 hours and no more than 48 hours between tests. Note that these recommendations on ending isolation do not apply to people who are moderately ill or very sick from COVID-19 or have weakened immune systems. See section below for recommendations for when to end isolation for these groups. Ending isolation for people who tested positive for COVID-19 but had no symptoms If you test positive for COVID-19 and never develop symptoms, isolate for at least 5 days. Day 0 is the day of your positive viral test (based on the date you were tested) and day 1 is the first full day after the specimen was collected for your positive test. You can leave isolation after 5 full days. If you continue to have no symptoms, you can end isolation after at least 5 days. You should continue to wear a well-fitting mask around others at home and in public until day 10 (day 6 through day 10). If you are unable to wear a mask when around others, you should continue to isolate for 10 days. Avoid people who have weakened immune systems or are more likely to get very sick from COVID-19, and nursing homes and other high-risk settings, until after at least 10 days. If you develop symptoms after testing positive, your 5-day isolation period should start over. Day 0 is your first day of symptoms. Follow the recommendations above for ending isolation for people who had COVID-19 and had symptoms. See additional information about travel. Do not go to places where you are unable to wear a mask, such as restaurants and some gyms, and avoid eating around others at home and at work until 10 days after the day of your positive test. If an individual has access to a test and wants to test, the best approach is to use an antigen test1 towards the end of the 5-day isolation period. If your test result is positive, you should continue to isolate until day 10. If your test result is positive, you can also choose to test daily and if your test result is negative, you can end isolation, but continue to wear a well-fitting mask around others at home and in public until day 10. Follow additional recommendations for masking and avoiding travel as described above. 1As noted in the labeling for authorized over-the counter antigen tests external icon external icon  : Negative results should be treated as presumptive. Negative results do not rule out SARS-CoV-2 infection and should not be used as the sole basis for treatment or patient management decisions, including infection control decisions. To improve results, antigen tests should be used twice over a three-day period with at least 24 hours and no more than 48 hours between tests. Ending isolation for people who were moderately or very sick from COVID-19 or have a weakened immune system People who are moderately ill from COVID-19 (experiencing symptoms that affect the lungs like shortness of breath or difficulty breathing) should isolate for 10 days and follow all other isolation precautions.  To calculate your 10-day isolation period, day 0 is your first day of symptoms. Day 1 is the first full day after your symptoms developed. If you are unsure if your symptoms are moderate, talk to a healthcare provider for further guidance. People who are very sick from COVID-19 (this means people who were hospitalized or required intensive care or ventilation support) and people who have weakened immune systems might need to isolate at home longer. They may also require testing with a viral test to determine when they can be around others. CDC recommends an isolation period of at least 10 and up to 20 days for people who were very sick from COVID-19 and for people with weakened immune systems. Consult with your healthcare provider about when you can resume being around other people. If you are unsure if your symptoms are severe or if you have a weakened immune system, talk to a healthcare provider for further guidance. People who have a weakened immune system should talk to their healthcare provider about the potential for reduced immune responses to COVID-19 vaccines and the need to continue to follow current prevention measures (including wearing a well-fitting mask and avoiding crowds and poorly ventilated indoor spaces) to protect themselves against COVID-19 until advised otherwise by their healthcare provider. Close contacts of immunocompromised people—including household members—should also be encouraged to receive all recommended COVID-19 vaccine doses to help protect these people.  Isolation in high-risk congregate settings In certain high-risk congregate settings that have high risk of secondary transmission and where it is not feasible to cohort people (such as correctional and detention facilities, homeless shelters, and cruise ships), CDC recommends a 10-day isolation period for residents. During periods of critical staffing shortages, facilities may consider shortening the isolation period for staff to ensure continuity of operations. Decisions to shorten isolation in these settings should be made in consultation with state, local, tribal, or territorial health departments and should take into consideration the context and characteristics of the facility. CDC’s setting-specific guidance provides additional recommendations for these settings. This CDC guidance is meant to supplement—not replace—any federal, state, local, territorial, or tribal health and safety laws, rules, and regulations. "

prediction = run_prediction(["What do I do if I am exposed to COVID-19?", "What should I do if I have a weakened immune system?", "How long should I isolate if I get Covid?"], cdc_guidance)

In [None]:
for key, value in prediction.items(): 
  print(value)

## 6.0 Conclusion

ALBERT is a highly accesible classifier for anyone who has enough RAM and GPU to run it. It is marketed as a lighter, higher speed version of the BERT classifier, but from what I found it's speed is highly dependent on the amount of context provided. With smaller amounts of context, BERT ran much faster. Overall testing the model took around 2 hours on Google Colab Pro GPU but could not run locally on my machine. 