# About:
- This notebook illustates the implementation of Covid-19 QA system

In [1]:
import sys
sys.path.append(r"C:\Users\tanch\Documents\GitHub\URECA-CovidQA-Research\Implementation\Custom Modules")
import Config
from haystack.document_store.elasticsearch import ElasticsearchDocumentStore
from haystack.retriever.sparse import ElasticsearchRetriever
from haystack.reader.farm import FARMReader
from haystack.pipeline import ExtractiveQAPipeline
from haystack.utils import print_answers
import pandas as pd

In [2]:
tuned_model_path = r"C:\Users\tanch\Documents\GitHub\URECA-CovidQA-Research\Implementation\Reader"

## Initiate Reader

In [3]:
reader = FARMReader(model_name_or_path=tuned_model_path,
                        context_window_size = 500,
                        max_seq_len = 280,
                        doc_stride =  100,
                   return_no_answer = False)

06/27/2021 22:13:29 - INFO - farm.utils -   Using device: CUDA 
06/27/2021 22:13:29 - INFO - farm.utils -   Number of GPUs: 1
06/27/2021 22:13:29 - INFO - farm.utils -   Distributed Training: False
06/27/2021 22:13:29 - INFO - farm.utils -   Automatic Mixed Precision: None
The git executable must be specified in one of the following ways:
    - be included in your $PATH
    - be set via $GIT_PYTHON_GIT_EXECUTABLE
    - explicitly set via git.refresh()

All git commands will error until this is rectified.

$GIT_PYTHON_REFRESH environment variable. Use one of the following values:
    - error|e|raise|r|2: for a raised exception

Example:
    export GIT_PYTHON_REFRESH=quiet

06/27/2021 22:13:35 - INFO - farm.utils -   Using device: CUDA 
06/27/2021 22:13:35 - INFO - farm.utils -   Number of GPUs: 1
06/27/2021 22:13:35 - INFO - farm.utils -   Distributed Training: False
06/27/2021 22:13:35 - INFO - farm.utils -   Automatic Mixed Precision: None
06/27/2021 22:13:35 - INFO - farm.infer -   G

## Connect to retriever to Elastic Search

In [4]:
document_store = ElasticsearchDocumentStore(index = Config.INDEX_NAME,
                                            username = Config.AUTH['username'],
                                            password = Config.AUTH['password'],
                                            host = "localhost",
                                            port = 9200,
                                            similarity = "dot_product",
                                            search_fields = ["text",'name'],
                                            text_field = "text",
                                            name_field = "name",
                                            embedding_field = "embedding",
                                            embedding_dim = 768)
retriever = ElasticsearchRetriever(document_store=document_store)

06/27/2021 22:13:35 - INFO - elasticsearch -   HEAD http://localhost:9200/covid_datastore [status:200 request:0.016s]
06/27/2021 22:13:35 - INFO - elasticsearch -   GET http://localhost:9200/covid_datastore [status:200 request:0.003s]
06/27/2021 22:13:35 - INFO - elasticsearch -   PUT http://localhost:9200/covid_datastore/_mapping [status:200 request:0.020s]
06/27/2021 22:13:35 - INFO - elasticsearch -   HEAD http://localhost:9200/label [status:200 request:0.004s]


In [75]:
document_store.get_all_documents(filters = {"category":['advisories']})

06/27/2021 23:00:25 - INFO - elasticsearch -   POST http://localhost:9200/covid_datastore/_search?scroll=1d&size=10000 [status:200 request:0.011s]
06/27/2021 23:00:25 - INFO - elasticsearch -   POST http://localhost:9200/_search/scroll [status:200 request:0.003s]
06/27/2021 23:00:25 - INFO - elasticsearch -   DELETE http://localhost:9200/_search/scroll [status:200 request:0.002s]


[{'text': 'Requirements for Safe Management Measures at the workplace Read the sector-specific guidelines and infographic on Safe Management Measures at the workplace. From 16 May 2021 to 13 June 2021, the Safe Management Measures for the workplace will be tightened. Previously, up to 50% of employees28 who are able to work from home could be at the workplace at any time. Now, employers must ensure that all employees who are able to work from home do so. Social gatherings at the workplace are disallowed. These measures help lower transmission risks by reducing the levels of interaction at common spaces at or near the workplace, and in public places, including public transport. Issued on 9 May 2020 Updated as of 14 May 2021 The tripartite partners (MOM, SNEF, and NTUC) have updated the workplace safe management measures to allow greater flexibility for businesses, while mitigating the risk of widespread COVID-19 transmission. Effective implementation of these measures will help to avoid

In [76]:
document_store.get_all_documents(filters = {"category":['articles']})

06/27/2021 23:00:33 - INFO - elasticsearch -   POST http://localhost:9200/covid_datastore/_search?scroll=1d&size=10000 [status:200 request:0.357s]
06/27/2021 23:00:33 - INFO - elasticsearch -   POST http://localhost:9200/_search/scroll [status:200 request:0.003s]
06/27/2021 23:00:33 - INFO - elasticsearch -   DELETE http://localhost:9200/_search/scroll [status:200 request:0.002s]


[{'text': "gradual re opening to phase 3 heightened alert from 14 june The Multi-Ministry Taskforce (MTF) has announced that Singapore will return to Phase 3 (Heightened Alert) from 14 June 2021, in two steps.From 14 June 2021:•Social gathering group sizes will increase from 2 to 5 persons•Event size limits and capacity limits of certain sectors will also increase, with pre-event testing (PET)•Resumption of personal services without masks, e.g. facials and saunasFrom 21 June 2021 (updated 18 June 2021):Due to the persistence of undetected community transmission cases, higher-risk activities – such as dining-in at F&B establishments and indoor mask-off sports activities – will be allowed in group sizes of up to 2 persons, instead of 5 persons.Group sizes of up to 5 persons will be allowed from mid-July 2021, should there be no further supers-spreader event or big clusters.Please refer to this table for full details: Current Measures for Phase 3 (Heightened Alert) Phase 3 (Heightened Ale

## Connect reader to retriever

In [5]:
pipe = ExtractiveQAPipeline(reader, retriever) 

## Making predictions on CovidQA

In [95]:
questions = ["when was the official launch of covid 19 vaccination for women?","what is the likelihood of getting serious illness due to moderna vaccine?","how fast can ART kits give results?","why do some seniors avoid taking the vaccine?","how much does nasal swab cost??","what was MOH's response after the Victoria Junior College student contracted covid?","what is a serology test","what is the capacity of recreational facilities?","what are the benefits of COVID-19 Driver Relief Fund?","what is the capacity of live performances without PET?"]
answers = []
contexts =[]

In [96]:
for question in questions:
    output = pipe.run(query=question, top_k_retriever=10, top_k_reader=1)
    try:
        answer,context = output['answers'][0]['answer'],output['answers'][0]['context']
    except:
        answer = ""
    answers.append(answer)
    contexts.append(context)

06/27/2021 23:10:54 - INFO - elasticsearch -   POST http://localhost:9200/covid_datastore/_search [status:200 request:0.022s]
Inferencing Samples: 100%|█████████████████████████████████████████████████████████| 1/1 [00:01<00:00,  1.21s/ Batches]
Inferencing Samples: 100%|█████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  5.23 Batches/s]
Inferencing Samples: 100%|█████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  4.94 Batches/s]
Inferencing Samples: 100%|█████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  5.05 Batches/s]
Inferencing Samples: 100%|█████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  4.62 Batches/s]
Inferencing Samples: 100%|█████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  6.40 Batches/s]
Inferencing Samples: 100%|█████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  3.41 Batches/s]
Inferencing Samples: 100%|████████

Inferencing Samples: 100%|█████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  1.91 Batches/s]
Inferencing Samples: 100%|█████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  2.42 Batches/s]
Inferencing Samples: 100%|█████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  2.30 Batches/s]
Inferencing Samples: 100%|█████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  1.86 Batches/s]
Inferencing Samples: 100%|█████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  1.87 Batches/s]
Inferencing Samples: 100%|█████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  2.59 Batches/s]
Inferencing Samples: 100%|█████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  2.39 Batches/s]
Inferencing Samples: 100%|█████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  3.66 Batches/s]
Inferencing Samples: 100%|██████████████

In [98]:
import pandas as pd
pd.DataFrame({"Question":questions, "Answer":answers,"Context":contexts})

Unnamed: 0,Question,Answer,Context
0,when was the official launch of covid 19 vaccination for women?,June 14,"f foetal anomaly reported was 2.6 per cent, Prof Tan said. For miscarriage, ..."
1,what is the likelihood of getting serious illness due to moderna vaccine?,0.004 per cent,effects are caused by the body’s immune response to the vaccine. Out of mor...
2,how fast can ART kits give results?,less than 20 minutes,"Singapore's director of medical services, Associate Professor Kenneth Mak, s..."
3,why do some seniors avoid taking the vaccine?,fear of complications or side effects,fear of complications or side effects main reason some seniors spurn covid 1...
4,how much does nasal swab cost??,$10,"rtable. An average of over 55,000 PCR and ART tests were carried out each da..."
5,what was MOH's response after the Victoria Junior College student contracted...,quarantined 95 students and eight staff,rm of active case finding. Experts have noted the more aggressive approach i...
6,what is a serology test,detects the presence of antibodies and can show if the person might have bee...,part of community surveillance testing for residents in the Bukit Merah View...
7,what is the capacity of recreational facilities?,50,1 Refers to 18 years (born in 2003) and below 2 Issued: 14 May 2021 Updated...
8,what are the benefits of COVID-19 Driver Relief Fund?,$500,ss of complete or partial (at least 10%) household income due to COVID-19 • ...
9,what is the capacity of live performances without PET?,50 pax,o cross-deployment Social gatherings not allowed Event Parameters Funerals U...
