# Challange-4: Question-Answering System

 - Designing this was quite a task beacuase as described in the **challange-4** we have to perform this task in span of **50ms**.

- Firstly I decided to use the current state of the art nlp transformer model **RoBerto** developed by **Facebook**. 

- But fitting the _Neural Network_ on the given paragraph and then _masking_ the (Question,Paragraph) & then splitting the paragraph according to the question in real time was quite a **Time Costly** task.

So to make the system give **real-time outputs** within **50 ms** I decided to use another approach.

- I decided to perform **sentence emabedding** on the given paragraph to generate word vectors of each line.

- Then I will convert question to the vectorised form and map it on the paragraph.

- TO map the question onto the paragraph I use Cosine similarity.

- Then I return answer that has the topmost **Cosine Similarity** with the question.

With this method I was able to **generate realtime outputs** within **_10ms_**.

_Lets See how I achieved it!_

### Downloading Dependencies

In [None]:
!pip install gensim

In [None]:
import re
import gensim 
from gensim.parsing.preprocessing import remove_stopwords
import pandas as pd
import numpy
import nltk
from nltk import sent_tokenize
nltk.download('punkt')

### Downloading Google's Word2Vec Model

I used google's word2vec model to generate sentence vetorization.

In [None]:
from gensim.models import Word2Vec 
import gensim.downloader as api

#Downloading Models
v2w_model=None;
try:
    print("Loading w2v model")
    v2w_model = gensim.models.KeyedVectors.load("/content/drive/My Drive/Colab Projects/Lincode VIrtual Hackathon Submission/Database/w2vecmodel.mod")
    print("Loaded w2v model")
except:            
    v2w_model = api.load('word2vec-google-news-300')
    v2w_model.save("/content/drive/My Drive/Colab Projects/Lincode VIrtual Hackathon Submission/Database/w2vecmodel.mod")
    print("Saved glove model")

w2vec_embedding_size=len(v2w_model['computer']);

### Getting Word Vectorizations 

In [None]:
#This Function Generates Word Vectors
def getWordVec(word,model):
        #Loding Model
        samp=model['computer'];
        vec=[0]*len(samp);
        try:
                vec=model[word];
        except:
                vec=[0]*len(samp);
        return (vec)

### Getting Sentence Embeddings For Given Sentences

In [None]:
#Getting Sentence Embeddings
def getPhraseEmbedding(phrase,embeddingmodel):

        #Calling Above defined Function               
        samp=getWordVec('computer', embeddingmodel);
        vec=numpy.array([0]*len(samp));
        den=0;
        #Generating a list which contains vectors for each word in the sentence because we are performing Sentence Embaddings
        for word in phrase.split():
            #print(word)
            den=den+1;
            vec=vec+numpy.array(getWordVec(word,embeddingmodel));
        return vec.reshape(1, -1)

### Cleaning Sentences

In [None]:
def clean_sentence(sentence, stopwords=False):
    
    #Cleaning Sentence by only keeping the Charachters & Numbers
    sentence = sentence.strip()
    sentence = re.sub(r'[^a-zA-Z0-9\s]', '', sentence)
    
    if stopwords:
         sentence = remove_stopwords(sentence)
    
  
    return sentence

### This function takes a Dataframe and cleans every sentence in the Dataframe's Column.

The Daataframe contains all the sentences of the given Paragraph.

In [None]:
def get_cleaned_sentences(df,stopwords=False):    
    #sents=df[["questions"]];
    cleaned_sentences=[]

    for index,row in df.iterrows():
        #print(index,row)
        cleaned=clean_sentence(row["Sentences"],stopwords);
        cleaned_sentences.append(cleaned);
    return cleaned_sentences;

### This Function Prints The Results.

In [None]:
# We use sklearn's cosine_similarity function to find Similarity between the question and Each sentence.
import sklearn
from sklearn.metrics.pairwise import cosine_similarity;

#This Function Finds the cosine Similarities, Maps question to the Paragraph & FInds the Result. 
def retrieveAndPrintFAQAnswer(question_embedding,sentence_embeddings,FAQdf,sentences,q,prnt=True):
    max_sim=-1;
    index_sim=-1;
    dictionary= {}
    for index,faq_embedding in enumerate(sentence_embeddings):
        sim=cosine_similarity(faq_embedding,question_embedding)[0][0];
        #print(index, sim, sentences[index]) #Uncomment this print the similarity scores of the Question with all the cleaned sentences 
        if sim>max_sim:
            max_sim=sim;
            index_sim=index;

        #Creating Dictionary containing indexes as keys and similarities as values.    
        dictionary.update( {index : sim})
    sort_orders = sorted(dictionary.items(), key=lambda x: x[1], reverse=True)

    #print(sort_orders) #Uncomment this to see (indexes,similarities) of the most similer sentences

    #Creating a list of most relevent answers.(Most Cosinely Similer Sentences of the Paragraph)
    Sep_Paras = []
    for i in range(3):
      Sep_Paras.append([FAQdf.iloc[sort_orders[i][0],0]])
    #print(Sep_Paras)

    #Print Answers (only if prnt parameter is true)
    if prnt:   
      print("\n")
      print("Question: ",q.strip())
      print("\n");
      print("Answer: ",FAQdf.iloc[sort_orders[0][0],0].strip())
      print("\n")
      #2nd and 3rd Most Similer Sentences
      print("Other Less Appropriate Answers:\n")
      for i in range(1,3):
         print("\t=> ",FAQdf.iloc[sort_orders[i][0],0].strip())

    #Returning Predicted Answer
    return FAQdf.iloc[sort_orders[0][0],0].strip()      

## Making Model

This Function Performs the Question-Answer Task.

In [None]:
def FinalModel(Context, Question, prnt = True):
  Data = Context
  df = pd.DataFrame()

  #Making a dataframe containing all the sentences of the given Context(Paragraph).
  #Splitting PAragraph into Sentences. sent_tokanize function of nltk library intelligently Seperates Sentenes.
  df['Sentences'] = sent_tokenize(Data)
  cleaned_sentences=get_cleaned_sentences(df,stopwords=True)
  question = Question

  #Generating List of Sentence Embeddings
  sent_embeddings=[];
  for sent in cleaned_sentences:
      sent_embeddings.append(getPhraseEmbedding(sent,v2w_model));

  #Generating Sentene Embedding of The Question
  question_embedding=getPhraseEmbedding(question,v2w_model);

  #Printing The Result
  retrieveAndPrintFAQAnswer(question_embedding,sent_embeddings,df, cleaned_sentences, Question, prnt);


## Seeing the Model In Action

In [None]:
#Takes 3 inputs, Context, Question & Whether to print output or not(default True)
FinalModel('''Interest in natural language processing (NLP) began in earnest in 1950 when Alan Turing published his paper entitled “Computing Machinery and Intelligence,” from which the so-called Turing Test emerged. Turing basically asserted that a computer could be considered intelligent if it could carry on a conversation with a human being without the human realizing they were talking to a machine.
The goal of natural language processing is to allow that kind of interaction so that non-programmers can obtain useful information from computing systems. This kind of interaction was popularized in the 1968 movie “2001: A Space Odyssey” and in the Star Trek television series. Natural language processing also includes the ability to draw insights from data contained in emails, videos, and other unstructured material. “In the future,” writes Marc Maxson, “the most useful data will be the kind that was is too unstructured to be used in the past.” [“The future of big data is quasi-unstructured,” Chewy Chunks, 23 March 2013] Maxson believes, “The future of Big Data is neither structured nor unstructured. Big Data will be structured by intuitive methods (i.e., ‘genetic algorithms’), or using inherent patterns that emerge from the data itself and not from rules imposed on data sets by humans.
''', 'What are the abilities of Natural Language Processing?')



Question:  What are the abilities of Natural Language Processing?


Answer:  Natural language processing also includes the ability to draw insights from data contained in emails, videos, and other unstructured material.


Other Less Appropriate Answers:

	=>  The goal of natural language processing is to allow that kind of interaction so that non-programmers can obtain useful information from computing systems.
	=>  Interest in natural language processing (NLP) began in earnest in 1950 when Alan Turing published his paper entitled “Computing Machinery and Intelligence,” from which the so-called Turing Test emerged.


### Seeing The TIme Cost of the Function

As You can see the model takes average 8-9ms(Not Printing The Output) & ~15ms(Printing The Output)

In [None]:
%%timeit

FinalModel('''Interest in natural language processing (NLP) began in earnest in 1950 when Alan Turing published his paper entitled “Computing Machinery and Intelligence,” from which the so-called Turing Test emerged. Turing basically asserted that a computer could be considered intelligent if it could carry on a conversation with a human being without the human realizing they were talking to a machine.
The goal of natural language processing is to allow that kind of interaction so that non-programmers can obtain useful information from computing systems. This kind of interaction was popularized in the 1968 movie “2001: A Space Odyssey” and in the Star Trek television series. Natural language processing also includes the ability to draw insights from data contained in emails, videos, and other unstructured material. “In the future,” writes Marc Maxson, “the most useful data will be the kind that was is too unstructured to be used in the past.” [“The future of big data is quasi-unstructured,” Chewy Chunks, 23 March 2013] Maxson believes, “The future of Big Data is neither structured nor unstructured. Big Data will be structured by intuitive methods (i.e., ‘genetic algorithms’), or using inherent patterns that emerge from the data itself and not from rules imposed on data sets by humans.
''', 'What are the abilities of Natural Language Processing?',False)

100 loops, best of 3: 6.07 ms per loop


In [None]:
FinalModel('''The Pentagon’s drones are an iconic symbol of war abroad, plane-sized matchsticks with wings lurking over cities and countrysides waiting for the moment routine patrol becomes un-routine. For the most part, the missions of those drones have remained abroad, but over the years the Department of Defense has flown drones a handful of times over the United States in support of civil authorities. From 2011 to 2017, the Pentagon reports just 11 total domestic drone missions.
But in 2018, that total doubled, with 11 domestic missions flown by military drones.
On Jan. 11, the Department of Defense published its 2018 statistics. The drones involved include everything from MQ-9 Reapers down to DJI Phantoms, and involvement in missions ranging from training exercises to border security and emergency response. (Notably, drone operations by the Department of Homeland Security are excluded from these statistics). These numbers are helpfully collected and contrasted with domestic drone use by the military from 2011 to 2017 by the Center for the Study of the Drone at Bard University.
In 2018, military MQ-9 Reapers flew five missions over the United States, four of which were in support of forest firefighting in California and Oregon. One Reaper mission, flown from May 7-10, was described as an incident and awareness exercise in the state of New York. RQ-11B Ravens flew two missions: one a base installation in Bangor, Kitsap, Washington, and the other a Defense Support of Civil Authorities mission in response to Hurricane Florence and requested by the South Carolina National Guard.
''', 'When did the Department of Defense publish its 2018 statistics?',True)



Question:  When did the Department of Defense publish its 2018 statistics?


Answer:  On Jan. 11, the Department of Defense published its 2018 statistics.


Other Less Appropriate Answers:

	=>  (Notably, drone operations by the Department of Homeland Security are excluded from these statistics).
	=>  For the most part, the missions of those drones have remained abroad, but over the years the Department of Defense has flown drones a handful of times over the United States in support of civil authorities.


In [None]:
FinalModel('''The Pentagon’s drones are an iconic symbol of war abroad, plane-sized matchsticks with wings lurking over cities and countrysides waiting for the moment routine patrol becomes un-routine. For the most part, the missions of those drones have remained abroad, but over the years the Department of Defense has flown drones a handful of times over the United States in support of civil authorities. From 2011 to 2017, the Pentagon reports just 11 total domestic drone missions.
But in 2018, that total doubled, with 11 domestic missions flown by military drones.
On Jan. 11, the Department of Defense published its 2018 statistics. The drones involved include everything from MQ-9 Reapers down to DJI Phantoms, and involvement in missions ranging from training exercises to border security and emergency response. (Notably, drone operations by the Department of Homeland Security are excluded from these statistics). These numbers are helpfully collected and contrasted with domestic drone use by the military from 2011 to 2017 by the Center for the Study of the Drone at Bard University.
In 2018, military MQ-9 Reapers flew five missions over the United States, four of which were in support of forest firefighting in California and Oregon. One Reaper mission, flown from May 7-10, was described as an incident and awareness exercise in the state of New York. RQ-11B Ravens flew two missions: one a base installation in Bangor, Kitsap, Washington, and the other a Defense Support of Civil Authorities mission in response to Hurricane Florence and requested by the South Carolina National Guard.
''', 'Who flew missions over the united states?',True)



Question:  Who flew missions over the united states?


Answer:  But in 2018, that total doubled, with 11 domestic missions flown by military drones.


Other Less Appropriate Answers:

	=>  For the most part, the missions of those drones have remained abroad, but over the years the Department of Defense has flown drones a handful of times over the United States in support of civil authorities.
	=>  In 2018, military MQ-9 Reapers flew five missions over the United States, four of which were in support of forest firefighting in California and Oregon.


### Summary

As you can see this system I created is **Very Fast** & **Fairly accurate**(Considering the Time-Limitations Given).

But it has 2 limitations. 
- It can only answer In Sentences.
- Accuracy can be further improved.

To Solve this limitations I used another nlp model that uses pre-implemented **transformers model called _Bert_**.

# Question-Answer Model Using Bert

This model Tokanizes the given context(Paragrapg) & Predicts **Very Accurate Result** even to the complex questions.

But it has slightly more time complexity then my previous Question-Answering System.

It Takes average **150ms** to predict the answer. But the answers are to the point and more accurate.

### Installing Dependencies

In [None]:
!pip install cdqa

### Importing Libraries

In [None]:
import os
import pandas as pd
from ast import literal_eval

#These libraries are used for tokanising the input.
from cdqa.utils.filters import filter_paragraphs
from cdqa.pipeline import QAPipeline



### Downloading Required Dataset(to_train the model)

In [None]:
from cdqa.utils.download import download_squad, download_model, download_bnpp_data

directory = '/content/drive/My Drive/Colab Projects/Lincode VIrtual Hackathon Submission/Database'

# Downloading data
download_squad(dir=directory)
download_bnpp_data(dir=directory)

# Downloading pre-trained DistilBERT fine-tuned on SQuAD 1.1
download_model('distilbert-squad_1.1', dir=directory)

Downloading SQuAD v1.1 data...
train-v1.1.json already downloaded
dev-v1.1.json already downloaded

Downloading SQuAD v2.0 data...
train-v2.0.json already downloaded
dev-v2.0.json already downloaded

Downloading BNP data...
bnpp_newsroom-v1.1.csv already downloaded

Downloading trained model...
distilbert_qa.joblib already downloaded


### Creating a List form pregiven Paragraphs in Challange-4

For Pre-training

In [None]:
Para_list = [ '''The Pentagon’s drones are an iconic symbol of war abroad, plane-sized matchsticks with wings lurking over cities and countrysides waiting for the moment routine patrol becomes un-routine. For the most part, the missions of those drones have remained abroad, but over the years the Department of Defense has flown drones a handful of times over the United States in support of civil authorities. From 2011 to 2017, the Pentagon reports just 11 total domestic drone missions.
But in 2018, that total doubled, with 11 domestic missions flown by military drones.
On Jan. 11, the Department of Defense published its 2018 statistics. The drones involved include everything from MQ-9 Reapers down to DJI Phantoms, and involvement in missions ranging from training exercises to border security and emergency response. (Notably, drone operations by the Department of Homeland Security are excluded from these statistics). These numbers are helpfully collected and contrasted with domestic drone use by the military from 2011 to 2017 by the Center for the Study of the Drone at Bard University.
In 2018, military MQ-9 Reapers flew five missions over the United States, four of which were in support of forest firefighting in California and Oregon. One Reaper mission, flown from May 7-10, was described as an incident and awareness exercise in the state of New York. RQ-11B Ravens flew two missions: one a base installation in Bangor, Kitsap, Washington, and the other a Defense Support of Civil Authorities mission in response to Hurricane Florence and requested by the South Carolina National Guard.'''
,

'''One of the most striking features of SARS-CoV-2, the virus that causes COVID-19, is the exceptional breadth of symptoms it causes in people. Of the nearly 30 million recorded infections to date, the vast majority of people experienced mild or moderate disease—which itself can range from no symptoms at all to pneumonia or long-term, debilitating neurological symptoms. A minority ended up with severe respiratory symptoms but eventually recovered. And some—nearly 940,000 worldwide, of which 196,000 are in the US—took a turn for the worse and died.
Why some people die while others recover is thought to depend in large part on the human immune response, which spirals out of control in severe disease. Over the past few months, researchers have developed a better understanding of this dysfunctional immune response. By comparing patients with varying degrees of disease severity, they’ve catalogued a number of dramatic changes across the human immune arsenal that are often apparent when patients first come into the hospital—from signaling cytokine proteins and first-responder cells of the innate immune system, to the B cells and T cells that confer pathogen-specific adaptive immunity.
The factors that trigger this immune dysregulation have so far remained elusive due to the complexity of the immune system, which consists of seemingly endless biological pathways that twist and turn and feed back on one another like a ball of spaghetti. But researchers—drawing on knowledge from other conditions such as sepsis, cancer, and autoimmune disease—are gradually building coherent theories of what puts patients en route to severe disease. Along the way, they’re also uncovering signals that clinicians could use to predict disease prognosis and identify potential new treatment avenues.'''
,
'''The grinning voice of John Denver caroling "Rocky Mountain High" may never again seem quite so innocent once you've consumed "Final Destination," the leaden teenage horror film in which the song is repeatedly used to announce the arrival of death (with a capital D).
The first time you hear the anthem by the perky folk-pop singer, who died in a plane crash, it is being piped over the sound system at Kennedy Airport minutes before Alex Browning (Devon Sawa), a jittery high school senior, is to board a jet for Paris on his class trip. For weeks, Alex has been having premonitions of disaster, and as he quakes with terror in a men's room stall, the Denver song sneaks into the background to taunt him with the reminder that what goes up must come down.
Once on the plane, Alex is seized by a fantasy (the movie's scariest scene) in which the aircraft, seconds after takeoff, shudders with a death rattle as an explosion rips through the cabin, creating pandemonium. Berserk with panic, Alex snaps out of his nightmare and screams that the plane is going to crash, even though it still hasn't left the gate.
Escorted back to the terminal, he ends up one of seven who stay behind. When the plane carrying most of his classmates finally takes off and seconds later explodes in midair, killing everyone aboard, Alex is shattered but not surprised.
The disaster and Alex's premonitions set up a heavy-handed fable about death and teenage illusions of invulnerability. Having cheated death, Alex and his six fellow survivors discover that the Grim Reaper is in a major snit. And for the rest of the movie, it sets about picking them off, one by one. But if you imagined that death would dispatch them as quickly and efficiently as possible, think again. Being a teenage horror film, "Final Destination" is not about to let anybody go gentle into that good night.'''
,


'''Interest in natural language processing (NLP) began in earnest in 1950 when Alan Turing published his paper entitled “Computing Machinery and Intelligence,” from which the so-called Turing Test emerged. Turing basically asserted that a computer could be considered intelligent if it could carry on a conversation with a human being without the human realizing they were talking to a machine.
The goal of natural language processing is to allow that kind of interaction so that non-programmers can obtain useful information from computing systems. This kind of interaction was popularized in the 1968 movie “2001: A Space Odyssey” and in the Star Trek television series. Natural language processing also includes the ability to draw insights from data contained in emails, videos, and other unstructured material. “In the future,” writes Marc Maxson, “the most useful data will be the kind that was is too unstructured to be used in the past.” [“The future of big data is quasi-unstructured,” Chewy Chunks, 23 March 2013] Maxson believes, “The future of Big Data is neither structured nor unstructured. Big Data will be structured by intuitive methods (i.e., ‘genetic algorithms’), or using inherent patterns that emerge from the data itself and not from rules imposed on data sets by humans.'''
]


### Creating a List Containig List from List Containing Strings

Model Needs List of Lists(List of paragraphs) as input.

In [None]:
Sep_Paras = []
for i in range(len(Para_list)):
  Sep_Paras.append([Para_list[i]])

In [None]:
Sep_Paras

[['The Pentagon’s drones are an iconic symbol of war abroad, plane-sized matchsticks with wings lurking over cities and countrysides waiting for the moment routine patrol becomes un-routine. For the most part, the missions of those drones have remained abroad, but over the years the Department of Defense has flown drones a handful of times over the United States in support of civil authorities. From 2011 to 2017, the Pentagon reports just 11 total domestic drone missions.\nBut in 2018, that total doubled, with 11 domestic missions flown by military drones.\nOn Jan. 11, the Department of Defense published its 2018 statistics. The drones involved include everything from MQ-9 Reapers down to DJI Phantoms, and involvement in missions ranging from training exercises to border security and emergency response. (Notably, drone operations by the Department of Homeland Security are excluded from these statistics). These numbers are helpfully collected and contrasted with domestic drone use by th

### Creating the dataframe whichi we will give to the Model

In [None]:
df = pd.DataFrame()
df['title'] = ['The Pentagon’s drones','Covid-19','The grinning voice of John Denver','natural language processing (NLP)']
df['paragraphs'] = Sep_Paras
print(df)

                               title                                         paragraphs
0              The Pentagon’s drones  [The Pentagon’s drones are an iconic symbol of...
1                           Covid-19  [One of the most striking features of SARS-CoV...
2  The grinning voice of John Denver  [The grinning voice of John Denver caroling "R...
3  natural language processing (NLP)  [Interest in natural language processing (NLP)...


### Initialising a Pipeline For Queestion-Answering Model 

The pipeline is created from a model called **Distilled Bert** that is pretrained for Question Answering Tasks.

In [None]:
cdqa_pipeline = QAPipeline(reader='/content/drive/My Drive/Colab Projects/Lincode VIrtual Hackathon Submission/Database/distilbert_qa.joblib') 
#FItting Challange-4's Data(Pretraining Data)
cdqa_pipeline.fit_retriever(df=df)

QAPipeline(reader=BertQA(adam_epsilon=1e-08,
                         bert_model='distilbert-base-uncased',
                         do_lower_case=True, fp16=False,
                         gradient_accumulation_steps=1, learning_rate=5e-05,
                         local_rank=-1, loss_scale=0, max_answer_length=30,
                         n_best_size=20, no_cuda=False,
                         null_score_diff_threshold=0.0, num_train_epochs=3.0,
                         output_dir=None, predict_batch_size=8, seed=42,
                         server_ip='', ser...size=8,
                         verbose_logging=False, version_2_with_negative=False,
                         warmup_proportion=0.1, warmup_steps=0),
           retrieve_by_doc=False,
           retriever=BM25Retriever(b=0.75, floor=None, k1=2.0, lowercase=True,
                                   max_df=0.85, min_df=2, ngram_range=(1, 2),
                                   preprocessor=None, stop_words='english',
           

## Predicting Answers From Data Given In Challange-4

GIves answers from pretraines model on challange-4's data.

In [None]:
cdqa_pipeline.predict(query='What is the name of the virus that causes Covid-19')[0]

'SARS-CoV-2'

In [None]:
cdqa_pipeline.predict(query='What are the abilities of Natural Language Processing?')[0]

'ability to draw insights from data contained in emails, videos, and other unstructured material'

### Seeing Time Cost

It takes average of 145ms. Bit more costly then my previous implementation but much more accurate.

In [None]:
%timeit cdqa_pipeline.predict(query='What is the name of the virus that causes Covid-19')[0]

### For Custom Testing Paragraphs

In [None]:
cdqa_pipeline = QAPipeline(reader='/content/drive/My Drive/Colab Projects/Lincode VIrtual Hackathon Submission/Database/distilbert_qa.joblib') 

In [None]:
def Predict(Paragraph, Question, PipeLine):
  #Splitting given paragraph into list of sentences.
  Para_list = sent_tokenize(Paragraph)
  
  #Creating List of sentences where each element is a sentence
  Sep_Paras = []
  for i in range(len(Para_list)):
    Sep_Paras.append([Para_list[i]])

  print("Fitting Paragraph...")
  df = pd.DataFrame()
  df['title'] = [i for i in range(len(Sep_Paras))] #giving indexing
  df['paragraphs'] = Sep_Paras
  
  #Fitting the Model to The newly created dataframe
  cdqa_pipeline.fit_retriever(df=df)

  print("Predicting Answer...")
  print("Question: ",Question)
  print("Answer: ",PipeLine.predict(query=Question)[0])


In [None]:
Predict('''The Pentagon’s drones are an iconic symbol of war abroad, plane-sized matchsticks with wings lurking over cities and countrysides waiting for the moment routine patrol becomes un-routine. For the most part, the missions of those drones have remained abroad, but over the years the Department of Defense has flown drones a handful of times over the United States in support of civil authorities. From 2011 to 2017, the Pentagon reports just 11 total domestic drone missions.
But in 2018, that total doubled, with 11 domestic missions flown by military drones.
On Jan. 11, the Department of Defense published its 2018 statistics. The drones involved include everything from MQ-9 Reapers down to DJI Phantoms, and involvement in missions ranging from training exercises to border security and emergency response. (Notably, drone operations by the Department of Homeland Security are excluded from these statistics). These numbers are helpfully collected and contrasted with domestic drone use by the military from 2011 to 2017 by the Center for the Study of the Drone at Bard University.
In 2018, military MQ-9 Reapers flew five missions over the United States, four of which were in support of forest firefighting in California and Oregon. One Reaper mission, flown from May 7-10, was described as an incident and awareness exercise in the state of New York. RQ-11B Ravens flew two missions: one a base installation in Bangor, Kitsap, Washington, and the other a Defense Support of Civil Authorities mission in response to Hurricane Florence and requested by the South Carolina National Guard.
''',"When did the Department of Defense publish its 2018 statistics?", cdqa_pipeline)

Fitting Paragraph...
Predicting Answer...
Question:  When did the Department of Defense publish its 2018 statistics?
Answer:  Jan. 11


In [None]:
%%timeit

Predict('''The Pentagon’s drones are an iconic symbol of war abroad, plane-sized matchsticks with wings lurking over cities and countrysides waiting for the moment routine patrol becomes un-routine. For the most part, the missions of those drones have remained abroad, but over the years the Department of Defense has flown drones a handful of times over the United States in support of civil authorities. From 2011 to 2017, the Pentagon reports just 11 total domestic drone missions.
But in 2018, that total doubled, with 11 domestic missions flown by military drones.
On Jan. 11, the Department of Defense published its 2018 statistics. The drones involved include everything from MQ-9 Reapers down to DJI Phantoms, and involvement in missions ranging from training exercises to border security and emergency response. (Notably, drone operations by the Department of Homeland Security are excluded from these statistics). These numbers are helpfully collected and contrasted with domestic drone use by the military from 2011 to 2017 by the Center for the Study of the Drone at Bard University.
In 2018, military MQ-9 Reapers flew five missions over the United States, four of which were in support of forest firefighting in California and Oregon. One Reaper mission, flown from May 7-10, was described as an incident and awareness exercise in the state of New York. RQ-11B Ravens flew two missions: one a base installation in Bangor, Kitsap, Washington, and the other a Defense Support of Civil Authorities mission in response to Hurricane Florence and requested by the South Carolina National Guard.
''',"When did the Department of Defense publish its 2018 statistics?", cdqa_pipeline)

### Summary

As you can see this model gives very accurate answer. But at the same time it has alot of time cost(~145ms for pretrained and ~2sec for untrained). 



# Final Summary

So if I want **quick Real-time answers under 50ms** I will predict **using 1st model**. But if I want more accuracy I will use **the second Model**(distillBert).

Hope you like my Submission.