<a href="https://colab.research.google.com/github/SohailaDiab/Question-Generation-and-Answering/blob/main/Question_Answering_HayStack.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 1. Installations

In [None]:
%%bash
pip install --upgrade pip
pip install git+https://github.com/deepset-ai/haystack.git#egg=farm-haystack[colab]

# 2. Imports

In [None]:
from haystack.pipelines import ExtractiveQAPipeline
from haystack.nodes import TransformersReader
from haystack.nodes import TfidfRetriever

from haystack.document_stores import InMemoryDocumentStore
# Prediction
from haystack.utils import print_answers

import pandas as pd

# 3. Load CSV File

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
df = pd.read_csv('/content/drive/MyDrive/DeepLearning_allen.csv')

In [None]:
df.head()

Unnamed: 0,phrase,t5_question,t5_answer,allen_answer,haystack_answer
0,no universally agreedupon threshold of depth divides shallow learning from d...,What is the term for a cap depth higher than cap of depth?,universal approximator,universal approximator,neural network
1,no universally agreedupon threshold of depth divides shallow learning from d...,Is there a universally agreed upon threshold of depth that divides shallow l...,no,no universally agreedupon threshold of depth,deep models cap are able to extract better features than shallow models
2,deep learning is modern variation which is concerned with an unbounded numbe...,What is modern variation concerned with an unbounded number of layers of bou...,deep learning,deep learning,deep learning
3,deep learning is modern variation which is concerned with an unbounded numbe...,Under what conditions does deep learning retain theoretical universality?,mild,mild conditions,mild
4,in deep learning the layers are also permitted to be heterogeneous and to de...,What are layers allowed to deviate widely from biologically informed connect...,the structured part,to be heterogeneous,heterogeneous


In [None]:
phraselist = list(dict.fromkeys(df['phrase'].values))
print(phraselist)

['no universally agreedupon threshold of depth divides shallow learning from deep learning, but most researchers agree that deep learning involves cap depth higher than cap of depth has been shown to be universal approximator in the sense that it can emulate any function.', 'deep learning is modern variation which is concerned with an unbounded number of layers of bounded size, which permits practical application and optimized implementation, while retaining theoretical universality under mild conditions.', 'in deep learning the layers are also permitted to be heterogeneous and to deviate widely from biologically informed connectionist models, for the sake of efficiency, trainability and understandability, hence the structured part.', 'deep learning also known as deep structured learning is part of broader family of machine learning methods based on artificial neural networks with representation learning.', 'definition\ndeep learning is class of machine learning algorithms that\u200aâ€

# 4. Question Answering

> Use questions generaed from T5 model as well as extracted phrases to generate an answer

[AllenNLP Model Notebook](https://colab.research.google.com/drive/1Z36CF3CeNGY4IV7XmhH8Hw8hfrKEzecb?usp=sharing)

## HayStack

### Pipeline

In [None]:
phrases = ''.join(phraselist)
doc = [{"content": phrases}]

document_store2 = InMemoryDocumentStore()
document_store2.write_documents(doc)

reader = TransformersReader(model_name_or_path="distilbert-base-uncased-distilled-squad", tokenizer="distilbert-base-uncased", use_gpu=-1)
retriever = TfidfRetriever(document_store=document_store2)

pipe = ExtractiveQAPipeline(reader, retriever)

### Prediction

In [None]:
prediction = pipe.run(
    query="", params={"Retriever": {"top_k": 10}, "Reader": {"top_k": 5}}
)

In [None]:
prediction['answers'][0].answer

'deep learning'

In [None]:
hay_answers = []
for i in range(df.shape[0]):
  prediction = pipe.run(
    query=df.t5_question[i], params={"Retriever": {"top_k": 10}, "Reader": {"top_k": 5}}
  )

  hay_answers.append(prediction['answers'][0].answer)

In [None]:
print(hay_answers)

['neural network', 'deep models cap are able to extract better features than shallow models', 'deep learning', 'mild', 'heterogeneous', 'deep structured learning', 'artificial neural networks', 'deep models cap', 'deep learning', 'compact intermediate representations', 'deep learning', 'the number of layers through which the data is transformed', 'deep in deep learning', 'the use of multiple layers in the network', 'artificial neural networks', 'artificial neural networks', 'deep learning', 'rina dechter', 'artificial neural', 'rina dechter', 'static and symbolic', 'structured', 'representation learning', 'artificial neural networks', 'deep belief networks', 'deep belief networks', 'frank rosenblatt', 'multiple', 'deep belief networks', 'deep belief networks', 'greedy layerbylayer method', 'deep belief networks', 'deep belief networks', 'networks', 'networks', 'deeplearning architectures', 'deep models', 'dropout', 'deep models', 'eight', 'deep learning', 'edges', 'handtuning', 'more l

In [None]:
df['haystack_answer'] = hay_answers 

In [None]:
df.head()

Unnamed: 0,phrase,t5_question,t5_answer,allen_answer,haystack_answer
0,no universally agreedupon threshold of depth divides shallow learning from d...,What is the term for a cap depth higher than cap of depth?,universal approximator,universal approximator,neural network
1,no universally agreedupon threshold of depth divides shallow learning from d...,Is there a universally agreed upon threshold of depth that divides shallow l...,no,no universally agreedupon threshold of depth,deep models cap are able to extract better features than shallow models
2,deep learning is modern variation which is concerned with an unbounded numbe...,What is modern variation concerned with an unbounded number of layers of bou...,deep learning,deep learning,deep learning
3,deep learning is modern variation which is concerned with an unbounded numbe...,Under what conditions does deep learning retain theoretical universality?,mild,mild conditions,mild
4,in deep learning the layers are also permitted to be heterogeneous and to de...,What are layers allowed to deviate widely from biologically informed connect...,the structured part,to be heterogeneous,heterogeneous


In [None]:
# path = '/content/drive/My Drive/DeepLearning_qna.csv'

# with open(path, 'w', encoding = 'utf-8-sig') as f:
#   df.to_csv(f, index = False)