# Open-retrieval Conversation Question Answering
Based on the paper _Open-retrieval Conversation Question Answering_ by _Qu et al_.

Since ConverSE is built upon Haystack. This notebook is very similar to the original notebook on Dense Passage Retrieval https://colab.research.google.com/github/deepset-ai/haystack/blob/master/tutorials/Tutorial6_Better_Retrieval_via_DPR.ipynb#scrollTo=kFwiPP60A6N7

## Prepare environment

In [None]:
# Make sure you have a GPU running
!nvidia-smi


!pip install git+https://github.com/deepset-ai/haystack.git # Install the latest master of Haystack
!pip install git+https://github.com/giguru/converse.git  # Install the latest master of Converse

In [None]:
from haystack import Finder
from haystack.preprocessor.cleaning import clean_wiki_text
from haystack.preprocessor.utils import convert_files_to_dicts, fetch_archive_from_http
from haystack.reader.farm import FARMReader
from haystack.reader.transformers import TransformersReader
from haystack.utils import print_answers

from converse.src.reader.farm import FARMReader
from converse.src.reader.transformers import TransformersReader
from converse.src.retriever.dense_passage_retriever import DensePassageRetriever
from converse.src.converse import Converse

## Indexer and data

In [None]:
# Add document collection to a DocumentStore. The original text will be indexed. Conversion into embeddings can be 
# is done below.
from haystack.document_store.faiss import FAISSDocumentStore
document_store = FAISSDocumentStore()


In [None]:
from converse.src.retriever.dense_passage_retriever import DensePassageRetriever
retriever = DensePassageRetriever(
    document_store=document_store,
    query_embedding_model="facebook/dpr-question_encoder-single-nq-base",  # TODO replace with ORConvQA model
    passage_embedding_model="facebook/dpr-ctx_encoder-single-nq-base",  # TODO replace with ORConvQA model
    use_gpu=True,
    embed_title=True,
    max_seq_len=256,
    batch_size=16,
    remove_sep_tok_from_untitled_passages=True
)

# Embed passages
Since retrieval will be done on the embeddings, the embedding representation of the documents need to be computed
This only needs to be done once.

In [None]:
# document_store.update_embeddings(retriever)

In [None]:
# Load a local model or any of the QA models on Hugging Face's model hub (https://huggingface.co/models)
reader = FARMReader(model_name_or_path="deepset/roberta-base-squad2", use_gpu=True)

In [None]:
finder = Converse(reader, [retriever])

## Evaluate pipeline

In [None]:
# Evaluate combination of Reader and Retriever through Finder
finder_eval_results = finder.eval(top_k_retriever=1, top_k_reader=10)
finder.print_eval_results(finder_eval_results)