# Tutorial: Generative QA with Retrieval Augmented Generation

In this tutorial, you'll learn how to run generative question answering by connecting a retriever to a generative LLM. You'll also learn how to use prompts with a generative model to tune your answers. The system should also generate a response like "Unanswerable" if no evidence is found.

You can plug-and-play this tutorial with most models on the HuggingFace model hub and also OpenAI LLMs. Some supported models include:
 - FLAN UL2-20B
 - FLAN T5 
 - Open AI ChatGPT (gpt-3.5-turbo)
 - InstructGPT(text-davinci-003)
 - lots more..

# Installing PrimeQA
First, we need to include the required modules.

In [None]:
%%bash

pip install --upgrade pip
pip install primeqa

## Initialize the Retriever

### Pre-process your document collection here to be ready to be stored in your Neural Search Index.
In this step we download a publicly available .csv file from a Google Drive location and save it as .tsv.

In [None]:
# save your input document as a .tsv
import pandas as pd
url='https://drive.google.com/file/d/1LULJRPgN_hfuI2kG-wH4FUwXCCdDh9zh/view?usp=sharing'
url='https://drive.google.com/uc?id=' + url.split('/')[-2]
df = pd.read_csv(url)
df.to_csv('input.tsv', sep='\t')

### Initialize the model. In PrimeQA we use the SearchableCorpus class for searching through your corpus.

For DPR, you need to point to a question and context encoder models available via the HuggingFace model hub.

In [None]:
from primeqa.util import SearchableCorpus
retriever = SearchableCorpus(model_name="PrimeQA/XOR-TyDi_monolingual_DPR_ctx_encoder", 
                              query_encoder_model_name_or_path="PrimeQA/XOR-TyDi_monolingual_DPR_qry_encoder", 
                              batch_size=64, top_k=10)

### Add your documents into the searchable corpus through PrimeQA's built-in pre-processor.

PrimeQA has a built-in class called DocumentCollection which pre-processes input.tsv to match the following format as needed by DPR:

`id \t text \t title_of_document`

Note: since DPR is based on an encoder language model the typical sequence length is 512 max sub-word tokens. So please make sure your documents are split into text length of ~220 words.

In [None]:
from primeqa.ir.util.corpus_reader import DocumentCollection
doc_collection = DocumentCollection("input.tsv")

retriever.add_documents(doc_collection.get_processed_collection())

# Initialize the Reader 

In this step you get a generative LLM ready which can be prompted. This reader can be any of the generative models available in the HuggingFace model hub or OpenAI models.

In [None]:
from primeqa.components.reader import GenerativeReader

reader = GenerativeReader(model_type='HuggingFace', model_name='google/flan-ul2')
# setup an OpenAI generative reader : we support gpt-3.5-turbo and text-davinci-003
# reader = GenerativeReader(model_type='openai', model_name='gpt-3.5-turbo', api_key='API KEY HERE')


# Setup the RAG pipeline

Attach a retriever to a generative LLM. You can then prompt it to answer questions.

In [None]:
from primeqa.pipelines import RAG
pipeline = RAG(retriever, reader)

# Start asking questions

We "run" the pipeline we just created and also attach a prompt.

In [None]:
import json

questions = ['When was Idaho split in two?' , 'Who was Danny Nozel']
prompt_prefix = "Answer the following question after looking at the text."
answers = pipeline.run(questions, prefix=prompt_prefix)
print(json.dumps(answers, indent=4))