# Tutorial: Generative QA with Retrieval Augmented Generation

In this tutorial, you'll learn how to run generative question answering by connecting a retriever to a generative LLM. You'll also learn how to use prompts with a generative model to tune your answers. The system should also generate a response like "Unanswerable" if no evidence is found.

You can plug-and-play this tutorial with most models on the HuggingFace model hub and also OpenAI LLMs. Some supported models include:
 - FLAN UL2-20B
 - FLAN T5 
 - Open AI ChatGPT (gpt3.5 turbo)
 - InstructGPT(text-davinci-003)
 - lots more..

# Preparing a Colab Environment to run this tutorial
Make sure to "Enable GPU Runtime" -> make a URL with a page with screenshots on how to do this.

# Installing PrimeQA
First, we need to include the required modules.

In [None]:
%%bash
pip install --upgrade pip

pip install primeqa

# Pre-process your document collection here to be ready to be stored in your Neural Search Index.

TODO- add some steps after this to ingest from the sample wikipedia docs.

## Initializing the Retriever
We initialize a DPR model to embed our documents from the collection. Note: since we will ask questions later over this document collection we need to embed the questions too.

In [None]:
from primeqa.retrieve import DPR

from primeqa.embed import DocumentStore

# remove all the unnecessary imports - let's make this very simple as I wrote this here
document_store = DocumentStore(vector_db='FAISS')

# declare the retriever here
retriever = DPR (document_store=document_store,
                 query_embedding_model = "PrimeQA/XOR-TyDi_monolingual_DPR_qry_encoder", # please change to NQ
                passage_embedding_model = "PrimeQA/XOR-TyDi_monolingual_DPR_ctx_encoder",
                use_gpu=True, embed_title=True)

# Initialize the Reader 

In this step you get a generative LLM ready which can be prompted. This reader can be any of the generative models available in the HuggingFace model hub or OpenAI models.

In [None]:
from primeqa.read import GenerativeReader

reader = GenerativeReader(model_type='HuggingFace', model_name='google/flan-ul2')
# setup a OpenAI generative reader : we support gpt-3.5-turbo and text-davinci-003
#reader = GenerativeReader(model_type='openai', model_name='gpt-3.5-turbo', api_key='API KEY HERE')
reader.load()

# Setup the RAG pipeline

Attach a retriever to a generative LLM. You can then prompt it to answer questions

In [None]:
pipeline = RAG(retriever, reader)

# Start asking questions

We "run" the pipeline we just created and also attach a prompt.

In [None]:
questions = ["How many area codes are in New Hampshire ?"]
prompt_prefix = "Answer the following question after looking at the text."
answers = pipeline.run(questions, prefix=prompt_prefix)
print(json.dumps(answers, indent=4))