# Tutorial: Generative QA with Retrieval Augmented Generation

In this tutorial, you'll learn how to run generative question answering by connecting a retriever to a generative LLM. You'll also learn how to use prompts with a generative model to tune your answers. The system should also generate a response like "Unanswerable" if no evidence is found.

You can plug-and-play this tutorial with most models on the HuggingFace model hub and also OpenAI LLMs. Some supported models include:
 - FLAN UL2-20B
 - FLAN T5 
 - Open AI ChatGPT (gpt-3.5-turbo)
 - InstructGPT(text-davinci-003)
 - lots more..
 
 
 
## Step 0: Prepare a Colab Environment to run this tutorial on GPUs
Make sure to "Enable GPU Runtime" by following this [url](https://drive.google.com/file/d/1jhE8CkieQXoW0gvz9IherTDdJY54Q4Yz/view?usp=sharing). This step will make sure the tutorial runs faster.

## Step 1: Install PrimeQA
First, we need to include the required modules.

In [1]:
! pip install --upgrade primeqa



## Step 2: Initialize the Retriever

### Pre-process your document collection here to be ready to be stored in your Neural Search Index.
In this step we download a publicly available .csv file from a Google Drive location and save it as .tsv.

In [3]:
# save your input document as a .tsv
import pandas as pd
url='https://drive.google.com/file/d/1LULJRPgN_hfuI2kG-wH4FUwXCCdDh9zh/view?usp=sharing'
url='https://drive.google.com/uc?id=' + url.split('/')[-2]
df = pd.read_csv(url)
df.to_csv('input.tsv', sep='\t', columns = ['text', 'title'])

In [4]:
df

Unnamed: 0,title,text
0,"""Albert Einstein""",to Einstein in 1922. Footnotes Citations Alber...
1,"""Albert Einstein""",Albert Einstein Albert Einstein (; ; 14 March ...
2,"""Albert Einstein""",observations were published in the internation...
3,"""Albert Einstein""",model for depictions of mad scientists and abs...
4,"""Alfred Nobel""",was adopted as the standard technology for min...
...,...,...
70,"""The Ashes""","In England and Wales, the grounds used are: Ol..."
71,"""The Ashes""",1978–79; 1981; 1985; 1989; 1993 and 1997). Aus...
72,"""The Ashes""","Ground (MCG) (1876–77), and the Sydney Cricket..."
73,"""The Ashes""",therefore will not host an Ashes Test until at...


### Initialize the model. In PrimeQA we use the SearchableCorpus class for searching through your corpus.

For DPR, you need to point to a question and context encoder models available via the HuggingFace model hub.

In [5]:
from primeqa.components import SearchableCorpus
retriever = SearchableCorpus(context_encoder_name_or_path="PrimeQA/XOR-TyDi_monolingual_DPR_ctx_encoder",
                             query_encoder_name_or_path="PrimeQA/XOR-TyDi_monolingual_DPR_qry_encoder",
                             batch_size=64, top_k=10)

2023-11-17 15:02:12.975342: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /path/to/anaconda/envs/myenv/lib:/usr/local/cuda/lib64:/usr/local/cuda-11.7/lib64::/usr/local/cuda/lib64:/usr/local/cuda-11.7/lib64


{"time":"2023-11-17 15:02:15,508", "name": "faiss.loader", "level": "INFO", "message": "Loading faiss with AVX2 support."}
{"time":"2023-11-17 15:02:15,525", "name": "faiss.loader", "level": "INFO", "message": "Successfully loaded faiss with AVX2 support."}


### Add your documents into the searchable corpus.

The input.tsv file can be added to the searchable corpus and it assumes the following format as needed by DPR:

`id \t text \t title_of_document`

Note: since DPR is based on an encoder language model the typical sequence length is 512 max sub-word tokens. Make sure your documents are split into text length of ~220 words.

In [6]:
retriever.add_documents("input.tsv")

Downloading (…)lve/main/config.json:   0%|          | 0.00/658 [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/436M [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/355 [00:00<?, ?B/s]

Downloading (…)solve/main/vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/711k [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

{"time":"2023-11-17 15:03:25,360", "name": "primeqa.ir.dense.dpr_top.dpr.index_simple_corpus", "level": "INFO", "message": "wrote passages_1_of_1.json.gz.records in 0 seconds"}
{"time":"2023-11-17 15:03:25,361", "name": "primeqa.ir.dense.dpr_top.dpr.faiss_index", "level": "INFO", "message": "building index, reading data from dpr_index_dir/passages_1_of_1.json.gz.records, writing to dpr_index_dir/index_1_of_1.faiss"}
{"time":"2023-11-17 15:03:25,364", "name": "primeqa.ir.dense.dpr_top.dpr.faiss_index", "level": "INFO", "message": "processed 0 passages"}
{"time":"2023-11-17 15:03:25,370", "name": "primeqa.ir.dense.dpr_top.dpr.faiss_index", "level": "INFO", "message": "calling index.add with 76 vectors"}
{"time":"2023-11-17 15:03:25,373", "name": "primeqa.ir.dense.dpr_top.dpr.faiss_index", "level": "INFO", "message": "processed 76 passages"}
{"time":"2023-11-17 15:03:25,374", "name": "primeqa.ir.dense.dpr_top.dpr.faiss_index", "level": "INFO", "message": "finished building index, writing 

Downloading (…)lve/main/config.json:   0%|          | 0.00/664 [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/436M [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/361 [00:00<?, ?B/s]

Downloading (…)solve/main/vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/711k [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

{"time":"2023-11-17 15:04:11,168", "name": "primeqa.ir.dense.dpr_top.dpr.searcher", "level": "INFO", "message": "Using sharded faiss, reading shards from dpr_index_dir"}
{"time":"2023-11-17 15:04:11,169", "name": "primeqa.ir.dense.dpr_top.dpr.searcher", "level": "INFO", "message": "Reading passages_1_of_1.json.gz.records"}
{"time":"2023-11-17 15:04:11,171", "name": "primeqa.ir.dense.dpr_top.dpr.searcher", "level": "INFO", "message": "Using sharded faiss with 1 shards."}


## Step 3: Initialize the Reader 

In this step you can use a generative LLM which can be prompted. This reader can be any of the generative models available in the HuggingFace model hub or OpenAI models.

In [7]:
from primeqa.components import GenerativeReader

reader = GenerativeReader(model_type='HuggingFace', model_name='google/flan-t5-small')
# setup an OpenAI generative reader : we support gpt-3.5-turbo and text-davinci-003
# reader = GenerativeReader(model_type='OpenAI', model_name='gpt-3.5-turbo', api_key='API KEY HERE')

Downloading (…)lve/main/config.json:   0%|          | 0.00/1.40k [00:00<?, ?B/s]

Downloading model.safetensors:   0%|          | 0.00/308M [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/2.54k [00:00<?, ?B/s]

Downloading spiece.model:   0%|          | 0.00/792k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/2.42M [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/2.20k [00:00<?, ?B/s]

## Step 4: Setup the RAG pipeline

Attach a retriever to a generative LLM. You can then prompt it to answer questions.

In [8]:
from primeqa.pipelines import RAG
pipeline = RAG(retriever, reader)

## Step 5: Start asking questions

We "run" the pipeline we just created and also attach a prompt.

In [9]:
questions = ['When was Idaho split in two?' , 'Who was Danny Nozel']
prompt_prefix = "Answer the following question after looking at the text."

answers = pipeline.run(questions, prefix=prompt_prefix)

Token indices sequence length is longer than the specified maximum sequence length for this model (1449 > 512). Running this sequence through the model will result in indexing errors


In [10]:
import pandas as pd
from IPython.display import display, HTML

output = pd.DataFrame.from_records(answers)
display(HTML(output.to_html()))

Unnamed: 0,question,answer,passages
0,When was Idaho split in two?,"American Citizens"" and ""American Citizens"" were the same. The Treaty of Peace and Amity of September 5, 1795, between the United States and the Barbary States contains the usages ""the United States of North America"", ""citizens of the United States"" and ""American Citizens"" respectively. The Treaty of Peace and Amity of September 5, 1795, between the United States and the Barbary States contains the usages ""the United States of North America"", ""citizens of the United States"" and ""American Citizens""","[Passage: 14, 1784. Copies were sent back to Europe for ratification by the other parties involved, the first reaching France in March 1784. British ratification occurred on April 9, 1784, and the ratified vers..., Passage: been achieved and that Reconstruction should end. They ran a presidential ticket in 1872 but were decisively defeated. In 1874, Democrats, primarily Southern, took control of Congress and opposed any ..., Passage: signed the Treaty of Paris in which Great Britain agreed to recognize the sovereignty of the United States and formally end the war. French involvement had proven decisive, but France made few gains a...]"
1,Who was Danny Nozel,Alfred's brother Ludvig. Alfred's brother Ludvig died while visiting Cannes and a French newspaper erroneously published Alfred's obituary. Alfred's brother Ludvig died while visiting Cannes and a French newspaper erroneously published Alfred's obituary. Alfred's brother Ludvig died while visiting Cannes and a French newspaper erroneously published Alfred's obituary.,"[Passage: Jack London's novel and starring Ethan Hawke, was filmed in and around Haines. Steven Seagal's 1994 ""On Deadly Ground"", starring Michael Caine, was filmed in part at the Worthington Glacier near Valde..., Passage: status as: ""I am not one of those hyphenated Americans who claim allegiance to two countries."" Despite this declaration, Bell has been proudly claimed as a ""native son"" by all three countries he resid..., Passage: react with carbon dioxide to form the alkali metal carbonate and oxygen gas, which allows them to be used in submarine air purifiers; the presence of water vapour, naturally present in breath, makes t...]"


Congratulations 🎉✨🎊🥳 !! You can now perform retrieve and generate (RAG) with PrimeQA!