# Generative QA with "Retrieval-Augmented Generation"

> As of version 1.16, RAGenerator has been deprecated in Haystack and will be completely removed from Haystack as of v1.18. We recommend following the tutorial on [Creating a Generative QA Pipeline with PromptNode](https://haystack.deepset.ai/tutorials/22_pipeline_with_promptnode) instead.

While extractive QA highlights the span of text that answers a query,
generative QA can return a novel text answer that it has composed.
In this tutorial, you will learn how to set up a generative system using the
[RAG model](https://arxiv.org/abs/2005.11401) which conditions the
answer generator on a set of retrieved documents.


## Preparing the Colab Environment

- [Enable GPU Runtime](https://docs.haystack.deepset.ai/docs/enabling-gpu-acceleration#enabling-the-gpu-in-colab)


## Installing Haystack

To start, let's install the latest release of Haystack with `pip`:

In [1]:
%%bash

pip install --upgrade pip
pip install farm-haystack[colab,faiss]

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting farm-haystack[colab,faiss]
  Downloading farm_haystack-1.16.1-py3-none-any.whl (713 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 713.1/713.1 kB 15.7 MB/s eta 0:00:00
Collecting azure-ai-formrecognizer>=3.2.0b2 (from farm-haystack[colab,faiss])
  Downloading azure_ai_formrecognizer-3.3.0b1-py3-none-any.whl (299 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 299.9/299.9 kB 33.1 MB/s eta 0:00:00
Collecting boilerpy3 (from farm-haystack[colab,faiss])
  Downloading boilerpy3-1.0.6-py3-none-any.whl (22 kB)
Collecting canals (from farm-haystack[colab,faiss])
  Downloading canals-0.2.1-py3-none-any.whl (31 kB)
Collecting dill (from farm-haystack[colab,faiss])
  Downloading dill-0.3.6-py3-none-any.whl (110 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 110.5/110.5 kB 15.5 MB/s eta 0:0

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tensorflow 2.12.0 requires protobuf!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.20.3, but you have protobuf 3.20.2 which is incompatible.
tensorflow-metadata 1.13.1 requires protobuf<5,>=3.20.3, but you have protobuf 3.20.2 which is incompatible.


### Enabling Telemetry 
Knowing you're using this tutorial helps us decide where to invest our efforts to build a better product but you can always opt out by commenting the following line. See [Telemetry](https://docs.haystack.deepset.ai/docs/telemetry) for more details.

In [2]:
from haystack.telemetry import tutorial_running

tutorial_running(7)

## Logging

We configure how logging messages should be displayed and which log level should be used before importing Haystack.
Example log message:
INFO - haystack.utils.preprocessing -  Converting data/tutorial1/218_Olenna_Tyrell.txt
Default log level in basicConfig is WARNING so the explicit parameter is not necessary but can be changed easily:

In [3]:
import logging

logging.basicConfig(format="%(levelname)s - %(name)s -  %(message)s", level=logging.WARNING)
logging.getLogger("haystack").setLevel(logging.INFO)

## Fetching and Cleaning Documents

Let's download a csv containing some sample text and preprocess the data.


In [4]:
import pandas as pd
from haystack.utils import fetch_archive_from_http

# Download sample
doc_dir = "data/tutorial7/"
s3_url = "https://s3.eu-central-1.amazonaws.com/deepset.ai-farm-qa/datasets/small_generator_dataset.csv.zip"
fetch_archive_from_http(url=s3_url, output_dir=doc_dir)

# Create dataframe with columns "title" and "text"
df = pd.read_csv(f"{doc_dir}/small_generator_dataset.csv", sep=",")
# Minimal cleaning
df.fillna(value="", inplace=True)

print(df.head())

INFO:haystack.utils.import_utils:Fetching from https://s3.eu-central-1.amazonaws.com/deepset.ai-farm-qa/datasets/small_generator_dataset.csv.zip to 'data/tutorial7/'


               title  \
0  "Albert Einstein"   
1  "Albert Einstein"   
2  "Albert Einstein"   
3  "Albert Einstein"   
4     "Alfred Nobel"   

                                                                              text  
0  to Einstein in 1922. Footnotes Citations Albert Einstein Albert Einstein (; ...  
1  Albert Einstein Albert Einstein (; ; 14 March 1879 – 18 April 1955) was a Ge...  
2  observations were published in the international media, making Einstein worl...  
3  model for depictions of mad scientists and absent-minded professors; his exp...  
4  was adopted as the standard technology for mining in the "Age of Engineering...  


We can cast our data into Haystack Document objects.
Alternatively, we can also just use dictionaries with "text" and "meta" fields

In [5]:
from haystack import Document

# Use data to initialize Document objects
titles = list(df["title"].values)
texts = list(df["text"].values)
documents = []
for title, text in zip(titles, texts):
    documents.append(Document(content=text, meta={"name": title or ""}))

## Initializing the DocumentStore

Here we initialize the FAISSDocumentStore. Set `return_embedding` to `True`, so Generator doesn't have to perform re-embedding

In [6]:
from haystack.document_stores import FAISSDocumentStore

document_store = FAISSDocumentStore(faiss_index_factory_str="Flat", return_embedding=True)

## Initializing the Retriever

We initialize DensePassageRetriever to encode documents, encode question and query documents.

In [7]:
from haystack.nodes import RAGenerator, DensePassageRetriever

retriever = DensePassageRetriever(
    document_store=document_store,
    query_embedding_model="facebook/dpr-question_encoder-single-nq-base",
    passage_embedding_model="facebook/dpr-ctx_encoder-single-nq-base",
    use_gpu=True,
    embed_title=True,
)

INFO:haystack.modeling.utils:Using devices: CUDA:0 - Number of GPUs: 1


Downloading (…)okenizer_config.json:   0%|          | 0.00/28.0 [00:00<?, ?B/s]

Downloading (…)solve/main/vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

Downloading (…)lve/main/config.json:   0%|          | 0.00/493 [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/438M [00:00<?, ?B/s]

  return self.fget.__get__(instance, owner)()
INFO:haystack.modeling.model.language_model:Auto-detected model language: english


Downloading (…)okenizer_config.json:   0%|          | 0.00/28.0 [00:00<?, ?B/s]

Downloading (…)solve/main/vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

Downloading (…)lve/main/config.json:   0%|          | 0.00/492 [00:00<?, ?B/s]

The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'DPRQuestionEncoderTokenizer'. 
The class this function is called from is 'DPRContextEncoderTokenizerFast'.


Downloading pytorch_model.bin:   0%|          | 0.00/438M [00:00<?, ?B/s]

INFO:haystack.modeling.model.language_model:Auto-detected model language: english


## Initializing the Generator

We initialize RAGenerator to generate answers from retrieved Documents.

In [8]:
generator = RAGenerator(
    model_name_or_path="facebook/rag-token-nq",
    use_gpu=True,
    top_k=1,
    max_length=200,
    min_length=2,
    embed_title=True,
    num_beams=2,
)

INFO:haystack.modeling.utils:Using devices: CUDA:0 - Number of GPUs: 1


Downloading (…)lve/main/config.json:   0%|          | 0.00/4.60k [00:00<?, ?B/s]



Downloading (…)okenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

Downloading (…)_tokenizer/vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'RagTokenizer'. 
The class this function is called from is 'DPRQuestionEncoderTokenizer'.
The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'RagTokenizer'. 
The class this function is called from is 'DPRQuestionEncoderTokenizerFast'.


Downloading (…)okenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

Downloading (…)tokenizer/vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

Downloading (…)tokenizer/merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/772 [00:00<?, ?B/s]

The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'RagTokenizer'. 
The class this function is called from is 'BartTokenizer'.
The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'RagTokenizer'. 
The class this function is called from is 'BartTokenizerFast'.


Downloading pytorch_model.bin:   0%|          | 0.00/2.06G [00:00<?, ?B/s]

Some weights of the model checkpoint at facebook/rag-token-nq were not used when initializing RagTokenForGeneration: ['rag.question_encoder.question_encoder.bert_model.pooler.dense.bias', 'rag.question_encoder.question_encoder.bert_model.pooler.dense.weight']
- This IS expected if you are initializing RagTokenForGeneration from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RagTokenForGeneration from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RagTokenForGeneration were not initialized from the model checkpoint at facebook/rag-token-nq and are newly initialized: ['rag.generator.lm_head.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictio

## Writing Documents

We write documents to the DocumentStore, first by deleting any remaining documents then calling `write_documents()`.
The `update_embeddings()` method uses the retriever to create an embedding for each document.


In [9]:
# Delete existing documents in documents store
document_store.delete_documents()

# Write documents to document store
document_store.write_documents(documents)

# Add documents embeddings to index
document_store.update_embeddings(retriever=retriever)

Writing Documents:   0%|          | 0/75 [00:00<?, ?it/s]

INFO:haystack.document_stores.faiss:Updating embeddings for 68 docs...


Updating Embedding:   0%|          | 0/68 [00:00<?, ? docs/s]

Create embeddings:   0%|          | 0/80 [00:00<?, ? Docs/s]

## Initializing the Pipeline

With a Haystack `Pipeline` you can stick together your building blocks to a search pipeline.
Under the hood, `Pipelines` are Directed Acyclic Graphs (DAGs) that you can easily customize for your own use cases.
To speed things up, Haystack also comes with a few predefined Pipelines. One of them is the `ExtractiveQAPipeline` that combines a retriever and a reader to answer our questions.
You can learn more about `Pipelines` in the [docs](https://docs.haystack.deepset.ai/docs/pipelines).

In [10]:
from haystack.pipelines import GenerativeQAPipeline

pipe = GenerativeQAPipeline(generator=generator, retriever=retriever)

## Asking a Question

Now let's ask questions to our system!
The Retriever will pick out a small subset of documents that it finds relevant.
These are used to condition the Generator as it generates the answer.
What it should return then are novel text spans that form and answer to your question!

In [11]:
from haystack.utils import print_answers

QUESTIONS = [
    "who got the first nobel prize in physics",
    "when is the next deadpool movie being released",
    "which mode is used for short wave broadcast service",
    "who is the owner of reading football club",
    "when is the next scandal episode coming out",
    "when is the last time the philadelphia won the superbowl",
    "what is the most current adobe flash player version",
    "how many episodes are there in dragon ball z",
    "what is the first step in the evolution of the eye",
    "where is gall bladder situated in human body",
    "what is the main mineral in lithium batteries",
    "who is the president of usa right now",
    "where do the greasers live in the outsiders",
    "panda is a national animal of which country",
    "what is the name of manchester united stadium",
]

for question in QUESTIONS:
    res = pipe.run(query=question, params={"Generator": {"top_k": 1}, "Retriever": {"top_k": 5}})
    print_answers(res, details="minimum")



'Query: who got the first nobel prize in physics'
'Answers:'
[{'answer': ' albert einstein'}]
'Query: when is the next deadpool movie being released'
'Answers:'
[{'answer': ' september 22, 2017'}]
'Query: which mode is used for short wave broadcast service'
'Answers:'
[{'answer': ' amplitude modulation'}]
'Query: who is the owner of reading football club'
'Answers:'
[{'answer': ' stefan persson'}]
'Query: when is the next scandal episode coming out'
'Answers:'
[{'answer': ' april 20, 2018'}]
'Query: when is the last time the philadelphia won the superbowl'
'Answers:'
[{'answer': ' the 1970s'}]
'Query: what is the most current adobe flash player version'
'Answers:'
[{'answer': ' 7.1. 2'}]
'Query: how many episodes are there in dragon ball z'
'Answers:'
[{'answer': ' 13'}]
'Query: what is the first step in the evolution of the eye'
'Answers:'
[{'answer': ' step by step'}]
'Query: where is gall bladder situated in human body'
'Answers:'
[{'answer': ' stomach'}]
'Query: what is the main mi

In [16]:
QUESTIONS = [
    "For the last 8 years of his life, Galileo was under house arrest for espousing this man's theory",
    "No. 2: 1912 Olympian; football star at Carlisle Indian School; 6 MLB seasons with the Reds, Giants & Braves",
    "The city of Yuma in this state has a record average of 4,055 hours of sunshine each year",
    "In 1963, live on ""The Art Linkletter Show"", this company served its billionth burger",
    "Signer of the Dec. of Indep., framer of the Constitution of Mass., second President of the United States",
    "In the title of an Aesop fable, this insect shared billing with a grasshopper",
    "Built in 312 B.C. to link Rome & the South of Italy, it's still in use today",
    "No. 8: 30 steals for the Birmingham Barons; 2,306 steals for the Bulls",
    "In the winter of 1971-72, a record 1,122 inches of snow fell at Rainier Paradise Ranger Station in this state",
    "This housewares store was named for the packaging its merchandise came in & was first displayed on"
]

for question in QUESTIONS:
    res = pipe.run(query=question, params={"Generator": {"top_k": 1}, "Retriever": {"top_k": 5}})
    print_answers(res, details="minimum")

('Query: For the last 8 years of his life, Galileo was under house arrest for '
 "espousing this man's theory")
'Answers:'
[{'answer': ' albert einstein'}]
('Query: No. 2: 1912 Olympian; football star at Carlisle Indian School; 6 MLB '
 'seasons with the Reds, Giants & Braves')
'Answers:'
[{'answer': ' nolan ryan'}]
('Query: The city of Yuma in this state has a record average of 4,055 hours of '
 'sunshine each year')
'Answers:'
[{'answer': ' azerbaijan'}]
('Query: In 1963, live on The Art Linkletter Show, this company served its '
 'billionth burger')
'Answers:'
[{'answer': ' ``'}]
('Query: Signer of the Dec. of Indep., framer of the Constitution of Mass., '
 'second President of the United States')
'Answers:'
[{'answer': ','}]
('Query: In the title of an Aesop fable, this insect shared billing with a '
 'grasshopper')
'Answers:'
[{'answer': ','}]
("Query: Built in 312 B.C. to link Rome & the South of Italy, it's still in "
 'use today')
'Answers:'
[{'answer': ' the colosseum'}]
'Quer