# Testing Haystack Pipelines

This notebook runs several experiments testing Haystack pipelines for IR tasks using a corpus of documents. 

In [1]:
!pip install farm-haystack==1.17.1
!pip install faiss-cpu==1.7.4

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [2]:
import os
from pprint import pprint
from haystack.pipelines.standard_pipelines import TextIndexingPipeline
from haystack.utils import (
    fetch_archive_from_http, print_documents, print_answers, convert_files_to_docs
)
from haystack.document_stores import InMemoryDocumentStore, FAISSDocumentStore
from haystack.nodes import (
    BM25Retriever, FARMReader, EmbeddingRetriever, TransformersSummarizer, 
    PDFToTextConverter, TextConverter, PreProcessor, 
    SentenceTransformersRanker, MultihopEmbeddingRetriever,
    PromptNode, PromptTemplate,
)
from haystack.pipelines import (
    Pipeline, ExtractiveQAPipeline, SearchSummarizationPipeline
)

# Data Ingestion

In [3]:
# fetch GoT data for testing
DOC_DIR = "data/got"
fetch_archive_from_http(
    url="https://s3.eu-central-1.amazonaws.com/deepset.ai-farm-qa/datasets/documents/wiki_gameofthrones_txt1.zip",
    output_dir=DOC_DIR
)


False

In [4]:
files_to_index = [DOC_DIR + "/" + f for f in os.listdir(DOC_DIR)]

# in-memory, bm25 document store
document_store = InMemoryDocumentStore(use_bm25=True)
indexing_pipeline = TextIndexingPipeline(document_store)
indexing_pipeline.run_batch(file_paths=files_to_index)

# in-memory, embedding document store
embed_document_store = InMemoryDocumentStore(embedding_dim=384)
embed_indexing_pipeline = TextIndexingPipeline(embed_document_store)
embed_indexing_pipeline.run_batch(file_paths=files_to_index)

# faiss document store with flat index
vector_store = FAISSDocumentStore(faiss_index_factory_str="Flat")


Converting files:   0%|          | 0/183 [00:00<?, ?it/s]

Preprocessing:   0%|          | 0/183 [00:00<?, ?docs/s]



Updating BM25 representation...:   0%|          | 0/2356 [00:00<?, ? docs/s]

Converting files:   0%|          | 0/183 [00:00<?, ?it/s]

Preprocessing:   0%|          | 0/183 [00:00<?, ?docs/s]



# Queries

These are the queries that will be used to evaluate the IR methods.

Q1 is a basic question with an answer than can be found directly in the source text.

Q2 is an advanced question that requires inference from the source text.

In [5]:
q1 = "Who is Arya Stark's father?"
q2 = "Why does Sansa Stark hate Joffrey Baratheon?"

## Document Search

Find documents related to the query, ranked by relevance.  This type of search will not answer the question but just provides relevant results.

### Experiment 1

Find relevant documents from keyword search.

Pipeline Components:
* Retriever = BM25 keyword search

In [6]:
def print_doc_results(q, res):
    print("Query:", q, "\n\n")
    for doc in res['documents']:
        print("Score:", doc.score, "\n\n", doc.content, "\n\n--------\n\n")


In [48]:
retriever = BM25Retriever(
    document_store=document_store,
    top_k=3,  # return the top k results
    all_terms_must_match=False,  # if True, only return documents that contain all of the terms
    scale_score=True  # scales relevancy score to (0,1)
)

p = Pipeline()
p.add_node(component=retriever, name="BM25Retriever", inputs=["Query"])


In [49]:
res = p.run(query=q1)
print_doc_results(q1, res)

Query: Who is Arya Stark's father? 


Score: 0.7863373287288284 

 She names her direwolf Lady; she is the smallest of the pack and the first to die, sentenced to death by Cersei after Arya's direwolf, Nymeria, bit a violent Joffrey.

===Arya Stark===
Maisie Williams

'''Arya Stark''' portrayed by Maisie Williams. Arya Stark of House Stark is the younger daughter and third child of Lord Eddard and Catelyn Stark of Winterfell. Ever the tomboy, Arya would rather be training to use weapons than sewing with a needle. She names her direwolf Nymeria, after a legendary warrior queen.

===Robb Stark===
Richard Madden

'''Robb Stark''' (seasons 1–3) portrayed by Richard Madden. Robb Stark of House Stark is the eldest son of Eddard and Catelyn Stark and the heir to Winterfell. His dire wolf is called Grey Wind. Robb becomes involved in the war against the Lannisters after his father, Ned Stark, is arrested for treason. Robb summons his bannermen for war against House Lannister and marches to the

In [38]:
res = p.run(query=q2)
print_doc_results(q2, res)

Query: Why does Sansa Stark hate Joffrey Baratheon? 


Score: 0.8889901526544065 

 ===In the Riverlands===
At the Stark army camp, Robb vows revenge on the Lannisters after Ned's death, but Catelyn says they must first rescue Arya and Sansa. The Starks followers now support Northern independence, proclaiming Robb the "King in the North", rather than support Stannis or Renly Baratheon, who have both claimed the Iron Throne. Jaime tells Catelyn he pushed Bran out of the tower window, but does not explain why.

At the Lannister army camp, Tywin, unable to sue for peace with the Starks after Ned's execution, orders Tyrion to go to King's Landing in his stead as "Hand of the King" to keep Joffrey under control. Against his father's orders, Tyrion brings Shae with him.

===In King's Landing===
Joffrey forces Sansa to look at Ned and his household staff's severed heads on spikes. When Sansa says she wishes to see Joffrey's head mounted there after Joffrey says Robb's head will be, Joffrey ha

### Experiment 2

Find relevant documents from keyword search, ranked by cross encoder logit.

Pipeline Components:
* Retriever = BM25 keyword search
* Ranker = sentence transformers cross encoder

In [50]:
retriever = BM25Retriever(
    document_store=document_store,
    top_k=10,  # return the top k results
    all_terms_must_match=False,  # if True, only return documents that contain all of the terms
    scale_score=False  # no need to scale here if ranker does it instead
)
ranker = SentenceTransformersRanker(
    model_name_or_path="cross-encoder/ms-marco-MiniLM-L-12-v2",
    top_k=3,
    scale_score=True  # scales relevancy score to (0,1)
)

p = Pipeline()
p.add_node(component=retriever, name="BM25Retriever", inputs=["Query"])
p.add_node(component=ranker, name="Ranker", inputs=["BM25Retriever"])

In [51]:
res = p.run(query=q1)
print_doc_results(q1, res)

Query: Who is Arya Stark's father? 


Score: 0.9982544779777527 

 == Storylines ==
=== Novels ===
==== ''A Game of Thrones'' ====
Coat of arms of House Stark

Arya adopts a direwolf cub, which she names Nymeria after a legendary warrior queen. She travels with her father, Eddard, to King's Landing when he is made Hand of the King. Before she leaves, her half-brother Jon Snow has a smallsword made for her as a parting gift, which she names "Needle" after her least favorite ladylike activity.

While taking a walk together, Prince Joffrey and her sister Sansa happen upon Arya and her friend, the low-born butcher apprentice Mycah, sparring in the woods with broomsticks.  Arya defends Mycah from Joffrey's torments and her direwolf Nymeria helps Arya fight off Joffrey, wounding his arm in the process.  Knowing that Nymeria will likely be killed in retribution, Arya chases her wolf away; but Sansa's direwolf Lady is killed in Nymeria's stead and Mycah is hunted down and killed by Sandor Cleg

In [41]:
res = p.run(query=q2)
print_doc_results(q2, res)

Query: Why does Sansa Stark hate Joffrey Baratheon? 


Score: 0.9976001381874084 

 Following Ned's initial resignation as Hand of the King, Sansa is devastated to hear she must return to Winterfell. She likens Joffrey to a lion and says he is nothing like Robert Baratheon. This statement inspires Ned to investigate the Baratheon family line, prompting him to realise that Cersei's children are bastards fathered by her twin brother Jaime Lannister, not Robert Baratheon.

Following Robert's death and Ned's arrest for treason, all Stark servants in King's Landing are executed. Cersei exhorts Sansa to write Robb and Catelyn, imploring them to swear fealty to Joffrey. At court, Sansa pleads for her father's life; all agree on the condition Ned confesses his treason and swears fealty. Sansa is present at the Great Sept of Baelor and is horrified when Joffrey orders Ned's execution, fainting as Ned is beheaded.

Grieving the death of her father, Sansa is forced by Joffrey to look upon the spi

### Experiment 3

Find relevant documents from semantic search.

Pipeline Components:
* Retriever = embedding model

In [53]:
retriever = EmbeddingRetriever(
    document_store=embed_document_store,
    embedding_model="sentence-transformers/all-MiniLM-L6-v2",
    model_format="sentence_transformers",
    max_seq_len=512,  # context window for the model, text longer than this will be truncated
    top_k=10,  # return the top k results
    scale_score=False,  # no need to scale here if ranker does it instead
    use_gpu=True
)
embed_document_store.update_embeddings(retriever)

p = Pipeline()
p.add_node(component=retriever, name="EmbeddingRetriever", inputs=["Query"])

Updating Embedding:   0%|          | 0/2356 [00:00<?, ? docs/s]

Batches:   0%|          | 0/74 [00:00<?, ?it/s]

In [57]:
res = p.run(query=q1)
print_doc_results(q1, res)

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Query: Who is Arya Stark's father? 


Score: 0.7149572968482971 

 If I find out who the dad is and why I need to know who the dad is, then let Jon know quick.'"The scene reveals neither Lyanna Stark's son's name nor his father's, with the transition between the newborn's face and Jon Snow's visually conveying the identity. HBO released an infographic shortly after the episode aired, confirming Ned Stark as Jon's guardian, and Lyanna Stark and Rhaegar Targaryen as his parents.

In regards to Arya Stark's transformation following her time as a disciple of the Many-Faced God, Weiss noted in the "Inside the Episode" featurette, "We all see where she's coming from, she's seen so many atrocities. It's a worrisome narrative; she started as this tough and plucky girl and turned into someone who's capable of slitting a man's throat and smiling as she watches him as he bleeds out." 

--------


Score: 0.6798129081726074 

 Williams was nominated for a Primetime Emmy Award for Outstanding Suppor

In [58]:
res = p.run(query=q2)
print_doc_results(q2, res)

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Query: Why does Sansa Stark hate Joffrey Baratheon? 


Score: 0.7272138595581055 

 She has her hair dyed dark brown later on while in the Vale, disguised as Alayne Stone, the bastard daughter of Petyr Baelish.  Sansa is 11 years old in ''A Game of Thrones'' and nearly 14 in ''A Feast for Crows''. Arguably the most naive of the Stark children at the start of the series, Sansa often finds herself used as a pawn in the machinations of the other characters. However, as the story progresses, she matures and becomes more of a player of the game rather than a pawn for other characters. She is the most beautiful woman in Westeros at the time of the events of "A Song of Ice and Fire".

==Storylines==
Coat of arms of House Stark

===''A Game of Thrones''===

Sansa Stark begins the novel by being betrothed to Crown Prince Joffrey Baratheon, believing Joffrey to be a gallant prince. While Joffrey and Sansa are walking through the woods, Joffrey notices Arya sparring with the butcher's boy, Mycah.

### Experiment 4

Find relevant documents from semantic search and rank the results.

Pipeline Components:
* Retriever = embedding model
* Ranker = sentence transformers cross encoder

In [63]:
# retriever = EmbeddingRetriever(
#     document_store=embed_document_store,
#     embedding_model="sentence-transformers/all-MiniLM-L6-v2",
#     model_format="sentence_transformers",
#     max_seq_len=512,  # context window for the model, text longer than this will be truncated
#     top_k=10,  # return the top k results
#     scale_score=False,  # no need to scale here if ranker does it instead
#     use_gpu=True
# )
# embed_document_store.update_embeddings(retriever)
ranker = SentenceTransformersRanker(
    model_name_or_path="cross-encoder/ms-marco-MiniLM-L-12-v2",
    top_k=3,
    scale_score=True  # scales relevancy score to (0,1)
)

p = Pipeline()
p.add_node(component=retriever, name="EmbeddingRetriever", inputs=["Query"])
p.add_node(component=ranker, name="Ranker", inputs=["EmbeddingRetriever"])

In [61]:
res = p.run(query=q1)
print_doc_results(q1, res)

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Query: Who is Arya Stark's father? 


Score: 0.9995865225791931 

 If I find out who the dad is and why I need to know who the dad is, then let Jon know quick.'"The scene reveals neither Lyanna Stark's son's name nor his father's, with the transition between the newborn's face and Jon Snow's visually conveying the identity. HBO released an infographic shortly after the episode aired, confirming Ned Stark as Jon's guardian, and Lyanna Stark and Rhaegar Targaryen as his parents.

In regards to Arya Stark's transformation following her time as a disciple of the Many-Faced God, Weiss noted in the "Inside the Episode" featurette, "We all see where she's coming from, she's seen so many atrocities. It's a worrisome narrative; she started as this tough and plucky girl and turned into someone who's capable of slitting a man's throat and smiling as she watches him as he bleeds out." 

--------


Score: 0.9982544779777527 

 == Storylines ==
=== Novels ===
==== ''A Game of Thrones'' ====
Coat of 

In [62]:
res = p.run(query=q2)
print_doc_results(q2, res)

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Query: Why does Sansa Stark hate Joffrey Baratheon? 


Score: 0.9969053864479065 

 == Storylines ==
Joffrey Baratheon's personal coat of arms

==== ''A Game of Thrones'' ====

Prince Joffrey is taken by his parents to Winterfell and is betrothed to Sansa Stark in order to create an alliance between House Baratheon and House Stark. At first, Joffrey is kind and polite to Sansa. However, he refuses to show sympathy with the family when Bran Stark falls from a tower, until physically forced to by his uncle, Tyrion Lannister. While on the Kingsroad to King's Landing, Joffrey and Sansa come across Arya Stark practicing swordplay with a commoner Mycah. Joffrey accuses Mycah of assaulting a noble girl and makes a cut on his face with a sword. This causes Arya to hit Joffrey, allowing Mycah to escape. When Joffrey then turns on Arya, her direwolf Nymeria attacks Joffrey, injuring him. Later, Joffrey lies about the attack, saying it was unprovoked and demands Nymeria to be killed; however, San

### Conclusions

Keyword search with ranking seems to produce the best qualitative results for these 2 questions.  Semantic search not only takes longer, but the results do not answer the question directly.  For q1, there are docs that directly answer the question not ranked in the top result.  For q2, the documents most helpful for finding the answer are towards the bottom of the top 10.

<b>All future experiments will use the BM25 + semantic ranker as the retreiver, since it seems to perform the best and it is faster than full semantic search.</b>

## Extractive Question Answering

Answer questions using the info contained in the documents.  This type of search pulls the answer out of the context from documents.  It may use retrievers to find relevant documents for answering the question, possibly a ranker, and then a reader or summarizer to produce the answer.  

### Experiment 1

Extractive QA pipeline with keyword search + ranking.

Components:
* Retriever = BM25 keyword search
* Ranker = sentence transformers cross encoder
* Reader = ROBERTA tuned on SQUAD


In [65]:
retriever = BM25Retriever(
    document_store=document_store,
    top_k=10,  # return the top k results
    all_terms_must_match=False,  # if True, only return documents that contain all of the terms
    scale_score=False  # no need to scale here if ranker does it instead
)
ranker = SentenceTransformersRanker(
    model_name_or_path="cross-encoder/ms-marco-MiniLM-L-12-v2",
    top_k=10,
    scale_score=True  # scales relevancy score to (0,1)
)
reader = FARMReader(
    model_name_or_path="deepset/roberta-base-squad2", 
    top_k=5, 
    use_gpu=True
)

p = Pipeline()
p.add_node(component=retriever, name="BM25Retriever", inputs=["Query"])
p.add_node(component=ranker, name="Ranker", inputs=["BM25Retriever"])
p.add_node(component=reader, name="Reader", inputs=["Ranker"])

In [66]:
res = p.run(query=q1)
print_answers(
    res,
    details="minimum"
)

Inferencing Samples:   0%|          | 0/1 [00:00<?, ? Batches/s]

"Query: Who is Arya Stark's father?"
'Answers:'
[   {   'answer': 'Eddard',
        'context': 's Nymeria after a legendary warrior queen. She travels '
                   "with her father, Eddard, to King's Landing when he is made "
                   'Hand of the King. Before she leaves,'},
    {   'answer': 'Ned',
        'context': 'k in the television series.\n'
                   '\n'
                   '====Season 1====\n'
                   'Arya accompanies her father Ned and her sister Sansa to '
                   "King's Landing. Before their departure, Arya's h"},
    {   'answer': 'Ned Stark',
        'context': 'b becomes involved in the war against the Lannisters after '
                   'his father, Ned Stark, is arrested for treason. Robb '
                   'summons his bannermen for war against '},
    {   'answer': 'Lord Eddard Stark',
        'context': 'rk daughters.\n'
                   '\n'
                   'During the Tourney of the Hand to honour her fa

In [67]:
res = p.run(query=q2)
print_answers(
    res,
    details="minimum"
)

Inferencing Samples:   0%|          | 0/1 [00:00<?, ? Batches/s]

'Query: Why does Sansa Stark hate Joffrey Baratheon?'
'Answers:'
[   {   'answer': 'likens Joffrey to a lion and says he is nothing like Robert '
                  'Baratheon',
        'context': ' hear she must return to Winterfell. She likens Joffrey to '
                   'a lion and says he is nothing like Robert Baratheon. This '
                   'statement inspires Ned to investi'},
    {   'answer': "she doesn't fit the narrow 'strong female character' mold "
                  "we're used to rooting for",
        'context': 'nsa "arguably gets a disproportionate amount of fan hate '
                   "because she doesn't fit the narrow 'strong female "
                   'character\' mold we\'re used to rooting for."'},
    {   'answer': 'Joffrey is so publicly monstrous',
        'context': 'reen,\n'
                   'Robinson explains, "it would have helped explain why '
                   'Joffrey is so publicly monstrous to his uncle at his '
                   'weddin


### Experiment 2

Extractive QA pipeline with multi-hop semantic search.

Components:
* Retriever = multihop embedding model
* Reader = ROBERTA tuned on SQUAD

In [69]:
retriever = MultihopEmbeddingRetriever(
    document_store=embed_document_store,
    embedding_model="sentence-transformers/all-MiniLM-L6-v2",
    top_k=10,  # return the top k results
    scale_score=False  # no need to scale here if ranker does it instead
)
reader = FARMReader(
    model_name_or_path="deepset/roberta-base-squad2", 
    top_k=5, 
    use_gpu=True
)

p = Pipeline()
p.add_node(component=retriever, name="MultihopRetriever", inputs=["Query"])
p.add_node(component=reader, name="Reader", inputs=["MultihopRetriever"])



Downloading (…)lve/main/config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

Downloading (…)solve/main/vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

In [70]:
res = p.run(query=q1)
print_answers(
    res,
    details="minimum"
)

Querying:   0%|          | 0/1 [00:00<?, ?it/s]

Inferencing Samples:   0%|          | 0/1 [00:00<?, ? Batches/s]

Inferencing Samples:   0%|          | 0/1 [00:00<?, ? Batches/s]

Inferencing Samples:   0%|          | 0/1 [00:00<?, ? Batches/s]

"Query: Who is Arya Stark's father?"
'Answers:'
[   {   'answer': 'Eddard and Catelyn Stark',
        'context': 'Background ===\n'
                   'Arya is the third child and younger daughter of Eddard and '
                   'Catelyn Stark and is nine years old at the beginning of '
                   'the book series.  Sh'},
    {   'answer': 'Rhaegar',
        'context': ' posted on the HBO-controlled website '
                   "MakingGameofThrones.com confirmed Rhaegar as Jon's father. "
                   'Journalists later commented on the significance of tw'},
    {   'answer': 'Eddard and Catelyn Stark',
        'context': 'Stark of House Stark is the first daughter and second '
                   'child of Eddard and Catelyn Stark. She was also the future '
                   'bride of Prince Joffrey, and thus the'},
    {   'answer': 'Jon Snow',
        'context': "unfinished business that needs to be resolved there. I'm "
                   "obviously not Jon Snow's 

In [71]:
res = p.run(query=q2)
print_answers(
    res,
    details="minimum"
)

Querying:   0%|          | 0/1 [00:00<?, ?it/s]

Inferencing Samples:   0%|          | 0/1 [00:00<?, ? Batches/s]

Inferencing Samples:   0%|          | 0/1 [00:00<?, ? Batches/s]

Inferencing Samples:   0%|          | 0/1 [00:00<?, ? Batches/s]

'Query: Why does Sansa Stark hate Joffrey Baratheon?'
'Answers:'
[   {   'answer': 'she will fail to give Joffrey a male heir',
        'context': 's time passes, Sansa wears her hair like a southerner and '
                   'is more flippant with Mordane, expressing fears she will '
                   'fail to give Joffrey a male heir.\n'
                   '\n'},
    {   'answer': "cruelty and ignorance of the commoners' suffering",
        'context': 'requently orders his Kingsguard to beat Sansa. His cruelty '
                   "and ignorance of the commoners' suffering makes him "
                   'unpopular after he orders the City Watc'},
    {   'answer': 'hatred of the Lannisters',
        'context': 'hae form a friendship in which Sansa is able to vent about '
                   'her hatred of the Lannisters without fear of being '
                   'betrayed. Sansa is present when the roya'},
    {   'answer': 'spoiled, sadistic bully',
        'context': "998) and ''A

### Conclusions

The multi-hop retriever does an excellent job answering q2, which requires making inferences from the text.  It does this by finding relevant text in iteration 1 and including that as context for the question in iteration 2.  It provides both Arya's parents for q1 though, not really answering the question in a straightforward way.  That is because the answer is directly in the text though, and requires no inference.  

Would a generative model be better at making inferences from relevant docs?

## Generative Question Answering

Answer questions by synthesizing the documents into an answer.  This type of search may use retrievers to find relevant documents for answering the question, and then a generative reader to produce the answer.

### Experiment 1

Generative QA pipeline with keyword search.

Components:
* Retriever = BM25 keyword search
* Ranker = sentence transformers cross encoder
* Generator = Flan T5 Large

In [9]:
retriever = BM25Retriever(
    document_store=document_store,
    top_k=10,  # return the top k results
    all_terms_must_match=False,  # if True, only return documents that contain all of the terms
    scale_score=False  # no need to scale here if ranker does it instead
)
ranker = SentenceTransformersRanker(
    model_name_or_path="cross-encoder/ms-marco-MiniLM-L-12-v2",
    top_k=1,  # keep this low enough to prevent prompt truncation, as the query comes at the very end
    scale_score=True  # scales relevancy score to (0,1)
)
lfqa_prompt = PromptTemplate(
    name="lfqa",
    prompt_text="""Use the related text to answer the question.  
    \n\n Related text: {join(documents)} \n\n Question: {query} \n\n Answer:""",
)
prompt_node = PromptNode(model_name_or_path="google/flan-t5-large", default_prompt_template=lfqa_prompt)


p = Pipeline()
p.add_node(component=retriever, name="BM25Retriever", inputs=["Query"])
p.add_node(component=ranker, name="Ranker", inputs=["BM25Retriever"])
p.add_node(component=prompt_node, name="PromptNode", inputs=["Ranker"])


In [10]:
res = p.run(query=q1)
print(res["results"])

['Eddard']


In [11]:
res = p.run(query=q2)
print(res["results"])

['he is nothing like Robert Baratheon']


### Experiment 2

Generative QA pipeline with multi-hop semantic search.

Components:
* Retriever = multihop embedding model
* Generator = Flan T5 Large

In [12]:
retriever = MultihopEmbeddingRetriever(
    document_store=embed_document_store,
    embedding_model="sentence-transformers/all-MiniLM-L6-v2",
    model_format="sentence_transformers",
    top_k=2,  # return the top k results, needs to be short for the prompt not to get cut off
    scale_score=False  # no need to scale here if ranker does it instead
)
embed_document_store.update_embeddings(retriever)
lfqa_prompt = PromptTemplate(
    name="lfqa",
    prompt_text="""Use the related text to answer the question.  
    \n\n Related text: {join(documents)} \n\n Question: {query} \n\n Answer:""",
)
prompt_node = PromptNode(model_name_or_path="google/flan-t5-large", default_prompt_template=lfqa_prompt)


p = Pipeline()
p.add_node(component=retriever, name="MultihopRetriever", inputs=["Query"])
p.add_node(component=prompt_node, name="PromptNode", inputs=["MultihopRetriever"])


Updating Embedding:   0%|          | 0/2356 [00:00<?, ? docs/s]

Batches:   0%|          | 0/74 [00:00<?, ?it/s]

In [13]:
res = p.run(query=q1)
print(res["results"])

Querying:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Token indices sequence length is longer than the specified maximum sequence length for this model (578 > 512). Running this sequence through the model will result in indexing errors


["Jon Snow's"]


In [13]:
res = p.run(query=q2)
print(res["results"])

Querying:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

['Sansa Stark hates Joffrey Baratheon because he is a bully.']


### Conclusions

Generative QA's performance is prompt dependent, and it seems better than extractive summarization at inference.  Multihop + Generative QA works well for answering inference questions, but it would be faster and just as accurate to answer exctractive questions with BM25 + ranker.  How can we determine which pipeline to use for different types of questions?

## Search Summarization

Summarize the relevant documents.  This type of search may use retrievers to find relevant documents, possibly a ranker, and then an abstractive summarizer. 

### Experiment 1

Search Summarizer pipeline with keyword search.

Components:
* Retriever = BM25 keyword search
* Ranker = sentence transformers embedding model
* Reader = DistilBART abstractive summarizer tuned on CNN

In [14]:
retriever = BM25Retriever(
    document_store=document_store,
    top_k=10,  # return the top k results
    all_terms_must_match=False,  # if True, only return documents that contain all of the terms
    scale_score=False  # no need to scale here if ranker does it instead
)
ranker = SentenceTransformersRanker(
    model_name_or_path="cross-encoder/ms-marco-MiniLM-L-12-v2",
    top_k=1,  # keep this low enough to prevent prompt truncation, as the query comes at the very end
    scale_score=True  # scales relevancy score to (0,1)
)
summarizer = TransformersSummarizer(model_name_or_path="sshleifer/distilbart-cnn-12-6")

p = Pipeline()
p.add_node(component=retriever, name="Retriever", inputs=["Query"])
p.add_node(component=ranker, name="Ranker", inputs=["Retriever"])
p.add_node(component=summarizer, name="Summarizer", inputs=["Ranker"])


In [15]:
def print_answer(preds, question):
    answer_text = preds['documents'][0].meta['summary']
    print(f"\nQuestion: {question}\n\nAnswer: {answer_text}\n-----\n")
    return


In [16]:
pred = p.run(query=q1)
print_answer(pred, q1)


Question: Who is Arya Stark's father?

Answer:  Arya adopts a direwolf cub, which she names Nymeria after a legendary warrior queen. She travels with her father, Eddard, to King's Landing when he is made Hand of the King.
-----



In [18]:
pred = p.run(query=q2)
print_answer(pred, q2)


Question: Why does Sansa Stark hate Joffrey Baratheon?

Answer:  Following Ned's initial resignation as Hand of the King, Sansa is devastated to hear she must return to Winterfell. She likens Joffrey to a lion and says he is nothing like Robert Baratheon.
-----



### Experiment 2

Search Summarizer pipeline with semantic search.

Components:
* Retriever = multihop embedding model
* Reader = DistilBART abstractive summarizer tuned on CNN

In [19]:
retriever = MultihopEmbeddingRetriever(
    document_store=embed_document_store,
    embedding_model="sentence-transformers/all-MiniLM-L6-v2",
    model_format="sentence_transformers",
    top_k=2,  # return the top k results, needs to be short for the prompt not to get cut off
    scale_score=False  # no need to scale here if ranker does it instead
)
#embed_document_store.update_embeddings(retriever)
summarizer = TransformersSummarizer(model_name_or_path="sshleifer/distilbart-cnn-12-6")

p = Pipeline()
p.add_node(component=retriever, name="Retriever", inputs=["Query"])
p.add_node(component=summarizer, name="Summarizer", inputs=["Retriever"])


In [20]:
pred = p.run(query=q1)
print_answer(pred, q1)

Querying:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Your max_length is set to 200, but your input_length is only 190. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=95)



Question: Who is Arya Stark's father?

Answer:  HBO released an infographic shortly after the episode aired, confirming Ned Stark and Rhaegar Targaryen as Jon's parents. Arya started as a tough girl and turned into someone who's capable of slitting a man's throat and smiling as he bleeds out.
-----



In [21]:
pred = p.run(query=q2)
print_answer(pred, q2)

Querying:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]


Question: Why does Sansa Stark hate Joffrey Baratheon?

Answer:  Sansa Stark is the most naive of the Stark children at the start of the series. She often finds herself used as a pawn in the machinations of the other characters. She is most beautiful woman in Westeros at the time of the events of "A Song of Ice and Fire"
-----



### Conclusions

The summarization pipeline is not as effective at answering questions.  It can be useful for summarizing documents collected through a doc search though.  

# Final Thoughts

Considering this was a qualitative/subjective evaluation on a limited dataset, I decided the following:
* For document retrieval and relevancy scoring, the best results for speed and relevancy were obtained by using BM25 + cross-encoder ranking model, essentially combining a partial keyword match with semantic ranking.
* Summarization is more useful for summarizing ranked results rather than answering questions or exploring the data.
* Extractive QA with BM25 + ranking performed best at answering questions whose answers could be pulled directly out of the text.  Extractive QA with multi-hop semantic search performed well at identifying pieces of information that could be combined to answer complex questions that require inferring info from the text.
* Generative QA performed best at answering questions that required making inferences based on the text content when combined with multi-hop semantic search, although that meant not doing well with easy answers that came out of the text.  Generative QA with BM25 + ranking performed well for both types of questions.  

It would be nice to have 2 capabilities:
1. Document search with BM25 + semantic ranking
2. QA search with BM25 + semantic ranking + generative LLM
    * Extractive QA for simple questions would be nice, but then how to determine which questions are simple?